Outsourcing and Out-tasking Best Practices

I recently published this post first at InformationWeek and it generated quite a few comments, both published and several sent directly via e-mail.  I would note that a strong theme is the frustration of talented staff dealing with senior leadership that does not understand how IT works well or do not appear to be focused on the long term interests of the company. It is a key responsibility of leadership to ensure they keep these interests at the core of their approach, especially when executing complex efforts like outsourcing or offshoring so that they do achieve benefits and do not harm their company. I think the national debate that is occurring at this time as well with Romney and Obama only serves to show how complex executing these efforts are. As part of a team, we were able to adjust and resolve effectively many different situations and I have extracted much of that knowledge here. If you are looking to outsource or are dealing with an inherited situation, this post should assist you in improving your approach and execution.

While the general trend of more IT outsourcing but via smaller, more focused deals continues, it remains an area that is difficult for IT management to navigate successfully.  In my experience, every large shop that I have turned around had significant problems caused or made worse by the outsourcing arrangement, particularly large deals. While understanding that these shops performed poorly for primarily other reasons (leadership, process failures, talent issues), achieving better performance in these situations required substantial revamp or reversal of the outsourcing arrangements. And various industries continue to be littered with examples of failed outsourcing, many with leading outsource firms (IBM, Accenture, etc) and reputable clients. While formal statistics are hard to come by (in part because companies are loathe to report failure publicly), my estimate is that at least 25% and possibly more than 50% fail or perform very poorly. Why do the failures occur? And what should you do when engaging in outsourcing to improve the probability of success?

Much of the success – or failure – depends on what you choose to outsource followed by effectively managing the vendor and service. You should be highly selective on both the extent and the activities you chose for outsourcing. A frequent mistake is the assumption that any activity that is not ‘core’ to a company can and should be outsourced to enable focus on the ‘core’ competencies. I think this perspective originates from principles first proposed in The Discipline of Market Leaders by Michael Treacy and Fred Wisrsema. In essence, Treacy and Wisrsema state that companies that are market leaders do not try to be all things to all customers. Instead, market leaders recognize their competency either in product and innovation leadership, customer service and intimacy, or operational excellence. Good corporate examples of each would be 3M for product, Nordstrom for service, and FedEx for operational excellence. Thus business strategy should not attempt to excel at all three areas but instead to leverage an area of strength and extend it further while maintaining acceptable performance elsewhere. And by focusing on corporate competency, the company can improve market position and success. But generally IT is absolutely critical to improving customer knowledge intimacy and thus customer service. Similarly, achieving outstanding operational competency requires highly reliable and effective IT systems backing your operational processes.  And even in product innovation, IT plays a larger and large role as products become more digital and smarter.

Because of this intrinsic linkage to company products and services, IT is not like a security guard force, nor like legal staff — two areas that are commonly fully or highly outsourced (and generally, quite successfully). And by outsourcing intrinsic capabilities, companies put their core competency at risk. In a recent University of Utah business school article, the authors found significantly higher rates of failure of firms who had outsourced. They concluded that  “companies need to retain adequate control over specialized components that differentiate their products or have unique interdependencies, or they are more likely to fail to survive.” My IT best practice rule is ‘ You must control your critical IP (intellectual property)’. If you use an outsourcer to develop and deliver the key features or services that differentiate your products and define your company’s success, then you likely have someone doing the work with different goals and interests than you, that can typically easily turn around and sell advances to your competitors. Why would you turn over your company’s fate to someone else? Be wary of approaches that recommend outsourcing because IT is not a ‘core’ competency when with every year that passes, there is greater IT content in products in nearly every industry. Chose instead to outsource those activities where you do not have scale (or cost advantage), or capacity or competence, but ensure that you either retain or build the key design, integration, and management capabilities in-house.

Another frequent reason for outsourcing is to achieve cost savings. And while most small and mid-sized companies do not have the scale to achieve cost parity with a large outsourcer, nearly all large companies, and many mid-sized do have the scale.  Further, nearly every outsourcing deal that I have reversed in the past 20 years yielded savings of at least 30% and often much more. Cost savings can only be accomplished by an outsourcer for a large firm for a broad set of services if the current shop is a mediocre shop. If you have a well-run shop, your all-in costs will be similar to the better outsource firms’ costs. If you are world-class, you can beat the outsourcer by 20-40%.

Even more, the outsourcer’s cost difference typically degrades over time. Note that the goals of the outsourcer are to increase revenue and margin (or increase your costs and spend less resources doing your work). Invariably, the outsourcer will find ways to charge you more, usually for changes to services and minimize work being done. And previously, when you had used your ‘run’ resources to complete minor fixes and upgrades, you could find you are charged for those very same resources for such efforts once outsourced. I have often seen that ‘run’ functions will be hollowed out and minimized and the customer will pay a premium for every change or increase in volume. And while the usual response to such a situation is that the customer can put terms in the contract to avoid this, I have yet to see such terms that ensure the outsourcer works in your best interest to do the ‘right’ thing throughout the life of the contract. One interesting example that I reversed a few years back was an outsourced desktop provisioning and field support function for a major bank (a $55M/year contract). When an initial (surprise) review of the function was done, there were warehouses full of both obsolete equipment that should have been disposed and new equipment that should have been deployed. Why? Because the outsourcer was paid to maintain all equipment whether in use in the offices or in a warehouse, and they had full control of the logisitics function (here, the critical IP). So, they had ordered up their own revenue in effect. Further, the service had degraded over the years as the initial workforce had been hollowed out and replaced with less qualified individuals. The solution? We immediately in-sourced back the logistics function to a rebuilt in-house team with cost and quality goals established. Then we split the field support geography and conducted a competitive auction to select two firms to handle the work. Every six months each firm’s performance would be evaluated for quality, timeliness and cost and the higher performing firm would gain further territory. The lower performing firm would lose territory or be at risk of replacement. And we maintained a small but important pool of field support experts to ensure training and capabilities were kept up to par and service routines were updated and chronic issues resolved. The end result was far better quality and service, and the cost of the services were slashed by over 40% (from $55M/year to less than $30M/year). And these results — better quality at lower costs — from effective management of the functions and having key IP and staff in-house are the typical results achieved with similar actions across a wide range of services, organizations and locales.

When I was at BankOne, working under Jamie Dimon and his COO Austin Adams, they provided the support for us to tackle bringing back in what had been the largest outsourcing deal ever consummated at its time in 1998. Three years after the outsource had started, it had become a millstone around BankOne’s neck. Costs had been going up every year, quality continued to erode to where systems availability and customer complaints became worst in the industry. In sum, it was a burning platform. In 2001 we cut the deal short (it was scheduled to run another 4 years). In the next 18 months, after hiring 2200 infrastructure staff (via best practice talent acquisition), revamping the processes and infrastructure, we reduced defects (and downtime) to 1/20th of the levels in 2001 and reduced our ongoing expenses by over $200M per year. This supported significantly the bank’s turnaround and enabled the merger with JP Morgan a few years later.  As for having in-house staff do critical work, Jamie Dimon said it best with ‘Who do you want doing your key work? Patriots or mercenaries?’

Delivering comparable cost to an outsourcer is not that difficult for mid to large IT shops. Note that the outsourcer must include a 20% margin in their long term costs (though they may opt to reduce profits in the first year or two of the contract) as well as an account team’s costs. And, if in Europe, they must add 15 to 20% VAT. Further, they will typically avoid making the small investments required for continuous improvement over time. Thus, three to five years out, nearly all outsourcing arrangements cost 25% to 50% more than a well-run in-house service (that will have the further benefit of higher quality). You should set the bar that your in-house services can deliver comparable or better value than typical out-sourced alternatives. But ensure you have the leadership in place and provide the support for them to reach such a capability.

But like any tool or management approach, used properly and in the right circumstances, outsourcing is a benefit to the company. As a leader you cannot focus on all company priorities at once, nor would you have the staff even if you could, to deliver. And in some areas such as field support there are natural economies of scale that benefit a third party doing the same work for many companies. So consider outsourcing in these areas but the extent of the outsource carefully. Ensure that you still retain critical IP and control. Or use it to augment and increase your capacity, or where you can leverage best-in-class specialized services to your company’s benefit. Then, once selected and effectively negotiated, manage the outsourcing vendor effectively. Since effective management of large deals is complex and nearly impossible, it is far better to do small outsourcing deals or selective out-tasking. The management of the outsourcing should be handled like any significant in-house function, where SLAs are established and proper operational metrics are gathered, performance is regularly reviewed with management and actions are noted and tracked to address issues or improve service. Properly constructed contracts that accommodate potential failure are key if things do not go well. Senior management should jointly review the service every 3 to 6 months, and consequences must be in place for performance (good or bad).

Well-selected and managed outsourcing will then complement your in-house team with more traditional approaches that leverage contractors for peak workloads or projects or the modern alternative to use cloud services and out-task some functions and applications. With these best practices in place and with a selective hand, your IT shop and company can benefit from outsourcing and avoid the failures.

What experiences have you had with outsourcing? Do you see improvement in how companies leverage such services? I look forward to your comments.

Best, Jim Ditmore

 

 

 

Optimizing Technology Infrastructure with Master Craftsmen

One of my former colleagues, Ralph Bertrum, has provided the primary material for today’s post on how to optimize a technology infrastructure with master craftsmen. Ralph is one of these master craftsmen in the mainframe infrastructure space. If you are a CIO or senior IT leader looking to improve your shop’s cost or performance, I recommend optimizing your infrastructure and systems through high payback platform efficiency reviews.

In today’s shops, often with development and coding partially or fully outsourced, and not enough experienced and capable resources on staff, many applications are built for functionality without much regard for efficiency.  And nearly every shop has legacy applications where few engineers, if any, actually understand how they work. These applications have often been patched and extended that just to have them run is viewed as the goal, rather than run effectively and efficiently. The result is that for most shops, there is a  build up of 10, 20, or even 30% of their compute and storage capacity that is wasted on inefficient systems. This happens partly because it is easiest to just throw hardware at a problem and partly because they do not have the superior engineering resources — or master craftsmen — required to locate, identify, and resolve such inefficiencies. Yet, it is a tremendous waste and a major recurring cost for the IT shop. It is a significant opportunity for IT leaders to attack these inefficiencies.  In my experience, every one of these master craftsmen, if given the framework and support, can easily return 4 to 10 times their annual compensation in savings each quarter!

So, how do you go about building and then leveraging such an efficiency engineering capability? First, to build the capability, you must be willing to invest in select engineers that are heavily dedicated to this work. I recommend focusing on mainframe efficiency and server efficiency (Unix and Windows) as the primary areas of opportunity. Given the different skill sets, you should establish two separate efficiency teams for these two areas. Storage usage should be reviewed as a part of each team’s mission. A small team of two to four individuals is a good starting point. You can either acquire experienced talent or build up by leveraging promising engineers on staff and augmenting with experienced contractors until your staff have attained full capability. Ensure you invest in the more sophisticated tools needed to instrument the systems and diagnose the issues.  And importantly, ensure their recommend application and systems changes are treated with priority and implemented so the savings can be achieved. A monthly report on the recommendations and results completes the building the team and framework.

Now for the approach, which Ralph Bertrum, an experienced (perhaps even an old school) efficiency engineer has provided for the mainframe systems:

Having spent 50 years in Information Technology working on Mainframe Computers, I have seen a great many misunderstandings.  The greatest single misunderstanding is the value and impact of system engineering training and experience and it’s use in performing maintenance on a very costly investment. Many CIOs prefer to purchase a computer engine upgrade and continue to run a wasteful collection of jobs on a new faster machine.  It is the easiest way out but definitely not the most cost effective.  It is the equivalent to trading in your car every time the air filter, spark plugs or hoses need changing or the tires need air and then moving the old air filter, old spark plugs, old hoses and old tires to the new car.

Would you drive around with a thousand pound bag of sand in the trunk of your car?  Would you pull a thousand pound anchor down the street behind your car?  That is exactly what you are doing when you don’t regularly review  and improve the Job Control Language (JCL), Programs, and Files that run on your Mainframe Computer.  And would you transfer that thousand pound bag of sand and that anchor to your new car every time you purchased a new one?  Most IT shops are doing that with every new mainframe upgrade they make.  Every time they upgrade their computer they simply port all the inefficiencies to the upgrade.

Platform efficiency reviews will reduce waste impacting all kinds of resources: CPU, storage, memory, I/O, networks. And the results will make the data center greener and reduce electricity bills and footprints of equipment, speed online and batch processing, eliminate or delay the need for  upgrades, reduce processing wall times and cycle time, and ultimately improve employee efficiency, customer satisfaction and company profitability.

You can apply platform efficiency reviews to any server but let’s use the mainframe as a primary example. And, we will extend the analogy to a car because there are many relevant similarities.

Both automobiles and computers have a common need for maintenance.  An automobile needs to have the oil, the air filter, and spark plugs changed, tires rotated and the tire air pressure checked.  All of these are performed regularly and save a large amount of gas over the useful life of the automobile and extend the life of the car.  Reasonable maintenance on a car can improve mileage by three to four miles per gallon or about a 20% improvement. When maintenance is not performed the gas mileage begins to degrade and the automobile becomes sluggish, loses its reliability and soon will be traded in for a newer model.  The sand is growing in weight every day and the anchor is getting heavier.

For a mainframe, the maintenance is not just upgrading the systems software to the most recent version. Additional maintenance work must be done to the application software and its databases. The transactions, files, programs, and JCL must be reviewed, adjusted and tuned in order to identify hot spots, inefficient code and poor configurations that have been introduced with application changes or additional volume or different mixes. Over the last twenty five years I have analyzed and tuned millions of Mainframe Computer Jobs, Files, JCL, and Programs for more than one thousand  major data centers and all of them were improved.  I have never seen a Mainframe computer that couldn’t have its costs reduced by at least 10% to 15% and more likely 20% through a platform efficiency review.

Often, there are concerns that doing such tuning can introduce issues. Adjusting JCL or file definitions is just as safe as changing a spark plug or putting air in a tire.  It is simple and easy and does not change program logic.   The typical effect is that everything runs better and costs less.  The best thing about maintenance in a data center is that almost all the maintenance lasts much longer than it does in an automobile and stays in effect with continued savings in upgrade after upgrade, year after year.

Think of maintenance of a Mainframe Computer as a special kind of motor vehicle with thousands of under-inflated tires.  By making simple adjustments you can get improvements in performance from every one of these under-inflated tires. And even though each improvement is small in total, because there so many, you multiply the improvement to get significant effect.   You get this cost reduction every time the file is used or transaction executed and when all the savings from all the little improvements are added together you will get a 15% to 20% reduction in processing costs. The maintenance is a onetime cost that will pay for itself over and over upgrade after upgrade.

Here are some areas to focus for performance improvement with examples:

Avoid excessive data retention:  Many IT shops leave data in a file long after its useful life is over with or process data that is not meaningful.  An example would be Payroll records for an employee no longer with the company, General Ledger transactions from previous years, or inventory parts that are no longer sold.  By removing these records from the active file, and saving them in separate archive storage, you are saving CPU every time the file is used and work may complete much faster.  For example, an IT shop had an Accounts Receivables file that had 14 million records.  Every day they would run the file through a billing program that produced and mailed invoices.  At that time the cost of a stamp was $0.32 cents for a first class postage stamp.  A recommendation was made to the CFO that they purge all billing amounts of $0.32 cents or less from the billing file.  It was silly to pay $ 0.32 to collect $0.32 or less.  Two million records were removed from the file, the daily job ran four hours faster and they saved $35,000.00 a month on CPU and DASD space to say nothing about employee time and postage costs.  After a trial test period the minimum billing amount was raised to $1.00 and another set of very large savings was accomplished.

Optimize databases for their use.  An IT shop was looking to reduce the run time of a mailing list label system.  After looking at the data it was found that 90% of the labels were located in California and that a table looked up the city and state from a zip code table.  Each time the program needed a California city name, the program had to do ninety thousand zip code table compares before finding the correct city and state for the address.  The table was rearranged to optimize searching for California zip codes and the job went from running twenty hours to running only one hour.  CPU dropped by over 90%.  This has also worked with online transaction tables and file placement in Local Shared Resource (LSR) buffer pools. Optimizing databases is a key improvement technique.

Optimize the infrastructure configuration for your system.   One shop had jobs that would run very quickly one day and very slowly the next day.  After analyzing the jobs and the file locations it was determined that the public storage pools contained two different types of disk drives.  The Temporary work files would be placed on different disk drives every day.  What was the very best setting for one type of disk was the very worst for the other and this was causing the erratic behavior.  The storage pool was changed to contain only one type of disk device. The problem went away and the jobs ran fast every day.

Tune your systems to match your applications:   A mainframe comes with a great many abilities and features, but if your team are not adept with them your systems will not be optimized to run well.  I have analyzed over 1,000 data centers and applications and never once failed to discover significant tuning that could be accomplished within the existing system configuration. This occurs because of a lack of training or experience or focus. Ensure your team places that as a priority and if needed, bring in experts to adopt best practices. As an example of system tuning, I worked with a shop that had an online file accessing a data base and was having a major response time problem. They were afraid they were going to need a very costly upgrade. Every time they entered a transaction the system would go into a 110% utilization mode with paging.  An efficiency analysis was conducted and a system file was discovered that was doing sixteen million Input Output (I/O) instructions a day. After working with IBM to optimize the configuration, we achieved a 50% drop in I/O to eight million per day and response time improved to less than one second.  Apparently the shop had installed the system file as it was delivered and never modified it for their environment.

Tune your configurations to match your hardware.  When you make a hardware change be sure to make all the necessary other software changes as well.  Last year I worked with a very large bank that upgraded their disk drives but forgot to change a System Managed Storage (SMS) storage pool definition and continued to run forty five thousand monthly jobs using the worst blocksize possible in two thousand five hundred files.  When found and corrected, the forty five thousand jobs ran 68% faster with significant CPU savings as well as a 20% disk space savings.

Ensuring you are optimizing the most costly resource.  Remember that the efficient use of disk space is important, but not nearly as important as CPU consumption.  Analysis at another company discovered that in order to save disk space many files were using a Compression option.  The storage group had implemented this to save DASD space. In doing, the increased CPU usage unwittingly caused a multi-million dollar CPU upgrade. The Compression was removed on some of the files and CPU dropped by 20% across the board for both batch and online processing and delayed another upgrade for two more years. Optimizing disk usage at the expense of CPU resources may not be a good strategy.

Tune vendor software for your configuration.    Remember that the vendors sell their product to thousands of customers and not just you. Each vendor’s product must run in all IBM compatible environments and many of those environments will be older or smaller than your environment.  When you install the vendor software it should always be adjusted to fit properly in your environment.  Last year I did an analysis for a company that was beginning to have a run time problem.  They had an online viewing product from a vendor. They had set it up to create an online file for each customer.  They had created over three million online files and were adding one hundred thousand new ones every day.  They had run into serious performance issues because they did not understand the vendor software and the setup had been done incorrectly. So don’t add more sand to your computer by just not understanding how to best use a vendor product and configuring it correctly for your system.

Understand your systems and avoid duplication whenever possible.  Duplication of data and work is a common issue. We reviewed one IT shop that had the three million online files backed up to one hundred volumes of DASD every night taking five to six hours to run each night (and missing their SLA every morning).  Analysis showed that the files were viewing files and were never updated or changed in any way.  Except for a small number of new files, they were exactly the same unchanged report files backed up over and over.  It would have been much better to just backup the newly created files each night.  After all how many copies of a report file do you need?

If it’s not used remove it.  Remember that every file that you are not using is typically being backed up and stored every night.  If a file is not used it should be backed up, saved on archive and removed.  This space will be released and can be used for other purposes.

So the next time you think you need a computer upgrade don’t move the thousand pound bag of sand or connect the anchor to your new computer. Remember that maintenance is easy, simple, safe, and green.  Maintenance has a much greater return on investment than an upgrade.  Conduct a thorough platform efficiency review, it will save you a great deal more than you think over and over, year after year, upgrade after upgrade from now on.

Best, Ralph Bertrum and Jim Ditmore

About Ralph: Since 1986, Ralph is the co-founder and principle of Critical Path Software. He is an inventor, designer and software developer of The TURBO suite of mainframe analysis tools and expert performance tuning database. He has provided performance tuning services and analyses for over 1,100 major Fortune 5,000 corporations worldwide.  He is a former MVS, VSE, VM, CICS, and EDOS/VSE systems programmer, and an IDMS and IMS DBA.