There is continual pressure in IT to deliver greater efficiency and reduced costs. In addition to the short term tactics discussed elsewhere in these best practices reference pages, this page covers how to deliver long term efficiency and cost reduction in IT, and particularly how to do it while maintaining or improving your capabilities. Here we cover how to leverage metrics, benchmarking and continuous process improvement to drive significant results over time. And, it will enable you to move from a reactive approach with mediocre results to a proactive, fully managed approach with first quartile results.
Let’s assume you have begun to tackle the quality and team issues in your shop. One of the key weapons to enable you to make better decisions as a manager and to identify those areas that you need to focus on is having a level of transparency around the operational performance of your shop. This doesn’t mean your team must produce reams of reports for senior management or for you. In fact, that is typically wasted effort as it is as much a show as real data. What you require to obtain effective operational transparency is the regular reporting of key operational metrics as they are used by the teams themselves. You and your senior team should be reviewing the Key Process Metrics(KPMs) that are used by the operational teams to drive the processes day-to-day and week-to-week. These would include things such as change success rates by team, the utilization rates of servers or storage or a summary of project statuses with regular project reporting behind each status.
The ability to produce and manage using metrics depends substantially on the maturity of your team. You should match the metrics to be reported and leveraged to your team’s maturity with an eye for them to master their current level and move to the next level.
For example, for production metrics you should start (if a Level 1 or 2 organization based on a CMM scale), with having basic asset and configuration information and totals by type and by business. In addition, you should have basic change success volumes and statistics, incident volumes by severity, initial cause data and service restoration times. A level 3 organization would have additional information on detailed causal factors. So in addition to change or incident statistics there would be data on why (or root cause) for a failed change or a production incident as well as further clarification as to why that cause occurred (e.g., if it was a failed change was it because the change was not documented, it was human error, or the change did not work as designed, etc). A level 4 organization would have information on resolving the root root cause as well as managing the correction and process improvement itself. Thus you would have information on how long root cause takes to complete, how many are getting resolved, what areas have chronic issues, where you have repeat problems, etc. Consider the analogy of having information on a car. At the lowest level, we have information on the car’s position, then we add information on the car’s velocity, and then on its acceleration and engine performance and so on, at the highest level.
These three levels of information (basic position and volumes, then results, causes and verification metrics, then patterns and root root causes, improvement metrics and effects) can be applied across any IT service or component. And in fact, if applied, your management team can move from guessing what to do, to knowing what the performance and capabilities are to understanding what is causing problems to scientifically fixing the problems. I would note that many IT managers would view putting the metrics in place as a laborious time-consuming step that may not yield results. They would rather fly and direct by the seat of their pants. This is why so many shops have failed improvement programs and a lack of effective sustained results. You obtain far better results with a scientific, process-oriented approach and garner repeatable cumulative benefits that enable your improvement programs to accelerate and multiply their impact over time.
Another advantage of this level of transparency is that both your team and your business partners can now see measurable results rather than promises or rosy descriptions. You should always seek to make key data a commodity in your organization. Publish key process metrics broadly. This will reinforce your goals and the importance of quality throughout your team and you will be surprised by additional benefits that will be gathered where employees are able to a better job because they have visibility of the thrust and impact of their efforts.
And when you do develop key process metrics, ensure you have a solid base operational metrics coupled with performance and verification metrics. Develop metrics for your key services and products that do not just have technical meaning but also business meaning. For example, why publish metrics on number of incidents or outage time (or system time availability percentages) to your business partners? While there is some merit in these data for the technical team, you can publish for the business in more appropriate terms. For availability, publish the customer impact availability (which would be the total number of transactions less those that failed or where the customer was impacted divided by the total number of transactions that typically occur or would have occurred for the reporting period). If you tell the business your company had 98.2% Customer Impact Availability where 50,000 customers were impacted by systems issues last month versus you had 98.7% system time availability, it becomes real for them. And they will then better understand the importance of investing in quality production.
In essence, by taking a metrics-based and CPI approach, we in IT are then following in the footsteps of the manufacturing revolutions on process that has been underway for the past 60 years. Your team should know about the basics on Continuous Process Improvement and Lean. Perhaps give them some books on Deming or others. Our businesses must compete in a lean manufacturing/services world, and our work now demands a level of discipline and sustainable precision not possible without these techniques. ,
On a final note, once you have base metrics in place, I recommend benchmarking. You will have one of two good outcomes: either you will find out you have great performance and a leader in the industry (and you can then market this to your business sponsors) or you will find out plenty of areas to improve and what you should work on.
Have you seen success by applying these approaches? How has your team embraced or resisted taking such a disciplined approach? What would you add or change?
Best, Jim Ditmore