2 ITIL for Guerrillas
2.1 Introduction This chapter is the only place in the book where I emphasize the process of capacity planning and how it relates to the business side of the operation, rather than the tools and techniques that support the discipline of capacity planning. An internationally recognized framework that emphasizes business processes for IT is called ITIL. The acronym ITIL stands for Information Technology Infrastructure Library; ITIL is quite literally a collection of related manuals and copyrighted books. The objective of this chapter is to outline the ITIL framework and provide some ideas for how Guerrilla capacity planning (GCaP) can be included within the ITIL framework. One of the more significant benefits that derive from understanding the ITIL framework is that it forces you think about the business impact of capacity planning rather than remaining narrowly focused on the tools and technologies used to achieve capacity planning. One example is to avoid using overly technical capacity planning terminology when presenting the conclusions of your analysis to your ITIL customers. Use their business terminology and units rather than performance metrics like throughput and utilization.
2.2 ITIL Background Historically, ITIL was initiated in the 1980’s by British Office of Government Commerce (OGC) as a set of best practices for IT Service Management (ITSM), and they own the copyright. The original outcome has now come to be known as ITIL version one. Since then it has been updated and published as version two. Although it has been used widely in the UK, allied British Commonwealth countries, and some European countries, it has found much slower adoption in the USA. To further ITIL promotion, a number of user groups have been established. The IT Service Management Forum (itSMF) is an international user group with a Web site at www.itsmf.com, whereas
18
2 ITIL for Guerrillas
www.itsmfusa.org is the corresponding USA Web site. ITIL is intended to integrate with other standards such as: ISO: (International Organization for Standardization, www.iso.org) Perhaps best known for the ISO 9000 standard, which has become an international reference for quality requirements in business-to-business dealings; ISO 14000 looks set to achieve at least as much, if not more, in helping organizations to meet their environmental challenges. COBIT: (Control Objectives for Information and Related Technology) www. isaca.org COBIT is an IT governance framework and supporting toolset that allows managers to bridge the gap between control requirements, technical issues and business risks. CMM: (Capability Maturity Model) CMM for Software, Carnegie Mellon Software Engineering Institute (SEI), has been a model used by many organizations to identify best practices useful in helping them increase the maturity of their processes. MOF: (Microsoft Operations Framework) A set of Microsoft publications containing their guidelines for IT service management. Although MOF is not the same as ITIL, the framework is built on best practices from ITIL, but directed at the Windows Server platform. The ITIL framework (Fig. 2.1) addresses seven management areas: 1. 2. 3. 4. 5. 6. 7.
service support service delivery planning to implement service management information communication technology infrastructure management applications management the business perspective security management
In this chapter, we focus on ITIL management area 2 (service delivery), because that is where the capacity management processes are defined. Recall from Sect. 1.2.4 that the performance and capacity planning components do not have equal weighting in terms of significance or resources with these other areas of systems management. Capacity management can rightly be regarded as just a subset of systems management, but the infrastructure requirements for successful capacity planning (both the tools and knowledgeable humans to use them) are necessarily out of proportion with the requirements for simpler systems management tasks like software distribution, security, backup, etc. It is self-defeating to try doing capacity planning on the cheap. Remark 2.1. The adoption rate for ITIL in the USA runs the gamut from those companies adopting it wholesale, to others seeing it as just another fad. As a GCaP planner, it should be pretty obvious which environment you are in. Nonetheless, it may be prudent for you to become conversant with
2.2 ITIL Background
19
Fig. 2.1. The ITIL framework showing the relative location of the service level management process and the capacity management process
some of the ITIL framework. There are barriers to entry, unfortunately. One of the greatest that I ran into was trying to obtain introductory literature on ITIL, just to see whether I needed to investigate it more thoroughly or not. The published manuals and books comprising the ITIL library are not written at an introductory level and are also prohibitively expensive for an individual to purchase. The closest I came to an “ITIL for Dummies” type of exposition was a complimentary booklet (Rudd 2004) published by itSMF. Try requesting it via email:
[email protected]. The ITIL Toolkit (www.itil-toolkit.com) is another resource designed to guide the novice through the ITIL diagram and acronym jungle. It contains a whole series of resources to help simplify, explain, and manage the process. 2.2.1 Business Perspective ITIL views IT as a business, so it emphasizes process rather than tools, in order to provides an appropriate interface between business processes and technology. Internal IT shops that were accustomed to having a captive audience or customer base, now find this is no longer true because of the advent of such things as outsourcing. To encompass the broader scope imposed by these recent developments, requires a broader framework. Part of the ITIL framework is to recognize that IT products, such as application hosting, are actually comprised of services that utilize devices, such as servers, storage, and networks.
20
2 ITIL for Guerrillas
Within the context of ITIL management area 2 (service delivery), service level management (SLM) provides the interface to the business (Fig. 2.2). The SLM process negotiates, agrees to, and reviews service requirements for the business side such as SLAs (service level agreements). SLM further specifies service targets that are contained in a set OLAs (Operational Level Agreements).
Fig. 2.2. The relationship of the capacity management process (and its possible GCaP implementation) to other immediate ITIL processes and the capacity management database (CDB)
Although ITIL is quite literally a collection of related manuals and copyrighted books that document best practices for ITSM, these materials should not be regarded as providing stepwise procedure manuals. ITIL is more about what needs to be done to provide and efficient coupling between IT and business rather than how that coupling is to be achieved. The actual implementation details are left open. Therefore, a lot is left to individual interpretation. In many respects, best practices are really an admission of failure. Copying someone else’s apparent success is like cheating on a test. You may make the grade, but how far is the bluff going to take you? Very quickly you reach the point where the implementation details are needed, and in the area of ITIL capacity management, that is where GCaP comes in.
2.3 The Wheel of Capacity Management
21
2.2.2 Capacity Management The capacity manager is an ITIL process owner responsible for such things as supply and demand, cost-benefit analysis, capability, and requirements. To satisify these responsibilities under ITIL requires structure, discipline, and organization. Compare this with the capacity homunclulus discussed in Chap. 1. To implement the ITIL capacity management process requires skills, technology, and enterprise wide support. The idea is that the ITIL processes should surmount typical organizational boundaries. One of the most important components with the ITIL specification for capacity management is the Capacity Management Database (CDB) shown in Fig. 2.2. This repository is intended to be far more encompassing than the typical database of operating system performance metrics supplied with commercial performance management products. It can and should include previous capacity reports, various statistical analsyes, and capacity planning models.
2.3 The Wheel of Capacity Management Modern business practice demands rapid product development to meet narrow market windows. Time to market is everything, so management is constantly forced to seek ways of shrinking product development schedules. Traditional engineering practices, such as design reviews, prototyping, and performance analysis, have become common casualties of such schedule squeezing. This creates a dilemma for the performance analyst. The product is expected to perform, but performance analysis tends to get squeezed out of product design. In this climate, performance analysis is reduced to mere post mortem evaluation long after the crucial design decisions have been made, or even long after the product has actually been released. Here I am using the terms business, market, customer, and product in their most generic sense. A product may be a full-blown commercial offering or an artifact for internal consumption only. The product may be a piece of computer hardware, a software application, or an integrated computer system. 2.3.1 Traditional Capacity Planning The frustration of today’s capacity planner stems from trying to combat these business pressures. The purist’s position, that capacity management is the “right thing to do” because it helps to ensure a more cost-effective product, tends to fall on deaf ears. On the other hand, it is relatively easy to cite an ongoing litany of multimillion-dollar computer projects that have failed as a consequence of a more short-sighted approach to system design. To help clarify the nature of this paradox, we introduce a visual aid: the wheel of capacity management in Fig. 2.3.
22
2 ITIL for Guerrillas
Measure
Model
Deploy
Design
Build Fig. 2.3. The gorilla wheel of capacity management
The wheel is read clockwise starting at any position and consists of five segments corresponding to nominal phases in the development cycle any product: Measure: Measurements are made on the current product, if it exists, or when a completely new product line is being developed. Back-of-theenvelope estimates (with appropriate fudge factors) can be based on the previous generation of product. These measurements might be made as part of quality assurance, for example, and these data are are fed into the modeling phase. Model: Since capacity planning involves predictions (by definition), performance models are a natural part of any capacity plan. Relevant parameters are extracted from performance data collected in the measurement phase and are used to define the inputs of the performance model, e.g., the spreadsheet scalability models discussed in Chaps. 1 and 5. Design: Architectural design decisions should be inclusive of capacity and performance projections from the modeling phase. Elsewhere (Gunther 2000) I have called this approach performance-by-design because it is a cost-effective way to build performance into the product, which, in turn, increases the chances that it will meet performance expectations. That keeps costs down and customers happy. Build: In general, the capacity planner will be less involved during this engineering phase, but it is still worthwhile to participate in the relevant engineering meetings, where useful information may be acquired for use in modeling phases of the future, e.g., unit test or functional test data. Deploy: The day of reckoning. The greater the investment in the modeling and design phases, the more likely the product will meet performance expectations and remain on track for the capacity plan. Like the build phase, measurements should be made where possible, and that is more easily facilitated if some degree of instrumentation (data collection points) is built into the product as part of the design phase.
2.3 The Wheel of Capacity Management
23
The phases apply to either hardware or software artifacts, no matter whether those artifacts are built for internal use or as part of a commercial product. Figure 2.3 is meant to convery the more traditional approach as practiced in the heyday of centralized mainframe computing—what I referred to as gorilla capacity planning in Chap. 1. The appropriate visual, therefore, would seem to be a big, fat, tractor tire capable of doing a lot of heavy lifting.
Design
Deploy
Build Fig. 2.4. Running the wheel of capacity management (Fig. 2.3) on the rim because the important capacity planning phases have been dismissed as inflationary for the product schedule (Sect. 1.2.3)
2.3.2 Running on the Rim To much of modern management, capacity planning conjures up this image of tractor tire, viz., a cumbersome expander of time that tends to inflate precious product development schedules. Under prevailing business pressures, managers tend to react to this “tractor tire” image by rushing to the other extreme, whereby the measurement and modeling phases are dropped altogether (or are never included in the first place), and decision making is largely reduced to guesswork. As Fig. 2.4 shows, it certainly shortens the skeletonized development cycle of guess, build, and guess again, but it also makes for a rather bumpy and uncertain ride because the development cycle is running partly on the rim. Remark 2.2. As if this were not bad enough, there are other compelling incentives for pursuing this approach. The strategy in Fig. 2.4 is aimed exclusively at releasing a product within a narrow market window. Once the product becomes available and adopted, so-called performance enhancements merely provide additional revenue as part of the customer service contract. Therefore, management can hardly be faulted for concluding that, if customers are willing to pay more for the next “performance version,” why design it in? In
24
2 ITIL for Guerrillas
short, performance analysis gets dropped on the proverbial floor, products are released with inferior performance, and the customer ends up financing the enhancements. Against this kind of economic incentive, the purist does not stand a chance. Things look bleak for the modern capacity planner. Is there any hope? As the old adage goes, if you can’t beat ’em, join ’em! In Chap. 1 we pointed out that your management might be more receptive to your input if you can offer them capacity management that is streamlined to meet their own highpressure constraints.
Measure
Model
Deploy
Design
Build Fig. 2.5. The leaner and meaner GCaP wheel of performance. It has exactly the same periodic phases, in exactly the same order (cf. Fig. 2.3), but with an emphasis on higher planning efficiency (see Table 1.1)
2.3.3 Guerrilla Racing Wheel Enter the racing wheel of GCaP (Fig. 2.5). It repairs the broken wheel of performance (Fig. 2.4) by reinstating the modern capacity planner as an active player in today’s fast-paced development process. Notice also that Figs. 2.3 and 2.5 look similar. That is because the basic methodologies are very similar. In fact, in Chap. 8 I discuss a GCaP approach to Web site capacity planning that derives from a mainframe technique called latent demand. Mainframe methods are mature, and many of them (e.g., queueing models) can be adapted to the analysis of modern computing environments (Gunther 2005a). The important difference between Figs. 2.3 and 2.5 is that no matter which capacity planning techniques you choose, they must be a good match for the high-pressure demands of shortened development cycles. Since management is unlikely to change its ways, you have to change yours. Assuming that the GCaP approach to capacity planning depicited in Fig. 2.4 is actually implemented, one has to remain vigilant against unbridled enthusiasm in the measure and model phases, otherwise the GCaP wheel
2.4 Summary
Measure
25
Model
Deploy
Design
Build Fig. 2.6. The GCaP wheel becoming overweight because of unbridled enthusiasm in the capacity planning phases
can end up looking more like Fig. 2.6. In other words, those phases can become overinflated. It is important to keep in mind that management today is very sensitive to such inflationary expansion of their schedules (Sect. 1.2.3). Management no longer focuses on the speed of the product as much as the speed of producing the product. Nowadays, production performance matters more than product performance. GCaP requires that you remain cognizant of this management constraint and, accordingly, keep your capacity planning style lean and mean. Clearly, I have somewhat oversimplified things to make a point. The real world is usually more confused than I have described it here. These days, when it comes to the trade-off between the speed with which a decision can be made and its accuracy, speed wins. Purists find this point difficult to accept. Most design decisions, however, do not require fine detail, and the decision makers are usually just looking for a sense of direction rather than a precise compass bearing. If they do no want precision, why waste time providing it? Moreover, design decisions are often revised many times throughout the product development cycle, so precision gets lost in the flux.
2.4 Summary This chapter has provided a brief overview of the ITIL framework and its history. How widely it is adopted, and for how long, remains to be seen. The formal copyright structure, expense, and lack of early regional advocate groups have slowed the adoption of ITIL in the USA. This has started to change over the last few years. Our focus in this chapter was on the position of capacity management within the ITIL framework. We saw (Fig. 2.1) that it resides within the service level management process, which, in turn, resides within the service delivery
26
2 ITIL for Guerrillas
area, one of the seven top-level components of the ITIL framework. Although the ITIL framework emphasizes process over procedure, and the actual implementation details are left open to interpretation, we suggested in Sect. 2.3 that GCaP was intrinsically compatible with ITIL processes and best practices. Perhaps one of the most significant benefits to come out of understanding the ITIL framework is that it forces you think about the business impact of capacity planning rather than remaining narrowly focused on the tools and technologies of capacity planning. Having acknowledged that point in this chapter, we now go on to examine in detail the tools and methodologies that can be applied to Guerrilla capacity planning.