M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
Tw e n t y - F i r s t C e n t u r y C h a l l e n g e s f o r C o m p e t i t i v e n e s s
A S S U R A N C E
A N D
A D V I S O R Y
S E R V I C E S
“Information technology pervades all aspects of our daily lives, of our national lives. Its presence is felt almost every moment of every day, by every American. It pervades everything from a shipment of goods, to communications, to emergency services, and the delivery of water and electricity to our homes. All of these aspects of our life depend on a complex network of critical infrastructure information systems. Protecting this infrastructure is critically important. Disrupt it, destroy it, or shut down these information networks, and you shut down America as we know it and as we live it and as we experience it every day. We need to prevent disruptions; and when they occur, we need to make sure they are infrequent, short, and manageable.”1
Thomas Ridge, Director of Homeland Security, Oct. 9, 2001
1
The White House, Office of the Press Secretary. "New Counter-Terrorism and CyberSpace Security Positions Announced," Personnel Announcement by National Security Adviser Condoleezza Rice and Director of Homeland Security Thomas Ridge, Oct. 9, 2001.
CONTENTS
2
Introduction
4
The Current Environment
8
Mapping Organizational Risks and Requirements
11
An Approach to Managing Business Continuity
14
Implications and Opportunities
16
Conclusion
17
Appendix: Interviews With Leaders
M a r g a r e t E . M c K e o u g h a n d Va l e r i e H o l t M e t ro p o l i t a n Wa s h i n g to n A i rp o rt s Authority
John Ball Standard Chartered Bank
The following white paper was developed as part of a series by KPMG’s Assurance and Advisory Services Center.
I NT R O D U C T I O N
Faced with rising exposure to new risks and declining tolerance for disruptions to operations, leading organizations are evaluating their capabilities to respond to crises and mitigate future risk. These leaders embrace the moral imperative to protect their people, and they understand that the ability to continuously perform and satisfy customers is fundamental to sustaining competitive advantage in the twenty-first centur y.
The case for implementing a strategy to manage the risk of disasters has always been compelling. If disaster strikes, and an organization cannot recover in a timely way, the consequences could include loss of revenue, defection of customers, deterioration of brand equity, and permanent loss of shareholder value. Indeed, 40 percent of businesses that suffer a disaster go out of business within two years.2 Today, as the economics of information, globalization, and technology continue to change the nature of business worldwide,3 traditional approaches to business continuity no longer address a widening array of threats. Traditionally, organizations planned for natural or Traditional man-made disasters disrupting production, distribution, and data processing capaapproaches to business bilities at a single facility. These threats are becoming more frequent, and their continuity no longer impact is growing. address a widening
Simultaneously, threats to information assets are quickly becoming significant for enterprises of almost any size. Computer viruses, information security issues, software quality, inadequate data storage, complex technology architectures, and ineffective information asset management practices can open the doors to a catastrophe with the same business impact (if not a more severe one) as that posed by a physical threat.
array of threats.
Moreover, traditional approaches are reaching the end of their useful life as stand-alone solutions. Organizations increasingly operate in multiple locations and depend on information systems. Business processes are carried out in real time, so that a disruption has consequences along an entire value chain. The effects of downtime (see Figure 1) are measured in hours or even minutes, instead of days. In this environment, preparing for disasters must become part of a larger effort to mitigate risk. Instead of responding to particular events, organizations need to focus on maintaining operations in spite of any event.
2
3
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 2 )
Vic Wheatman. “Aftermath: Disaster Recovery,” GartnerGroup, September 21, 2001. The e-Business value chain: Winning strategies in seven global industries. Economist Intelligence Unit research report written in cooperation with KPMG, vii. 2000.
"How do I manage risk so that I’m always there for my customers and my other This white paper Thus, the key question for leaders is no longer, “How do I respond in the event stakeholders?" examines the variety of a crisis?” Rather, organizations increasingly need to ask, “How do I manage of issues that organizarisk so that I’m always there for my customers and my other stakeholders?” The answer—and the challenge—is to implement a strategy that takes into account the totality of tions face today. It introduces a risk, ensures the welfare of people, and balances the costs of risk management with the oppor- framework for managing the tunity cost of not taking appropriate action. According to the experiences of leaders in a variety risk of disasters in the context of industries, answering the challenge will result in a successful defense against disasters as well of managing the continuity of the enterprise from an inforas other benefits with strategic payoffs. mation asset perspective. It F i g u re 1 : A G row i n g , Po te n t i a l l y C o s t l y C a p a b i l i t i e s G a p also discusses a process for implementing a chosen busiThe gap between the cost of downtime and ness continuity strategy, inteability to deliver is widening. grating it with organizational strategy, and capitalizing on Downtime in excess of two hours is unacceptable for opportunities to achieve and 24% of organizations, and Increasing cost an additional 48% of organisustain competitive advantage. of unplanned zations cannot tolerate downtime more than 24 hours of Finally, the Appendix provides downtime.4 interviews with industry leaders whose thoughts reflect More than 60% of organizaissues many organizations are tions do not have corporatewide disaster recovery plans Ability to deliver now addressing. in effect. For their most through traditional recent interruption, almost 70% failed to completely meet their disaster recovery objectives.5
mechanisms
Source: KPMG, 2001.
As downtime is increasingly measured in hours and minutes, not days, organizations’ tolerance for it is decreasing. A capabilities gap has developed, and is widening, between the cost of downtime and the effectiveness of traditional response mechanisms.
4
5
KPMG LLP and “Contingency Planning and Management” Survey. “A Review of Factors Influencing Business Continuity in the Next Millennium,” 2000. Ibid. ( 3 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
T H E
C U R R E NT
E N V I R O N M E NT
Efforts to manage unforeseen circumstances that render assets useless and disrupt operations have been a management priority at leading organizations for decades. For example, in the 1960s when American Airlines developed SABRE, the widely used airline reservations system, engineers took great care to ensure the system’s reliability. They also kept a standby computer in the event that an outage affected the primary computer running the system. SABRE succeeded while competing services failed in large part because of American Airlines’ innovation, but the reliability of the system also played a role. 6
Prudent organizations maintain and test plans for responding to likely catastrophic events. However, the effects of numerous global trends, and the risks that arise from them, are prompting leaders to question the adequacy of their current capabilities. At the same time, emerging technologies are enabling new risk management strategies that were cost prohibitive just a few years ago. Appreciating these forces will lead to a wider view of risks and ultimately a broader approach to managing business continuity. T h e E m e rge n c e a n d G row t h o f I n fo rm a t i o n A s s e t s a n d R e l a te d R i s k s
Organizations are rapidly evolving from manually operated stand-alone entities to informationdependent extended enterprises. For these new entities, information facilitates competitive positioning, value chains depend on the timeliness of service, and supply chains are technologydependent (see the sidebar on the next page). Trends contributing to this evolution include initiatives common in most industries—enterprise resource planning, customer relationship management, supply chain management, mergers and acquisitions, outsourcing, The flow alliances, and e-commerce. Internet-based service suppliers and the globalization of information is no of business models are also key factors. longer connected with the distribution of physical
An important consequence of these changes is that the flow of information is no longer connected with the distribution of physical objects. As a result of immense capacity to share information electronically—across the Internet, wide-area corporate networks, wireless networks, and other media—the linkage between information-based or virtual processes and physically based processes is dissolving and becoming more complex. Consequently, value chains are dividing into two interdependent streams—one consisting of processes in the physical world and the other made up of information flows in the virtual realm.
objects.
6
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 4 )
Martin Campbell-Kelly and William Aspray. “Computer: A History of the Information Machine,” New York: Basic Books, 1996, pp. 169–176.
T h e E m e rge n c e o f I n fo rm a t i o n - D ri ve n E x te n d e d E n te rp ri s e s
V i r t u a l E x t e n d e d E n t e rp ri s e
Tr a d i t i o n a l B u s i n e s s M o d e l
Product origination and packaging (loans, insurance, investments, checking and savings accounts)
Mutual Fund Manager
Suppliers:
Transaction processing and account origination (mainframes)
Auto Lender
Other Banks
Primary Bank
CarPoint
Morningstar
Motley Fool
Navigators: Lipper Yahoo!
Retail sales and distribution (ATMs, 800 numbers, tellers)
Financial Advisor
e •Trade SchwabOne
Bank Software
Browser
Quicken
Bank customer
AOL
Phone Call
C U STO M E R
Source: Adapted from Philip B. Evans and Thomas S. Wurster, "Strategy and the New Economics of Information,” Harvard Business Review, September-October 1997.
Virtual extended enterprises are an emerging feature of e-business and collaborative commerce, as this financial services industry example illustrates. Whereas customers traditionally dealt with their banks through human-operated channels or ATM networks, they can now gain access through online as well as offline channels, and several players have a hand in delivering services.7 Extended enterprises present special challenges to how leaders will manage business continuity. Traditionally, an organization would write disaster recovery plans for critical processes and applications. As infrastructure becomes more complex, however, organizations need to consolidate and streamline their contingency planning practices. Moreover, they need to focus on mitigating risk— assuring customers, partners, and other stakeholders of the availability of information assets. Many agree that information assets, and the complex extended enterprises they enable, are changing the nature of competition. Leaders increasingly appreciate that these assets are also exposing organizations to a new set of risks in the virtual world.
7
Philip B. Evans and Thomas S. Wurster, “Strategy and the New Economics of Information,” Harvard Business Review, September-October 1997, p. 78.
( 5 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
As business processes move closer to real-time, the
T h e Pa ra d i g m S h i ft Leaders responsible for evaluating and developcost of downtime ing future business continuity strategies will need The links between physical and virtual assets will goes up. to focus on this division and its potential implicahave ramifications for how leaders assess the true cost tions. Ensuring the usefulness of both physical assets and of unplanned downtime, evaluate their exposure to risk, and set information assets—as well as protecting the people that are an agenda for management action. Unplanned downtime is central to both—will be critical to creating and sustaining comestimated to have cost businesses worldwide some $1.6 trillion petitive advantage and business continuity. Figure 2 shows how in lost revenues alone in the year 2000.8 That number will cerorganizations can map themselves based on the value and tainly climb as downtime is increasingly assessed in terms of complexity of their information assets, their organizational how it affects both physical and virtual links in the organizacomplexity, and their linkages with business partners. tional value chain. F i g u re 2 : T h e I n fo rm a t i o n - D e p e n d e n t E vo l u t i o n o f E n te rp ri s e s
Value and Complexity of Information Assets
Virtual Extended Enterprise e-Business I n te g ra t i o n Integrated Enterprise Functional Automation
Manual Operations
Low
on ing ati w o r ro ab nd g l l o a r c es, a te e s s gre roc g p lin e ab im en al-t are s, re y . g ise ts olo pr se h n te r a s te c d e n t i o n d e a n d n a en orm tio xt nf za h e f i ali oug lue o b o r Gl th va
Organizational Complexity and Degree of Collaboration
High
Source: KPMG, 2001.
Information assets are driving organizations toward networked business models. Such models let organizations create and sustain increased value, but the risk of unplanned downtime becomes more significant.
Emerging approaches to business continuity have to take into account the extent to which organizational value is now embodied in information (and information-based, real-time processes) as well as physical assets. Real-time capabilities make processes more efficient and predictable, increasing an organization’s capacity to take on new value-adding activities and improve customer satisfaction. As business processes move closer to realtime, the cost of downtime goes up—in part because its direct financial consequences become greater. But the more significant issue is the impact of downtime on customer satisfaction, efficiency, reputation, and shareholder value, and the domino effect that problems in these areas can have on profitability and market share. Assessing the cost of downtime in a broadened context often leads to a new appreciation of the risk of disaster—and whether an organization should measure its exposure in terms of its tolerance for downtime or its need for availability.
8
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 6 )
“It’s Time to Clamp Down,” Informationweek.com, July 10, 2000.
Traditional approaches to managing business continuity emphasize recovering from a disaster before a predefined amount of time elapses. The availability-based perspective (see Figure 3) focuses instead on ensuring that the organization will always be able to produce an output or reach some desired conclusion when it needs to do so. How an organization can determine what degree of availability is most appropriate within its business model is an important step, as discussed in the next section. Figure 3: The Evolution of Business Continuity Management Practices Toward an Availability-Based Perspective
TRADITIONAL
EMERGING
FOCUS
Minimizing the financial impact of disasters
Ensuring financial continuity, customer satisfaction, and productivity despite a catastrophe
APPROACH
Recovery from single episodes of prolonged downtime
Business-driven continuous availability through management of information and operational risk
RISKS
Low-frequency, high-impact disasters
Traditional threats to physical assets and emerging threats to information infrastructure
BENEFITS
Recovery of degraded service levels 12 to 72 hours after a disaster event
Up to 99.999% availability of critical infrastructure as well as performance improvement
ENABLERS
Documented plans relying on after-the-fact recovery
Emerging technologies and operational excellence
( 7 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
M A P P I N G O R G A N I Z AT I O N A L R I S K S A N D R E QU I R E M E NT S
In managing business continuity, leaders evaluating their own organizations need to ask, “How much protection is enough?” Such an evaluation will require examining the organization’s current exposure to the risk of downtime as well as the effectiveness of its strategy for managing that risk.
A useful way to assess the current state, as well as estimate future needs, is to benchmark or map current risk exposure (see Figure 4). An organization can gauge the usefulness of its current practices by mapping exposure, on the one hand, and the commitment required to Mapping can manage risk and value achieved, on the other hand. Such an effort can also help leadreveal opportunities ers evaluate alternative scenarios as they weigh future courses of action. Quantifying and risks. risk, however, is not the ultimate goal. Rather, this effort is intended to help managers understand the value propositions that alternative strategies will support, the commitment needed, and the opportunity costs of inaction (see the sidebar on the next page). F i g u re 4 : Va l u e L aye r s i n a B u s i n e s s C o n t i n u i t y Fra m e wo rk
How I manage information supports value chain excellence among all stakeholders.
RISK EXPOSURE
The performance of my information assets exceeds my stakeholders' expectations. ALWAYS THERE
I'm always there for my customers. I can resume my automated processes in the event of a disaster.
TIMELY RECOVERY
I can recover my physical assets and data in the event of a disaster.
COMMITMENT and VALUE REACT
CONTROL
Source: KPMG, 2001.
M A N A G I N G
B U S I N E S S
STRATEGIC USE
C O N T I N U I T Y
( 8 )
TRANSFORM
I d e n t i f y i n g a n d U n d e r s t a n d i n g O rga n i z a t i o n a l R i s k s a n d R e q u i re m e n t s REACT
Manual E n te rp ri s e
TRANSFORM
CONTROL
Fu n c t i o n a l Au to m a t i o n
I n te g ra te d E n te rp ri s e
e-Business I n te g ra t i o n
Vi rtu a l E x te n d e d E n te rp ri s e
VALUE PROPOSITION
I can recover my physical assets and data in the event of a disaster.
I can resume my automated processes and protect my people in the event of a catastrophe.
I’m always there for my customers in maintaining information availability.
The performance of my information assets exceeds my stakeholders’ expectations.
How I manage information supports value chain excellence among all stakeholders.
ORGANIZATIONAL CHARACTERISTICS
Manual operations characterized by limited use of information technology.
Automation of business processes and inc r ea si n g l y distributed information technology.
Integration of automated processes into complex information systems and reliance on information for timely performance and customer satisfaction.
Integration of business practices with the Internet and online collaborators.
Integration of information assets with a network of organizations focusing on core competencies and collaborating through extended enterprises.
TYPICAL OUTCOMES
Stand-alone plans for limited recovery of critical assets.
Contingency plans for restoring computer systems and resuming operations.
Infrastructure for maintaining availability of information assets, monitoring risk in real-time, and managing routine, non-routine, and catastrophic events.
Integration of infrastructure for maintaining availability with systems for managing operational risk to enhance operational excellence and improve flexibility.
Integration of standards for managing business continuity across collaborating enterprises with business intelligence systems used to capitalize on information.
KEY QUESTIONS
●
Is downtime in excess of 72 hours tolerable?
●
Is downtime in excess of 24 hours tolerable?
●
Is downtime of minutes or hours intolerable?
●
●
●
Are business processes highly manual?
●
Do manual alternatives to automated processes exist?
●
Is technology highly distributed and complex?
Will the performance of information assets determine success or failure?
Are strategies based on real-time, information-driven business models?
●
●
Will resolution of a crisis depend on a formal plan?
●
Is the organization leveraging the Internet?
●
Do services depend on the performance of outsourcers?
Will the integrity and availability of information determine success or failure?
●
Does the organization depend on information to respond to changing market needs?
●
Are facilities located in a disaster-prone area?
( 9 )
●
T W E N T Y - F I R S T
Does customer service depend on proactive resolution of unexpected events?
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
Trust between entities will become
When assessing business continuity risk exposure, both present and future, organizations’ strategies will typically address one of three scenarios:
more important in
React: Is it sufficient to react to a disaster?
●
Control: Is mitigating risk and controlling the availability of operations necessary?
●
Transform: Should business continuity capabilities be provided across an extended enterprise to assure the reliability of collaborative commerce?
S c e n a ri o 1 : R e a c t i o n S t a ge
Reacting to disasters is an adequate risk posture if an organization can live without business processes, applications, and other capabilities for 24 hours or longer. Contingency plans that focus on reacting to a disaster are less expensive to develop than proactive business continuity measures. Such plans, however, are inherently focused on single catastrophic events rather than the cumulative impact of downtime. S c e n a ri o 2 : C o n t ro l S t a ge
Controlling availability is advantageous or necessary for organizations that cannot tolerate: 1) downtime in excess of a few minutes or hours or 2) failure to deliver adequate service levels across the value chain. While the commitment required is higher in many cases, the indirect benefits can include better vendor management, productivity improvements, better responsiveness to stakeholders, and lower total cost of ownership for technology infrastructure.
B U S I N E S S
C O N T I N U I T Y
Trust between entities will become more important in managing business continuity as trends driving economic change take root and become a part of everyday life. As the sidebar on page 5 describes, one organization’s actions will increasingly affect another’s success. To cope, organizations that depend on collaboration with third parties should extend business continuity capabilities to these components of their value chains. In addition to better stability of a value chain, the benefits may also encompass improved customer service, marketplace responsiveness, and mutual trust.
continuity.
●
M A N A G I N G
S c e n a ri o 3 : Tra n s fo rm a t i o n S t a ge
managing business
( 1 0 )
P u tt i n g S t ra te gy i n C o n te x t These strategies—reacting to crises, controlling the availability of operations, and providing availability across the extended enterprise—are not mutually exclusive. One approach builds on the next in an evolutionary manner, and leading organizations’ strategies will blend elements of all three. A large data center, for example, might have contingency plans to restore computer hardware, redundant telecommunications services to prevent a network outage, and service-level agreements to ensure the productivity of outsourcers. Differences among organizations will be revealed by the tactical decisions that lead to action plans and processes for managing risk.
The next section describes an approach to managing the risk of disasters that applies to all three of these scenarios and can also help organizations continuously improve their practices in keeping with changing risks.
A N A P P R OAC H TO M A N AG I N G B U S I N E S S C O NT I N U I T Y
Once an organization understands its current state and determines how it should evolve, it can develop a practicable strategy for managing business continuity in the future. An effective approach may be depicted in an ongoing “life cycle” encompassing four phases aligned with the organization’s business strategy (as illustrated in Figure 5 and described below). A life cycle approach enables risk management, and it can also facilitate an organization’s evolution from reacting to a disaster, to controlling availability by mitigating risk, to ensuring the reliability of the extended enterprise. Figure 5: An Approach to Managing Business Continuity
R I SK
m
ne
si
M
en
t
RE
Bu
SU EA
ELOP STRATEGY
Organizational Strategy
DEV
Bu
inuity Man Cont ag ss em ne i s
t en
AND PERFO RM AN
CE
ASSESS RISKS
ss
Con
IM P
tin uit y M a
na
LEM ENT CHANG
ge
E
Source: KPMG, 2001.
1. A s s e s s R i s k s
Achieving effective business continuity starts with assessing organizational risks and requirements for managing them in the future. An organization needs to understand how it relies on its people, processes, and technology as well as its relationships with customers, suppliers, and other contributors to its value chain. This knowledge will help the organization understand its tolerance for downtime so that it can define the requirements that a business continuity strategy, once developed and implemented, should satisfy.
( 1 1 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
Tackling cultural issues is critical to implementing
A structured assessment can help model the disIn some cases, organizations can use the strategy change. ruptive impact of downtime and reveal the organizato drive other organizational improvements. Such tion’s vulnerabilities. Such an assessment should provide benefits can include improving the reliability of outdetailed information about the configuration of information sourcers and other vendors, phasing out dependence on comassets and effectiveness of operations. It should also facilitate mercial hot sites, and consolidating and automating technology development, implementation, and continued use of risk meas- management practices. Organizations may also be able to ures, controls, and contingency plans. improve the payoffs from alliances and lower the cost of compliance with industry regulations. Phase 1 Critical Outcomes: Requirements for managing… REACT
…crises and priorities for restoring critical operations
CONTROL
…operational risk and priorities for improving availability in critical areas
TRANSFORM
…risks associated with collaborative arrangements and the availability of shared infrastructure
2 . D e ve l o p S t ra te gy
After an organization defines a set of requirements for improving business continuity capabilities, the next step is to define a strategy that will integrate business continuity as a risk management program into the fabric of the organization. The strategy will focus on the complementary elements of monitoring, mitigating, and responding to risk; and these issues will encompass people, processes, technology, and, increasingly, the interdependence of organizations. If the organization’s focus is on reacting to disasters, its strategy will primarily address the structure and enablers of contingency plans. Organizations focused on continuous availability will look at the resiliency of technology infrastructure, the performance needs of stakeholders, and the reliability of processes. When an organization is focused on transformation, this scenario will also encompass steps for maintaining trust between collaborating parties and the integrity of their mutual infrastructure.
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 1 2 )
Phase 2 Critical Outcomes: Strategy for… REACT
…responding to crises and restoring critical operations at an alternative location
CONTROL
…maintaining continuous availability and resolving non-routine events
TRANSFORM
…continuous availability across the extended enterprise and for optimizing the value of information flowing across the value chain
3 . I m p l e m e n t C h a n ge
The organization makes the Phase 2 strategy effort an ongoing reality by investing in a structured program to apply risk management standards, tackle cultural issues, and improve technology and processes. If its focus is on recovering from a catastrophe, the organization will establish arrangements for storing data at an offsite location, secure an alternate location where restoration of computer systems and operations will take place, document contingency plans, and establish an employee awareness and training program. When the focus is continuous availability, leaders will seek to improve critical infrastructure, consolidate and automate infrastructure management processes, and implement real-time
monitoring capabilities. Continuous availability, especially in an extended enterprise, will also depend on establishing an event management shared service that is responsible for providing strategic leadership and coordinating interdependent risk management activities. Phase 3 Critical Outcomes REACT
Develop and test contingency plans
CONTROL
Ongoing delivery of continuously available internal infrastructure
TRANSFORM
Continuous availability of infrastructure shared across the extended enterprise; strategic use of information assets
As tolerance for downtime diminishes, organizations must be able to measure risks and performance as well as monitor both in real time. These efforts should be part of an ongoing continuous improvement effort—one in which the organization maintains an adequate risk posture and improves the effectiveness of its risk management program. If the organization is focused primarily on disaster recovery, efforts to measure risk and performance will involve testing and updating contingency plans. These efforts will also encompass the execution of emergency response, disaster recovery, and business resumption plans in the event of a catastrophe. In the context of continuous availability within an organization or across its extended enterprise, the focus of measurement will address the operational risks that threaten the organization’s ability to accomplish its goals efficiently and effectively. Monitoring in this context will link and integrate incident response, crisis management, disaster recovery, and other processes. Monitoring will also encompass real-time performance measurement from a customer’s perspective.
P u tt i n g P l a n s to t h e Te s t Organizations that develop emergency response, crisis management, and disaster recovery plans need a mechanism to determine if they are effective. In the absence of an actual disaster, testing is the best tool. Testing helps leaders answer such questions as: 1. Are assumptions about threats and vulnerabilities correct?
The technology-driven benefits of measuring performance can be significant. Organizations will likely see improvements in security management and end-user support as well as greater scalability and flexibility of critical infrastructure. They will also enjoy better alignment of business and technology.
2. Is the risk management strategy adequate and comprehensive? 3. Will crisis management processes handle all relevant contingencies? 4. Is the business continuity strategy effectively integrated with the people, processes, and technology it supports? 5. Is the organization maintaining the integrity of its information?
Phase 4 Critical Outcomes
6. Is the organization aligned to maintain its desired risk posture? Since more than 40 percent of organizations do not test their disaster recovery plans, and fewer than 30 percent use performance reviews,9 testing practices literally will separate the winners from the losers in the race to manage risk and protect the enterprise.
9
4 . M e a s u re R i s k a n d Pe r fo rm a n c e
REACT
Maintainability and timely execution of contingency plans
CONTROL
Real-time monitoring and ongoing improvement of critical infrastructure availability and customer service levels
TRANSFORM
Continuous measurement and improvement of the extended enterprise and information asset capabilities
KPMG LLP and “Contingency Planning and Management” Survey. “A Review of Factors Influencing Business Continuity in the Next Millennium,” 2000.
( 1 3 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
I M P L I C AT I O N S A N D O P P O RT U N I T I E S
Efforts to manage business continuity can enable organizations to respond appropriately to crises as well as improve their ability to mitigate the risks of such events. Such achievements can help organizations build and sustain continuous competitive advantages—including lasting customer service, ongoing productivity, employee welfare, and asset integrity.
Efforts to improve competitiveness are more important than ever as the cost of downtime increases while the value of traditional response mechanisms decreases. Faced with this growing capabilities gap, organizations need to consider whether their current approaches to risk management and business continuity remain appropriate. As shown in Figure 6, choosing solely to achieve timely recovery from unplanned downtime (react) may be appropriate for certain organizations; for others, seeking to maintain information availability so that they are always there for their customers (control) may be the better alternative, depending on the degree to which information assets drive value in the organization. When organizational strategies rely heavily on information assets and begin to leverage extended-enterprise scenarios, embedding availability throughout critical areas of the value chain (transform) will help ensure the sustainability of competitive advantage. F i g u re 6 : Fu n d a m e n t a l D ri ve r s o f B u s i n e s s C o n t i n u i t y S u c c e s s Competitiveness Transform: improve information asset performance and sustain competitive advantage across the extended enterprise
Availability
Recoverability
Control: design, implement, and maintain 24x7 information availability infrastructure
React: achieve timely recovery from unplanned downtime
Source: KPMG, 2001.
Addressing the three fundamentals helps leaders manage risks and improve competitiveness across the value chain.
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 1 4 )
As entities evolve into extended enterprises they will rely increasingly on information assets. As they transform, they must ensure that their business continuity strategies evolve along with them. To leverage the opportunities inherent in business continuity management, organizations must ensure that such efforts are aligned and integrated with their overall organizational strategies. Managing business continuity within organizational strategy promotes a variety of benefits, which can include operational excellence, scalable technology platforms, cost-effective technology management, and improved vendor management.
Te n Q u e s t i o n s fo r L e a d e r s As leaders assess their contingency plans and other business continuity efforts, they should consider a number of critical questions, including: ●
Is our business continuity strategy event-driven or risk-driven and stakeholder-focused?
●
How critical is information availability to our success?
●
Are capabilities for managing business continuity aligned with organizational strategy?
●
Who are our stakeholders and what is their tolerance for unplanned downtime?
●
Does the risk management program address people, processes, and technology as well as the extended enterprise?
●
Does the business continuity strategy eliminate single points of failure?
●
How do we reinforce key management disciplines to ensure reliable service delivery to all stakeholders?
●
Does the risk management program support realtime service monitoring and reporting with predictive capabilities for critical infrastructure?
●
How do we optimize the value of information flowing across the value chain?
●
Does management have timely, independent assurance that its business continuity capabilities are adequate?
L e ve ra g i n g Y 2 K I nve s t m e n t s When the world ushered in a new millennium, it also concluded a $600 billion effort to correct the Y2K computer glitch. As part of these efforts, many organizations developed sophisticated contingency plans to respond to potential Y2K-related crises. An unexpected payoff: “Companies including 7-Eleven, Amgen Inc. and drugstore-chain CVS Corp. activated emergency plans after [the September 11th] terrorist attacks halted air traffic, threatening to cut off supply channels for many goods nationwide. Those plans had been designed to address the effects of potential shutdowns in the year-2000 transition.”10 When assessing and improving business continuity practices, Y2K-related contingency plans are often an excellent repository. For leaders they reflect thoughtful, comprehensive methodologies, as well as capture detailed knowledge of critical infrastructure. Determining where organizational change and new technology assets invalidate knowledge is a challenge, but leveraging Y2K can help leaders achieve efficiency and effectiveness.
10
Joe Richter. “Companies Say Y2K Steps Got Goods to Market When Flights Halted,” Bloomberg News, September 17, 2001.
( 1 5 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
C O N C L U S I O N
As risks have evolved and multiplied while tolerance for downtime has declined, leading organizations are taking new steps to implement business continuity programs that protect their people as well as the information and physical assets on which they depend (see Figure 7).
A program for business continuity can influence business strategy by facilitating the competitive advantage that evolves from continuous availability and customer satisfaction. Moreover, it can help organizations shift their focus from crisis response, to proactive crisis management, to improved risk and performance measurement, and, ultimately, to enhanced and sustained customer satisfaction. F i g u re 7 : B u s i n e s s C o n t i n u i t y M a n a ge m e n t S c e n a ri o s
VALUE
RISK
IMPACT
FOCUS
DOWNTI M E TOLERANCE
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
REACT
CO NTRO L
TRA NSFO RM
Re coverability
Availability
Competitiveness
Information Assets
Competitive Position
Customer Satisfaction/Productivity
Extended Enterprise/Value Chain
Event
Cus tomer
All Stakeholders
Da ys
Hours/Minutes
Zero Downtime
Physical Assets
Facilities/Processes
( 1 6 )
A P P E N D I X : L E A D E R S
I NT E RV I E W S
W I T H
Margaret E. McKeough and Valerie Holt, Metropolitan Washington Airports Authority
Margaret E. McKeough is vice president–business administration and Valerie Holt is vice president–audit at the Metropolitan Washington Airports Authority (MWAA), an independent entity established in 1987 to operate the two federally owned airport systems in Washington, D.C.—Ronald Reagan National Airport and Dulles International Airport. They talk here of the wide variety of business continuity issues they have faced, and will face, as their industry evolves in the wake of the events of September 11, 2001.11 D e s c ri b e t h e m i s s i o n a n d s e r v i c e s o f t h e Au t h o ri t y.
Margaret E. McKeough: The Authority is responsible for operating and maintaining Ronald Reagan National Airport and Dulles International Airport. The Authority is a self-sustaining corporation. We must generate revenues to cover expenses and secure financing to construct new facilities and needed infrastructure. Our priority in 1987 was to build a new terminal facility at National, which was completed in 1997. Now growing demand requires significant capital investment at Dulles. A year ago the Authority began a six-year, $3.4 billion expansion program at Dulles that includes a fourth runway, a new air traffic control tower, gate facilities on midfield concourses, two new parking garages, an underground passenger train system, and various infrastructure improvements. Valerie Holt: An airport requires services similar to a city. To support the aviation operations, the Authority operates bus systems, fire and police departments, retail businesses, and licensing and permits functions. The Authority itself employs 1,200 persons. The total employment base at National is approximately 10,000 people, while Dulles Airport provides employment to over 15,000 people. H ow d o e s m a n a g i n g t h e ri s k o f d i s a s te r s f i g u re i n t h e d ay- to - d ay wo rk o f t h e Au t h o ri t y ?
McKeough: The aviation business requires a great focus on safety. Every airport is obligated by
federal regulation to have an emergency response recovery plan. Safety issues also extend to our corporate functions. The stability of the Authority impacts regional transportation and the regional economy. Holt: An unfortunate example is the three-week closure at National Airport after September 11.
The temporary closure sent ripples throughout the regional economy, and we are still experiencing its effect. McKeough: When National Airport was closed for an extended period of time after the
September 11 terrorist attacks, countless people advocated the need for the airport to reopen. Those dramatic responses vividly demonstrated how tied the two airports are to the health of the regional economy.
11
Telephone interview with Margaret E. McKeough and Valerie Holt, October 24, 2001.
( 1 7 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
W h a t i m p a c t i s te ch n o l o gy h av i n g o n ri s k m a n a ge m e n t a n d o n t h e o p e ra t i o n a l s i d e , s p e c i f i cally as a business enabler?
McKeough: Technology is a key factor in efforts to heighten
security and baggage screening processes. For example, we had new, highly sensitive CTX baggage screening equipment in limited deployment before September 11. The initial plan was to have one machine at each airport. We now plan to have multiple machines at both airports. Holt: Regardless of the industry, management is always look-
ing to technology to improve efficiency. For example, the highest source of profit revenue for airports is usually parking operations. Traditionally, airports have used staff to manage the lots and to track the huge sums of revenue associated with parking operations. We are completely changing the technology that supports our revenue controls systems, to improve its efficiency and accuracy, thereby reducing loss. We’re also implementing technology that will read license plates electronically, documenting how long a car has actually been in the lot. This will eliminate the need for manpower to take manual inventory of the vehicles parked in the lots. We’re also planning to eliminate the exit lane personnel by using the automated payment technology employed in transit systems. W h a t t h o u g h t s d o yo u h ave fo r a i rp o rt s a n d o t h e r i n d u s t ri e s t h a t m ay b e l e s s we l l p re p a re d ?
McKeough: I certainly can’t say we’ve got all the answers. We
reopened both airports, but we’re not fully back in business yet, and clearly, with every day of not being back, our challenges mount. Holt: When the revenue flow into your business is impacted,
you need to manage costs and, in our case, the hundreds of third-party contracts that support our operations. Security, for example, could represent a tremendous cost. If the federal government adjusts the way that’s handled, it could also adjust how it’s funded. The business challenges are enormous. McKeough: Every airport relies on its retail and food and bev-
erage concessions to generate a certain amount of revenue. In some cases, those revenues help reduce the rents charged to the airlines. So, higher food and beverage and retail revenues mean
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 1 8 )
lower square footage rates on leased space for the airlines. Until recently, industry standards suggested revenue from these services would be maximized if facilities were located near the airline gates, past security, where people wait to board aircraft. With the change in airport security regulations, however, it may now be better to locate those venues pre-security, where more people have access to them. We don’t know how the economic model will actually be affected. With the market dynamics continuously changing, the business decisions made to generate revenue are going to need to be reevaluated. It’s way too soon for clarity on these issues. R e c e n t e ve n t s a re c e rt a i n l y a ffe c t i n g h ow a i r p o rt s a n d t h e a i rl i n e i n d u s t r y p l a n fo r c a t a s t ro p h e s . W h a t i s s u e s a re a t t h e fo re f ro n t fo r yo u ?
McKeough: Airports and airlines will continue to work closely
in planning for and responding to disaster. Clearly, the September 11 incident identified vulnerabilities at airports throughout the country. Aviation security is highly regulated. Improvements to the passenger and bag screening processes will require greater attention, too. Catastrophe planning is ingrained in airport systems. If there is an aircraft incident on the airfield, a response plan clicks into activation. When we evacuated National on September 11, we had to look at the ripple effects. What do you do with the checked luggage that is left behind? What do you do about the cash registers that were left in open stores? Our focus has always been on the “air side” of airport operations, but we recognize the need to reconsider the land side implications of disaster planning and the impacts on corporate functions. I’ve joked with Valerie that the development of a formal business continuity plan used to be at the bottom of a large stack of priority projects. Well, after September 11, the needs flow out of your head a lot quicker because you’re experiencing them right now. You don’t have to predict the response plan; you’re living the response plan.
John Ball, Standard Chartered Bank
Based in Singapore, John Ball is head of markets operations in the global markets division of Standard Chartered Bank. With facilities in 57 countries, U.K.-based Standard Chartered provides risk management services to local and multinational companies, including investment and financial institutions and central banks. It has 150 years’experience in emerging markets—especially in the currencies of Asia, the Middle East, and Africa. Ball talks here about how new technology is enabling improvements in disaster recovery and business continuity.12 H ow d o e s yo u r o rga n i z a t i o n a p p ro a ch t h e i s s u e o f b u s i n e s s c o n t i n u i t y ?
Simplistically, we view disaster recovery (DR) as loss of software and systems and business continuity (BC) as loss of premises. Each of our locations has to have its own standalone DR and BC plans that they test on a regular basis. For locations in our core markets, where we operate mainframe applications, we outsource the maintenance to a third-party service provider that guarantees DR. Each location is also required to have identified an alternative BC plan (BCP) site. In London, for example, that site is in a separate building altogether. Within markets operations we are centralizing the transaction processing for 10 sites, including U.K., U.S., and our Asian core markets, into two processing centers. In addition, we are replacing the mainframe applications with scalable solutions that we are moving into a single global data center in the United Kingdom. The 10 sites and two processing centers access the applications in the data center remotely. There are considerable risks associated with such a facility. Consequently, for DR purposes we operate a second data center in a separate building that is a real-time image of the first and connected via a fiber-optic cable. Each data center has been designed to operate the complete set of markets’ applications at full operating capacity. In practice, the applications are split between the two data centers with data mirrored to the non-active site for fail-over operations. Therefore, if the primary center goes down, the failure will not be apparent to the users because its mirror image will already be operating. W h a t a re t h e b u s i n e s s d ri ve r s o f yo u r a p p ro a ch ?
We decided in 1999 that we needed to reduce our unit operating cost. We operate in some fairly expensive places—Hong Kong, London, Singapore—and we wanted to create economies of scale. Our mainframes weren’t scalable, and they had no Internet technology, which is going to be a number-one requisite going forward. To minimize cost, we realized that we needed to move off the mainframe, use common applications in all locations, create economies of scale for processing, and apply standardized procedures and controls. That meant consolidating all our software into a single location, thereby reducing not only mainframe charges but also support costs and future upgrade costs. We also addressed the issue of DR through the creation of the split data center concept. Obviously, there is a risk with operating the data centers in the same city, if both buildings go, but that is a business risk we are comfortable with.
12
Telephone interview with John Ball, November 7, 2001.
( 1 9 )
T W E N T Y - F I R S T
C E N T U R Y
C H A L L E N G E S
F O R
C O M P E T I T I V E N E S S
H ow d o t h e s e b u s i n e s s i s s u e s a ffe c t yo u r b u s i n e s s c o n t i n u i t y p ro g ra m ?
We operate from the time the markets open in Japan until they close in New York. In creating our BCP, we knew we would need processing capabilities that could enable us to support locations in all time zones. We decided on a two-center strategy. We built our first processing center in Singapore in November 2000, and it is now handling transactions for Singapore, Hong Kong, and Japan. In May 2002, our second processing center will be live in India, where we will process transactions generated from Mumbai through to New York close. The implementation of two processing centers gives us built in business continuity. However, until both centers are live, Singapore operates a standalone BCP. In the longer term, we intend to roll out this model in Africa as well as in the Middle East/South Asia (MESA) region. The Africa project is well advanced in creating two processing centers to support the region. We hope to start the MESA project next year, the objective of which will be to consolidate applications into a single data center and a single processing center to support the region. The ultimate goal will then be to consolidate all the regional processing centers so that there are just two supporting the whole of global markets. W h a t a re t h e m o s t wo rri s o m e t h re a t s to t h e c o n t i n u i t y o f yo u r e n te rp ri s e ?
The answer depends on the country. In the United Kingdom, our biggest threat has been due to terrorism. In Indonesia, where we are a foreign bank in a country that’s had considerable political and economic troubles in recent years, our problems arise due to rioting and changing attitudes. In Singapore and Hong Kong, on the other hand, our main threats are fire or natural disasters. We would have said the same of New York prior to September 11. We housed our payments systems and cash management operations in 7 World Trade Center, where we were tenants. When we evacuated 7 World Trade Center, some of the IS and IT support teams were able to get to New Jersey and initiate the critical back-up applications there. Like everyone else, we had problems getting the right people to the right location, but fortunately, our U.S.-dollar (USD) clearing was not significantly
M A N A G I N G
B U S I N E S S
C O N T I N U I T Y
( 2 0 )
affected due to the ability of staff to invoke the DR at our BCP site. The fact that we made 98 percent of our payments on September 11 is testament to the quality and commitment of the people and the fact that it is critical to have a viable DR and BC plan. H ow w i l l o rga n i z a t i o n s ch a n ge t h e i r b u s i n e s s c o n t i n u i t y p ra c t i c e s i n t h e f u tu re ?
I think we will all be better prepared. Like everyone else, we have become much more aware of the importance of BCP. People used to pay a lot of lip service to BCP and consider it a necessary evil. It is time consuming and can be expensive. But our experience on September 11 showed us just how beneficial having properly prepared BC and DR plans can be. I know some institutions struggled after September 11—institutions I would have expected to have been better prepared. I think to a large extent complacency had set in. As a U.K. bank, we have seen the impact on business of terrorist activity in London, and that made us much more aware of the associated issues. I think people’s awareness has certainly changed now, especially with regard to areas previously considered reasonably safe. A l t h o u g h b u s i n e s s c o n t i n u i t y p l a n s h ave b e e n re ga rd e d a s a c o s t , t h ey h ave o bv i o u s l y p rov i d e d va l u e . H ow d o yo u fo re s e e t h e i m p l e m e n t a t i o n o f B C P ch a n g i n g i n t h e f u tu re ?
In the short term, people have chosen temporary BCP solutions—one in which, for example, you rent space in an empty building that you hope never to use. But I think they will eventually move toward in-built business continuity, as we are doing, and new technology makes that possible. There is no longer a requirement for big mainframes that are expensive to operate; with new technology you can mirror data remotely, operate live back-up sites, and add processing power fairly easily. Take us, for example. We operate a single data center in London that essentially functions as our DR site. So, whilst we may have spent a little bit more on putting the necessary hardware in place, it’s not that much more of an incremental cost over what it has been to upgrade our applications. Changes in technology enable people to change the way they view DR and BCP. The new technologies will be incorporated into the organizational infrastructure as companies upgrade from mainframes to the next generation of technology.
KPMG’s Risk and Advisory Services KPMG’s Risk and Advisory Services (RAS) practice focuses on the fundamental business issues— managing risk, increasing revenues, controlling costs—that all organizations, in all industries, must address in order to flourish. RAS encompasses a wide array of advisory services designed to help companies deal with these issues and to better manage the financial and operational functions of their organization. RAS professionals help companies identify and manage risks, including the risks inherent in the technology systems used to support business objectives, and provide them with information to help them meet their strategic and financial goals.
KPMG’s Assurance and Advisory Services Center KPMG’s Assurance and Advisory Services Center (AASC) provides assistance to KPMG member firms in creating, enhancing, and supporting KPMG member firms’ assurance products worldwide. Staffed by client service and technical professionals recruited from KPMG member firms around the world, the AASC is a center for assurance research and innovation, product development and support, knowledge management, and technology tool integration.
Major KPMG Contributors Stuart Campbell Felipe Alonso Charles McKinney Rick Cudworth Jeanne Edwards Sally Hales David DiCristofaro John Boucher Robert A. Litt Sarah Wise Colleen Drummond Diane K. Nardin
Visit us on the World Wide Web at www.kpmg.com.
The information contained herein is of a general nature and is not intended to address the circumstances of any particular individual or entity. Although we endeavor to provide accurate and timely information, there can be no guarantee that such information is accurate as of the date it is received or that it will continue to be accurate in the future. No one should act upon such information without appropriate professional advice after a thorough examination of the particular situation.
©2001 KPMG LLP, the U.S. member firm of KPMG International, a Swiss association. All rights reserved. Printed in the U.S.A. on recycled paper. 23522atl