IBM Global Services
ITIL Foundation Course V1.0 Introduction to the IT Infrastructure Library Availability Management
ITIL ® is a Registered Trade Mark, and a Registered Community Trade Mark of the office of Government Commerce, and is Registered in the U.S. Patent and Trademark Office
© 2004 IBM Corporation
IBM Global Services
Module 10
Availability Management Content:
Availability Management – objective and overview Responsibilities and obligations Some definitions Important aspects: – Uptime, downtime, and availability – Availability measurement – Availability reporting
Benefits and risks Best practices Summary
2
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Integration into the IPW Model
Implementing your Service Desk infrastructure
IPW Model is a trade mark of Quint Wellington and KPN Telecoms 3
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Mission Statement
Availability management ensures that IT delivers the right levels of availability required by the business to satisfy its business objectives and to deliver the quality of service demanded by its customers. Availability Management should ensure that the required level of availability is provided. The measurement and monitoring of IT availability is a key activity to ensure that availability levels are being met consistently. Availability Management should look continuously to optimize the availability of the IT Infrastructure, services, and supporting organisation, in order to provide cost-effective availability improvements that can deliver proven business enhancements to customers. Goal of Availability Management: Forecast, planning, and management of services availability, to ensure that: – All services are based on appropriate and latest CIs – For CIs not supported internally, appropriate agreements exist with third-party suppliers – Changes are suggested in order to avoid future service downtime Ensures that SLA-agreed availability is met 4
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Definitions (1)
Availability: measured by Mean Time Between Failures (MTBF).
Ability of an IT service or component to perform its required function at a stated instant, or over a stated period of time
Underpinned by reliability, maintainability, serviceability, and resilience of the IT infrastructure Reliability: measured by Mean Time Between System Incidents (MTBSI)
Ability to work without operational failure Depends on the probability of failure of each component, the resilience built into the IT infrastructure, and the preventive maintenance applied to prevent a failure from occurring Maintainability: measured by the Mean Time To Repair (MTTR)
Ability to be retained or restored to an operational state Depends on anticipation, detection, diagnosis, resolution, recovery from failures, and restoration of data and IT service 5
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Definitions (2)
Serviceability: cannot be measured as a specific metric
Ability to maintain the availability, reliability, and maintainability provided by the contractual agreements with the IT service providers Resilience: or Fault Tolerance
Ability of an IT service to remain operational in spite of malfunction by one or more subcomponents
6
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Tasks
Availability planning
Monitoring, Review, and Assessment
Availability plan Availability Managemen t
Availability Improveme nt
7
ITIL Foundation Course | Student material v1.0
Identificatio n Availability Requiremen ts
© 2004 IBM Corporation
IBM Global Services
Availability Management
Inputs and Outputs
Outputs: Availability and recovery design criteria for new or enhanced IT services
Inputs:
Availability, reliability, and maintainability requirements of the business for new or enhanced IT services.
Availability techniques that will be deployed to provide additional infrastructure resilience
Business Impact Assessment (BIA) for each vital business function underpinned by the IT infrastructure.
Information on IT service and
Availability Management
component failures coming from incidents and problems.
Configuration and monitoring data SLA achievements
8
ITIL Foundation Course | Student material v1.0
Availability reporting to reflect the business, IT support, and user perspectives
Monitoring requirement for IT
components that allow the detection of deviations in availability
Availability Plan for the proactive improvement of IT infrastructure availability
© 2004 IBM Corporation
IBM Global Services
Availability Management
Uptime, Downtime, and Availability MTTR
Recognition
Incident
Repair
Diagnosis
Recovery
Recovery
Incident
MTBF
MTBSI
Time
MTTR - Mean Time to Repair DOWNTIME Maintainability MTBF - Mean Time Between Failure UPTIME Availability (Serviceability) MTBSI - Mean Time Between System Incident Average Reliability Reliability 9
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Availability Measurements (1) When is a service not available? ”A service is not available to a customer if the locally required functions cannot be used, although the agreed conditions for the provision of the service are fulfilled." A simple calculation of availability in %:
Agreed service time Downtime Agreed service time
X
100 1
But what does 98% availability mean?
10
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Availability Measurement (2) Serial
Parallel
Disk A Disk A Availability = 90%
Availability = 90%
Disk B Availability = 90%
Availability only then, if both are in
Disk B
Availability = 90%
Availability = 1 – not available
AxB=
1 – both are not available =
0.9 * 0.9 = 0.81 or 81%
1 – (A not available) x (B not
operation =>
available) = 1 – 0.1 * 0.1 = 0.99 or 99% 11
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Availability Measurement Example (3) Example of availability in a parallel or a serial architecture
12
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Risk Management is also an aspect of availability
Assets
Threats
Risk analysis
Weaknesses
Risks
Risk management
Countermeasures
13
Planning for possible downtimes
ITIL Foundation Course | Student material v1.0
Management of downtimes
© 2004 IBM Corporation
IBM Global Services
Availability Management
Availability Reporting Classical reporting measures % available
% unavailable Duration of unavailability in hours Frequency of failure Impact of failure Problems with classical measures: – Fails to reflect IT availability as experienced by the business and users – Conceal “hot spots”. Generally good availability for the IT organization – Does not support continuous improvement Future measured variables (CCTA acceptance): Impact by user minutes lost (user productivity)
Impact by business transaction 14
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Benefits
IT services with an availability requirement are designed, implemented, and managed to consistently meet that target
Improvement of capability of the IT infrastructure to attain the required levels of availability to support the critical business processes
Improvement of customer satisfaction and recognition that availability is the prime IT deliverable
Reduction in frequency and duration of incidents that impact IT availability Single point for availability is established within the IT organization (process owner)
Levels of IT availability provided are cost-justified and support SLAs fully Shortcomings in provision of availability are recognized and coped with in a formal way
Mindset moves from error correction to service enhancement: from reactive to proactive attitude 15
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Risks
Potential problem areas:
Costs of availability management are seen as overhead and are too high It is difficult to quantify the availability demands of the user and to determine their costs
Lack of available resources with the required skills Gathering of availability data requires many tools to underpin and support the process
Vendor dependency Broad knowledge of IT infrastructure
16
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Best Practices
Separation of design and measurement Usage in connection with capacity, financial management for IT services, and IT service continuity management
Determination of metrics using this process
17
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation
IBM Global Services
Availability Management
Summary
The goal of the Availability Management process is to optimize the capability of the IT infrastructure, services, and supporting organization to deliver a costeffective and sustained level of availability that enables the business to satisfy its business objectives.
Aspects: Availability, Maintainability, Reliability, Serviceability Risk Management Measures of Availability: – MTBSI (Mean Time Between System Incidents) – MTTR (Mean Time To Repair ) – MTBF (Mean Time Between Failures) – Calculation of Availability
18
ITIL Foundation Course | Student material v1.0
© 2004 IBM Corporation