The Service Desk (Function)
Overview The service desk is the window of the IT community and as such should be viewed as the single or central point of contact for all IT products and services. This course will answer the question "Why do we need a Service Desk?" The service desk will provide the user with a single point of contact. A single point of contact provides the user with one number to call for all their IT service concerns and issues. The Service Desk will:
Represent "IT" to the customer and the user Offer the ultimate in customer satisfaction Prioritize highest perception of quality to its customers Co-ordinate people, processes and technology to deliver business services
To deliver high quality support to achieve business goals In order to be successful, the service desk must give the customer the highest quality of support possible while staying within budget and delivering business goals directed by management. This can be achieved through Service Level Management.
To aid in user retention and satisfaction Good quality service will keep customers coming back for more! As mentioned earlier, positive perception is key to the success of the Service Desk. When a customer's needs are met, there is no need to look beyond the Service Desk. Satisfaction can be measured not only by the speed at which the phone is answered but also by the level at which the organization understands their customer's business and needs.
To improve the service but reduce the cost This can be achieved by improving the response time to the customer. Time is money, and the faster you can move forward with a customer's concern the more cost efficient the resolution will be. Collecting the vital information needed to resolve the customer's request the first time (that is, via 1st level support) will also reduce the amount of time the customer and 2nd level support has to wait. If wrong or insufficient information is collected at 1st level then 2nd level support has to get back to the customer or Service Desk for data collection. The Service Desk is also
responsible for ensuring that the right 2nd level support group is working on the customer's issue. Again, making efficient use of resources will reduce costs.
To highlight customer training and education needs. As the single point of contact, Service Desk will understand the customer's needs. All requests for both incidents and service are filtered through the Service Desk. The Service Desk will identify trends that highlight a possible IT need, for example, training. The Service Desk will also be aware of new technology that will be introduced to the business and will make recommendations on training needs based on historical data from previous software or hardware releases.
To close incidents and confirm customer satisfaction Before closing records/tickets, the service desk will consult the customer, giving them the authority to ensure that their problem was resolved to their satisfaction. This is an excellent way of confirming satisfaction with the customer. This type of service is not very common and is very well received by customers.
Contributing to problem identification The Service Desk staff will have a combination of both superior technical and soft skills, which will enable them to be more involved in the problem identification process. The Service Desk will use both experience and a knowledge base to collect data for both a work-around and resolution. Since it is the responsibility of the Problem Management Process to resolve problems, the Service Desk can identify previous solutions to Problem Management. Tools play a vital role in this area. A call tracking device or knowledge base can be considered a tool.
Roles and Responsibilities The Service Desk is responsible for accepting and recording all calls, without exception. This information, no matter how insignificant, will be vital for statistics. All calls are important - even if they are wrong numbers. A large number of wrong numbers coming into the Service Desk may identify a problem with the lines or trunks coming into the Service Desk. It is the directive of the Service Desk that own the Incident Management Process to resolve as many calls as possible at 1st level. World class 1st level resolution is set at 85%. This can be achieved by close co-operation with 2nd level support. As calls are being forwarded to 2nd level support for problem resolution, the Service Desk has to ensure that information regarding resolution is placed in the call-tracking tool. This information can be used for similar calls received in the future. If the skill set is adjusted to allow 1st level (Service Desk, Incident Management) to resolve the issue then the % of resolution will go up. In any case the fact that the information is being collected will reduce the waiting time for resolution to the customer and the research and diagnosis for 2nd level support, resulting in a cost saving and efficient use of resources.
The Service Desk is responsible for monitoring and escalating according to all SLAs. It is a given that on all Service Level Agreements the Service Desk is highlighted as the point of escalation. Close monitoring of all tickets being escalated will ensure that 2nd level support does not miss their SLA targets to resolution. The Service Desk will keep users informed on the status and progress of their requests. Management of the customer's expectations and informing the customer of any status changes or progress with their requests is vital to customer satisfaction. If close monitoring of requests is adhered to, the customer's expectations will be exceeded. In addition to keeping customers informed the Service Desk is responsible for closure and verification of every request. Communication of planned and short-term outages will be done when information is gathered from either the Change Advisory Board or Service Level Management. The Service Desk will broadcast these messages to the users using whatever means are available at the time, be they phone messages, e-mail, broad cast messages, bulletin boards, intranet, and so on. Co-ordination of 2nd level and 3rd party support for all customers' needs is another responsibility. The Service Desk owns all problems being escalated. They will ensure that the right 2nd level group is handling the customer's request. They will also ensure that the proper 3rd party vendor is involved and providing the service within the underpinning contract. The Service Desk has the ability to inform management on all recommendations for service improvements. Being the single point of contact, the Service Desk will have insight into or input from customers' needs.
Benefits:
Improved user service, perception, and satisfaction Increased user accessibility via SPOC (Single Point of Contact) Improved quality and faster response to user requests Improved team work and communications Management of infrastructure and control Reduced Cost (Efficient use of resources)
Exercises The Service Desk (Function) You are the new ITIL process owner for the ABC COOKIE COMPANY. You are assigned to set up a service desk. How will you apply the benefits of the service desk to the ABC COOKIE COMPANY? 1. True or false? A call to the service desk to place an order for cookies is an incident? 2. True or false? The service desk makes recommendations on service improvements? 3. Which of the following are true?
(A) The service desk has to give the highest quality of support possible within budget (B) The service desk has to deliver business goals directed by management A. B. C. D.
Both A and B Just A Just B None
Answers 1. Any call to the Service Desk is considered an incident (True) 2. The Service Desk does make recommendation for improvements (True) 3. Both A and B
Configuration Management
Why Configuration Management? The responsibility for businesses to deliver quality IT services economically, efficiently and effectively is what drives Configuration Management. Configuration Management is the need to control IT assets and services.
Terms and Definitions Open table as spreadsheet Term Definition CI
Configuration Item. Anything within IT that is decided to be within scope and can be changed should be considered a "CI". This could be hardware, software, service level agreements, job descriptions, and so on.
CMDB
Configuration Management Database. The CMDB holds all detail and relationship information of all CIs associated with the IT infrastructure.
SCOPE
The activities of configuration management include identification, control, Status accounting and auditing.
In order to be able to do this, the company has to decide what will be within scope of configuration management. If the scope is too big then the CMDB might have integrity issues. If the scope is too small then the CMDB might be useless. Configuration management, working with asset management, will ensure that all configuration items (Software, Hardware and Documentation) are identified, recorded and tracked. All ITIL processes rely on information supplied by Configuration Management. The information that will be available is:
CI history (from the point of being ordered to retired). Information on all assets (what type of equipment, attributes and location, and so on). Information on how much assets are available when a release is being planned. The relationship between assets (PCs connected to server for instance). Call tracking data used in incident, problem and change management. Information on suppliers that are related to either service or asset providers. Information on lease management on all assets. Information used for auditing company assets to ensure asset management numbers. Information for IT Business Continuity Management Process (Base Line).
Planning Configuration Management Planning the Configuration Management database will need a great deal of input from all IT departments. The size of the database will depend on the information that the IT departments need to provide and where it is coming from. The question that will pop up will be "How big or how small will this database be?". If the database is too small the information contained in it won't be of much use. There will be an inability to track down certain CIs or do any trending. If it has too much information it will too cumbersome to maintain and will soon become riddled with integrity issues. No one will want to maintain a large database. Also, the cost of managing a large database may outweigh its value. It really all depends on the scope and the resources to manage it. The pertinent information may be found in a number of different databases, which needs to be assembled into one logical format within the incident control system. In the diagram below you will see that the service desk has access to information from five different databases. Having access to this information depends on the incident control system tool you are using.
Processes, procedures and activities have to be defined. Who will manage it? There is going to be a need for processes and procedures that will clearly define accountability and authority for managing the information. Only the Configuration Management Owner can authorize people such as the Service Desk Analyst to update or change the Configuration Management Database. The service desk would be the likely group to do this since they are in contact with between 5 to 10 percent of the employees of a company daily. Planning the relationship processes between Configuration Management, Change Management and Release Management as well as third-party vendors is important. Configuration Management contains information on all CIs within the infrastructure. This information can help with Release Rollouts, which are directed by Change Management. For example, if your company decides to release Windows XP into your infrastructure there are certain elements that need to be considered:
How many licenses do we need? Is the equipment on which the software will be loaded capable of handling the operating system? Will any training need to take place to make the transition smoother?
If your configuration management database is planned and built properly, this information is obtainable. A tool that can do this will need to be selected. All of the larger Help Desk / Call Centre solutions companies are ITIL certified and will be able to supply a tool that can do this for you.
Identification You now have a configuration management database! So what do you need to do now? The next step is to decide how you enter the information into your system. You have to decide on an identifying and labeling procedure. When you decide what CIs are in the scope of your process
you either have to identify and label them yourself or have your third-party providers do that for you. It is very important that you take care in this step. The Service Desk will need to ask this information of a user when they call for support. If the information is not clear or understandable then the issue will take much longer to resolve. Identifying equipment can be very straightforward. For example: A laptop in company "ABC" located in building "10" on the "3rd" floor can be identified as follows: ABCLT1003
ABC - Company LT - Laptop 10 - Building 10 03 - 3rd Floor
This identification will now be associated with information related to that CI. The identification will also include information regarding who owns the equipment and what version of operating system is on it, as well as what server and department its associated with. All this information is vital to understanding the impact that each piece of equipment has on the infrastructure. The business will also have to decide what level of detail regarding each CI is appropriate - will having information on the PC, monitor and keyboard be enough, or should the detail be at a deeper level?
Control The person who owns the Configuration Management Database is called the "Configuration Librarian". This person decides who has update authority. As mentioned earlier, the Service Desk would be very helpful in this area considering the amount of contact they have with the users. Only authorized CIs can be included. This means that if your standard desktop is an IBM machine then any other manufacturers who are not recognized should not be included in the database. From the moment that a purchase order for equipment is signed an entry into the database should be made. This is very important for management to understand what their infrastructure looks like at the earliest point. Upper management should also have access to this database, in order to find the answers to simple questions about CIs throughout the company (that is, to determine how many computers are owned by the company). This information should include all recent purchases. It is also important for this database to track equipment to the point at which they are retired or disposed of. This information will be used by a number of different ITIL processes, including IT Financial Management.
Status Accounting
Status accounting takes into consideration the state of every CI within the company. If a computer is ordered then its status is "ordered". When the CI arrives it is now in "received" status. When it is deployed, in "active" status, and so on. Every CI will have historical data associated with it. This will be beneficial when deciding on a desktop that you want to use in your company. Historical status information will let you know the success or failure rate for all CIs. For example if you determine from the data that the current desktops had a higher than 10% failure rate then you should consider other options for future acquisitions. Status accounting information can be used for audit purposes to determine the current equipment actively being used by the company.
Verification and Audit Verification and Audits have to be done prior to any major changes and releases to CIs. This will give your company an idea as to what equipment or technologies are out there and compare it with the information within the database. On average, the Service Desk comes into contact with between 5 - 10 % of company users daily. This means that by asking a few selected questions, they can verify the accuracy of the information in the database.
Other Definitions
CI Attributes (unique identifiers) o Serial or copy number o Model number o License number o Type o Version Relationships o Connected to o Part of o Copy of
Benefits
Configuration Management information that supports all other processes Information for impact and trend analysis for problems and changes Assists in adherence to legal and contractual obligations Reduces risk of unauthorized software Helping with financial planning
Exercises Configuration Management (Process)
The ABC Cookie Company needs to set up a configuration management process. How will you apply the benefits Configuration Management to the ABC Cookie Company? 1. True or false? An Oven is a Configuration Item (CI) 2. What would be an attribute of an OVEN? 3. Which of the following is true? (A)The Configuration Management Database (CMDB) contains physical copies of hardware, software and documentation (B) A job description is considered to be a CI item A. B. C. D.
Both A and B Just A Just B None
Answers 1. An oven is a CI (True) 2. An attribute for an oven could be a rack or thermostat 3. Just B. No physical copies of hardware can be stored in the CMDB. A job description is a CI because it can be changed.
Incident Management The Service Desk (function) owns the incident management (process). The goal for incident management is to restore service as soon as possible, minimizing the disruption to the business.
Terms and Definitions An incident is defined as any event that is not part of the expected operation of a service that causes (or may cause) an interruption to (or a reduction in) the quality of that service. This is true of all incidents involving hardware, applications, service requests or documentation. Most incidents are related to either hardware or software. Documentation is an integral part of the IT department. If documented procedures for an operating system are changed and the supporting staff is not informed, the procedure will likely fail due to the use of an incorrect or out of date procedure - that would be considered an incident. Applications, hardware and service requests are within the scope of incident management support.
Incident Management Activities: Detecting and recording All communications with the service desk function regarding an incident or request should be recorded regardless of whether the service desk detected the incident or if a user reported it. The capturing of incidents is beneficial for a number of other processes. Problem management can use the information from all tickets to do trend analysis. Service level management can use the information for metrics in service management. Change management can use the information to detect if incidents occurred due to the lack of proper change management.
Classification and initial support The service desk, through the incident management process, will classify each incident or service request coming into the service desk. It will be determined early whether it is an incident or service request.
If it is an incident, it will then be determined if the incident is routine (such as printer start), known error (lack of space on server) or a known problem (blue screen of death for instance). Each possibility will have handling processes around them. Part of classification is to determine the priority of the incident. This is done through an evaluation of impact, urgency and expected effort. Once the classification is determined, a priority is assigned.
Impact - The degree to which the business is affected by the incident Urgency - The speed in which the incident needs to be resolved Expected effort - The resources needed to rectify the incident
Part of initial support is to inform problem management of any new or unmatched incidents as quickly as possible. The service desk will either resolve or quickly find a resolution using the incident management processes.
Investigation and diagnosis Understanding the details of the incident by collecting any pertinent information related to the incident or if available consulting a knowledge base is an important part of the investigation and diagnosis stage. This information will determine any resolution activity, workarounds or escalation process. There are 2 forms of recognized escalation:
Functional - usually from 1st line upward due to lack of skill set to resolve the incident Hierarchical - usually a manual escalation through authority and can be done at any time
When the right resolving group has accepted the incident and evaluated the type of work that is needed, both the service desk and the customer must be informed of the time to resolution.
Resolution and Recovery Incident management will use a workaround solution to resolve an incident or initiate a request for change (RFC).
Incident closure The incident can only be closed with the user's permission. Confirmation of incident resolution will be made with the user and details of the resolution must be placed in the Incident Control System.
1st, 2nd and 3rd line support Within the incident management process there are multiple lines of support. The service desk is usually considered 1st line support. 2nd, 3rd line and up escalation is based on what skill set is needed, determined by the service desk. The service desk (1st line) accepts an incident. If after recording and some initial support it is determined that the resolution of the incident is not one that is presently known or the skill set is not available at 1st line, the service desk will escalate based on functional escalation (skill set) to 2nd line. If 2nd line can't resolve the incident, they will escalate to 3rd line and so on. Regardless of which line of support resolves the incident, it is the responsibility of the service desk (1st line) to confirm closure with the customer.
Relationship between Incident, Problem and Change Management Processes The relationship between processes highlights the importance of ITIL within the IT environment. The Incident Management (process) through the Service Desk (function) collects information in the incident control system. Incident Management can resolve about 85% at 1st line. Anything the Incident Management process cannot resolve has to be escalated to Problem Management. Problem Management relies on Incident Management to provide enough information to create a work-around. The work-around is for incident management to provide an interim solution to satisfy the user. To permanently remove the incident it has to be determined what the known error is. After the known error is determined and the CI (configuration item) at fault is identified, a request for change (RFC) is presented to the change management process for handling.
Benefits Reduced business impact of incidents By having proper incident management processes in place the duration of the incident will be reduced, resulting in the management of the customer's expectations to resolution.
Proactive Identification of beneficial system enhancements and amendments The ability to use the Incident Control System to analyze incidents, what is causing them and how they can be resolved (Knowledge base).
Availability of business-focused management information related to the SLA The incident control system being used by the service desk contains information on all incidents. There will be metrics indicating when the incident was logged, resolved and closed. This information will be compared with the SLA requirements, which will indicate if the incidents were resolved within SLA targets.
Improved monitoring of SLA
The service desk using the incident management process will monitor all Incident Control Tickets being escalated to 2nd level support and up to ensure that SLA targets are met.
Improved management information Incident management will capture all information related to incidents, both the symptoms and the resolution. This will give management a better insight into the IT infrastructure. Information on how many incidents for each classification are created, including what areas were affected and how long it took for resolution.
Better staff utilization and efficiency Incident management is responsible for ensuring that the right resources are working on the right problems. By doing this, the resolving groups will be used more efficiently and at the correct times.
More accurate CMDB information Incident management will ensure that all information going into the Incident Control System ticket is correct. The information that populates the ticket comes from the CMDB. Verification of the information will always be done and any discrepancies will be corrected.
Improved user and customer satisfaction The faster incidents are resolved correctly the first time the more satisfied the user/customer. The Incident management process ensures that speed and accuracy is incorporated into the process of incident resolution.
Exercises Incident Management (Process) The ABC Cookie Company has set up the Incident Management process. A call came into the Service Desk regarding an issue with one of the ovens. What activities would incident management do to try and resolve the incident? 1. Who will close the ticket with the baker, incident management or the Service Desk?_______________ 2. If incident management could not resolve the oven issue at 1st level what kind of escalation would be used to get the right skills working on it: Hierarchical or Functional? _______________ 3. Which of the following is not a benefit of Incident Management? ______ A. B. C. D.
Better staff utilization and efficiency Elimination of lost Incidents More accurate CMDB information Improved user and customer satisfaction
E. Reduced risk of unauthorized software
Answers 1. The Service Desk is the function and it owns the Incident Management Process so the Service Desk will close the ticket with the baker 2. Functional escalation to 2nd or 3rd level support 3. a) Better staff utilization and efficiency is not a benefit of Incident Management
Problem Management The goal of problem management is to minimize the adverse effect of incidents on the business, problems caused by unknown errors in the infrastructure and to prevent the recurrence of incidents related to those errors.
Terms and Definitions Getting from problem to known error is a major part of the problem management process
Problem - the unknown underlying cause of one or more incidents. Known Error - when the root cause is known and temporary work around or permanent alternatives have been identified.
Problem Management Activities Problem Control Problem management handles incidents that cannot be resolved by the incident management process at first contact and directs its resources toward the root cause. Problem management will identify the problem and record it in the knowledge base for future reference. The problem will be classified based on the SLA. After the problem is classified, problem management will then perform a structured investigation and diagnosis to determine the CI (configuration item) at fault. When the CI has been identified, a RFC (request for change) will be submitted. Problem management will then direct incident management on the best work-around and the time to resolution of the problem.
Error Control Now that the problem has been identified, error control eliminates known errors by working with change management. Error control has to be aware of, monitor and eliminate known errors where possible in a cost justifiable way.
Assistance with the handling of major incidents Problem management will assist incident management with major incidents by alerting them of known errors and workarounds.
Proactive Problem Management Proactive prevention focuses on identifying and resolving problems before incidents start to occur. This can be done in 3 ways:
Trend Analysis o Using metrics and incoming data to identify patterns related to incidents or CIs indicative of a problem Targeting Preventive Action o Be aware of current incidents o Volume of incidents o The number of customers effected o Duration and related costs of resolving those incidents o The cost to the business with this outage Providing information to the organization o Information on the IT infrastructure that is found within the CMDB can be very useful to the organization when they go through the exercise of selecting new hardware
Completing Major Problem Reviews Major problem reviews occur when the IT infrastructure has stabilized following a major problem. In this review there will be a description of the known error and what caused it. An example could be that the known error was a scheduled job that didn't run, which was caused by a change to the production schedule. The major problem review will identify the actions that were taken to resolve it and what future considerations will be taken to prevent this from happening again.
Benefits
Increased IT organizational service quality: o More effective and efficient handling of incidents, problems and known errors will result in the increase of IT service quality o Improved IT and Business relationship o Improved management information Reduction in the number of incidents and problems will improve user productivity. Permanent solutions:
o
Solutions that are put in place by problem management will become part of the infrastructure and will be used by 1st level support to increase the likelihood of resolution at 1st call.
Exercises Problem Management (Process) The ABC Cookie Company has set up the Problem Management process. A number of incidents have been reported to the Service Desk regarding oven 1 shutting off due to the breaker tripping in the power room. Incident management, as a work-around, instructed maintenance to go into the power room and reset the breaker. After problem management used the "problem control" activity it was determined that the new oven installed last week was put on the same breaker. 1. What would you consider the incident to be? _______________________________________ 2. What would you consider the known error to be? ___________________________________ 3. What would you consider the request for change to be? _____________________________ 4. Match the following two columns: Incident
Someone changed the start time to a job
Problem
Same job failing every day at the same time on the same code
Known Error
Reports aren't being produced
5. Give another example of incident, problem and known error 6. What information does Problem management give to the Service Desk? Answers 1. 2. 3. 4.
Incidents are oven 1 shutting off by itself Known Errors are 2 ovens on the same breaker Possible RFC would be to add another breaker for the new oven Match the columns: Incidents = Same Job keeps failing with same code Problem = Reports are not being produced
Known Error = Someone changed the start time of a job 5. Give your own examples of incident, problem and known error 6. Problem Management gives the Service Desk "Known Error" Information
Change Management Change management exists to ensure that all changes introduced into the IT infrastructure do not negatively affect service levels. The changes have to be done using standardized methods and procedures in an efficient and prompt manner to minimize impact.
Terms and Definitions Term RFC (Request for Change)
Definition A request for change is submitted for any adds, moves or changes to any CI in the infrastructure.
RFC LOG (Request for All requests for change are given a call tracking number so that each Change Log) RFC can be monitored and referenced. The information contained in the log will be used at post change meetings and any future requests showing similar requirements. FSC (Forward Schedule of Change)
Information obtained from change management about any scheduled upcoming changes. This information is vital to Projected Service Availability (PSA).
CAB (Change Advisory Board)
A membership of people ranging from the Change Manager to support staff and Selected Subject Matter Experts (SME). These people meet regularly to approve and assess change requests. Members can change from meeting to meeting depending on the type of RFC.
CAB/EC (Change Selected members of the CAB committee that will be contacted in Advisory Board extreme or urgent situations to assess and approve changes that are Executive Committee) outside of the time lines agreed to for changes. Board (Senior Management Board Level)
Senior level management that makes strategic decisions resulting in possible major changes to the infrastructure. When approved, these changes will be passed to the CAB for authorization and scheduling.
CM (Change Manager) A role that has defined responsibilities: "Cradle to grave" ownership of all RFCs, who sees them Owner and chair of the CAB meetings and selects the attendees that will be required for each meeting Issues FSCs via the Service Desk PSA (Projected Service This report is a result of the FSC. It contains details of changes to Availability) availability in relation to SLAs. This information is also used in the development of new SLAs.
What is a Change? A change is an action that alters the status of a CI that is found within the IT Infrastructure (CMDB).
Change Management Activities Filtering The change manager has to assess all RFCs that are presented. The assessment is based on the content of the RFC document; that is, is there enough information on the RFC to move forward with the change? Sufficient information will ensure the change can be implemented with limited disruption to service levels.
The following questions will have to be answered by the Change Manager:
Has this type of request been submitted before? The change manager will check to see if an RFC has been submitted in the past. Was the previous submission rejected? Why? Was it successful? Why or why not? The change manager needs to understand the direction of the proposed change.
This can only be done by the creation of a standard RFC document that will be used by all IT departments. If the RFC is complete and is accepted by the change manager it will be put forward to the CAB committee. However, acceptance does not equal authorization. If the change is small enough, the change manager can authorize it him or herself. If the change manager authorizes a change and thinks it is necessary, the CAB committee will be informed of the changes that were authorized.
Allocating priority and classification Now that the change manager has accepted the RFC it has to be classified and prioritized. Every company will establish their own "change model scope" that will identify classification. As in incident management, classification will be based on the impact and urgency of the change request. Changes can be classified by range from miniscule to mega. For example:
Change requests are based on impact of the problem and urgency of the remedy:
Immediate High Medium Low Immediate - Mission critical system affected or affecting a large number of users. Immediate action required (CAB/EC) High - Severely effecting critical business functions and must be given highest priority Medium - No severe impact but correction needs to be completed prior to next release or upgrade Low - Change is justified but urgency for completion is low
Authorizing
Change managers can authorize minor changes but should inform the CAB committee of changes that were authorized CAB committee will authorize medium or significant changes Senior Level (Board of Directors) will authorize major changes on a strategic level. After they authorize the major changes, authorization will then go to the CAB for review.
The authorizing bodies will make an impact assessment on all change requests. The assessment will be based on the following criteria:
Impact upon Business Impact upon SLA Targets Other Services Impact of 'not doing' Resources and Costs (Time and People) Current FSC and PSA Release Management (What effect change has on release) Ongoing Maintenance of Resources
Change Co-Ordination Throughout the whole change co-ordination process the configuration management owner is kept up to date through updates to the RFC log. The change manager directs any updates to the RFC log to the configuration management process owner who owns this RFC log.
BUILD The change manager will ensure that the correct resources are assigned to build the change. This information will be contained within the RFC and will either identify certain individuals or groups.
TEST
Testing will take place by the requester after the change build is complete. The testing specifics should be contained with the RFC. The builder will only test emergency or urgent changes if time permits.
IMPLEMENT After the testing has been complete to the satisfaction of the requestor/builder then the change manager arranges the implementation of the change into production.
Post Implementation Review After the change has been put into production the change manager will do a post implementation review to see how the change process worked in regards to the implemented change. Some questions to be answered in the review are:
What caused the need for the change? If the change was a result of a problem then the problem management will get involved in the implementation review process. What can be done to avoid the problem that caused this change?
Dealing with Urgent or Emergency Changes There will be time when a change will be considered urgent or emergency. The change manager, when confronted with this type of change, will initiate the process in which the CAB/EC (Change Advisory Board Emergency Committee) will convene to assess the change. CAB/EC will consist of selected members from the CAB who have the accountability and authority to make higher-level decisions. Clear guidelines with an associated process need to be established as to what is considered an urgent or emergency change. The CAB/EC will meet when alerted. They will quickly assess the situation, authorize and co-ordinate the change. After the change has been completed and implemented the CAB/EC will do a post implementation review.
Impact and Resource Assessment
Impact change has upon business. Careful consideration is necessary regarding the change and the business units affected by the change. Change impact upon Infrastructure SLA targets. Impact on other services. Impact on non-IT services. Effect of not doing change. Resources and costs (time and people). Current FSC and PSA.
Benefits
Alignment - the change manager will align all IT services with business needs based on the changes being put forward. The change manager will need to understand the impact of every change on the business. Increased productivity of both users and IT personnel: o Users - Better quality changes with fewer disruptions o Personnel - The change manager will ensure that the right support personnel are working on the changes resulting in better use of resources Risk - since change management assesses and filters all RFCs, the risk of the implementation of bad changes is reduced Quality reports on information related to changes. Change management will create reports on all changes, which will include: o The number of changes o Category/type o Successful/non-successful Increased volume of changes - with a strong change management process, there will be the ability to absorb high volume of changes.
Exercises Change Management (Process) The ABC Cookie Company has set up the change management process. A number of requests for changes have been submitted to the change manager. 1. What activity involves assessing the RFC? _____________ 2. Who authorizes significant changes? _____________ 3. Can the change manager make changes to the CMDB? ______ 4. Put the following activities in the order of occurrence: A. B. C. D. E.
Change co-ordination Allocating priority and classification Filtering Authorizing Post implementation review
1) __, 2)__, 3) __, 4) __, 5) __ 5. Can changes take place without testing? ________ 6. A minor change is a change category. Name the other three. A. ______________ B. ______________ C. ______________
Answers 1. 2. 3. 4. 5.
Assessing an RFC is a Filtering Activity. Significant Changes are authorized by Senior Level (Board). A change manger can authorize changes but not make changes. 1C, 2B, 3D, 4A, 5E. Testing prior to a change should always occur. If there is no time then a change can be done followed by the testing. 6. Change Categories (Minor, Major, Significant and Emergency).
Release Management The goal of release management is to ensure that all requests for change management take into consideration both the technical and non-technical aspects. For example, when a RFC involves upgrading the operating system from Windows 98 to Windows XP, release management will ensure that every workstation will receive software (Windows XP), hardware requirement updates (RAM or hard drive space) and documentation (Windows XP manual and training). All four of the change components need to be coordinated so that the user service levels aren't interrupted.
Terms and Definitions Term DSL (Definitive Software Library)
Definition All authorized versions of software CIs must be kept in a protected and secure area. This could be a vault or locked cabinet for vendor software that is contained on tapes, CDs or cartridges. Internally developed software would be kept on a secure server, which only authorized personnel can access. All information about the software must be kept in the CMDB.
DHS (Definitive Hardware Store)
Surplus hardware must be stored in a protected, secure area. All information related to the hardware must be kept in the CMDB.
SCI (Software Configuration Item)
A configuration item that is software based and either developed internally or purchased externally.
Release Categories: There are three main types of release categories:
Major software and hardware upgrades Minor software and hardware upgrades Emergency software and hardware fixes
Release Types
Delta Release or partial release will contain CIs that are individual components or modules, which are part of a larger suite of applications. For example, Microsoft Word is a component of the Microsoft Office Suite. Full Release - all components/modules or a complete suite of applications. Package Release - two delta releases or a delta and a full or any combination of both.
Activities Release policy and planning This policy will cover the number and frequency of releases in certain business units. All releases must have a unique identifier so that configuration management can maintain information on them. At this point, clarification of peoples' roles and level of authority / responsibilities have to be determined.
Designing, building and configuring a release Planned and documented release procedures should be used for all software releases. If possible, reusing a previously implemented release procedure is recommended. Points to consider:
Release definition Release plan
Fit for and release acceptance This is testing that should be done prior to any release. A test group could be used to test the actual release plus perform any stress testing that is needed.
Rollout planning Time lines, identifying the CIs being affected, documentation, training and communication planning
Communication, preparation and training Initiating the communication campaign, preparing departments and affected CIs for the release, initiate training program per roll out plan.
Distribution and installation
Initiating the phased distribution and installation of software into the infrastructure.
Release Policy Should include:
Clarification of roles and responsibilities for release management. Could be on a document or a number for each supported system or IT service.
Should be used for:
Large or official hardware roll-outs. Major software roll-outs. Bundling or batching related sets of changes.
CMDB and DSL
CMDB (Configuration Management Database) DSL (Definitive Software Library)
DSL Considerations
Media Naming convention Environments supported Security arrangements Scope Retention period Audit procedures
Release Lifecycle Management Development Environment
Release policy Release planning Design and develop or order and purchase software
Controlled Test Environment
Build and configure release Fit for purpose testing Release acceptance
Rollout planning Communication preparation and training
Live Environment
Distribution and Installation
Benefits Improved quality of services due to successful releases Successful releases would improve the quality of service by improving the hardware or software within the infrastructure. They would also reduce the possibility of outages created by releases that were not properly managed, which would have resulted in backing the change out.
Planned use of resources With the ability to track all software and hardware via the CMDB, DSL and DHS, all resource usage is maximized. The correct amount of hardware and licenses are accounted for and are at the disposal of authorized change management requests.
Managed expectation levels Using release management, the user will receive hardware, software, documentation and training at a scheduled time (of which they are informed).
Improved historical data concerning releases Historical data will be kept on all releases as a knowledge base. When releases are requested in the future, there will be data to show how past releases went.
Improved control of software and hardware assets Release management via the DSL and DHS will know exactly where all hardware and software CIs are. This will pertain to assets that are owned or leased.
Reduced risk of unauthorized or illegal software Release management will only authorize licensed software, so pirated or illegal software copies will not be a concern.
Exercises Release Management (Process)
The ABC Cookie Company has a release management process. A request came in from Change management that there is going to be a new release of the chocolate chip module of the chocolate chip recipe. 1. What type of release is this? (delta, full or package) _____________ 2. Where would the physical change be kept? _____________ 3. Where would information about the recipe be kept? _____________ 4. Which of the following is not part of the development environment? A. B. C. D.
Release policy Roll out planning Release planning Design and development
5. What is the relationship between the DSL and the CMDB? _________ 6. What are some of the benefits of having release management in your environment? a. ______________ b. ______________ c. ______________ Answers 1. 2. 3. 4. 5. 6.
Delta Significant Changes are authorized by Senior Level (Board) A change manger can authorize changes but not make changes Rollout Planning CMDB - Identifies the location of an asset. DSL - Is the physical location. See Benefits in Release sections
Service Level Management
Overview The goal of Service Level Management is to:
Establish a better relationship between IT and the customer Maintain and improve customer service
Establish a true partnership in which both the customer and the IT provider have accountabilities
Terms and Definitions
Term
Definition
SLM (Service Level Management)
The monitoring of required service levels.
SLA (Service Level Agreement)
Specific targets identified by SLM for each unit within the IT organization.
SLC (Service Level Contract)
Specific targets identified by SLM for each unit within an external IT supplier.
OLA (Operation Level Agreement)
Specific targets for the service being supplied by internal service providers (Network services, LAN services, and so on).
UC (Underpinning Contract)
Specific targets for the service being supplied by an external service provider (such as, G.E. Capital, Decision One).
Service Catalogue
A collection of all the services being provided and the customers of each.
SLR (Service Level Requirements)
SLM will ask each IT customer what his/her requirements are. This will be embedded into the SLA.
SIP (Service Improvement Program) After the review of a SLA, service improvements may be necessary. A service improvement plan will be designed and actioned.
Service Level Management Process Activities Establish the function
Initial planning activities: o Mission statement o SLM scope o Objectives Plan monitoring capabilities: o Tools o Incident Control System Planning and implementation of the service catalogue. Establish customer perception of current service levels by surveying if no other method of monitoring is in place.
Implementing the SLA
Produce the service catalogue of all services being provided by IT. Draft an initial service level agreement with the customer based on the service catalogue and what the customer's service level requirements are. Expectation management (satisfaction = expectations - perception). Negotiate the service level agreement based on the contents of the service catalogue. Review all the UC and OLA to ensure they support the SLA. For example: If the service level agreement requires 7x24 coverage ensure that the OLA for Operations is 7X24. Produce an agreement between the customer and the service provider. Define reporting and reviewing procedures (duration and frequency of review meetings). Ensure that the SLA's existence is publicized.
Managing the ongoing process
Monitor the statistics based on the SLA for the user. It if can't be measured then it can't be monitored. On a monthly basis, report the results of how the IT infrastructure is performing against the SLA. Review the results. If improvement is necessary, implement a SIP.
Periodic reviews
Review of all SLAs need to be scheduled with the customers on regularly occurring intervals (monthly for instance). Reviews of all OLAs and UCs to ensure that they are still valid and support the SLA.
The Service Catalogue
Introduction List of all standard services provided List of all specialized business services (if any) Service support matrix Contact Persons Information on IT department
SLA Contents
Introduction Service Hours Availability Reliability Support Throughput Transaction response times Batch turnaround times Change
IT service continuity and security Charging Service reporting and reviewing Performance incentives/penalties
Benefits
Both the customer and IT provider clearly understand their roles and responsibilities: o Both parties agree to levels of service and information requirements from each side o Communications will be scheduled Service level targets will be set: o The agreed targets will be documented and measured Service levels will be monitored and reviewed: All metrics regarding service levels will be supplied to both the service level manager and the customer. This information will be reviewed and a SIP will take effect if necessary. SLR will drive IT services. OLAs and UCs will be better aligned with the business requirements.
Exercises Service Level Management (Process) The ABC Cookie Company has a service level management process in place. An agreement has been established with the service desk regarding the type of support that will be given to the bakers. The bakers have contracts with a delivery company to ensure the pick up and delivery of the cookies. The marketing people have an agreement with the ABC Cookie Company's IT department for connectivity. 1. Which part of the above case study is an example of: A. Operational level agreement _______________________ B. Under-pinning contract ___________________________ C. Service level agreement ___________________________ 2. Which of the following is not a service level management activity? A. B. C. D.
Establish a function Implementing the SLA Periodic reviews Detecting and recording
3. What are some of the benefits of service level management?
a. ______________ b. ______________ c. ______________ Answers 1. Examples of... The agreement between the Service Desk and the cookie company The bakers have contracts with a delivery company The marketing people have an agreement with the ABC Cookie Company's IT department for connectivity 2. Detecting and Recording is a Service Desk activity 3. See Benefits in Service Level Management
IT Service Continuity Management
ITSCM IT Service Continuity Management supports Business Continuity. By establishing an IT supporting infrastructure following a disaster or major service disruption, continuity of services can be assured.
BCM Business Continuity Management concerns itself with all aspects of the business service recovery. ITSCM is a part of this. The goal of IT Service Continuity Management (ITSCM) is to support business continuity and, in the event of a crisis, recover the required technical services and facilities within an agreed time frame.
Why ITSCM? The main reason to have ITSCM in place is to ensure business survival by reducing the impact of a disaster or major failure. A plan will analyze risks and vulnerability and put in place measures to reduce them. ITSCM will produce an IT recovery plan, which will prevent the loss of customers and users due to lack of confidence.
How to make ITSCM successful:
Review the configuration and change management processes to ensure they are current. Ensure the whole organization is aware of the ITSCM. Use the latest technology and software that supports the process. Train individuals with specific responsibilities associated with the ITSCM. Test the plan to ensure that it will work when needed.
Sources of Crisis Companies may face a crisis (a disruption of service that exceeds limits set out within a SLA) due to hardware or software malfunction, human error, natural disaster or viruses.
Hardware malfunction Hardware malfunctions have been the leading source of crisis over the past 10 years. There are a number of reasons this may occur, ranging from the quality of the product to the technology being used to incorrect specifications.
Human error Human error is a large cause of disasters. This can be attributed to lack of training, mistakes or malicious behavior.
Software malfunction Software malfunctions can be caused by coding errors, improper versions being introduced into a production environment or changes being introduced without going through change management.
Virus Software code specifically designed to hinder or destroy systems. Usually introduced via:
Floppy diskette CD ROM Internet
Natural disasters
Earthquakes Floods Tornadoes Volcanoes
Responsibilities The business needs to understand the options of ITSCM, as the "buy in" from the business is vital to the success of ITSCM. All possible solutions should be clearly documented and explained, including costs and time lines. It is necessary to explain to the business what the consequences are of either delaying the implementation of a plan or not having a plan in place at all. The roles and responsibilities have to be documented and explained to management and individuals involved.
BCM Stage One - Initiation
Establish a policy The owner of the BCM process will need to ensure all members of the BCM team are aware of their roles, responsibilities, intentions and objectives
Identify terms of reference and scope Clearly define the roles of management and staff through a risk assessment and business impact analysis of their respective departments and how the departments should be run following a disaster.
Allocation of resources Both financial (costs) and human resources (skill set) need to be considered to be able to do the analysis needed in stage two.
Definition of project organization and control structure Both ITSCM and BCM projects are complex in nature and need to be structured using a software tool to be able to track time lines, accomplishments and tasks/responsibilities (MS Project). Agreement of project and quality plans will need to be completed by all necessary parties.
BCM Stage Two - Requirements and Strategy
Business impact analysis o Identify critical business processes and what damage or loss to the organization would occur if that business process was disrupted o Identify the type of loss (for example, financial, increased cost to running business and/or intangible) o Identify if the outage would cause more outages if not corrected soon
o o
Identify the minimum requirements for staff, technical ability and facilities needed to restore services Time lines to recovery need to be identified
Risk Assessment
Identify risks (assets that are a vital business function) Assess threats and vulnerabilities o Security (from terrorism) o IT Services being below ground near a large water source o Locations to potential dangers, like mountains Levels of risk based on vulnerability and threat defined by BCM owner o Low o Medium o High
Business Continuity Strategy (countermeasures)
Using the information collected from the impact analysis and assessment, a strategy is put in place for risk reduction. Both a risk reduction and recovery strategy are required and should compliment each other. A cost versus benefits analysis should also be completed. Would it cost more to put a plan in place then the recovery itself? Counter measures o Manual Backup - Usually help desks, call centers or order takers can use this where paper alternatives can be initiated until the system is back up o Reciprocal Arrangements - A situation where two similar organizations share the cost of an off-site solution o Gradual Recovery (Cold Standby - Greater then 72 hours) o Intermediate Recovery (Warm Standby - Between 24 - 72 hours) o Immediate Recovery (Hot Standby (Within 24 hours) o Do Nothing (When the cost of having a recovery solution outweighs the risks)
BCM Stage Three - Implementation
Organization planning Executive - Authority level (senior management) responsible for crisis management Co-ordination - Level below executives - responsible for the recovery effort coordination Recovery - Teams within critical business functions responsible for executing the plans
Implementation planning Co-ordination of the following plans: Emergency response, Damage assessment, Salvage, Vital Records and Crisis Management/Public relations
Implement
Risk reduction measures o UPS, Fault Tolerant systems for critical applications, off-site storage and archiving, disk mirroring and spare equipment. Stand-by arrangements o 3rd party recovery sites like Comdisco or Sunguard o Fully equipped company owned stand-by location o Installing a stand by computer system
Develop
ITSCM plans o Administration (who, what, when and where) o The right people need to be identified as to who will be involved in the ITSCM plan o IT infrastructure o (Hardware, software, and documentation to support the business) o The documentation needs to be written so that anyone who is not familiar with the system can use it for system recovery o Personnel (People related to staff and accommodation) o Security (Instructions for fire, explosions, first aid, retrieving backups, and so on) o Alternative location (Information - people, address, facilities, and so on) o The plan will include actions to bring the IT infrastructure back to its operational state Procedures should include: o Hardware and network installations and testing o Reference points to guide the restoration of software and data o Procedures to consider time zones and multinational organizations o What is considered a business cut off point
BCM Stage Four - Operational Management
Education and awareness o All staff within the IT organization should be aware of the ITSCM and BCM and their roles Training o In recovery procedures and responsibilities Review o The review of the ITSCM and BCM to ensure they are current and valid Testing Testing should be done: o o
When the plan is complete Once a year
o
After every major change to the Infrastructure Change control o All changes to the ITSCM or BCM must follow Change Management Process rules. The ITSCM and BCM are considered CI items within the Configuration Management Database
Benefits of ITSCM
Potential for lower insurance premiums due to proactively managing the business risks. Business relationships improve due to closer working relationships and increased understanding of risks and dependencies. Positive marketing of contingency capabilities results in higher levels of non-interrupted service. Organizational credibility improves with the addition of ITSCM and BCM. Competitive advantage due to the ability to reduce risk and incorporate business safeguards.
Security Management The company needs to identify a person or team that will be responsible for either the long or short-term security goals. This person or team will be responsible for the following:
The creation of information security process and procedures Identifying potential security Incidents and an action plan to deal with them Establishing a security awareness in regards to the impact of lost information Security Levels
Security Processes and Procedures The person or team responsible for Security Management need to work with the Business Management Team and IT to determine what policies should be put in place to try and prevent security breaches and what processes need to be developed to deal with the avoidance of such breaches and what actions need to be taken if a security incident occurs.
Security Awareness Sharing of information to the users of IT regarding the consequences of lost data due to hardware or malicious actions. A user will only be aware of the impact of a lost hard drive after they either experience it themselves or they are informed.
Security Incidents A security incident is an alert to the possibility that a breach of security may be taking, or may have taken, place. All security incidents need to be logged so that it can be evaluated later when needed.
Security Levels:
Top Secret - Documents that are of extremely sensitive nature. The availability of such documents is kept to key company individuals. Highly Confidential - Documents that are of sensitive nature that contain information on company process or of competition. Proprietary - Information that should be used by specific individuals within an organization, such as Project Plans Internal Use Only - All information used between companies or departments, such as Operation Level Agreements. Public Domain - All information approved for public viewing.
Exercises IT Service Continuity Management (Process) The ABC Cookie Company has invoked an ITSCM process. 1. Match the following 2 columns: A. B. C. D.
Stage 1 of BCM Operational management Stage 2 of BCM Initiation Stage 3 of BCM Requirements Stage 4 of BCM Implementation
2. Which of the following is not a counter measure? A. B. C. D.
Manual backup Gradual recovery Security Do nothing
3. What are some of the benefits of IT Service Continuity Management? a. ______________ b. ______________ c. ______________ Answers 1. Match the columns: Stage 1 initiation
Stage 2 Requirements Stage 3 Implementation Stage 4 Operational Management 2. Security is part of Availability Management 3. See Benefits in Service Level Management
Financial Management for IT Services The goal of financial management is to provide formal cost effective stewardship of the IT assets and the financial resources being used to provide IT services. This will have to be done in an efficient, economical and cost effective way.
Financial Management Activities Budgeting Financial management supports the planning and execution of business objectives by predicting and controlling the spending of money. Budgeting usually goes through a period of negotiating cycles (usually yearly) while being monitored daily.
Predicting the finances needed to operate IT service for a period of time. Using historical data and industry standard predictions, expenses need to be established in order for budget setting. Budget planning is usually done once a year and takes into account tabled projects and the current cost of service levels. Budgeting is used to monitor performance against pre-defined targets to ensure service levels can be maintained all year. These reviews are on-going to ensure that adjustments are possible prior to year end. Review between Business and IT, calculating the cost of IT service provisions: Discussions are necessary to determine service level requirements, the costs associated with them, anticipated availability requirements and asset costs needed.
IT accounting
Making cost effective decisions on service provisions is necessary to assess the service being provided against cost. For example, "Is it more effective to outsource a service or keep it internal?" Putting financial accountability into the hands of managers ensures that every cost decision is cost justified with a business case like any other business investment. Provides the ability to measure under/over usage in financial terms. "We were $10,000 over budget on IT expenditures this year".
Provide for an understanding of the cost of changes and the implications of not taking advantage of business expenditure. "What is the cost if we do it? What is the cost if we don't do it?"
Charging Charging is a means of recovering costs for services rendered. Since customers pay for services, they have the right to influence decisions on the service. Therefore, IT services should be operated as a business unit.
Cost - price of services rendered is the cost Cost-plus - price of services rendered plus % of profit Going rate - price compared to other internal departments or external similar organizations Market rate - price matched by external suppliers Fixed price - a set price for a period of time based on anticipated usage
All IT costs External services
Outsourced Help Desk, DRP providers, contract companies
Software
Databases, operating systems, incident control systems
People
Payroll, benefits, fees associated with staff
Transfer
Internal charges from other cost centers: Help Desk, LAN Services, and Operations
Hardware
Mainframes, servers, PCs
Accommodations
Building, Offices, Utilities, storage areas
Cost Elements
Costs can be broken into two categories: capital or operational.
Capital The outright purchase of a software package or piece of hardware would constitute capital costs.
Operational Day to day running of the software or hardware including staffing, utilities, and so on, would constitute operational costs. Each cost is part of one of three elements: Direct Costs, Indirect Costs and Unabsorbed Indirect Costs.
Direct costs Traced to a cost centre or department. For example:
Servers Applications used exclusively by a single cost centre
Indirect costs Traced to a cost of providing a service. For example:
Operations staff Network or technical services
Unabsorbed indirect costs Traced to a cost that can't be assigned to a customer. For example:
IT management Building or facilities
Developing a Charging System Scope Informing the business of the need for putting the system into production. This charging system should:
Determine right policy for the organization. Recover fairly and accurately the agreed costs of services. Shape the behavior of customers to ensure best return on IT investments.
Charging policies will:
Force the business units to control their own user demands. Reduce overall costs and highlight areas that are not cost effective. Match internal services to justifiable business needs through direct funding.
Recovery of costs The recovery of costs should be simple, fair and realistic.
Simple
Less bureaucracy with improved overall cost-effectiveness.
Fair
Ensure that the charge for services provided reflect market value.
Realistic
The charging policy must be designed to adjust the behavior of the business. If it is perceived that the cost of the service is too low then it will be exploited by the business. Therefore, realistic charges have to be implemented.
Customer behavior Informing customers of costs can adjust customer/user behavior to make better use of IT resources. Also, informing customers/users of costs associated with services being rendered will make better corporate citizens. This will serve two purposes:
Reduce the inefficient use of IT resources. Better control of the peaks and valleys of IT usage by adjusting charges based on usage and times of usage.
Costing, Charging and Budgeting Cycle An IT operational plan will contain information from the Business IT requirements and Financial targets dictated by senior management. The outcome of this sub-process is fed into a cost model where the costs are analyzed to determine which charging policies should be administered. The proposed charges are then fed back to the business.
Benefits
Reduced long term costs by budgeting, monitoring and controlling expenditures
Increased confidence in setting and managing budgets. By using controls implemented by financial management there will be improved accuracy and professionalism. Turning IT into a business unit will ensure cost justification is put in place for all expenditures. Accurate cost information when making a business decision on investments will be made clearer. For example, "How much did it cost to run that software, hardware or help desk? Can it be done cheaper by using a different technology or service provider?" Making the organization aware of costs of doing business will result in better and more efficient use of IT resources. This will increase the professionalism of each IT business unit and will also make it easier to perform market comparisons with alternate service providers.
Exercises Financial Management for IT Services (Process) The ABC Cookie Company has invoked financial services for its services. One of its goals is to provide cost-effective stewardship of IT assets. 1. Which of the following activities will best achieve the above goal? A. Budgeting B. IT Accounting C. Charging 2. Match the following 2 columns: Direct Cost Building Indirect Cost Operational staff Unabsorbed Indirect Cost Server requirements 3. Which of the following is not part of IT Accounting? A. B. C. D.
Cost effective service provisions Justification of expenditure Cost Plus Measuring under/over
4. True or false? Developing a charging system will shape customer behavior? Answers 1. IT Accounting 2. Match the columns: a. Direct costs - Server Requirements
b. Indirect Costs - Operational Staff c. Unabsorbed Indirect Costs - Building 3. Cost Plus is part of Charging 4. True
Capacity Management
Overview The goal of capacity management is to ensure that current and future capacity and performance needs for business requirements are provided in a cost-effective way. This can be done by:
Monitoring the performance and throughput of all IT services and the supporting Infrastructure components. Tuning resources to make them more efficient. Understanding the present and future demands on IT resources. Using financial management, influencing customers on how resources are being used. Producing a capacity plan for IT service providers to support services that are required within the SLA.
Capacity Management Sub Processes Inputs to Capacity Management:
Technology SLAs, SLRs and Service Catalogue Business Plans and Strategy IS, IT Plans and Strategy Business requirements and volumes Operational schedules Deployment and development plans and programs Forward schedule of change Incidents and problems Service reviews SLA breaches Financial plans Budgets
Business Capacity Management
Business Capacity Management is the process responsible for ensuring that the future business requirement for IT services are considered, planned and implemented. The use of existing data describing the way current resources are being used by all departments will help in forecasting future requirements. Business capacity management must be:
Responsive to change and changing requirements. Aware of customers SLRs. Involved in both Change Management and Project Management.
Service Capacity Management Service Capacity Management is the process responsible for managing the performance of operational IT services used by the customer. IT services are monitored and measured against the targets within service level agreements and requirements. These measurements are recorded and analyzed and if necessary action will be taken on IT resources to improve performance. The advice of specialists within resource capacity management and making staff accountable and knowledgeable in capacity management will help improve resource performance.
Resource Capacity Management Resource Capacity Management is the process that manages single CIs within the IT infrastructure. All resources with limited resource are monitored and measured. If necessary, action will be taken against the data that is collected and analyzed. This action is to ensure that all business requirements are met.
Output from Capacity Management
Capacity plan Capacity database Baselines and profiles Thresholds and alarms Capacity reports (regular, ad-hoc and exception) SLA and SLR recommendations Costing and charging recommendations Proactive changes and service improvements Revised operational schedule Effectiveness reviews Audit reports
Capacity Management Iterative Activities
Monitoring of all CIs.
Data should be analyzed using expert systems (if possible) to compare usage against thresholds. Recommendations made from analysis: o Balancing services o Changing concurrency levels o Adding or removing resources Then monitoring starts again in a continuous cycle.
Capacity Management Activities Demand Management Demand Management is an influencing activity intended to change the capacity usage habits of its customers. Demand management needs to understand which business units are utilizing the resources, to what level and how often. After this information is collected, a demand management plan can be put in place to change the level of usage or when they are using the resources, provided the business units can be influenced to do so.
Modeling Modeling helps in predicting the behavior of IT services under certain conditions.
Trend Analysis (historical performance of resources). Analytical modeling (through use of mathematical software program). Simulation modeling (through use of a program to simulate computer processing it helps in accurately sizing new applications but time consuming). Baseline Models (reflects the performance that is being achieved).
Application Sizing Implemented at the project initiation and design stage for a new application. Application sizing takes into account what resources are needed for applications being built internally or purchased from a vendor. At the initiation stage care has to be given to ensure that enough space is given for growth (SLR).
Production of a Capacity Plan The capacity plan documents the current levels of resource utilization and service performance. Forecasts and future requirements for resources are made after the analysis of the business strategy and plan are considered. The plan should include:
Assumptions Recommendations on resources required, cost, benefits and impacts
Benefits Increased efficiency and cost savings including:
Deferring costs of new equipment:
If the current technology is used more efficiently within the infrastructure then the need to purchase new equipment can be deferred. This could either free up financial resources to purchase other equipment or the financial services could be saved for future purchases.
Economic provisions of service:
Matching capacity with business needs properly would avoid the maintenance of unnecessary capacity, which will result in a cost saving.
Planned buying:
Planned buying would reduce the risk of panic buying.
Reduced Risks A significant benefit of capacity management is reduced risk. When capacity is managed effectively then the risk of failure is reduced:
With current applications the risk is reduced by managing the resources and service performance. With new applications the risk is managed through the application sizing. The change advisory board should include capacity management, and should assess the impact of changes. Effective capacity planning will reduce the number of emergency change increases to capacity.
More confident forecasts Over time, capacity planning improves due to the data collected. For example, normal operating baselines and monitoring data collected will help in capacity planning. Using application sizing and modeling when introducing new services will help forecast capacity needs more accurately.
Value to applications lifecycle: Early identification of capacity needs in the development of applications should be put into the capacity plan. This will reduce the risk of running into capacity issues when applications are in production.
Exercises Capacity Management (Process)
The ABC Cookie Company has invoked capacity management 1. Which of the following is not an activity? A. B. C. D.
Application sizing Modeling Demand management Authorizing
2. Match the following two columns: a. Business Capacity Management Understanding each CI required b. Service Capacity Management Understanding future requirements c. Resource Capacity Management Understanding each service required 3. Which of the following is an influencing activity? A. Modeling B. Demand management C. Application sizing 4. True or false? Application sizing is only for internally developed applications. Answers 1. Authorizing is a Change Management Activity 2. Match the columns: a. Business Capacity management Understanding future requirements = b. Service Capacity Management = Understanding each service required c. Resource Capacity Management Understanding each CI required = 3. Demand Management 4. False
Availability Management
Overview
The goal of Availability Management is to:
Ensure IT Services have the availability designed to meet the business requirements Optimize availability to satisfy business objectives Deliver to the business a level of availability in a cost-effective way Offer sustained levels of availability, reduced outages and to always strive for improvement.
Terms and Definitions
Term
Definition
Availability
The ability of a component within the IT infrastructure to function at a certain level for a certain period of time.
Reliability
Freedom from outage or operational failure of CIs within the IT infrastructure. The levels of built-in redundancy, fault tolerance or resilience put into each CI will make the service provided by that CI more reliable.
Maintainability
Maintainability is the responsibility of internal service providers to support the availability of IT infrastructure CIs. There are certain CIs that are supported internally, often because internal employees built these CIs or there are job functions within the company specific to maintaining this equipment.
Serviceability
Serviceability is the responsibility of external, 3rd party, IT service providers who are contracted to service equipment. This normally occurs in companies with large numbers of leased equipment or in companies who lack the certified skill set to maintain certain CIs.
Security
With the help of a Security Management Process, Security measures are put in place to guarantee the confidentiality, integrity and availability of certain data associated with a service.
Availability Management Activities
Determining the availability requirements of the business.
Determining from the business what the business requires in the way of availability is input into availability management.
Determining the VBF and reporting it to ITSCM.
Availability management will inform the IT Service Continuity Manager which are the Vital Business Functions so that the Contingency Plan will have documented actions to support the business.
Compiling availability plan.
There is a need for specific information to develop an availability plan. This includes:
Availability requirements The impact assessment for all the vital business functions Incident and Problem records to be used as information on IT service failures Monitoring data from configuration to data pertaining to each IT service and component Service level measurements against agreed targets
Monitoring, reporting and improving availability.
The availability manager is accountable for all IT Availability. That person will monitor, create reports and if necessary make adjustments prior to issues arising to ensure that availability does not affect service levels. Availability trending will be used for improving availability in the future
Monitoring maintenance and service obligations.
Maintenance and service levels with both internal and third-party vendors are maintained through the Operation Level Agreements or underpinning contracts and are up to date and checked for validity. If there are any changes to the technology then the third-party vendor needs to be informed to ensure the technology is covered in the event of an outage.
Availability Plan Inputs and Outputs Create an objective for the availability plan. This objective should include what you want to achieve, what the deliverables are and what other factors should be considered that might influence availability (for instance, people, processes and tools). The plan should cover:
Levels of availability (actual versus agreed). What will be done in the event of availability shortfall. Information on how to change availability requirements to existing IT services. Information on how to adjust availability to new IT services. Information on how to adjust availability with new technology benefits to availability (futures).
Indicators of Availability
Mean Time to Repair Incident
Incident is the actual time the outage started
Detection
The point at which the user discovered the outage
Response Time
The time in which the outage was escalated for support
Diagnosis
The time the outage was investigated
Repair
The time it took to repair or replace the CI at fault
Recovery time
The time it took to bring the machine back up to its original state
Restoration
The time the business resumed normal operations
Benefits Single point of accountability; the process owner is designated for Availability Management and will ensure availability benefits the business by:
Addressing business availability needs. Making availability services designed to be cost effective to meet business needs. Ensuring the levels of availability are sustained and measured to support SLM. Addressing any issues with availability and taking corrective action. Ensuring outages due to lack of availability are reduced. Helping the mindset of IT move from reactive to proactive. With IT support 'adds value' to the business.
Exercises Availability Management (Process) The ABC Cookie Company has invoked availability management. 1. Which of the following is not an activity? A. Realization of availability requirements B. Compiling an availability plan
C. Monitoring maintenance obligations D. Filtering 2. Indicators of availability are: A. B. C. D. E.
Detection Repair Diagnosis Restoration Recovery
Which of the following is the right sequence for the above indicators? b,a,c,e,d a,c,b,e,d a,b,c,d,e 3. A cookie vending machine has 3 functions. Which is considered a Vital Business Function? A. Interact function B. Receipt Printing C. Cookie Dispenser
Answers 1. Filtering - Is a Change Management activity 2. a,c,b,e,d 3. Cookie Dispense
Study Notes
Configuration Management Goal
To identify, record and report on all IT components that are under the control and scope of Configuration Management.
Activities
Planning Identification Control Status Accounting Verification and Audit Management Information
Definitions
Infrastructure o Includes hardware, software and any associated documentation CMDB o A database which holds a record of all Configuration Items (CI) associated with the IT Infrastructure CI o It is a component of an infrastructure or an item, such as a request for change, associated with an infrastructure, that is (or is to be) under the control of Configuration Management Baseline o A product or system established at a specific point in time which captures both the structure and the details of that product or system and can be rebuilt at a later date Scope o The range of responsibility covered by Configuration Management CI Levels o The degree of detail selected to describe each CI Relationships o A description of the interfaces which exist between CIs
Benefits
Configuration Management information supports all other processes Information for impact and trend analysis for problems and changes Assist in adherence to legal and contractual obligations which will help in detecting and deterring crime
Reduced risk of unauthorized software Helping with financial planning
Problems
CI definition: too much or too little detail Lack of commitment Process perceived as bureaucratic Circumvention of process
Service Desk Goals/Purpose
To provide a central/single point of contact To deliver high quality support to achieve business goals To aid in user retention and satisfaction To improve the service but reduce the cost To highlight customer training and education needs To close incidents and confirm satisfaction with the customer To contribute to problem identification
Roles and Responsibilities
To accept and record all calls To do initial assessment of all incidents, attempt to resolve then escalate to 2nd level based on SLA To monitor and escalate according to SLA To keep users informed on status and progress To produce management reports Managing the request lifecycle, including closure and verification Communicating planned and short-term changes of service levels to customers Coordinating second-line and third party support groups Providing management information and recommendations for service improvements Single point of contact (liaison) Customers and business service improvements Service restoration Skill sets and tools Deliver support critical for achieving business goals Identification of business opportunities
Definitions
No definitions to be explained
Benefits
Improved quality and faster response to user requests, cost of ownership, user service, perception, satisfaction, team work and communication Better management of change by better management of infrastructure and control (better managed CMDB) Increased user accessibility via (SPOC) Long term customer retention and Identification of business opportunities
Problems
Lack of understanding of the business needs Not defining service objectives, goals and deliverables Not understanding customer requirements People issues (lack of training, overload/ burnout, and so on)
Incident Management Goals
To restore normal service operation as quickly as possible and minimize the adverse impact on business operation
Activities
Detection and recording Classification and initial support Investigation and diagnosis Resolution and recovery Closure Ownership, monitoring, tracking and communications Management information
Definitions
Incident - any event which is not part of the standard operation of a service, and which causes or may cause an interruption to, or a reduction in the quality of that service
Benefits
Reduced business impact and interruptions from incidents by resolving them as quickly as possible Proactive identification of beneficial system enhancements Availability of business-focused management information related to the SLA
Improved monitoring of performance based on the requirements of Service Level Management Improved management information Better staff utilization Elimination of lost incidents More accurate CMDB information resulting from constant audits from daily user interaction Improved user and customer satisfaction
Problems
Management or staff commitment Lack of knowledge for resolving incidents Inadequate training for staff Lack of integration with other processes No provision of agreed service levels Lack of clarify about business needs Review of working practices
Problem Management Goals
To minimize the adverse impacts of incidents and problems on the business that are caused by errors in the IT Infrastructure and to prevent recurrence of incidents related to these errors. Problem Management seeks to get to the root cause
Activities
Problem control Error control Assistance with the handling of major incidents Proactive prevention Management information and problem reviews
Terms and Definitions
Problem - a condition identified by multiple incidents exhibiting common symptoms, or one single significant incident, indicative of a single error for which the cause is unknown Known Error - a condition identified by successful diagnosis of the root cause of a problem, when it is confirmed which CI is at fault
Benefits
More effective and efficient handling of incidents, problems and known errors Increased service quality Improved user productivity Reduction in the number of incidents and problems Reduced chance of invoking contingency plan Improved management information Higher 1st level resolution rate and IT technical awareness Improved relationships between IT and the business
Problems
Incident control procedure Staffing issues Knowledge based system Dealing with known errors
Change Management Goals
To ensure that standardized methods and procedures are used for efficient and prompt handling of all changes to minimize the impact of change-related incidents and improve day-to-day operations.
Activities
Filtering (assessing) RFCs Classification (prioritize and categorize) Change approval (assess and authorize) Coordination (build, test, implement, Post Implementation. Review) Management information
Terms and Definitions
Change - an action that results in a new status for one or more IT infrastructure CIs Standard Change (Pre-approved) - A change to the infrastructure that follows an established path Request for Change (RFC) - form or screen used to record details of a RFC Forward Schedule of Changes (FSC) - a schedule that contains details of all changes approved and their proposed implementation dates Projected Service Availability (PSA) - a document containing details of changes to agreed SLAs/SLCs and agreed serviceability due to currently planned FSC Change Advisory Board (CAB) - a group of people who have decision authority for significant changes
Change Advisory Board Emergency Committee (CAB/EC) - a subset of the full CAB to handle emergency change requests
Benefits
Alignment - between the services provided and business needs Productivity - right personnel working on changes Risk - reduced impact by eliminating bad changes Volume - ability to absorb a high volume of changes Reports - quality reports on information related to changes
Problems
Perceived bureaucracy Scope Dependencies Synchronization
Release Management Goals
Release Management takes a holistic view of a change to an IT service and should ensure that all aspects of a release, both technical and non-technical, are considered together
Activities
Release planning and overseeing Designing, building, configuring a release Release acceptance Rollout planning Communication, preparation and training Distribution and installation Management information
Definitions
Definitive Software Library (DSL) - a secure software compound where all authorized versions of software are kept Definitive Hardware Store (DHS) - an area set aside for the secure storage of hardware spares Delta Release - only the CIs within the release unit that have changed Full Release - all the components of the release unit are built, tested, distributed and implemented together Package Release - includes at least two releases, delta/full grouped together
Benefits
Improved Quality of Services due to successful releases Consistency and improvement of releases Planned use of resources (testing and quality control) Managed expectation levels Improved historical data concerning releases Improved control of software and hardware assets Reduced risk of unauthorized or illegal software
Problems
Initial resistance to change Circumvention of procedures Disparity of hardware and software versions at distributed locations Perceived bureaucracy and expense Unclear ownership and responsibilities Pressure to move forward
Service Level Management Goals
To maintain and improve IT service quality Better relationship between IT and customers True partnership Mutually beneficial
Activities
Establish function Implement the process Manage ongoing process Periodic reviews Management information
Definitions
Service Level Agreement (SLA) - agreement between IT provider and the IT customer (internal) Operational Level Agreement (OLA) - agreement between Internal IT dept (such as Network) and IT Service Provider Underpinning Contract (UC) - contract between external supplier providing services to IT Service Catalogue - Defines services available to the customer from IT
Service Improvement Program (SIP) - formal project undertaken within SLM, focusing on Customer/Staff satisfaction Service Level Requirements (SLR) - client's service requirements that form the basis for an SLA
Benefits
Both the customer and IT provider clearly understand their roles and responsibilities Service level targets will be set Service levels will be monitored and reviewed SLR will drive IT services OLAs and UCs will be better aligned with the business requirements
Problems
Verifying targets prior to agreement Inadequate supporting agreements IT based rather than business aligned SLAs not communicated
Availability Management Goals
To optimize the capability of the IT infrastructure, services and supporting organization to deliver a cost effective and sustained level of availability enabling the business to meet their objectives
Activities
Determining availability requirements Compiling the availability Plan Monitoring availability Monitoring maintenance obligations Management information
Definitions
Availability - availability of a component to perform its required function Reliability - freedom from operational failure Maintainability (Internal) - maintenance done on systems from an internal focus Serviceability (External) - maintenance done on systems from an external focus (Outsourced) Resilience - redundancy, to allow IT's continued operation despite the incorrect functioning of one or more of its sub-systems
Security - confidentiality, Integrity, Availability) Vital Business Function (VBF) -the element of a business process considered business critical
Benefits
Single point of accountability Design for high level of availability Availability designed to fully support service level management Reduction of failures, duration and frequency Error correction to service enhancement Add value to the business by addressing business requirements
Problems
Organizational Management responsibility Got problem and change, why this? Appropriate authority levels Lack or resource Support tools
Capacity Management Goals
To ensure that all the current and future capacity and performance aspects of the business requirements are provided cost effectively
Iterative Activities
Monitoring, tuning, analysis of IT throughput Implementation of changes Storage of data
Activities
Demand management Modeling Application sizing Production of Capacity Plan Management Information
Definitions
Capacity Database (CDB) - database used to record activity within Capacity Management Demand Management - trying to influence clients to use systems at different times as workloads can be better controlled Resource Management - evaluates new technology and provides insight into the infrastructure and calculates the investment Modeling - techniques can determine how to achieve optimal capacity Application sizing - calculates new software requirements and can determine new hardware for applications that are being developed
Benefits
Increased efficiency and cost savings Reduced risk More confident forecasts Value to applications lifecycle
Problems
Over expectation Vendor influence Lack of info The distributed environment Monitoring levels
IT Service Continuity Management Goals
To ensure that the required IT Technical and Services facilities can be recovered within required and agreed timescales.
Activities
Risk analysis Risk management Management of the Contingency Plan Testing the Contingency Plan Initiation Requirements analysis and strategy Implementation Operational management Management information
Definitions
ITSCM - part of the overall BCM, for continuity of IT Services only BCM - managing risks to ensure that the business can operate at all times Crisis - exceeds threshold values agreed to within the SLA/SLC
Benefits
Management of risk and the reduction of consequences due to the impact of a failure Potential lower insurance premiums Business relationship improvement due to better business focus and awareness of impacts and priorities Positive marketing of contingency capabilities both internal and external Organizational credibility by reduced outages Competitive advantage by customer confidence
Problems
Will not remove all sources of risk Might not address organizational facilities Not all damage is financially quantifiable Contingency plans may be required by law
Financial Management for IT Services Goals
To provide cost-effective stewardship of the IT assets and resources used in providing IT Services
Activities
Budgeting IT Accounting Charging Management information
Definitions
Direct Costs - costs which are clearly attributable to a single customer Indirect Costs - costs incurred on behalf of all, which have to be apportioned Variable Costs - costs which vary with some factor such as usage/time Unabsorbed Overhead - any indirect costs which cannot be apportioned Cost Elements (TCU, ECU, OCU, ACU, SCU) - hardware, software, people, accommodations and transfer
Benefits
Confidence in setting and managing budgets Accurate cost information to support IT investments, determining cost of ownership for ongoing services More efficient use of IT resources throughout the organization Increased professionalism of staff within the IT organization
Problems
New Disciplines IT/Accountancy skills Poor strategic objectives Costs and charges Cost outweighs benefit
The Service Desk (Function) Chapter relevance 1%
Show AllFetching ToCHide All
Configuration Management
Chapter relevance 1%
Show AllFetching ToCHide All
Incident Management
Problem Management Change Management Release Management Service Level Management IT Service Continuity Management Financial Management for IT Services Capacity Management Availability Management Study Notes