Evidence-Based Maintenance Decisions Andrew K S Jardine CBM Laboratory Department of Mechanical & Industrial Engineering University of Toronto
[email protected] Abstract The purpose of this paper is to alert readers to the availability of tools (mathematical models) that can be used by maintenance and reliability specialist for making data-driven decisions. Decision areas addressed are: component preventive replacement, inspection decision including predictive maintenance, capital equipment replacement, and resource requirements. To eliminate the tedium of performing the analysis manually, software that implements many of the available procedures and models in the paper has been developed and is referenced.
Introduction In the context of maintenance decision-making it is often found that very little factual knowledge is available. Although abundant data may have been captured in the organisation’s CMMS, EAM, or ERP systems, asset managers may not know the data mining technique to extract useful knowledge from such data. This type of information is absolutely necessary for the development of optimal maintenance procedures. Weibull analysis (Abernethy, 1996), for example, is one such data mining techniques; it turns failure data into knowledge on the risk of failure of various assets. There is keen interest in evidence-based maintenance decisions rather than the use of ‘gut-feel’ or indiscriminately following manufacturer’s recommendations. It is hoped that this paper will go some way towards reducing the proportion of subjective judgement in maintenance decision-making.
Component Replacement Decisions This topic covers determination of: replacement intervals for equipment the operating costs of which increase with use; the interval between preventive replacements of items subject to breakdown (also known as the group or block policy); and the preventive replacement age of items subject to breakdown.
The interest in this decision area is because a common approach to improving the reliability of a system, or complex equipment, is through preventive replacement of critical components within the system. Thus, it is necessary to be able to identify which components should be considered for preventive replacement, and which should be left to run until they fail. While the methodology of Reliability Centered Maintenance (Moubray, 1997) determines the type of maintenance tactics to be applied to an asset, the issue of when to perform the recommended maintenance action that will produce the best results possible, remains to be addressed. As Knowlan and Heap (1978) said “Once the equipment enters service a whole new set of information will come to light, and from this point on the maintenance program will evolve on the basis of data from actual operating experience. This process will continue throughout the service life of the equipment, so that at every stage maintenance decisions are based, not on an estimate of what reliability is likely to be, but on the specific reliability characteristics that can be determined at the time.” In a private communication the following statement was made: “At the TTC, I have been able to analyze several components (which we were overhauling periodically) to justify if it is worthwhile doing the overhaul. It was possible to use Weibull analysis since when these components failed, they failed due to a dominant failure mode, and the defective component was replaced with a new one or one that is just like new. I found that most times the hazard rates obtained were decreasing. This I later found was due to poor quality components and questionable maintenance practices. Overhaul on these components has been suspended and we are only changing them on failure. Quality issues are also being addressed.” Data was available for many years, yet it was not until a new reliability engineer was appointed to the team, one who had been trained in the use of Weibull analysis, that a major change in maintenance practice was made. In the October 2003 issue of Maintenance Technology the Viewpoint had the following heading: “Using MTBF to Determine Maintenance Interval Frequency is Wrong”. And then it went on to say, inter alia, “Random failures make up the vast majority of failures on complex equipment as research has shown. For example, consider the failure of a component. Assume that each time the component failed we tracked the length of time it was in service. The first time the component is put into service it fails after 4 years, the second time after 6 years, and the third time after only 2 years (4 + 6 + 2 = 12/3 = 4). We know that the average lifespan of the component is 4 years (its MTBF is 4 years). However, we do not know when the next component will fail. Therefore we cannot successfully manage this failure by traditional time-based maintenance (scheduled overhaul or replacement).” If the author had been aware of Weibull analysis and the tool to establish the best changeout time for a component he would have realized that:
Fact 1. Preventive replacement at the MTBF could be the best answer, but it does depend on additional evidence. Fact 2. If a reliability engineer trained in the statistical analysis of failures analyzed the 3 failure times they would obtain a "best-estimate" that there is significant wear-out occurring, and that time-base replacement could be appropriate. This conclusion is obtained by examining the evidence (3 failure times) and doing a simple Weibull analysis. Using regression analysis the shape parameter, beta, is estimated as 1.74. Thus the “best estimate” indicates an increasing hazard function, and so the risk of bearing failure occurring could be reduced through bearing preventive replacement based on time. The moral of the above: Make evidence based decisions!!- Using appropriate tools.
Inspection Decisions This topic covers determination of: inspection frequencies for complex equipment used continuously; fault finding intervals for protective devices; and condition based maintenance (CBM) decisions. In this brief paper only CBM decisions will be considered. The classical approach to CBM is the trending of condition monitoring measurements and the use of limits, such as normal, warning and alarm. Such an approach is simple to understand. But it has limitations. For example: Which measurements are correlated with failure? What are the optimal limits? What is the effect of an item’s age on failure? What is the probability of a failure occurring between the current and next inspections? CBM optimization (Jardine and Banjevic, 2005) using the EXAKT tool (www.omdec.com) extends and enhances the classical control chart approach. (See illustration on power point presentation that is associated with this paper)
Capital Equipment Replacement Decisions This topic is concerned with determining the: replacement intervals for capital equipment the utilization pattern of which is fixed; replacement intervals for capital equipment the utilization pattern of which is variable; and replacement policy for capital equipment taking into account technological improvement. The author recently received an e-mail that said, inter alia: “We are one of the largest marine cargo handling firms in the U.S. We have approx 2400 pieces of rolling stock, mostly powered lift equipment (stationary cranes, mobile cranes, side & top handlers, forklifts, etc). We have no corporate strategy on equipment repair/replacement, lease vs. buy, economic service life, etc. These decisions are based
often on strength of personalities and # of mechanics complaints, not objective analysis. I'm looking to change that. On the plus side, we do have a CMMS (Maximo) and 4 years of "pretty good" equipment information and cost history. So we have some data to analyze. I'll be back in my office Sept 18-19, perhaps we could connect then. I'm on U.S. west coast time (based in Los Angeles). Look forward to learning more”. Here is a situation of an organization being data rich, but not being aware of tools that can assist them make evidence-based asset replacement decisions. The outcome of the previous message was that the company was visited for one day. In the morning a procedure to establish the economic life of their mobile equipment was discussed – using the software AGE/CON (Jardine and Tsang, 2006) to optimize life cycle cost of assets (www.banak-inc.com). In the afternoon the information technology specialist joined the discussion, and discussed how to access their company data base. The data from the data base was then inputted into a standard economic life model to establish the economic life for a sample asset - it was a Hustler truck - costing about USD 60,000 new. Company present policy was to replace their Hustlers at about 18 years of age. The economic life established by using the economic life model was about 10 years. Cost saving per year was USD 3340. There were 449 similar vehicles in their fleet. Therefore total annual saving was estimated at: USD 3340.00 x 449 = USD 1.5 millions per year. Clearly a very significant improvement occurred through using an appropriate tool to establish the replacement policy for their mobile equipment. The company wanted fact –based decision-making – and now they knew how to achieve it
Resource Requirements This section is associated with problems relating to the determination of: the mix of equipment to be installed in a maintenance workshop; the right size and composition of a maintenance crew; the extent of use of subcontracting opportunities; and lease or buy decisions. Two interrelated problem areas, concerning what type of maintenance organization should be created, which will are often considered by an organization are: (i) Determination of what facilities (e.g. staffing and equipment) there should be within an organization; and (ii) Determination of how these facilities should be used, taking into account the possible use of subcontractors (i.e. outside resources). In this paper we will only consider the optimization of the use of contractors.
Of course, there are many factors that an organization will take into account in the decision to contract out all their maintenance work, choose to undertake it all internally or have a mix. We will assume that it is acceptable to have a mix of doing work internally and contracting out if it can be justified economically. Specific assumptions are: The workload for the maintenance crew is specified at the beginning of a period, say a week. By the end of the week all the workload must be completed. The size of the work force is fixed, thus there is a fixed number of staff available per week. If demand at the beginning of the week requires fewer staff than the fixed capacity then no subcontracting takes place. However, if the demand is greater than the capacity, the excess workload will be subcontracted to an alternative service deliverer, to be returned by the end of the week. Two sorts of costs are incurred: (a) Fixed cost depending on the size of the work force. (b) Variable cost depending on the mix of internal/external workload. As the fixed cost is increased through increasing the size of the work force, there is less chance of subcontracting being necessary. However, there may frequently be occasions when fixed costs will be incurred yet demand may be low, i.e. considerable under-utilization of the work force. The problem is to determine the optimal size of the work force to meet a fluctuating demand to minimize expected total cost per unit time. Graphical representation of the optimization is provided in the figure in the associated power point to this paper, and the background mathematical model is developed in Chapter 5 of Jardine and Tsang (2006)
Concluding Remarks In physical asset management we seek fact–based arguments (data driven decisions), not intuition–based pronouncements. (such as based on the strength of personalities or number of mechanics’ complaints) It is to be hoped that this brief paper has demonstrated that while we want tools to deliver fact – based arguments, we do have many such tools already available. As we go forward let us develop our evidence based maintenance tool box: A collection of tools for identifying, assessing and applying relevant evidence for better asset management decision-making.
It is important to have evidence to support asset management programs and not simply accept “expert opinion.”
References Abernethy, R. B. (1996) The New Weibull Handbook, 2nd edition, TX: Gulf Publishing Company Jardine, A.K.S. and Banjevic, D, (2005), Interpretation of inspection data emanating from equipment condition monitoring tools: Method and software in Mathematical and Statistical Methods in Reliability, Armijo, Y.M. (Editor), World Scientific Publishing Company Jardine, A.K.S. and Tsang A.H.C,( 2006), Maintenance, Replacement and Reliability: Theory and Applications, CRC Press, Taylor and Frances Group. Moubray, J. (1997) Reliability Centred Maintenance, 2nd edition, Butterworth-Heinemann Nowlan, F.S, and Heap, H., (1978), Reliability Centered Maintenance, U.S. Department of Defence.