3 Risk Reliability And Availability 2009

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 3 Risk Reliability And Availability 2009 as PDF for free.

More details

  • Words: 3,678
  • Pages: 29
University Of Western Australia Subsea Technology module OENA8589

RISK, RELIABILITY AND AVAILABILITY Kevin Mullen

Risk

1

What is Risk?

• “The chance of something happening that will have an impact on the objective” • Frequency x Consequence • “Expected value of an unwanted outcome measured in dollars”

What is Risk?

• “Expected value of an unwanted outcome measured in dollars” – Injury or death of personnel – Damage or destruction of the environment – Excessive production costs – Reduction or loss of production – Project delays

2

Consequence

Likelihood

Typical Risk Matrix Likelihood -> Consequence

Never heard of Has occurred in industry

in industry

Has occurred

Occurs often

Occurs often

in company

in company

at site

No Injury

LOW

LOW

LOW

LOW

LOW

Slight injury

LOW

LOW

MED

MED

MED

Minor injury

LOW

MED

MED

HIGH

HIGH

Major injury

MED

MED

HIGH

HIGH

VERY HIGH

Fatality

MED

HIGH

HIGH

VERY HIGH

VERY HIGH

Multiple fatality

HIGH

HIGH

VERY HIGH

VERY HIGH

VERY HIGH

VERY HIGH HIGH MED LOW

Rectify immediately Rectify with urgency, unless clearly impracticable Reduce risk as far as practicable Accept, but manage through competency and awareness

3

ENTERPRISEWIDE RISK RANKING MATRIX SEVERITY OF CONSEQUENCES

LIKELIHOOD OF OCCURRENCE

Threat to Enterprise (Catastrophic) (1) PERSONNEL – Multiple (five or more) fatalities. APPENDIX A COMMUNITY – Widespread impact to nearby communities. ENVIRONMENTAL – Long term environmental impact, and/or adverse, worldwide publicity. FACILITY – Total destruction to installation(s) estimated at a cost greater than $100,000,000; Extended facility shutdown, and/or potential for permanent closure. For floating production systems, loss of floating structure.

Major Serious (2) (3)  PERSONNEL – One or  PERSONNEL - One or several fatalities, limited to more severe injuries, area of incident. including --immediate ENTERPRISE-WIDE RISK permanently RANKING  COMMUNITY - One or disabling injuries. more severe injuries.  COMMUNITY - One or  ENVIRONMENTAL more minor injuries. Significant release with  ENVIRONMENTAL serious off-site impact and Significant release with more likely than not to cause serious off-site impact. immediate or long-term  FACILITY - Damage health effects. to process area(s) at an  FACILITY – Damage to estimated cost greater installation(s) estimated at a than $1,000,000 but cost greater than less than $10,000,000; $10,000,000 but less than 10 to 90 days of $100,000,000; downtime in downtime. excess of 90 days.

Minor (4)  PERSONNEL - Single injury, not severe, possible lost time. MATRIX  COMMUNITY - Odor or noise complaint from the public.  ENVIRONMENTAL Release which results in Agency notification or Permit violation.  FACILITY - Some equipment damage at an estimated cost greater than $100,000 but less than $1,000,000; 1 to 10 days of downtime.

Incidental (5)  PERSONNEL – Minor or no injury, no lost time.  COMMUNITY - No injury, hazard, or annoyance to the public.  ENVIRONMENT Environmentally recordable event with no Agency notification or Permit violation.  FACILITY - Minimal equipment damage at an estimated cost less than $100,000; negligible downtime.

Enterprise-Wide Risk Ranking Matrix

Frequent (1) Incident is very likely to occur at this facility. Possibly several times during its life time. Statistical probability P> 10-2

1

2

2

3

5

Occasional (2) Incident may occur at this facility some time during its life time. Statistical probability: 10-2 > P > 10-3

2

2

3

4

6

Seldom (3) Incident has occurred at a similar facility and may reasonably occur at this facility. Statistical probability: 10-3 > P > 10-4

3

3

4

5

6

Unlikely (4) Given current practices and procedures, this incident is not likely to occur at this facility. Statistical probability: 10-4 >P > 10-6

4

4

6

6

6

Remote (5) Highly unlikely, although statistics show that a similar event has happened. Statistical probability P< 10-6

4

5

6

6

6

ENTERPRISE RISK MANAGEMENT DRIVEN

SAFETY MANAGEMENT SYSTEM DRIVEN

PRIMARY DRIVER

OCCUPATIONAL HEALTH AND SAFETY DRIVEN

Risk Assessments QRA – Quantitative Risk Assessment 1. Identifying what could go wrong 2. Estimating the likelihood of these events occurring 3. Examining the possible consequences of these events

Risk Analysis

4. Deciding which risks are tolerable and which aren’t

Risk Assessment

5. Modifying the activity so the intolerable risks are reduced or eliminated.

Risk Management changes to design and operational practice

4

Fatal Accident Rates

Implied Cost of Averting a Fatality (ICAF) 58. In making an assessment of reasonable practicability, there is a need to set criteria on the value of a life or implied cost of averting a statistical fatality (ICAF). HSE’s ‘Reducing Risks Protecting People’ document sets the value of a life at £1,000,000 and by implication therefore the level at which the costs are disproportionate to the benefits gained. In simplistic terms, a measure that costs less than £1,000,000 and saves a life over the lifetime of an installation is reasonably practicable, while one that costs significantly more than £1,000,000, is disproportionate and therefore is not justified. However case law indicates that costs should be grossly disproportionate and therefore costs in excess of this figure (usually multiples) are used in the offshore industry. In reality of course there is no simple cut-off and a whole range of factors, including uncertainty need to be taken account of in the decision making process. 59. In the offshore industry there is a need to take account of the increased focus on societal (or group) risk, i.e. the risk of multiple fatalities in a single event, as a result of society's perceptions of these types of accident. Therefore the offshore industry typically addresses this by using a high proportion factor for the maximum level of sacrifice that can be borne without it being judged ‘grossly disproportionate’; this has the effect of increasing the ICAF value used for decision-making. The typical ICAF value used by the offshore industry is around £6,000,000, i.e. a proportion factor of 6. HSE considers this to be the minimum level for the application of Cost Benefit Analysis (CBA) in the offshore industry. 60. Use of a proportion factor of 6 ensures that any CBA tends towards the conservative end of the spectrum and therefore takes account of the potential for multiple fatalities and uncertainty. Although a proportion factor of 6 tends to be used, there are no agreed standards and it is for each duty holder to apply higher levels if appropriate, for example in very novel designs. Extract from Assessment Principles for Offshore Safety Cases (APOSC) Issued March 2006 UK Health and Safety Executive

5

Safety Terminology •

Risk Assessment - a subjective evaluation, involving judgment, intuition and experience, where the level of risk is classified in four levels and their associated measures of Fatalities/Person/Year – 1) Tolerable Risk - level prepared to accept but will continue to seek reduction. 10-3 to 10-5 – 2) Acceptable Risk - level prepared to accept without seeking further reduction. 10-5 – 3) Unacceptable Risk - level prepared to reject for oneself and others. 10-3 – 4) ALARP - As low as reasonably practicable.



The usual measure of risk at a global level is Fatalities/Person/Year, but for the local view, i.e., for your immediate corporate mission, risk can be viewed as simply the “failure of your product.”



The usual format for the analysis of Risk Assessment is a “CostBenefit” Analysis, lives saved versus monetary costs.

What is Risk Management? Risk Management is the effective identification, assessment and control of Risk • Establish Context and Scope • Identify the Hazards • Assess the Risk – frequency – consequences – safeguards • Rank the Risks • Eliminate / Minimise the Risk • Ongoing review and monitoring

6

How is Risk Managed?

• Useful Tools: – QRA – RAM studies – FMECA – HAZID \ HAZOP – Audits • Best implemented during design • Qualitatively first, then quantitatively

Why is Risk Management needed? • Legislation \ Standards • • • •

Control of Major Hazard Facilities Pipeline Acts OS&H Regulations 1984 AS/NZS 4360 Risk Management

• Necessary for business optimisation ($) • Increase value by: – minimising loss ($) – maximising opportunity ($) • Optimises the performance of the facility • Reduces probability of becoming: – Piper Alpha – Longford – Exxon Valdez

7

History of Major Hazards Control 1960’s Flixborough UK (explosion and fire) Prescriptive • Recommendations for design and operation • (USA) style statutory provisions • Consideration of the operation of safety procedures Alexander L. Kielland (accommodation platform capsize) 1970’s The “Safety Report” approach. • Operator has to describe safety management to the Regulator. Bhopal India (toxic release) 1980’s • Concept Safety Evaluations based on Quantified Risk Analysis Techniques QRA • Aims to identify and quantify risks to an acceptable level Piper Alpha oil platform (explosion and fire) 1990’s The “Safety Case” approach. • Operator has to convince Regulator on safety management. • Companies now responsible for their Actions - Must assess and determine the level of Risk 2000’s Control of Major Hazards • Safety SILs

Bombay High North platform (explosion and fire)

Bowtie Diagram

Critical Event

Events leading to critical event

Events following critical event

The process of risk analysis, with a sequence of events leading to a hazardous situation (critical event), followed by a series of events leading to a variety of possible consequences

8

Identify the Control Measures Proactive Controls

Causes

Hazards

Reactive Controls

Reduction measures

Elimination measures

Outcomes

Incidents

Prevention measures

Emergency Response

Mitigation Prevention of measures escalation

Safety Case “A documented body of evidence that provides a convincing and valid argument that a system is adequately safe for a given application in a given environment” To implement a safety case we need to: • make an explicit set of claims about the system • produce the supporting evidence • provide a set of safety arguments that link the claims to the evidence • make clear the assumptions and judgements underlying the arguments The Safety Case must demonstrate that the control measures are adequate to eliminate or reduce as far as practicable risks associated with Major Incidents Demonstration is typically achieved through: • Reference to Codes of Practice, Standards, Guidance, etc. • Through risk assessment (qualitative or quantitative) The safety case is a “living document” which evolves over the safety life-cycle.

9

Reliability

RAM DEFINITIONS • • • •





RAM – Reliability, Availability, Maintainability Reliability - The ability of an item to perform a required function under stated conditions for a stated period of time (BS4778) – UPTIME Failure – The termination of the ability of an item to perform a required function (BS4778) - FAILURE EVENT Maintainability - The ability of an item, under stated conditions of use, to be retained in, or restored to, a state in which it can perform its required functions, when maintenance is performed under stated conditions and using prescribed procedures and resources (BS4778) - DOWNTIME Availability - The ability of an item (under combined aspects of its reliability, maintainability and maintenance support) to perform a required function at a stated instant of time or over a stated period of time (BS4778) - UPTIME / (UPTIME + DOWNTIME) or MTTF / (MTTF + MTTR) Deliverability – The ability of a system to deliver gas to the LNG plant (under combined aspects of availability and capacity) understated conditions and at a stated instant of time or over a stated period of time – (AVAILABILITY * CAPACITY)

10

Reliability: Key Design Requirement •

Reliability is as fundamental a design requirement as function and performance



For every Functional requirement a Reliability requirement can (in principle) be specified



– Function:

Seal A must not leak

– Reliability:

P(seal A does not leak) > 0.99

For every Performance requirement a Reliability requirement can (in principle) be specified – Function: Valve must close in less than 10 seconds – Reliability: P(time to close < 10) > 0.99

Failure Characteristics • Different components fail in different patterns – Flow components, chokes & valves - wear out – Mechanical components, wellheads – long life – Electronic components - fail early or last a long time – Pressure containment, pipes – system fails pressure test, or long life – Environmental influences, CO2, H2S, chlorides, overprotective CP and H2 build-up – corrode progressively or induce rapid cracking failures • These create various distribution, Normal, Exponential, Weibull, etc. • Simple Prediction uses Exponential = e ^ (t/mttf) as approximation for linear failure rates • Complex Simulation programs use distributions matched to components

11

Factors influencing failure rate In general the failure rate of a component or element depends on four main factors: (a) Quality (b) Temperature (c) Environment (d) Stress These factors are influenced by: • the design process • manufacture • the way the system is operated

Probabilistic Design

Probability Distribution Function of Load and Resistance

12

Stress and Strength

Overlapping of stress and strength distributions

Failure Rate and Mean Time To Failure Example: Constant Failure Rate • •

Set h(t) = λ, a constant failure rate. Integrate to find the reliability R(t) R(t) = exp (-λ t),

This is often used in reliability analysis of systems. Mean Time To Failure (MTTF) - average time a device or system will operate, without repair, before failure. Form the Expected Value Theorem: • •

E(x) = ∫ x f(x) dx, and introducing an integration by parts, it follows that the MTTF can be determined as: MTTF = ∫ t f(t) dt = ∫ R(t) dt

For the special case of a constant failure rate: •

MTTF = 1 / λ

13

Availability

Availability Improvement • Availability = MTTF / (MTTF+MTTR) • It is express as a fixed ratio, NOT time dependent • Availability can be achieved in 2 ways: – Extend failure free operating period (reliability) – Reduce time to restore system (maintainability) • Subsea time to repair must include; Detection, Location, Analysis of repair, Spares / repair kit, Qualification, Mobilisation, Deployment, Repair execution, Commissioning. • Increased value in driving for Reliability rather than Maintainability to achieve Availability

14

Reliability & Repair Data Reliability / Availability of Repairable Items Assessment Period (t) ITEM

1 2 3 4 5 6 7 8 9 10

REPAIRABLE ITEM Hydraulic System Elements Production Pipiing Test / Vent Piping 10 inch 10 kpsi gate valve Isolation function 10 inch 10 kpsi gate valve HIPPS function 1/2" Test Valve 1/2" Vent Valve PZT Sensor HIPPS Hydraulic Module Check valve HIPPS SEM

30 years MTTF

FAILURE RATE X years^-1

QUANTITY OF ITEMS No.

RELIABILITY OVER PERIOD Re=exp^(-Xt)

UNRELIABILITY OVER PERIOD 1-Re

MTTR

years

days

REPAIR RATE u years^-1

AVAILABILITY PROPORTION A=u / (X + u)

UNAVAILABILITY PROPORTION 1-A

10000 5000 1000 250 250 250 50 210 500 42

0.0001 0.0002 0.0010 0.0040 0.0040 0.0040 0.0200 0.0048 0.0020 0.0238

1 1 1 1 1 1 1 1 1 1

0.99700 0.99402 0.97045 0.88692 0.88692 0.88692 0.54881 0.86688 0.94176 0.48954

0.0030 0.0060 0.0296 0.1131 0.1131 0.1131 0.4512 0.1331 0.0582 0.5105

100 100 70 20 20 20 20 20 20 20

3.650 3.650 5.214 18.250 18.250 18.250 18.250 18.250 18.250 18.250

0.999973 0.999945 0.999808 0.999781 0.999781 0.999781 0.998905 0.999739 0.999890 0.998697

0.000027 0.000055 0.000192 0.000219 0.000219 0.000219 0.001095 0.000261 0.000110 0.001303

Types of Redundancy •

Classified on how the redundant elements are introduced into the circuit



Active or Static Redundancy – External components are not required to perform the function of detection, decision and switching when an element or path in the structure fails.



Standby or Dynamic Redundancy – External elements are required to detect, make a decision and switch to another element or path as a replacement for a failed element or path.



Generally subsea systems (e.g. umbilicals, the MCS) use active redundancy – hot standby



As an alternative to redundancy, consider Diversity – using alternative arrangements of a different kind – e.g. the Back-Up Intervention Control system (BUICS) available on Snohvit, in case the umbilical fails

15

Simple Parallel Redundancy Active - Type 1

In its simplest form, redundancy consists of a simple parallel combination of elements. If any element fails open, identical paths exist through parallel redundant elements.

Bimodal Parallel Redundancy Active - Type 3

(a) Bimodal Parallel/ Series Redundancy

(b) Bimodal Series/ Parallel Redundancy

A series connection of parallel redundant elements provides protection against shorts and opens. Direct short across the network due to a single element shorting is prevented by a redundant element in series. An open across the network is prevented by the parallel element. Network (a) is useful when the primary element failure mode is open. Network (b) is useful when the primary element failure mode is short.

16

Series and Parallel Availabiity Calculations SAP

Series - Availabilty - Product

Umbilical

PUP

Availability 72.000% UnAvail 28.000%

Subsea

Av 90.000%

Av 80.000%

UnAv 10.000%

UnAv 20.000%

Parallel - Unavailabilty - Product

SCM A Re 90.000% UnRe 10.000% MTTF yrs 4.5

OR

Re 99.000% UnRe 1.000%

MTTR years 0.5 SCM B Re 90.000% UnRe 10.000% MTTF yrs 4.5 MTTR days 0.5

Maintainability

17

Maintainability



Philosophy - preventative, corrective, opportunistic



Actions to demonstrate function is in good condition – In service monitoring, testing and footprinting – Corrosion monitoring – Noise / vibration monitoring – Fluid monitoring, sand detection, SRBs, chlorides, scale



Repair planning and contingencies, pipeline repair systems, spares stock holding, stand-by or call-off intervention contracts, alternative temporary systems



Access systems and tooling



All aim to reduce MTTR



Reliability Centred Maintenance



Historic records, Trends, Predictive capability & feed back loops

Maintenance Philosophy

• Subsea  Excess Capacity (typical) • Subsea  High Redundancy (typical) – spare wells – valves – spare control systems •

Mobilise maintenance when…?

18

Maintaining the Gorgon Field

Deliverability

19

Deliverability • Deliverability = Availability * Capacity • Useful terms – DCQ, Daily Contract Quantity – Shortfall, Quantity not supplied • Security of supply • Contract shape and style • Business Risk and Exposure • Best Programs focus on the issue • Used to understand, Quantify risk & contract accordingly • Shapes contract terms DCQ to rolling 24 hour average quantity • “Its about the money stupid”

Deliverability •

How to get high deliverability – System analysis & engineering – Understanding frequency & duration of failures – Standard sizes and component rating at no extra cost – De-bottlenecking & tuning capacity of system – Line pack and storage – Ability of downstream to respond to peak turn-up rates – Capacity and ullage as pressure drops due to well failure – Temporary increase of flow velocity / erosion limits wrt life – N out of M philosophy and sparing insurance



Operability studies & modelling



Supply chain models based on “Just In Time” logistics



Define value of Re Av De in relationship to project

20

Safety Integrity Levels

What is a Safety Integrity Level? Safety Integrity Level is the required “reliability” of a safety function Safety Integrity Level 4 3 2 1

Low demand mode of operation (Average probability of failure to perform its design function on demand) ≥ 10-5 to < 10-4 ≥ 10-4 to < 10-3 ≥ 10-3 to < 10-2 ≥ 10-2 to < 10-1

Safety Integrity Level

High demand or continuous mode of operation (Probability of a dangerous failure per annum) ≥ 10-5 to < 10-4 ≥ 10-4 to < 10-3 ≥ 10-3 to < 10-2 ≥ 10-2 to < 10-1

4 3 2 1

21

PFD



Risk reduction requiring a SIL 4 function should not be implemented. Rather, this should prompt a redistribution of required risk reduction across other measures.

Classic HIPPS Configuration

22

SIL 3 HIPPS example

Risk Reduction Residual

Tolerable

risk

risk

1.87 x 10-6

pa

10-5 pa

Initial Risk of high pressure getting past the tree production choke (Pressure Regulating System) 100

(once per annum)

(Acceptable failure rate per DNV)

Necessary risk reduction

Increasing risk

Actual risk reduction

23

Risk Reduction

Layers of Protection

Pressure Protection System for Pipeline

Residual

Tolerable

risk

risk

10-5

Initial Risk of hydrate blockage, and overpressuring the pipeline 100

(once per annum)

(Acceptable failure rate per DNV)

Necessary risk reduction

Increasing risk

Actual risk reduction

Partial risk covered by other systems e.g. manual shutdown, Pipeline Simulator etc.

Risk Reduction by Pressure Safety System SIL 3

Risk Reduction by Pressure Regulating System SIL 2

Risk reduction achieved by all safety-related systems and external risk reduction facilities

24

Equipment Failure Rates

Equipment PFDs

25

PFD as a function of Test Interval

Probability of Failure on Demand

PFDAVG = ½ λ τ i

PFDavg

Test TIF Independent Failure

Time, Test Interval τ i

PFD for a simple system Proof Test =

1 yr

For the Pressure Transmitter, PFDSE

=

0.44 x 10-3

For the logic solving element, PFDLS

=

7.0 x 10-3

=

3.5 x 10-3

For the final element, PFDFE

Therefore, for the safety function, PFDAVG

=

0.44 x 10 -3 + 7.0 x 10-3 + 3.5 x 10-3 = ≡

1.1 x 10-2

Safety Integrity Level 1

Change proof test interval to 6 months PFDSE = PFDLS = PFDFE = PFDAVG

0.22 x 10 -3 3.5 x 10-3 1.75 x 10-3 = 5.5 x 10-3 ≡

Safety Integrity Level 2

26

Layered Protection System Subsea Control Module

Dump Valve

Subsea Electronics Module

Gas Plant DCS PPS card

Single layer PFDAVG = 1.1 x 10-2 ≡ Safety Integrity Level 1 (annual testing) Dual layers PFDAVG = (1.1 x 10-2) x (1.1 x 10-2) ≡ 1.2 x 10-4

(assuming no common mode failure) (annual testing)

≡ ”Safety Integrity Level 3”

Conclusion

27

The cost of failure - BP experience

These are the direct costs only, Foinaven also incurred: • FPSO demurrage charges • NPV of production (20% * 80,000 bbl/d * 300 days * 25USD / bbl) 120MUSD • Share value erosion and significantly lower dividends for period • Loss of public / shareholder confidence in BP abilities to manage technology • Reputation damage • Tangible losses > 250MUSD, Measurable losses at least the same again • Changed BP contracting philosophy, EPC to EPCM Managed Engineeirng • Schiehallion SCM were run at single high pressure but DCV pilots were not requalified and subsequently overstressed and leaked.

The BP Bathtub Curve

28

Value of Performance An interesting echo from the 1970’s

or SAFETY

29

Related Documents