When it all goes wrong… Who you’re gonna call – the DR Team!
Clive Longbottom, Service Director Quocirca Ltd
Different Strokes • Individual Disaster – My PC/Laptop/SmartPhone doesn’t work • Swap out • Image management • Information restore • Corporate Item disaster – The server/network switch/disk drive isn’t working • Hot swap where possible • Cold swap where necessary • Large scale disaster – We had a data centre there, once • Better have a plan, then © 2008 Quocirca Ltd
Which is a bigger disaster? • This VoIP handset isn’t working • This server isn’t working • This electricity to this building isn’t working
• The VoIP handset belongs to the CEO • The server is a print server for a group with access to another print server • The building was decommissioned anyway
© 2008 Quocirca Ltd
What the business expects…
Disaster Happens IT to the Rescue! Business Resumes © 2008 Quocirca Ltd
What IT hopes for…
Disaster Happens A Miracle Happens Business Resumes © 2008 Quocirca Ltd
A Common Approach…
© 2008 Quocirca Ltd
Or…
Relocation plan
eCommerce disaster plan
IT disaster plan
Business disaster plan Physical disaster plan
Emergency services disaster plan
Chaos © 2008 Quocirca Ltd
A More Considered Approach • DR Plan is initiated Disaster Happens
• Fall back plan takes over
• Scale and Scope of disaster is defined Disaster is evaluated
• Approach to remediation – time and cost
• Decisions are made against ongoing information Business is advised
• Fall back and DR plans are synchronised
© 2008 Quocirca Ltd
What should be in the DR plan? • The immediate issues – Gaining a modicum of capability – Who does what, where and how? – Ongoing steps • The short term – The impact on the business – Prioritising need to bring capabilities back on line – Time to gain access to capabilities • The long term – Re-synchronising live and historic data – Learning from experience – Revisiting the DR plan © 2008 Quocirca Ltd
Creating a DR plan • What is likely to happen? • What has a reasonable probability of happening? • What can possibly happen? • Create a list of priorities – By likely incidence – By cost to the business of the disaster happening – By time and cost to resume capability
© 2008 Quocirca Ltd
But what about the BC plan? • All should be seen in the light of the BC plan – What is likely to happen has to be covered in the BC plan – What is highly probable is more likely to be covered by the BC plan – Any holes in the BC plan have to be covered by the DR plan
© 2008 Quocirca Ltd
Timeliness • How much is the disaster costing us per hour? • (Cost to business) • How much would it cost to get back to capability: – In an hour? – In four working hours? – In a working day? – In a working week? • (Cost of recovery by time)
© 2008 Quocirca Ltd
Back to the See-Saw • It’s all down to equations…. • Where (∑Cost to Business) becomes ≧ (Cost of recovery by time), then = solution
© 2008 Quocirca Ltd
Or…. 120 100 80
You’re losing money!
DR costs too much
60 40
This is the DR sweet spot
20 0
0
2
4 Cost to Business
© 2008 Quocirca Ltd
6
8 Cost to Recovery
10
12
DR Role Playing • Let’s have fun…. – A DR plan can only be created if effort is put in. – Basic paper/web research will only get so far – Role playing and external feeds enable focus • BC is an essential external feed – DR planning needs time
© 2008 Quocirca Ltd
Putting the team to work. • The DR team needs to be able to think outside of the box – it needs some “plants” in it – at least to start with – Put them in a room – Feed them with disaster scenarios – Let them come up with recovery scenarios – Evaluate viability – Cost out
© 2008 Quocirca Ltd
Diversion • Not that sort of “plant” • Belbin profiling provides people with a sense of their strong and weak points • A “plant” is an ideas person – Great at thinking out of the box – Great at visualisation – Weak at full realities – Hopeless at crossing “t”s and dotting “i”s
© 2008 Quocirca Ltd
Plan B • Any DR “Plan B” has to be technology light – It’s likely that it will be the IT that is the problem – No problem with using manual systems • Manned call centres • Paper-based systems • Faxing, telephone calls through to suppliers/customers • Keep it simple, but keep it working!
© 2008 Quocirca Ltd
Technical areas • Basic platform – Network • Redundancy – LAN/WAN – Servers • Full stacks – Access devices • PCs, Laptops, Smartphones, PDAs, specialist devices
© 2008 Quocirca Ltd
Supplier Agreements • Service Value Management – It’s only a small problem, it should be “cheap” – It’s a bigger problem, more expensive – Defined by time, possible alternative kit, etc…
© 2008 Quocirca Ltd
Continued… • Applications – Latest images • Fully patched – Interdependencies • If this breaks, what else breaks? • Data – Where is it? – How current is it? – How rapidly can it be restored?
© 2008 Quocirca Ltd
Maximising data restore capability • Always centralise storage wherever possible – Use shared folders and synchronisation – Use vaulting for email – Use SAN and NAS technologies – Use storage virtualisation – Use Virtual tape to optimise backup windows – Ensure that full images/incremental backups are managed and stored correctly – Check at least one tape in five!
© 2008 Quocirca Ltd
Data synchronisation • Where were we? Where are we? – What happened in between? • Were any transactions left in limbo? • Can we recover these? • Can we identify what is not recoverable for manual intervention? • Pulling it all together – Last in wins – Last known good – Polling • It’s risk management again…
© 2008 Quocirca Ltd
Conclusions • • • •
DR is the safety net for the organisation Where BC is too expensive, DR is the answer Where BC fails, DR has to take over The aim is “time to basic capability”, not “time to equal capability” • Plans must be kept up to date and tested regularly
© 2008 Quocirca Ltd