Seven strategies for keeping disaster recovery ON TARGET
- 12 May, 2008 08:14
- Comments
It was a normal Monday batch process at a well-respected global bank - until, that is, a critical back-office system failed. At first, IT administrators took it in stride. This wasn't the only time they'd had to recover lost data. But soon it became clear something more ominous was occurring: the bank's multi-terabyte database had become corrupted.
The administrators tried to switch to the hot offsite backup. No luck: it had mirrored the corruption. In the IT world, the situation was beginning to spell 'crisis'. Applications teams and anyone else who could help had to suspend all priorities to focus on the failure. Despite best efforts, the target recovery time - four hours - came and went without a clue as to the problem's root cause or fix.
It began to look like an episode of 'House', with IT managers anxiously brainstorming for more than a day, trying to diagnose the mysterious disorder in their dying patient. They knew a premature move could make matters worse.
To the outside world, the bank showed no sign of its grave condition. Customers continued trading, unaware that this high-profile institution was on the verge of losing millions, being investigated by regulators, and spoiling its good reputation.
Out of view from customers, the IT teams struggled to keep the patient alive. They scrambled to find a clean backup. They found out the corruption had happened two days before the crash; it would take 36 hours to run a check on earlier copies of the data to see if it was clean. They worked on updating the production system, rerunning transaction log files to catch up to the crash point, and processing days of transactions that had since accumulated. Senior managers burned the midnight oil to decide which processes to give priority. By end of day Friday, the bank was uncertain it could open for business on Monday. It might be too risky to go more than five days without accurate settlement reconciliation. The bank alerted regulators. The team plugged away on catch-up processing over the weekend. Fortunately, they completed it in time. By Monday the patient was out of danger and the bank was able to open its doors.
A Matter Of When, Not If
This bank is not alone. Indeed, similar near misses are increasingly common. One global retailer had its point-of-sale transactions freeze for 18 hours during the holiday shopping season. The cause: a storage-network software bug that was never precisely identified. Despite the happy ending at the global bank, its senior managers and IT teams were left troubled. Losses had been modest but had the failure struck at year-end instead - when trading was running at full tilt as investors tidied their portfolios - the outcome could have been disastrous.
It turns out that a tiny conflict between a packaged software bug and the server-management software - something nobody could have foreseen - had caused the potentially monumental disaster. This was a problem not addressed in any standard operations manual. For the bank's leadership, the unsettling truth was this: Despite the bank's full compliance with internal policies and external regulations, despite its readiness for loss of a site or failure of a major hardware component, it remained ill prepared for disaster recovery.
The bank is one of many enterprises and public institutions for which a combination of complacency, complexity and strained legacy systems are raising the risk of IT disasters to an alarming level. This is despite the fact that in recent years, disaster recovery and business continuity have gained visibility and significant funding. Improved as practices are, they are no longer enough.
Many large businesses are now so dependent on the flawless operation of their systems that they are dangerously vulnerable to substantial, even irreparable, business damage. The likelihood of disaster is becoming more a matter of when than if.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
- Think print, Think security - Plugging the printer security gap
- HP Imaging and Printing Services
- Learning To Compete: IT’s Next Transformation
- Six tips for choosing a unified threat management (UTM) solution
- Printer Usage and Cost Management Strategies for the Australian Mid-market, an Unrealised Opportunity
-
Australia's first 4G smartphone is the HTC Velocity 4G
-
Swedish e-commerce startup's execs linked to NYC sex crime
-
Face Time - Interview with John Brennan and Robert DiStefano
-
How to implement next-generation storage infrastructure for Big Data
-
Pfizer's Future Depends on IT Transformation
-
IDC MarketScape: Worldwide Business Process Platforms 2011 Vendor Analysis
Enterprises adopting business process management (BPM) software have wide-ranging needs, from highly dynamic task management to complex, high-volume processing with a focus on straight-through automation and the ability to rapidly detect exceptions. This IDC MarketScape focuses on what we call business process (BP) platforms, which are optimized to support midrange to more complex use cases. Read on. -
Pathways Advanced ICT Leadership Development Program Brochure and Course Outline 2012
Developed by the CIO executive Council in conjunction with Rob Livingstone Advisory, Pathways Advanced is a 12-month CIO delivered, small group, mentor based professional leadership development program. Pathways Advanced brings together best practice, thought leadership and business insights for today’s most promising ICT professionals -
Leveraging the Service Catalog to Scale Your MSP Business
When assessing an MSP’s maturity and prospects, one question provides more insights than any other: “What’s in your service catalog?” A well-defined service catalog can set the framework for growth. The lack of a service catalog can significantly impede an MSP’s ability to scale. This paper explores why the service catalog is so vital, and provides some practical guidelines MSPs can apply in order to ensure their service catalog provides maximum utility and benefit.
















Comments
Post new comment