Seven strategies for keeping disaster recovery ON TARGET
- 12 May, 2008 08:14
- Comments
It was a normal Monday batch process at a well-respected global bank - until, that is, a critical back-office system failed. At first, IT administrators took it in stride. This wasn't the only time they'd had to recover lost data. But soon it became clear something more ominous was occurring: the bank's multi-terabyte database had become corrupted.
The administrators tried to switch to the hot offsite backup. No luck: it had mirrored the corruption. In the IT world, the situation was beginning to spell 'crisis'. Applications teams and anyone else who could help had to suspend all priorities to focus on the failure. Despite best efforts, the target recovery time - four hours - came and went without a clue as to the problem's root cause or fix.
It began to look like an episode of 'House', with IT managers anxiously brainstorming for more than a day, trying to diagnose the mysterious disorder in their dying patient. They knew a premature move could make matters worse.
To the outside world, the bank showed no sign of its grave condition. Customers continued trading, unaware that this high-profile institution was on the verge of losing millions, being investigated by regulators, and spoiling its good reputation.
Out of view from customers, the IT teams struggled to keep the patient alive. They scrambled to find a clean backup. They found out the corruption had happened two days before the crash; it would take 36 hours to run a check on earlier copies of the data to see if it was clean. They worked on updating the production system, rerunning transaction log files to catch up to the crash point, and processing days of transactions that had since accumulated. Senior managers burned the midnight oil to decide which processes to give priority. By end of day Friday, the bank was uncertain it could open for business on Monday. It might be too risky to go more than five days without accurate settlement reconciliation. The bank alerted regulators. The team plugged away on catch-up processing over the weekend. Fortunately, they completed it in time. By Monday the patient was out of danger and the bank was able to open its doors.
A Matter Of When, Not If
This bank is not alone. Indeed, similar near misses are increasingly common. One global retailer had its point-of-sale transactions freeze for 18 hours during the holiday shopping season. The cause: a storage-network software bug that was never precisely identified. Despite the happy ending at the global bank, its senior managers and IT teams were left troubled. Losses had been modest but had the failure struck at year-end instead - when trading was running at full tilt as investors tidied their portfolios - the outcome could have been disastrous.
It turns out that a tiny conflict between a packaged software bug and the server-management software - something nobody could have foreseen - had caused the potentially monumental disaster. This was a problem not addressed in any standard operations manual. For the bank's leadership, the unsettling truth was this: Despite the bank's full compliance with internal policies and external regulations, despite its readiness for loss of a site or failure of a major hardware component, it remained ill prepared for disaster recovery.
The bank is one of many enterprises and public institutions for which a combination of complacency, complexity and strained legacy systems are raising the risk of IT disasters to an alarming level. This is despite the fact that in recent years, disaster recovery and business continuity have gained visibility and significant funding. Improved as practices are, they are no longer enough.
Many large businesses are now so dependent on the flawless operation of their systems that they are dangerously vulnerable to substantial, even irreparable, business damage. The likelihood of disaster is becoming more a matter of when than if.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
- 2012 Pathways ICT Leadership Development Program
- Beyond Dropbox: Requirements for Enterprise Secure File Sharing
- The Pathways ICT Leadership Development Program | Turning today’s ICT professionals into tomorrow’s business leaders | 2012 Course Curriculum
- Detailed Explanation of the Core Competencies
- Enterprise Buyers Guide for Printers
-
Monash Uni reduces IT teams after consolidation project
-
FTC warns makers of background checking apps
-
Time to get Agile
-
QLD govt demands answers after pay glitch
-
Monash Uni reduces IT teams after consolidation project
-
Pathways Advanced ICT Leadership Development Program Brochure and Course Outline 2012
Developed by the CIO executive Council in conjunction with Rob Livingstone Advisory, Pathways Advanced is a 12-month CIO delivered, small group, mentor based professional leadership development program. Pathways Advanced brings together best practice, thought leadership and business insights for today’s most promising ICT professionals -
Pathways Business Brochure 2012
Tailored learning and development program for organisations looking to build business acumen within their Key ICT executive. The course curriculum is designed in conjunction with the specific requirements the enrolling organisation. -
Oracle Business Intelligence and Data Warehousing From Storage to Scorecard
Getting actionable data in the hands of the right decision makers translates to positive business outcomes – whether that means competing more effectively, reducing operational costs, meeting compliance requirements, or anticipating changing market conditions. To get the right data to the right people at the right time, you need an integrated business intelligence and data warehousing solution that can provide fast access to reliable information and the tools to translate that insight into actions.
-
Operating Systems Concepts with Java 6E Wileyplus/Blackboard Standalone Card
-
Learning to Program with Visual Basic 6.0 2E
-
Data Mining for Genomics and Proteomics
-
Trustworthy Computing
-
Market-oriented Grid and Utility Computing
-
Planning for Pki
-
Software Engineering Project Management, 2nd Edition (Foreword By Edward Yourdon)
-
Software Measurement and Estimation
-
Windows Home Server for Dummies











Comments
Post new comment