Because companies can't immediately see the ROI of an IT/business continuity plan, too many put developing one on the backburner. But when a disaster hits home, planning for the possibility of IT destruction suddenly seems anything but wasteful Until recently, most organisations could accommodate planned system outages for maintenance or rollout of new software. Unplanned outages, while potentially calamitous, could usually at least be concealed from the public gaze. But in the high-velocity, high-pressure e-business world, users demand access to sites 24 hours a day, seven days a week. Downtime is highly visible, and customers won't wait around for downed sites to get up - most will just flit to the competition.
Consider eBay. When the US online auctioneer's site went down for 22 hours in June 1999, it lost between $3 and $5 million in revenues and watched its stock plummet 26 per cent. Or AOL. When it went down for 24 hours in June 1996, it had to cough up $3 million in customer rebates.
Yet in Australia, it seems, you'd never know just how much the stakes have been raised. According to a recently-published survey, Managing for Failure, by Macquarie Graduate School of Management researchers Professor Ernest Jordan and David Musson, most Australian organisations daily risk their businesses, not to mention the jobs and well-being of staff, while doing nothing to avert the problems.
It's nearly impossible to return to operation within 24 hours after a disaster without an agreed and tested business continuity plan (BCP). Most Australian companies have no such plan, tested or untested, for either their business or IT operations. "You insure your house, your car and your life, so what about your business?" Resume IT business continuity management consultant Phillip Kinch told Citec and Sun Microsystems clients in May. "Business continuity is the insurance policy for your business."
Emergencies such as fire, flood, malicious damage and failures of gas, electricity or water can close down an organisation for days, weeks or even months. Most organisations are so heavily dependent upon their IT, the loss of their IT services would be a disaster.
Daewoo Electronics in Sydney escaped by the skin of its teeth when much of its computer equipment was damaged beyond repair after a malicious attack in January 1999. If not for the skills of the recovery engineers who sprang to its rescue, the company could have been forced out of business. Even with their help the disaster led to an insurance claim of more than $2 million after backup tapes (which had been kept on site) were also destroyed.
In the light of such incidents, the failure of business to take business continuity planning seriously is remarkable. It is especially remarkable given that 50 per cent of those organisations with significant computer configurations had experienced unexpected stoppages in the past two years. Nonetheless, 91 per cent had no tested plans to deal with any disruption of their operations.
"It is nearly impossible to return to operation within 24 hours after a major disaster without an agreed and tested plan, so most Australian organisations would be seriously or terminally affected by such a stoppage," the Macquarie researchers say.
Even more remarkable is that in 1997 many respondents told the same researchers they had no contingency plans because they had never had any operational problems. Yet in 1999 half of all respondents said they had to stop or reduce operations for more than four hours during the previous two years due to an unforeseen event. Responding to the same survey, most Australian organisations cited 24 hours as the longest tolerable interruption to their operations; 30 per cent said interruptions of up to eight hours would be intolerable.
In addition, many organisations are required to perform these tasks as part of audit, legal and shareholder obligations. The only way to achieve assurance in the face of such threats is to develop a solution based on a good business continuity plan.
"Companies unwilling to spend money on continuity services might view the situation in a different light if they considered the risk to their revenues," wrote Bruce Hoard in US CIO magazine recently. Hoard has reported on high technology and its implications for more than 20 years. "These companies should ask themselves if it makes sense to spend one per cent of their revenues to ensure the safety of the other 99 cent. When the question is phrased that way, it is hard to come up with no' as an answer."
Kinch believes that too many organisations still consider business continuity too difficult and expensive to achieve and therefore assign it a low business priority.
"Executive management acknowledges the importance of having such plans, but most consider the problem to be an IT issue and fail to relate it to the overall business strategy," he says. However, not only can business continuity plans be cost-effective, he says, they can also improve existing service delivery to an organisation's clients and demonstrate an ongoing commitment to the welfare of clients and staff.
Also seeking an explanation for the apparent apathy of Australian business, the Macquarie University report posits that Y2K may have swallowed funds that might otherwise have been available for more general contingency planning. That kind of thinking makes no sense at all to Citec general manager Greg McCallum.
"People say [business continuity planning] costs a lot," McCallum says. "The answer to that is if you want to stay in business you've got to pay for your losses. Do you want to pay for them in advance, say by insurance or business continuity programs; or do you want to pay for them in one big hit when they emerge - if you can afford it?
"And while the IT&T element of it does cost, technology is advancing all the time. You can have interoperable disk farms these days in separate physical environments, which just wasn't possible all that long ago. So there's lots of ways you can reduce the IT&T element of a business continuity program by clever use of the technology that's available," he says.
But McCallum points out that it's a fallacy to think that business continuity is only about disaster recovery in the event of computer failure. "It's more about creating opportunities and avoiding problems, and solving problems. And, in fact, the whole disaster recovery thing is only one element. You get up to a dozen elements in a proper business continuity program."
As much as technology issues, both business and computer continuity are management and cultural issues. Computer contingency planning won't in itself stop Sydney-based businesses coming to a standstill during the Sydney Olympics. And as McCallum points out, BCP that focused entirely on IT&T would do nothing to discourage people from opening mystery e-mails called "I Love You" or "Joke". People regard the e-mail system culturally as sort of a gossip network, and tend to be totally undiscriminating about what they open and what they send.
"That's an issue that won't be solved overnight and you can't prevent that just by technology. Technology is not the number-one way to stop that sort of problem," McCallum says. "When top management and the boardroom level park BCP in their minds as being about technology and IT&T, they're committing a fundamental error."
Paradoxically, however, the CIO is often precisely the right man or woman for the job. McCallum finds in organisations where BCP is well supported it's usually because the CIO or the person in charge of finance and IT is leading it. They are the ones who have raised BCP to a business issue and educated top management. "If the organisation is reactive, the CIO is best placed of all to educate upwards and raise it up to a business level - in other words, to make sure people understand it is costly but it's a cost of doing business," he says.
But while the buck should stop with the CIO, the focus must be much wider than just the IT side of continuity management or on worst case scenario planning. The needs of the entire organisation must be taken into account, and it may well be up to the CIO to get the board on side.
Ultimately, the objective must be to balance the exposure to risk against the treatment of that risk. Every organisation should invest in risk control focused both on preparation for crisis management and planning for the recovery of business operations.
Assessing a Risk
Continuity planning has evolved considerably since its inception in the 1980s when disaster recovery planning was the focus and the glasshouse the centre of activity. Disaster recovery planning became business recovery planning (BRP) once organisations realised that not only the systems themselves, but also every business area reliant on those systems had to be capable of rapid recovery in the event of a disaster.
Nowadays the aim is to minimise the need for any recovery effort at all in the event of a disaster. Business continuity, Kinch says, ensures the continuation of your business after a disastrous event. The objective is to minimise the inevitable disruption after such an event.
"Business continuity management is about identifying, assessing and treating the risks that can disrupt key business functions. It requires a proactive approach and is an ongoing process."
Elements of an effective BCP include a business impact analysis, an assessment of identified risks, determinations of appropriate risk management strategies, creation of an effective recovery plan, and regular testing of the plan.
Yet 45 per cent of Australian businesses and 60 per cent of local councils have no contingency plans at all, undocumented or in draft, for their business operations. Moreover, 12 per cent of Australian businesses and 39 per cent of councils have no contingency plans at all for IT or business operations.
And while 13.6 per cent of businesses do have fully documented and authorised plans for both their entire business operations and IT, 36 per cent of those plans have never been tested. That leaves less than 9 per cent of businesses with authorised and tested plans for their entire operations.
More than 50 per cent either couldn't say whether they could recreate their data in an acceptable period in the event of a major disaster loss, or they knew all too well it couldn't be done. However, 40 per cent of councils and almost a quarter of businesses have no arrangements for obtaining replacement equipment or services, should their current IT equipment become unusable.
BCP experts say the key is to invest heavily in up-front analysis and then to try to manage those risks as efficiently as possible. That means accepting that there is no zero risk environment and that BCP may mean having to eliminate as much risk as possible with the resources.
Before developing a plan, a good starting point is to consider the impact extended downtown would have on the organisation. According to one US study, up to 20 per cent of Fortune 500 companies could be put out of business by a 48 hour system or network outage, and it takes an average of four to five days to recover from such an outage.
Mobile messaging company Pocketmail helps insure its business through a disaster recovery site and live snap-mirroring of the operations centre. Chief technology officer David Shearer says Pocketmail puts a premium on delivering reliable services to its corporate clients and is increasingly trying to differentiate itself with non-corporate customers by offering 24x7 availability.
"Having at least a complete operating backup site means we can minimise any disruption if we do have a problem in our operation centre," he says. And Shearer says that premium also extends to being able to provide high levels of security to clients. "For Web-based access for people using the service it's all secure sockets connection. I guess we implement firewalls and protection schemes that we think give us a high level of security against external attacks, which again might affect the delivery of service that we are trying to provide."
BCP is also about recognising and challenging dangerous assumptions. McCallum says too many organisations that outsource their computer environment assume the outsourcer has the entire disaster recovery issue covered.
"As an outsourcer, I'd be putting our business continuity at risk if I had to include in that an unspecified and uncosted and open-ended commitment for business continuity [for clients]," says McCallum. "It's an important issue for people when they are outsourcing not to assume that they're covered because they aren't unless it's explicitly negotiated with an outsourcer."
McCallum says the issue highlights how important it is for BCP practitioners to know that they don't know everything. "For example, Citec - and I'm sure other outsourcers are like this as well - doesn't necessarily try to do it all, although we're offering a service.
"We have a group of business analysts with whom we join, and they understand all the strategy and management alignment issues. They understand the total environment, and they're really good people to get in if you really want to pick up what it is you don't know," he says. "What we have found is that clients we believe have a very solid understanding of business continuity have usually been to one of these people."
As organisations turn increasingly to electronic supply chains to help manage supplier relationships in order to minimise both inventories and costs, business continuity becomes a far slipperier issue. The danger is that should one supplier network go down, others could also get dragged down in a kind of domino effect. That makes an end-to-end architecture look increasingly desirable.
And there are other ways external sources of supply can prove the weak point in the supply chain. Infrastructure and emergency service agencies are critical to business continuity. The BCP should consider a wide range of potential scenarios covering the risks to those agencies as well.
Look at ways to reduce the risk. Even simple things like smoke detectors, sprinkler systems and devices that prevent car entry into the building can reduce business risk.
Experts say companies should also exploit the efficiencies to be gained from bringing BCP under the same umbrella as other risk management activities like security, fraud and occupational health and safety risk planning.
And when it comes to computer contingency, caution should be exercised in relying too heavily on the ability of backups to get you out of trouble. The Macquarie study found 21 per cent of businesses and 14 per cent of councils had experienced failures in backup restoration. The main reason was media failure, underlining the need for proper handling and storage of magnetic and other media.
In the IT area, a good first step is to gauge the availability of your applications, the extent of your exposure, and then use that insight to brainstorm an availability architecture. Hoard says IT managers should ideally identify the top three to five business processes and develop an architecture that provides the necessary capacity, access, performance and availability, given their company's cost constraints. And he warns the first two steps should be done as quickly as possible because the Internet economy will not tolerate "analysis paralysis".
Twelve steps to recovery
1. Enlist the cooperation of upper management 2. Seek help from qualified experts 3. Conduct a business impact analysis to identify key business functions and IT resources 4. Assess the risk of particular disasters based on company profile and location 5. Devise a detailed, flexible plan that outlines staff responsibilities 6. Select a company- or vendor-based recovery option (that is, a redundant system, hot site, mobile data unit or quick shipment solution) 7. Cover all IT resources, including telecommunications networks and LANs 8. Select IT equipment vendors that can provide prompt service 9. Maintain updated vendor information 10. Test your plan at least once a year 11. Don't underestimate IT needs: maintain a strong technical support staff and plan to replace lost equipment with more powerful equipment 12. Structure the workload to address top priorities first Source: Research Company of America