Despite an increase in natural disasters in the last few years in Australia, disaster recovery (DR) planning is still not ranked highly on the agenda of most organisations.
Most organisations, view DR as an ‘expensive insurance policy’, and their investments are made solely for protecting the mission critical systems and applications. However, when a disaster or outage strikes, organisations often struggle to recover systems and applications they did not initially plan for, resulting in a loss of information and valuable revenue.
A survey by IDC in 2011 of Asia-Pacific markets showed that in the event of a disaster, most corporations would be left with less than half of their systems running. The IDC statistics found that if a disaster occurs, less than a third of the organisations interviewed would be capable of restoring more than 50 per cent of their applications.
Having a DR plan in place simply means that your organisation will be back up and running with minimal disruption to your customers, staff and partners.
So what are the key considerations of a DR plan?
Elements of a good DR plan
A DR plan is a documented set of processes to help your organisation minimise disruption to business services in the event of an outage. The plan should include detailed procedures to be followed before, during and after a disaster. Its purpose is to ensure a certain level of stability and systematic recovery after a disaster.
The DR plan should detail what employees need to do in the event of a disaster, what communication between employees is required, and the time frame within which critical IT services need to be reinstated. It is also a good idea to include a description of key roles and responsibilities, so that anyone assigned to a particular role understands what is required of them when an outage occurs. There must also be a procedure for maintaining and updating the DR plan to reflect any significant internal, external or systems changes in the organisation.
Effective DR planning is about constant interaction and communication among different stakeholders and employees within the organisation. Failure to communicate will only slow down the recovery process.
Threats and outcomes
There are natural, technical and human threats, and disasters thereof. In order to develop a realistic DR plan, it pays to review all threats and the impact an outage can have on your business services.
You will need to create a risk analysis and business impact analysis that covers the full range of potential disasters your organisation might face and then look at how you can respond to each from a day-to-day standpoint, and the potential long-term consequences the disaster might have on your organisation. Each potential disaster should be ranked and analysed to determine the possible impact and outcome associated with each scenario. This will give you a framework of issues that need to be covered in your DR plan.
Remember, DR plans vary from organisation to organisation based on the company’s size, location and industry.
Traditional DR versus cloud-based DR
Many organisations have been slow to embrace proper DR planning as traditional DR solutions struggle to balance cost, performance and risk. Most are too expensive, and the cheaper alternatives provide inadequate protection such as traditional tape backup.
The popularity of cloud computing has brought about a whole range of cloud-hosted DR solutions, known more commonly as DR in the cloud or recovery-as-a-service (RaaS). Instead of requiring one-for-one disaster recovery infrastructure to protect your production data centre, DR in the cloud allows you to protect many physical production servers with low cost virtual infrastructure. This allows an organisation to minimise downtime and recover more quickly after disaster strikes.
The most obvious advantage to having your DR solution in the cloud is that it is a more cost effective option when compared to traditional DR solutions. This makes a cloud-based DR solution a much more viable alternative, especially for a small to medium organisation where IT budgets are limited.
DR in the cloud also enables an organisation to trust its DR plans by removing risk and achieving better predictability. It does this by allowing an organisation to conduct simple and frequent testing without affecting business services. This ensures that in the event of a disaster, business services will come online as expected.
Coping with data loss
While there are numerous benefits to having cloud-based DR solutions, it’s important that your protection and recovery strategy deals with the different potential data loss scenarios. These scenarios could range from deleted virtual machines, internal virtual disk corruption, storage and server hardware failure, file system corruption and deleted or corrupt files.
In order to deal with and understand data loss, you should remember two key DR concepts which will help you determine your protection and recovery strategy – recovery point objective (RPO) which is a measure of maximum acceptable data loss in terms of time (minutes, hours, days) and recovery time objective (RTO) which refers to the target maximum allowable time to recover from an outage.
RPO and RTO relate to downtime and availability in the event of an outage to business services. By classifying all your business according to the RPO and RTO requirements, you’ll be able to select the appropriate protection and recovery technologies. This exercise will also reveal whether your business can take advantage of the benefits of DR in the cloud by allowing you to match your requirements to a service that fits.
Test your DR plan
Finally, to ensure that you DR plan will work effectively; you will need to test it regularly and solve any issues that have arisen. Also, employees need to be trained on their role(s) in implementing the plan in the event of a disaster. Virtual and remote access for employees during a disaster is also crucial to employees being able to execute the plan well.
It’s important to revise and update your DR plan as business and IT environments change in order to ensure smooth implementation of the plan in the event of a disaster.
When a disaster hits an organisation, it can be a stressful time for everyone involved. Having a DR plan in place can help ease that stress and minimise the risk to your organisation. A DR plan provides you with the tools and guidance to get your organisation up and running again following a disaster.
Steve Stavridis is NetIQ’s disaster recovery expert for Asia-Pacific.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.