Challenges of Cloud for Disaster Recovery
Disaster recovery seems an obvious first use case for cloud services. The relative low level of adoption signals that there significant technology issues. Using the cloud for disaster recovery requires close integration of a number of technologies. The technology integration challenges include:
- The on-ramp to the cloud must include technology for seeding the initial copy of the disaster data in the cloud, and for replacing it if necessary. This seeding can take a very long time over a network (days or months) that is required only a few times a year. Other methods of seeding include taking a magnetic or solid-state copy of the data and shipping it to the cloud data center (hybrid approaches).
- Incremental backups of changed data are the normal method of backup. The incremental data has to be applied to the original seed data either on a continuous basis in the cloud (which increases cost), or at recovery time (which can result in significant delays, especially if the data is held on low-cost slow media.
- Technologies such as de-duplication and compression can significantly reduce the amount of data transferred and transfer time, especially for seed data. The amount of data reduction is significantly less for incremental data, but not so important if the amount of change is small.
- Encryption needs to be applied after data reduction techniques have been applied. However,
- The off-ramp from the cloud to the disaster recovery site has the same technical problems as the on-ramp, but with greater urgency. The ability to move a physical copy (preferably solid-state, as time to recover from the media when it has arrived is also very important) is a business necessity in most cases.
- The urgency of the off-ramp problem can be reduced if the cloud data can be used to run the disaster site in the cloud. There is still an issue of how the data is moved back to the production site, but there is time to address that issue.
- Solutions should include the identification of critical data that should be recovered first, where possible and practical. This is not an easy exercise to identify, as many systems are interrelated as part of business processes.
- There is often a regulatory necessity to test disaster recovery procedures at least once a year. It is also sound risk reduction strategy to test procedures at least once a year.
- The backup software needs to be optimized for a remote cloud environment, with capabilities that ensure integrity of the data in the cloud.
- Having all business data in the cloud may increase risk of legal discovery being done directly on data by opposing counsel.
Examples of Disaster Recovery in the Cloud
For file-based systems there are many services that support client systems such as Mozy from EMC. They work well for the recovery of lost files, but all the issues discussed above come into play for full disaster recovery, especially the seed time and recovery time.
In the enterprise space there are many vendors offering cloud backup solutions. Wikibon has picked two examples to illustrate
Asigra has provided an on-site backup appliance that integrates the backup software, a local copy for fast recovery, and a remote copy in the cloud for disaster recovery. The remote cloud copy is often held at a local managed service providers site in the same geographical area. Backing up to IBM’s SmartCloud is also an option. The data in the cloud can only be used for backup and recovery purposes.
NetApp have just announced a partnership with Amazon for Private Storage on Amazon Web Services, shown in Figure 1.
Figure 1 shows NetApp on-premise storage arrays connecting with NetApp storage & data management residing in a Amazon qualified co-location facility. The data transfer between the two NetApp arrays is integrated using NetApp replication and/or archive products. THe integration between NetApp storage in the remote location and AWS EC2 and/or S3 uses AWS Direct Connect. Potential Multiple Uses of Data, including Disaster Recovery, Development & Test, and Big Data Analytics.
NetApp does not offer a total end-to-end disaster solution including the backup/Disaster Recovery Software (as in Asigra above), but leverages its data storage software tools in conjunction with its back-up and DR partners. The attractiveness of the NetApp approach is that the data in the cloud can be used for multiple purposes.
Conclusions
The use of integrated cloud services for backup and restore is still in the early stages. The vision is of a completely integrated solution that allows rapid recovery of application services either in the cloud or at a backup location, with the data in the cloud available for other purposes such as archiving, interrogation and blending with other cloud data. The challenge of delivering this vision economically compared with traditional solutions is still elusive.
Action Item: The technology integration aspects of implementing disaster recovery in the cloud are complex. CIO should clearly spell out the business implications of alternative strategies to the business. There is significant innovation still coming to market - waiting for the right solution is preferable to force-fitting an incomplete solution.
Footnotes: