Backup is one of the most highly tuned applications, meaning backup code is very efficient. The elapsed time to complete a backup is critical and backup windows are becoming shorter or disappearing altogether. Placing backup processes from the physical world into a virtualized environment often wreaks havoc with performance.
There are three broad choices users have to backup virtual environments:
- Traditional “bare=metal” backup of the entire ESX. This is not effective because to recover a single VM you need to recover an entire ESX; which is time-consuming and inefficient.
- Use VMware Consolidated Backup (VCB), which enables third party backup software to protect each VM. The problem with this approach is third party backup software is difficult to optimize for virtual machines because it lacks access to bare metal knowledge that allows vendors to optimize in a physical world. As a result, few organizations are using VCB.
- Deploy a backup agent inside each virtual machine guest OS. Backup using agents in each of the VM systems is a viable approach, but can be complex to setup, especially if vApps get migrated from one machine to another. In addition, agents are notorious for failing and often need to be restarted.
New methods of backup utilizing source-side deduplication are very efficient, but: 1) require users to rip and replace existing backup software and 2) are not a complete solution, especially for high-change rate databases. As well, when recovery time is critical for large volumes, architecting a grid to ensure fast recovery becomes cost prohibitive.
Finally, VCB and agent-based backups ensure volume, but not application consistency, because a vApp can span multiple VMs and the need for guest API-level integration for many application-integrated backup and restore scenarios. To put it bluntly, backup on virtualization needs serious attention!
In the near term, VMware has announced the vStorage API for Data Protection (VADP). VADP supports two major operations:
- changed block tracking (CBT) which allows the backup software to see what blocks have changed since the last backup. This enables backup software to directly read and write the contents of a vDisk without being a guest (i.e. direct bare metal restore at native speed), and
- a Virtual Disk Development Kit (VDDK) which allows the backup software to directly manage files within guests (currently Windows guests only) and index the content of backups without needing an in-guest agent.
Both traditional backup software such as Symantec NetBackup and source-side deduplication software such as Avamar will be able to exploit CBT to improve backup and recovery performance.
The roadmap for integrating backup into the virtualized infrastructure journey has to include the following:
- Resource Virtualization
- In order to help backup software, the virtual machine needs to provide information on what “blocks” of information have been changed by which virtual machines or VM clusters.
- Backup software that integrates with the vCenter APIs, identifying new virtual machines and automatically applying the backup policy, as well as reporting on recent backup state.
- Backup software needs “special permissions” to be able to directly read and write data on storage volumes and directly access the changed block data.
- Application Encapsulation
- Full exploitation of changed block tracking
- Ability to define the backup and recovery requirements at the vApp level
- Integration of hypervisor APIs for data protection into backup and recovery software
- Integration of system recovery metadata (RPO & RTO) at the vApp level with array-based remote replication services
- Internal location independence
- Multi-location Data Recovery services
- Backup software is aware of virtual machine moves and mapping of real and virtual resources
- Backup software can provide rapid point-in-time recovery
- Ability to move applications and data across the infrastructure non-disruptively and at high speed
- Ability to create and exploit an active/active topology for remote and local high-availability replication services
- External location Independence
- Enhanced security and monitoring for backup software, as it has special privileges
- Ability to hold backup data outside the organization with full security and monitoring
- Multi-location Data Recovery including external resources
Footnotes: This research is an expansion of a section of research looking into The Value of the VMware Integration Journey