Storage Peer Incite: Notes from Wikibon’s September 4, 2007 Research Meeting
This week Wikibon presents Virtual tape libraries: Journey or destination. About 20 vendors now are pushing disk in the form of virtual tape libraries (VTL) as an alternative to tape for backup/restore (B/R). These solutions use low-cost disk technologies, including drives that power down when not in use to save power, decrease heat generation, and extend the life of the media, while providing faster access than tape.
These technologies have advantages for fast restoration to replace corrupted data. Surveys show that a huge proportion of restores are of single files, rather than whole disks, and are of data less than 24-hours old. For these restorations VTLs offer a much faster, less expensive solution than finding the right tape, recovering it from the vault, mounting and sequentially searching it for the needed file.
However, for most users, today's disk technology is unsuited for long term, off-site data archiving, which provides protection from larger disasters and is needed to meet compliance requirements. As a result, the Wikibon community favors hybrid solutions that combine disk with tape and allow the application of advanced de-duplication technology to decrease the tremendous waste of backup time and media caused by today's brute force techniques that resave huge amounts of unchanged data every night on that day's backup tape. Few of these exist today, and we believe users should urge VTL vendors to integrate tape to their solutions.
Based on the Peer Incite weekly research meetings moderated by Peter Burris, we've tried to make this newsletter about your business. Each week we summarize the community's input from the meeting and document specific advice for users (IT), organizational considerations, technology integration issues, and vendor actions. We also address the all-important 'getting rid of stuff' (GRS). Bert Latamore
New compliance requirements for data preservation plus the explosion of data volumes has focused attention on backup/recovery (B/R) and archiving applications in the past few years. Inevitably tape, and specifically tape’s access delays (e.g., finding, mounting and spinning up the right tape reel), robot library failure rates and the medium’s sequential read restrictions on B/R and archiving solutions versus its financial benefits, becomes part of that discussion.
Tape’s portability and low cost of operation compared to disk have made it the near-universal choice, particularly for secure, off-site disaster recovery (DR) solutions. In the past few years however, as the cost of drive technology has plummeted and disk systems that keep their drives unpowered until needed, reducing power consumption and heat generation while increasing disk lifetimes, have appeared, systems using disk as a primary B/R and archiving solution have become more visible in the market. Disk, however, still must overcome two major challenges before it can supplant tape for DR:
- . Disk drives are too fragile to support their transportation to unpowered storage in a secure, remote site, while network bandwidth costs still make backup over the network a very expensive alternative to tape.
- . Getting full benefit from disk solutions requires a painful, expensive migration to a new, nonstandard B/R software architecture that adds minimal revenue potential to the business.
As a result in the last decade we have seen the emergence of a variety of virtual tape library (VTL) technologies that attempt to address these issues in part by making the disk device look like a tape drive to legacy B/R subsystems. This has two compelling advantages:
- . Access times are a fraction of those for tape, so a shop can recover newly written data much faster. This is important because a large percentage of restores are made from data that is less than 24 hours old.
- . VTL libraries can use advanced data deduplication technology to dramatically reduce the need to rewrite unchanged data multiple times in the B/R run.
- . Despite improvements in mean time between failures (MTBF) for tape drives, the reliability differential between low-cost disk and high-end tape drives is still large enough to merit use of VTL technologies in reliability-sensitive B/R situations.
While some suppliers aggressively push full disk-drive-based VTL products, a hybrid VTL technology combining disk and tape is a superior approach for larger organizations. This can exploit the benefits of VTL on the drive side while still using low-cost, easily transferred tape to meet long-term archiving and disaster recovery (DR) needs. However hybrid application set decisions are complicated by the limited number of products in this category (only a few of the 20-odd VTL vendors offer hybrid solutions off the mainframe) and the emergence of linear tape-open (LTO) based tape systems for large as well as medium enterprise use.
The long-term question is whether tape will remain part of an overall B/R and DR solution. Ultimately we can be sure that low-cost ruggedized disk technologies that support physical removal, truck transport and long-term storage in secure remote sites will appear, while sequential data access will remain a basic limiting factor of tape. Whether those advances will happen this year, or indeed this decade, is an open question. However, until the question of data movement can be satisfactorily answered, tape will maintain an important place in the enterprise. In this environment, hybrid technologies should be getting greater vendor and user attention.
Action Item: Organizations should not overlook the role of tape in B/R and DR applications based on historic concerns of performance, availability and administrative costs. While most VTL vendors are pushing all-disk alternatives to tape, hybrid VTL technologies are the better approach to solving enterprise B/R and archiving requirements in an integrated, flexible and operationally simple packaging.
For decades, tape users have struggled with shrinking backup windows, escalating backup costs, poor reliability of backup systems, backup/restore (B/R) performance issues, cumbersome recovery and high labor costs associated with backup and restore. Disk-to-disk (D2D) backup technologies such as virtual tape promise to address many of these problems. However, in a storm of acronyms, overlapping functionality and vendor marketing the right strategy is not always obvious.
On the spectrum of data protection choices, virtual tape fits somewhere between conventional tape and snapshot copy/replication. This spectrum is large and growing, spanning out through continuous data protection (CDP), asynchronous and synchronous replication all the way to 3-node disaster recovery. Most solutions in this spectrum still largely rely on tape for the final archiving solution.
In the near term, successful disk-to-disk backup approaches including virtual tape will leverage existing BUR practices that have been hardened over the years, mainly using sequential tape approaches. Applying D2D technologies means evaluating the following:
- What are the RPO/RTO requirements of an application and how much data loss is acceptable?
- What is the backup window?
- How will existing tape/backup processes be leveraged?
- What is the reduction in expected loss?
- What is the budget?
While the answers to these questions will begin to help formulate a strategy to implement D2D backup and choose the right solution, trade-offs must be understood and fully vetted. For example, if implemented in virtual tape, how will de-dupe and/or encryption impact recovery times? What are the compliance implications of reducing tape usage and how should tape media and device strategies change as a result of using D2D solutions? Simply throwing tape out won't cut it with the auditors.
Action Item: Users must carefully evaluate which practices in B/R are and are not effective and use virtual tape (and other D2D solutions) to make adjustments at the margin as to how backup and restore technologies are introduced into the labor pool.
Wikibon has often stated that application RPO/RTO requirements should be the primary drivers for backup, restore and disaster recovery technology. VTLs address the local backup and restore requirements extremely well, and are usually implemented so that the B/R software is not even aware of the VTL’s existence; the VTL looks like a set of tape drives.
Users should be aware that this technology implementation can mean that the disaster recovery component of the RPO/RTO application requirement is not adequately addressed; the increased elapsed time for the data to be taken offsite and the manual procedures required to ensure that tapes are created and shipped can severely compromise disaster recovery capabilities.
Hybrid technologies, where the VTLs are equipped with tape drives, provide a potential solution to this problem. However, to be fully effective, this technology will have to be better integrated into the B/R software and procedures.
Action item: Nassim Nicholas Tale’s book “The Black Swan: The Impact of the Highly Improbable” (2007) graphically illustrates how short-term certain rewards (in this case easier local restores) will prevent people from guarding against lower probabilities of a disaster (in this case loss of a large amount of data). Organizations need to explicitly factor in the impact of delaying getting data offsite, and integrate technologies to minimize these risks.
Despite the growing appeal of hybrid virtual tape solutions that integrate disk as a 'cache buffer' to tape and introduce technologies such as data de-duplication and encryption, today’s virtual tape is unlikely to be the archiving technology of choice on its own. This is because today's virtual tape solutions are mostly designed to address backup and restore (BUR) problems and don't typically include the requisite hardware security module (HSM), data classification and data movement technologies needed to satisfy true archive requirements.
For offline deep archiving, virtual tape is probably not suitable in most cases, and the compliance, legal, risk management and records management functions in organizations will require continued off-site archiving and tape practices to be instituted perhaps as part of but more likely outside virtual tape infrastructure.
The growing schism between structured and unstructured data and the continued pressure to secure information and ensure these assets do not become liabilities means that storage administrators will be interfacing with and explaining to various oversight functions where virtual tape fits and how it affects archiving capabilities and ultimately corporate risk.
Action Item: Organizations implementing virtual tape must consider archiving-like capabilities as part of their data protection strategies that will satisfy the legal, compliance, risk and records management functions within organizations. Tape and associated archiving software capabilities will be a continued part of these solutions.
Changing software and changing procedures are the biggest inhibitors to adoption of new technologies. VTL technology has had its initial success by minimizing the barrier to adoption. It pretends to be tape.
New technologies such as MAID, LTO-4 tapes, VTL hybrids, encryption, data de-duplications, continuous data protection and many others are the young pretenders. There comes a point when users will want them to be recognized for what they are and what they can contribute.
Many different scenarios could lead to this recognition. B/R vendors can re-architect to allow new technology objects to be integrated and enable end-to-end management of the whole process. Open standards could emerge. Appliance-based solutions could add a new control point, especially in the archiving T2/T3 space.
Action item: Vendors need to address how these new technologies can be integrated into the backup, restore, disaster recovery and archiving requirements of organizations. Minimizing the adoption barriers is likely to determine vendor success.
Virtual tape libraries (VTL) can dramatically improve the performance of tape-based backup/restore (B/R) applications (e.g., improved RTO, de-duplication of backup data). Nonetheless, physical tape systems generally will still have to be employed for archiving applications and moving copies of production data to disaster-safe secondary sites (unless shops secure enormous and costly bandwidth for moving backup data between primary and secondary sites).
VTLs, then, are unlikely to obsolete tape. However, VTLs will reduce the count of tape drives required in a shop. Since restore (time to read data from tape) requirements typically demand far more tape drives in operation than backup requirements (time to write data to tape, which often can be performed asynchronously), fewer tape drives will be demanded in VTL shops. Fewer drives, though, does not necessarily translate into less media, despite VTL de-duplication. The Wikibon community believes that data written to tape for movement to secondary sites is best written in full form, with duplicates, to minimize the degree to which VTL or de-duplication technologies increase the complexities of recovery efforts at the second site in the event of a disaster.
Action Item: VTLs will reduce the number of tape drives required to handle backup, restore, disaster management, and archiving applications, but will not obsolete tape for the foreseeable future, nor reduce the volume of tape media required.