Contents |
Introduction
On the brink of extinction in 1987 StorageTek introduced a radical concept to the world-- a new and improved automated, robotic tape library. Many observers thought the notion was crazy because IBM had introduced a similar idea in 1974 in the form of a product called the 3851 Mass Storage Facility which saw limited acceptance and was not that reliable. Moreover, microprocessor economics were driving the PC industry and miniaturization, in the form of smaller/cheaper disk drives was the big trend, not large, robotic tape silos. Nonetheless, STK's visionary and determined CEO, Ryal Poppa persevered and pitched the idea to Wall Street bankers, creditors and customers, helping STK emerge from Chapter 11 in one of the greatest turnaround stories in the history of the tech business.
This bit of computer industry lore is often lost in the sea of talk on cloud, big data and hyper-convergence, but the reality is these automated systems have been storing and securing data for decades in some of the most demanding environments from financial services, energy, manufacturing and virtually anywhere high end systems reside. Automated tape has even found its way into the sexy world of hyperscale finding homes in data centers at firms such as Google and Yahoo. Tape usage is morphing quite dramatically and is evolving its applicability into many new use cases. Despite efforts to kill tape, it remains the most cost effective, durable and highest performance (in some applications) medium. Tape's economics, longevity and removability make alternatives less attractive and it appears that tape is coming back. Moreover, cloud service providers see automated tape as a viable alternative to Amazon's Glacier and are beginning to offer similar services that monetize the technology.
Major R&D investments from HP, IBM, Oracle, Quantum and Spectra Logic have driven a resurgence of announcements and tape activity in 2012 and 2013. While industry revenues have been under pressure for a decade, tape capacity shipped actually grew by 12% in 2012 according to the Santa Clara Consulting Group and is expected to hit 26% growth in 2013. Why the surge? It's because the role of tape is dramatically changing from its historical position as a pure backup medium to an increasing prevalence in large unstructured content files, high resolution video, deep cloud retention and compliance and regulatory applications.
This newfound tape momentum has four key elements:
- The introduction of the Linear Tape File System LTFS in April 2010, with general support from all the major tape vendors.
- LTFS allows the file content data and the metadata to be stored together. The metadata is held in an index on the tape.
- LTFS avoids the problem of only an application knowing what data is stored on which tape. The previous model lead to complexity and high cost in managing the relationship between the physical media and previous versions of, for example, a backup application. As a system it was an unreliable and expensive way of recovering data, and became even less reliable over time.
- Traditional backup software and HSM packages typically have the database within the application, away from the tape system and media. The systems work well for today’s backup and HSM workflows but are not designed for interoperability. In addition, many backup/HSM systems become the de facto archive systems. To restore any file, for example, is a major exercise absorbing manpower and system resources. It is essentially impossible to extract value from the data in these systems. LTFS is a way of accessing the business value within data.
- Tape within the current non-LTFS environment has a bad name, because of the operational overheads of managing tape farms with poor application support. However, applications using LTFS with automated tape systems have obtained very high marks from users for ease-of-use and cost.
- The rapid increase in tape density and transfer rates, holding much more data and transferring it much faster than disk drives.
- The development cycle of so-called high-performance hard disk drives (HDD) is over. The vast majority of R&D dollars are being targeted at lower spin speed (e.g. 7200 RPM) high density drives. There is virtually no IOPS or bandwidth increase available from disk technology. The only disk metric that is increasing is capacity, with 10TB disks on the horizon. Tapes can transfer sequential data faster, and if necessary the same techniques of wide striping can be deployed over tape drives.
- The introduction of specific increasingly vertical solutions that integrate tape and other storage technologies.
Wikibon has projected that by the end of 2015 the majority of new active data and metadata will be held in flash storage. Semi-active data that requires millisecond response times is projected to live on traditional disks, mainly high capacity low cost disks (see disk/tape technology discussion below). For inactive data and archive data, automated tape systems are emerging as the lowest cost alternative. Google and other cloud service providers have large tape libraries because they understand the economics are compelling and deep archive on disk is too expensive. While Amazon uses spun down disks for Glacier, competitive cloud service providers have indicated to Wikibon that they believe automated tape provides better economics and longevity for deep archive applications. Moreover, applications are being rewritten around the LTFS model and tape remains the last source of record for deep archiving, compliance and other "boring but important" use cases. Tape is becoming a lucrative substrate for cloud services in the compliance space.
The “good thing” for tape library manufacturers is that because of the high precision robotic components involved, it is not quite so easy for a Google to construct a software-led infrastructure machine version, as is being done with other storage and networking. Because Web giants can't "componentize, commoditize and software-ize "tape machines,” the way they can with storage, networks and servers, Web giants will buy rather than build tape solutions. Even thought these systems have higher CAPEX costs than full commodity-based storage systems (e.g. JBODs), better integration and superior tape economics yield lower OPEX costs for customers.
State of the Tape Art
Today's major manufacturers all support interchangeable LTO-6 tape cartridges. Oracle and Spectra Logic also have their own format, which allows more data per tape, and higher data transfer speeds. LTO-7 is a new technology that is scheduled to be available late in 2014 or early 2015.
Figure 1 compares the capacity capabilities of the current technologies, and the future LTO-7 technology. The Oracle StorageTek T10000D stands out from other technologies in being able to deliver a cartridge capacity 8.5 terabytes uncompressed, and over 21 terabytes compressed (2.5 compression ratio). Compared with the current 4TB HDD, the tape can hold five times as much data in a much smaller space and with much lower environmental impact. The number of tape drives can be tailored to the expected access rate. Tape has two orders of magnitude greater reliability from data corruption relative to disk technology. The T10000D can check the validity of data by reading the data and checksums from within the tape drive. On balance, when it comes to long term retention for enterprise data, tape is more cost effective and reliable. As well, because it is a removable medium, tape is often the cheapest and fastest way to move data if needed (i.e. load tapes on a truck and drive them somewhere); especially as network costs can become onerous and moving data across the network can be time consuming.
Figure 2 compares the bandwidth capability of different tape technologies. Of the available technologies today, the Oracle T100000D has about the same bandwidth as the TS1140. The LTO-7 technology which is expected to appear in late 2014/early 2015 has even higher bandwidth, but future enhancements will keep the maximum bandwidth capabilities about the same across all tape drive technologies.
All the tape systems can also utilize LTO-6 tapes for interoperability between tape systems. Oracle StorageTek has the highest installed base of tape systems, and is in a good position to integrate tape effectively into the "Red Stack." Observers should expect Oracle to continue to push tighter integration of automated tape functionality to improve performance and recovery speeds.
Emerging LTFS Applications
Virtually all tape system vendors have focused on LTFS applications. Oracle has announced the LTFS Library Edition, which allows drag-and-drop movement between NAS systems spread over disk and tape. As an example of service providers monetizing tape, Oracle now has inked an OEM pact with Front Porch Digital, which provides Digital Asset Management solutions for the media & entertainment industry. Specifically, Front Porch Digital is now offering Oracle’s latest T1000D tape drive with Front Porch Digital’s DIVA digital asset management software suite and LYNX cloud offering. As Front Porch Digital is a prominent provider to the media & entertainment sector, this partnership will provide Oracle tape solutions to customers in need of cost-effective storage for their digital film content and unstructured file archives.
Quantum has recently announced a joint object storage solution with CommVault to provide a converged backup and archive system in the enterprise data center. It combines Quantum' Lattus™ object storage technology (based on Amplidata tech) and CommVault Simpana 10 software to optimize backup performance in multi-petabyte environments. Quantum Corp has announced a digital asset and end-to-end media flow management system, using Quantum's Lattus-M object storage and StorNext® appliances. The London-based Ark Post Production has recently implemented this system across multiple petabytes.
In general, tape has been selected as the ultimate storage medium for media, and LTFS is the foundation on which solutions from all tape vendors are providing solutions. Wikibon expects there to be significant amounts of ISV integration with tape systems that provide support for a multi-tier storage system. While market revenues are not likely to sustain dramatic growth, like the mainframe market of the 1990's, we expect tape will reach an equilibrium and, as its uses expand, it will continue to provide good profit opportunities for a select group of suppliers. This is critical as it ensures a continued R&D pipeline.
Tape vs. Disk Technology & Cost Projections
Flash has had a major impact on the investment of disk systems. Investment in high performance HDDs (10K and 15K disks) has been very significantly reduced. This type of disk is being replace by SSDs in both PCs and enterprise systems. The major disk technologies that have investment dollars available are high density slower disks. These will provide the large market for data "tubs" for PCs and enterprises data centers as active data migrates to semi-active data, with most of the metadata of such systems still remaining on flash storage.
Wikibon expects that tape cartridge storage density will grow significantly faster than disk storage density, and the bandwidth for ingestion and retrieval from tape will grow much faster for tape drives rather than disk drives. As semi-active data migrates to archive and non-active data, LTFS tape systems offer the lowest cost of long-term and short-term retention. Over the next decade, many installations will eventually migrate to a flash/tape integrated system with a minimum of disk drive technologies (a concept Wikibon has dubbed "Flape").
Action Item: Modern LTFS tape systems are more reliable than disk; have and will continue to demonstrate higher read/write bandwidth than disk; and have and will continue to deliver much higher density than disk. Modern tape systems when properly compared with disk systems are lower in CAPEX and OPEX for low and very low activity data. Tape is and will increasingly be the natural technology for low and very low activity data, where the access time for data is not critical and can be in seconds rather than milliseconds. A truck-full of tape cartridges is the black hole of storage technology, with by far the highest density of data, highest bandwidth and lowest cost of data movement. Applications that exploit LTFS are rolling out across different business segments, with media, archive and oil/gas in the lead. Tape technologies will exploit flash for metadata, and integrate with flash and disk systems as a natural component of enterprise and cloud storage infrastructure. The bottom line is IT organizations and cloud service providers should integrate tape as a fundamental component of IT-as-a-Service and cloud strategies to offer the most cost effective and reliable deep archive, compliance and last resort, offsite disaster recovery capabilities.
Footnotes: