David Vellante with Eric Peterson and Nathan Thompson
SaskEnergy is a corporation controlled by the Saskatchewan government, which delivers natural gas to more than 90% of the province, more than 327,000 customers. A few years ago, SaskEnergy had a problem. Its data center was divided into AIX and Wintel server islands. Each server had its own directly attached tape drive that was used nightly for full disk backups. This approach was expensive and time consuming, with backups requiring an eight-hour window to complete.
Compounding the challenge, SaskEnergy policy dictates that data be retained for five years, which increased the costs of daily full disk backups. The key business drivers for the project were to reduce costs, improve backup efficiency, and provide a more scalable and dynamic backup and restore infrastructure that could evolve over time.
The Solution
SaskEnergy required an enterprise solution that would allow it to reduce costs, speed backups, and maintain fast recovery. After reviewing several vendor offerings, it developed the following solution:
- Use Tivoli Storage Manager to provide an ‘incremental forever’ methodology, backing up only changed data;
- Perform backups of large disk pools to a temporary disk-based staging area, migrate data from disk to tape (locally), then move the tapes offsite for long-term retention;
- Use a SpectraLogic library with LTO4 and SAIT (Super Advanced Intelligent Tape) tape technology onsite and Spectralogic’s T120 library with LTO4 and SAIT technology for offsite archiving to take advantage of cartridge capacities in excess of 500GB uncompressed.
SaskEnergy’s five-year retention requirement mandated an extremely high density solution optimizing density in a single rack with a long life. From a cost perspective, disk was not an option for long-term retention, and the T950 combined with LTO4 (800GB uncompressed) and SAIT tape technology fit the requirement.
The tradeoff of this approach is that while perpetual incremental backups are efficient, simple, and fast when restoring recent files, restoring full server images requires 'touching' large numbers of tapes containing incremental backups and therefore is significantly more time-consuming and expensive. To reduce the risk and hassle of recovering a full system image from incremental backups, SaskEnergy performs a full image archive each month, limiting the maximum incremental window to 31 days.
Users should also note that systems like email (e.g. Notes) require a different methodology to be able, for instance, to restore a mailbox of a specific user quickly. SaskEnergy uses TSM for Notes, making a full backup once a week with daily incrementals, to solve this problem.
The Implementation
SaskEnergy started the implementation with AIX, using a methodology to directly attach to the AIX fibre channel SAN. This enables the backup of large disk pools in a few hours with minimal application shut down. Over time, this approach was applied to SaskEnergy’s Intel infrastructure.
Daily backups average 900GB/day (up from 500GB as recently as late last year) with a specialized Notes backup driving more than 1TB for each full backup (once a week)(which can be done on one or two cartridges), with daily incremental backups. To date, most recoveries have been done from the previous night’s changed files, and SaskEnergy has thus far not had to perform a full server recovery.
The Tivoli learning curve was a major implementation challenge. Tivoli Storage Manager is extremely powerful with lots of options. The tradeoff is that effectively implementing the solution requires a focused effort to understand the environment and set the numerous ‘knobs and dials’ to optimize and automate the system. In such a project, organizations are advised to budget for and/or train a domain expert in the area of automated storage management.
Going Forward
In recent years, the tape industry has seen disk encroach on its traditional backup and recovery domain, limiting industry revenues, including media revenue, to around $4B worldwide. However, LTO, higher tape cartridge capacities, longer media life and better reliability will allow the technology to compete effectively for tier 3 archiving and recovery applications. Indeed, estimates based on access and capacity requirements estimate that 60% of digital data is candidates for placement on tier 3 storage.
The perpetual incremental backup approach used by SaskEnergy provided the following benefits to the organization:
- It reduced backup windows from eight to two hours for some applications and minutes for many.
- It supported a 24 x 7 strategy by eliminating the need to quiesce Oracle and other applications during backups.
- It reduced costs by consolidating backup devices across the province.
- It reduced media costs by moving to cartridges with capacities of 50GB to 1TB+.
- It enabled full on and off site data retention, compliant with SaskEnergy’s policies.
SaskEnergy made a sound business decision given the retention requirements imposed on the IT department. It chose to use tape as a long-term archival technology which proved to be vastly less expensive and more efficient than disk-based alternatives. The bottom line is the experience of SaskEnergy and others underscores that tape is not dying but that its role is changing to one of a premier long-term archival technology for data that one will hopefully never be accessed.
Action Item: Incremental forever backups may sound crazy. However, organizations with very long retention requirements should consider this philosophy. The perpetual incremental approach is best for storing data where the likelihood of ever having to recover older data is very low. In this instance, for the next ten years tape will be the most cost-effective and efficient technology.
Footnotes: