The following realities are facing forward-looking storage administrators:
- Bit error rates for disk drives will soon reach parity with drive capacities, meaning that every disk drive will lose some data.
- The negative impact of a bit error is magnified by the use of encryption, compression, and de-duplication techniques, all of which are on the increase.
- RAID-rebuild times are lengthening to the point where the probability of a second drive failure during the RAID-rebuild process is becoming unacceptably high, and the impact on application service levels is becoming too great.
- Capacity growth and budget limitations will force companies to leverage more cost-effective, high-capacity drives for most data, but much of this data can not be protected from a site loss using traditional methods because the high cost of band-width and the limited size of the data pipes prevent companies from replicating it to multiple sites in a timely fashion.
- RAID will not sustain the protection of data in the areas of greatest data growth: unstructured data and massive, structured-data repositories.
A variety of approaches will be applied over the coming years to enable companies to continue to use the current approach of RAID, synchronous metro-area replication, and remote asynchronous replication for data protection in latency-sensitive transaction databases. These may include the use of double-parity RAID protection, leveraging smaller form-factor and lower-capacity drives, increasing the ratio of controllers to drives, and de-duplicating or compressing data before transmitting to the remote site. For most organizations, this is not an area of significant concern. The real challenge for organizations is how to cost-effectively protect unstructured data and massive structured-data repositories from the near-certainty of bit errors on drives and from the less-probable but catastrophic impact of a data-center loss.
Many organizations have gone through massive data-center consolidation initiatives and reduced the total number of data centers into a few, regional super centers. When discussing data centers to support an organization’s transaction systems, a case can be made for consolidating down to as few as two data centers. For the applications experiencing the greatest data growth, however, data-dispersal methods leveraging the organization’s existing, dispersed infrastructure together with cloud-based offerings may be used to an advantage.
Action Item: Before consolidating unstructured data and structured-data repositories into fewer, larger super-centers, organizations should establish a group to evaluate new data-dispersal approaches to ensure data reliability and integrity. This group should be prepared to evaluate the data-protection approaches which can be applied not only to their own organization and data centers but also to cloud-storage providers. Ultimately, the data protection approach needs to match the service level agreement (SLA) for the application, and the ability of a storage system or service to meet an SLA should be evaluated both when all components of the system are working and also during points of inevitable component failure.
Footnotes: