Storage bits fail, and as a result storage protection systems need the ability to recover data from other storage. There are currently two main methods of storage protection:
- Replication,
- RAID.
Erasure coding is the new kid on the block for storage protection. It started life in RAID 6, and is now poised to be the underpinning of future storage, including cloud storage.
An erasure code provides redundancy by breaking objects up into smaller fragments and storing the fragments in different places. The key is that you can recover the data from any combination of a smaller number of those fragments.
- Number of fragments data divided into:......m
- Number of fragments data recoded into:.....n (n>m)
- The key property of erasure codes is that the stored object in n fragments can be reconstructed from any m fragments.
- Encoding rate..........................................r = m/n (<1)
- Storage required......................................1/r
Replication and RAID can be described in the erasure coding terms.
- Replication with two additional copies........m = 1, n=3, r = 1/3
- RAID 5....................................................m = 4, n = 5, r = 4/5
- RAID 6....................................................m = n - 2
When recovering data, it is important that it is know if any fragment is corrupted. m is the number of verified fragments required to reconstruct the original data. It is also important to identify the data to ensure immutability. A secure verification hashing scheme is required to both verify and identify data fragments.
It can be shown mathematically that the greater the number of fragments, the greater the availability of the system for the same storage cost. However, the greater the number of fragments that blocks of data are broken into, the higher the compute power required.
For example, if two copies (r = ½) provided 99% availability, 32 fragments with the same r (16/32) and therefore the same amount of storage would provide an availability of 99.9999998%. You can find the math in a paper by Hakim Weatherspoon and John D. Kubiatowic.
Figure 1 below shows the topology of a storage system using erasure coding and RAIN, a Redundant Array of Inexpensive Nodes.
Wikibon believes that storage using erasure coding with a large number of fragments will be particularly important for cloud storage but will also become used within the data center. Archiving will be an early adopter of these techniques.
Action Item: All storage professionals will need to be familiar with erasure coding and the trade-offs for data center and cloud storage.
Footnotes: The ideas in this post are used in two other posts on erasure coding:
- Erasure Coding Revolutionizes Cloud Storage, Wikibon 2011;
- Reducing the Cost of Secure Cloud Archive Storage by an Order of Magnitude, Wikibon 2011.