The Wikibon Project continues to identify strategies, technologies, and best practices for managing the business and environmental impact of rising energy consumption, and the consequences of not taking proactive steps to address these challenges.
Reducing storage energy consumption and possibly even energy costs are usually thought of in terms of improving the hardware infrastructure by shrinking footprint or by replacing aging UPS, chillers, and air conditioning gear with newer and more efficient technologies. While these will help address the mounting energy challenges facing today's data centers, other strategies also need to be considered.
Beyond the issues of hardware infrastructure, focus has been shifting to a strategy of “getting rid of stuff.” Clearly server virtualization and storage consolidation have received much attention in the past year. What about getting rid of duplicate or redundant data? What about improving poor storage device utilization? What about optimizing the location of data using the tiered storage hierarchy? It is a matter of taking proactive versus reactive steps to minimize the amount of stored data.
Businesses are re-architecting their storage systems with a variety of new capabilities that improve storage efficiency by reducing the number of devices, improving I/O performance, increasing allocation levels, and adding stronger security measures. Each of these gets rid of stuff in a different way, directly attacking the goal of making the data center greener. An expected range of storage savings is provided below.
Storage Issue | Solution | Avg Savings |
---|---|---|
Too many devices | Virtualization/Consolidating storage | Varies |
Over-allocation of storage | Thin provisioning | 25%-30% |
Removing redundancy | Compression | 50%-66% |
Eliminating duplicate data | Deduplication | >80% |
Optimizing data placement | HSM software | 10%-50% |
Short-stroke disks (IOs) | Flash disk drives | 5%-10% |
Although available on mainframes since 1965, thin provisioning is now a staple for non-mainframe systems and allows physical disk storage to be reserved only when data is written, not when the application is first configured. In traditional storage provisioning, application teams guess at how much storage they might consume and reserve that full amount on day one. The amount reserved can be used only by that application and is not available to others. This means that within a data center, large portions of costly storage go unused, and consume power and cooling resources even though no data is actually written to the disk.
Compression and deduplication reduce data in different ways. With compression you are using an algorithm to reduce the size of a particular file by eliminating redundant bits. If the exact same file is stored multiple times, no matter how good your compression method is your backup storage will end up with multiple copies of the compressed files. Typical compression ratios range from 2x to 4x. File deduplication eliminates redundant data copies, storing only one, reducing data for many applications beyond what compression can accomplish alone. Proactive data reduction capabilities are expected in a future generation of deduplication technology, moving the process from a reactive to a proactive data reduction approach that serves a larger range of file types.
Flash disk drives are low energy consumers and are becoming a viable solution to traditional short-stroke or partially allocated disks that improve performance by reducing disk arm contention. The high performance applications suited for flash memory can represent as much of 10% of disk storage at current flash pricing levels. Flash SSDs are non-volatile; have low power consumption, much higher read performance than magnetic disk, low heat output, and a small size with pricing moving into the disk drive realm.
Classifying data has immense value in enabling and identifying what needless data to eliminate. All data is not created equal, and the value of data can change throughout its lifetime. For non-mainframe systems, data classification can be a big step, but it is quickly becoming an essential one for most data management functions. Data classification encompasses aligning data with the most efficient cost-effective storage architectures and services based on the changing value of data. Defining policies to map application requirements to storage tiers is assisted by existing HSM type software as well as emerging data classification and policy-based management software from a variety storage software companies.
Action Item: Though challenging and becoming increasingly complex as storage requirements grow, re-architecting the data center yields much improved operational efficiencies and cost savings. Technologies including HSM (Hierarchical Storage Management), deduplication, thin provisioning, virtualization, and snapshot copies can play a role but at the end of the day the root issue is the ability to defensibly get rid of useless data. This means you have to be able to make smart decisions about your information, and that starts with the ability to classify it. Soon data classification will become a requirement for most enterprise and SMB data centers if they are to have any hope of managing their ever growing data. The time to start is now.
Footnotes: