Contributing author: Bill Mottram
Contents |
Introduction
Wikibon Energy Lab Validation Reports are designed to assist customers in understanding the degree to which a product contributes to energy efficiency. The four main goals of these studies are to:
- Validate the hardware energy efficiency of a particular technology as compared to an established baseline.
- Asses the potential contribution of software technologies to power savings, and validate the actual contribution in real world installations.
- Quantify the contribution of the hardware and software technologies to a green data center.
- Educate business, technology, and utility industry professionals on the impact of technologies on reducing energy consumption.
Our objective is to identify not only the hardware energy consumption but also the often overlooked and hard-to-quantify green software aspects of technologies. Wikibon Energy Lab Validation Reports are submitted to utilities such as Pacific Gas & Electric Company as part of an energy incentive qualification process.
Wikibon Energy Lab defines and validates the hardware testing procedures to determine the energy consumed by specific products in various configurations. As well, Wikibon reviews actual customer results achieved in the field to validate the effectiveness of these technologies based on real-world field-data analysis. These proof points are mandatory for the utility company to qualify a specific vendor's technology for energy incentives.
Wikibon Energy Lab Reports are not sponsored. Rather they are deliverables required by PG&E and other utilities as part of an incentive qualification process. As part of its Conserve IT Program, Wikibon is paid by the vendor to perform services associated with securing incentive rebates from utilities for end customers that acquire the vendor's technologies. To ensure this process is completely independent, Wikibon lab and field results are sometimes vetted by a third party engineering firm hired by PG&E or other utilities.
Wikibon only produces Lab Validation Reports for technologies that have been qualified for rebate incentives by PG&E or other utilities and have passed strict utility company guidelines. By adhering to this criterion, Wikibon assures its community of the independence of these results.
Executive Summary
Virtualization and thin provisioning are two powerful storage optimization techniques that significantly improve effective storage utilization. Compellent has leveraged these technologies to implement thin copying and automated tiered storage management to deliver what Wikibon believes is one of the most advanced and energy-efficient systems in the storage industry. The operational efficiencies of its architecture have enabled storage administrators to store more data on less physical hardware while delivering substantial operational productivity and improved energy efficiencies.
Figure 1 below illustrates the impact of virtualization and thin provisioning on storage utilization.
Figure 1: Compellent Thin Provisioning and Virtualization Technologies
The purpose of this report is to look at how this technology impacts the operational costs of power and cooling and present a detailed analysis of these factors for customers.
Description of Virtualization and Thin Provisioning
The fundamental building block of storage systems is a storage volume. This can be (and is usually) smaller that a disk drive, but can also be bigger and span multiple storage devices. The operating system “mounts” a storage volume to be able to access the data on it. For example, a PC has one or more disk drives. A number of logical drives (equivalent to a storage volume) can be made from these disks drives (C: drive, D: drive, etc.) and space allocated to them when they are created. External volumes can be accessed from a PC over a LAN or WAN.
This approach has several problems:
- Because the space is allocated when the storage volume is created; a large amount of space is initially allocated but not used.
- Over time, the data becomes fragmented as (for example) when data is deleted but the space is not released.
- It is difficult to move volumes around to optimize storage, as the application has to be stopped while the volume is moved. This limits the ability to reorganize the data and take advantage of spare capacity on the disk drives.
- When a copy of a volume is required, the whole storage volume has to be copied.
Virtualization breaks the connection between what is written, what is allocated, and what is unused. This concept was applied first by IBM in the 1960s to the storage on the CP/67 server operating system, which later became VM with virtual disks.
Compellent has integrated virtualization into the storage operating system. Virtualization allows a number of techniques to be used to reduce the amount of data that is actually stored:
Thin Copying
Traditionally the application using the volume must be stopped while the volume is copied because updates may be written to the volume during the copy process that then are not included in the copy. When the copy is over (this can take a significant time for large volumes), the application can be restarted.
A more modern way of copying an active volume is to take a “Snap copy.” The storage controller starts to copy the data from one volume to another and caches all the writes. It then ensures that the data is updated appropriately in both volumes until the volume copy is complete, when the target (new) volume is automatically disconnected. This allows the application to continue but still requires multiple copies of the volume to be stored on disk and still requires a large number of disk actuator movements to accomplish.
Thin copying is a different technique which creates a virtual copy that is only updated with the original data if the source (original) volume is updated. The average amount of data that is updated on a volume during a 24-hour period is usually small (<5%), which means that multiple copies can be made over time with a small overall overhead and in a much shorter time.
Thin Provisioning Figure 1 below diagrams how thin provisioning works. The server believes that it has the storage allocated to it in a storage volume, just as it would in the traditional (fat) provisioning environment. The allocation is virtual, however, and real physical space is dedicated only when data is written to the volume. Compellent implements this capability using a Dynamic Block Architecture, which records and tracks specific information about every block of data (variable in size) that provides the system intelligence on how that block is being used. Information about the blocks includes the time written, the type of disk drive used, the type of data stored, RAID level, etc. All of this metadata or “data about the data” enables Storage Center to take a more sophisticated approach to minimizing the amount of data and ensuring that the data is held on the most efficient device. These dynamic blocks are held in a common storage pool and dedicated to a volume when it is written. As the pool serves a large number of volumes, the probability that all the storage will be requested together is vanishingly small. Compellent has a thin import feature that allows migration of data without thin provisioning to be copied with thin provisioning to the newly installed Compellent drives. It also has a feature called "free space recovery" that allows automatic recovery of deleted storage space on Windows® FAT drives; it runs as a background task in the array.
One unique feature of Compellent’s approach, compared with most other thin provisioning providers, is that data within a volume can be automatically spread across many different types of drive, according to the performance requirements of the blocks of data. This enables automatic optimal placement of data within a volume with no manual intervention (Figure2).
Figure 2 - Diagram of Thin Provisioning. Source: Compellent Thin-Provisioning Software Products, downloaded December 2008
Thin provisioning does not work with all operating systems and file systems; care has to be taken in setting the correct parameters to avoid allocating a large amount of space to a server, and finding that the file system or database has written on all the space available. As storage virtualization and thin provisioning has matured, the software developers have also ensured that their software can take advantage of thin provisioning.
Measurement Methodology for Power and Results
Baseline The baseline for the power savings is the storage array configuration required if virtualization and thin provisioning software is not installed on the storage array. The power savings are calculated from the power requirements of the baseline array less the power requirements of the installed array.
Equipment Measured: Wikibon reviewed the power measurements made on Compellent’s Series 20 and 30 storage arrays. Table 1 below in the “Measurement Methodology for Power and Results” section gives the summary and detailed results of the measurements.
Power Measurements Wikibon believes that the benchmarks used and the power measurements made were done professionally and in good faith, and are within a 5% accuracy level. In the opinion of Wikibon, this measurement represents a good and reasonable estimate of the power consumed in the real world applications across a number of drives for “typical” combinations of applications found in a data center.
Measuring Equipment Used: The equipment used to measure the equipment was a Kill-A-Watt EZ Model P4460. The measurements were completed in October 2008.
Location of Testing The testing was done in Compellent’s development facility at 7625 Smetana Lane, Eden Prairie, MN 55344-3712 (tel: 952-294-3300). The testing was overseen by Lawrence E. Aszmann, CTO (tel: 952-294-3303).
Methodology The key component of a storage array that varies power consumption with different workloads is the drive enclosure, because changes in I/O rate govern the actuator movement on the drive. The Compellent 20 and 30 series contain up to 16 drives per enclosure. Most of the measurement effort was dedicated to the drive enclosure to measure power consumption with different drive types and workloads. The relevant properties of the disk drives are:
- Capacity - The amount of data held, measured in gigabytes (GB), where one GB is equivalent to 1,000 million bytes.
- Spin Speed - The rotational speed of the disk, measured in rpm (e.g., 15K = 15,000 revolutions per minute).
- Transfer Rate - The maximum transfer rate from the disk, measure in gigabits (Gb)/sec, where one gigabit = 1,000 million bits.
- Interface – The means by which the drive connects and exchanges data with a host computer. Compellent arrays support either FC (Fibre Channel) or SATA (Serial Advanced Technology Attachment) interfaces.
The measurements were made with three different workloads:
- Sequential read,
- Random read and write,
- Idle (no data being transferred).
The three benchmarks were averaged to produce an overall figure of power consumption for the Compellent arrays. In Wikibon’s opinion, this measurement represents a good and reasonable estimate of the power consumed in the real world applications across a number of drives for “typical” combinations of applications found in a data center, and are within a 5% accuracy margin.
Verification Compellent’s storage arrays have software running in the array which tracks the benefits that thin provisioning and virtualization. Details are given in the section below called “Confirmation of Energy Savings with Virtualization and Thin Provisioning”.
Power Measurements Summary Wikibon reviewed the power measurements made on Compellent’s Series 20 and 30 storage arrays using a 60 cycle 120vac single-phase power-source. These arrays are built from standard components, and this enables the power of different configurations to be calculated. The measurements were made at the component level. Table 1 opposite gives a summary of the results of the measurements.
Drive Measurement Results
The power results from the SATA drive measurements taken by Compellent are detailed in Table 2. The results of the analysis allow the power consumption of each of the drive
components (drive enclosure and each disk drive type) to be calculated.
Table 3 summarizes the power results from the Fibre Channel (FC) drive measurements taken by Compellent. The power source was 60-cycle 120vac single-phase. The results of the analysis allow the power consumption of each drive component (drive enclosure and each disk drive type) to be calculated. The 146GB 15K drive was not directly measured but interpolated from the other FC measurements.
Controller Measurement Results
The power characteristics of the Compellent Series 20 & 30 controller nodes are given in Table 4. Using a 60-cycle, 120vac, single-phase power-source. 7Seventy-two hour battery backup for cache is included in the controller measurements. The final averages are used in table 1.
Confirmation of Energy Savings with Virtualization and Thin Provisioning Technology
Compellent has a report which shows the savings of disk space made by virtualization and thin provisioning. A sample output is shown in Figure 3 below.
Figure 3 - Output from Compellent Array Management Software that Confirms the Savings from Virtualization and Thin Provisioning
The report can be run to confirm the disk savings that have been achieved from the installation of virtualization and thin provisioning.
Virtualization & Thin Provisioning Measurements
Virtualization and thin provisioning are relatively new technologies that significantly reduce the amount of storage that is required, and improve the utilization of that storage. A detailed description of the technology is given in Appendix I. The purpose of this section of the Energy Lab report is to estimate the level of saving that can be achieved with virtualization and thin provisioning.
Compellent offers a “Call-Home” program that is used to help predict a component failure. A large percentage of its customer base subscribes. As part of the prevention analysis, the system captures metadata about disk usage and function usage from the service processor in each array connected to the system and each volume created on the arrays. The database allows this to be looked at on a customer basis, taking an average for all the volumes at each customer. This is a good way to look at the data, as the way a customer organizes volumes is usually consistent within an organization.
One key metric is the improvement in utilization that virtualization and thin provisioning achieve. Each volume has a logical size (what the server sees) and a physical size (what has actually been written). In traditional storage arrays, the physical and logical sizes are very similar. With virtualization and thin provisioning, the physical size is usually significantly smaller than the logical size.
Wikibon performed a statistical analysis of the data from 115 customer installations and found that the actual physical disk required to hold the data was only 33% of the logical volume size. From an energy point of view, 66% of the disks were not required. However, a closer analysis of the data showed that a few of the customers had very large gains. These customers are finding new way of exploiting virtualization and thin provisioning. For example, customers were making multiple thin copies of data many times an hour, so that they could quickly recover back to an earlier copy of data if data was corrupted. This is very efficient with thin copying, because only changed data has to be written. The amount of data saved can be very high indeed.
However, from an energy rebate point-of-view, this is not a saving because the customer was not making physical copies before. To estimate the real power savings, Wikibon assumed that customers with savings in disk space of 50% or more had changed the procedures on how they were storing data. Wikibon took the subset of customers (n=42) that had less that 50% savings. The weighted average saving was 35%. Wikibon believes that this figure is a good predictor of the direct reduction in disk capacity that will be found in customers.
Wikibon did a detailed analysis of two customer installations that were using Compellent’s storage arrays with virtualization and thin provisioning. The results of this analysis are shown in table 5 below.
The average in Table 5 is 37%, which is in line with previous findings. The default figure of 35% has been used in the Wikibon Energy Lab Power Calculator. This calculator can be used to determine the direct power savings from virtualization and thin provisioning as well as the indirect savings from power distribution and air conditioning. The overall business case is calculated.
The Compellent technology is qualified for incentives by PG&E, and qualification by other power utilities is in progress.
Action Item:
Footnotes: