Contributing author: Bill Mottram
Introduction
Wikibon Energy Lab Validation Reports are designed to assist customers in understanding the degree to which a product contributes to energy efficiency. The four main goals of these studies are to:
- Validate the hardware energy efficiency of a particular technology as compared to an established baseline.
- Assess the potential contribution of software technologies to power savings, and validate the actual contribution in real world installations.
- Quantify the contribution of the hardware and software technologies to a green data center.
- Educate business, technology, and utility industry professionals on the impact of technologies on reducing energy consumption.
Our objective is to identify not only the hardware energy consumption but also the often overlooked and hard-to-quantify green software aspects of technologies. Wikibon Energy Lab Validation Reports are submitted to utilities such as Pacific Gas & Electric Company as part of an energy incentive qualification process.
Wikibon Energy Lab defines and validates the hardware testing procedures to determine the energy consumed by specific products in various configurations. As well, Wikibon reviews actual customer results achieved in the field to validate the effectiveness of these technologies based on real-world field-data analysis. These proof points are mandatory for the utility company to qualify a specific vendor's technology for energy incentives.
Wikibon Energy Lab Reports are not sponsored. Rather they are deliverables required by PG&E and other utilities as part of an incentive qualification process. As part of its Conserve IT Program, Wikibon is paid by the vendor to perform services associated with securing incentive rebates from utilities for end customers that acquire the vendor's technologies. To ensure this process is completely independent, Wikibon lab and field results are sometimes vetted by a third party engineering firm hired by PG&E or other utilities.
Wikibon only produces Lab Validation Reports for technologies that have been qualified for rebate incentives by PG&E or other utilities and have passed strict utility company guidelines. By adhering to this criterion, Wikibon assures its community of the independence of these results.
Executive Summary
Virtualization and Thin Provisioning Technology
Hitachi has implemented virtualization and thin provisioning in its USP V and USP VM series of storage arrays. Virtualization and thin provisioning technologies help solve the problem of the low utilization of storage by allowing capacity on different disk drives to be logically joined together into a storage pool. This capacity is shared by many servers and applications. In addition, only the data that is actually written is stored on disk; the data that is allocated but not written does not use storage capacity. The power savings from this technology are realized because the number of spinning hard disks required is significantly reduced by (for example) 50% by increasing the data utilized from 35% to 70%, as shown in figure 1 below. The expected figure for incentive applications is a 35% reduction in spinning drives, as discussed in section “Calculation of Energy Savings” below.
Hitachi have introduced one additional function on the UPS V/VM series; it allows almost any storage array to be attached to it, and can provide the same virtualization and thin provisioning functionality to this “external” storage. This means that the same benefits of better storage utilization can be applied to both the storage inside the Hitachi USPV/VM arrays and to any storage attached to these arrays.
Baseline
The baseline for the power savings is the storage array configuration required if the virtualization and thin provisioning software was not installed on the storage array. The power savings are calculated from the power requirements of the baseline array less the power requirements of the installed array.
Power Measurements
Wikibon reviewed the power measurements made on Hitachi’s USP and USP VM storage arrays. These arrays are built from a series of common components, and enable the power of different configurations to be calculated. Hitachi has made a large number of measurements, and put them into an Excel weight and power calculator, which is updated regularly and is now available on the Hitachi web site. Under Wikibon’s guidance, Hitachi ran tests on Hitachi equipment to test the accuracy of the calculator V8.6rb. As a result of this exercise, Wikibon is confident that the power calculator will measure the actual power used by the Hitachi storage arrays within 5% accuracy. The details are in the section “Measurement Results” below.
Power Savings from Virtualization and Thin Provisioning
Wikibon looked at typical savings from Hitachi customers using virtualization and thin provisioning, as well as extensive data collected from a broad group of end users who have implemented thin provisioning and submitted to PG&E to prove the energy efficiency effectiveness of this technology. The overall conclusion is that the average saving from virtualization and thin provisioning results in about one third reduction in the number of disks required. This leads to an approximate one third reduction in the power required. Full details are given in the section below “Virtualization & Thin Provisioning Measurements”.
Verification
Hitachi’s storage arrays have software running in the array which tracks the benefits that thin provisioning and virtualization give in storage reduction. Details are given in the section below “Confirmation of Energy Savings with Virtualization and Thin Provisioning”.
Conclusion
These results used in combination with the power measurements and power calculator of Hitachi’s equipment allow a quick and accurate calculation of the actual power that would be consumed by Hitachi’s storage array products in a data center environment, and the likely power savings that virtualization and thin provisioning will achieve. This estimation can be verified if necessary at any time after the installation. Wikibon has created a calculator that will allow the incentives to be quickly calculated. The output from a sample customer is shown in Table 2 in the section “Calculation of Energy Savings and Incentives”.
Measurement Methodology for Power
The key component of a storage array that varies power consumption with different workloads is the drive bay, because changes in I/O rate changes the actuator movement on the drive. Most of the measurement effort was dedicated to the drive bay to measure power consumption with different workloads. The relevant properties of the disk drives are:
- Capacity - The amount of data held, measured in gigabytes (GB), where one GB is equivalent to 1,000 million bytes
- Spin Speed - The rotational speed of the disk, measured in rpm (e.g., 15K = 15,000 revolutions per minute)
- Transfer Rate - The maximum transfer rate from the disk, measure in gigabits (Gb)/sec, where one gigabit = 1,000 million bits
- Interface – The means by which the drive connects and exchanges data with host computer. Hitachi arrays support either FC (Fibre Channel) or SATA (Serial Advanced Technology Attachment) interfaces.
The measurements were made with three different workloads:
- Sequential Read, with a 256K block size
- Random Write, with a 16K block size
- Idle (no data being transferred)
Read benchmarks for sequential were used for practical reasons, specifically it is much quicker to reset after a benchmark is run which accelerates the preparation for the next run. The main power consumed by disk devices is actuator movement and media rotation, which are similar for reads and writes.
The three benchmarks were averaged to produce an overall figure of power consumption for the Hitachi arrays. In Wikibon’s opinion, this measurement represents a good and reasonable estimate of the power consumed in the real world applications across a number of drives for “typical” combinations of applications found in a data center.
Hitachi has done a large number of measurements, which are included in the Hitachi Weight and Power Calculator. The objective of the measurements was to validate the power calculator. Using the power calculator will simplify the processes for Hitachi & PG&E customers when incentive applications are made. The results of the comparisons are given in the Measurement Results section below.
Measurement Results
Equipment Measured
The Hitachi storage arrays measured were the Hitachi USP V storage array with 1024 drives.
Location of Testing
The testing was done in Hitachi’s facility at 750 Central Expressway, Santa Clara, CA 95050. The testing was overseen by Kint Lodewijk, Operations Manager.
Measurements
Figure 2 below shows the results of measurements of a 1024 drive configuration (146 GB drives) with 16 spares using idle and random write workloads. The kVA for idle is 25.31, and the random writes for 1024 drives is 26.90.
Figure 3 below shows the results of measurements of a 1024 drive configuration (146 GB drives) with 16 spares using idle and sequential write workloads. The kVA for idle is 25.31, and the sequential writes for 1024 drives is 25.59.
A comparison with the Hitachi Weight and Power Tool is given in the Table 1 below. It shows that the difference is 1.1%, and supports the statement that any difference between the tool and an actual measurement is less than 5%.
The version of the tool used to compare the results for the equipment used was V8.6rb. This tool is being constantly updated as new equipment and components become available, and the newest version will be used to evaluate PG&E incentive applications.
Wikibon concludes the Hitachi Power Calculator is very accurate, and can be used as a basis for estimating the power consumption for the equipment to be installed under the PG&E incentive programs, and for calculating the power saved from virtualization and thin provisioning from the reduction in the number of disk drives required.
Calculation of Energy Savings and Incentives
The Hitachi Weight and Power Calculator is an accurate tool which allows the power consumption of different configurations to be compared with each other. In the previous measurement section, Wikibon concluded that the accuracy was within 5%.
The availability of this tool makes the calculation of incentives much easier. The power consumption for the storage array configuration as will be installed is calculated. Figure 4 & 5 show the first two input screens.
The output from the model is shown in Table 2 below. The key figure is the power requirement (kVA).
The tool can then be re-run with the additional drives that would be required to be deployed if virtualization and thin provisioning were not available. The default saving in the number of drives required would be 35%. The difference between the power savings would be entered into the Wikibon Energy Lab Power Calculator, together with the software costs for Virtualization and thin provisioning. The Wikibon Power Calculator would then the power savings, include the HVAC power saving calculated according to the PG&E methodology, and calculate the incentive payments recommended. The output would be attached to a submission to PG&E for an incentive.
Wikibon created a Wikibon Energy Lab Power Calculator to calculate the benefits and incentives of virtualization and thin provisioning. A sample customer input of the installation of a Hitachi USP V controller with 300 drives. Table 3 shows the output of Wikibon Energy Lab Power Calculator (Hitachi V&TP V1). The inputs were the power usage of the proposed system, and the alternative system without the Hitachi virtualization and thin provisioning software suite, and with additional drives to compensate. Using the power consumption figures from Figures 1 and 2 as inputs, and using an overall reduction of disk drives of 35% as discussed in the previous section, the output of the Wikibon Power Calculator shows the energy savings over $28,962 over the life of the project.
Note: Assumes 3% inflation rate over 5 years
Confirmation of Energy Savings with Virtualization and Thin Provisioning Technology
Hitachi has a report as part of the thin provisioning software which shows the savings of disk space made by virtualization and thin provisioning. A sample output is shown in Figure 6 below.
Table 4 takes the data from the Figure 6 above and calculates the overall disk savings, which in this example from a customer is 34%. This is in line with the 35% assumption made in the executive summary.
The report can be run on request of PG&E to confirm that the disk savings that have been achieved from the installation of virtualization and thin provisioning. This is expected to be used to verify the early installations, and then be used again only for PG&E audit purposes.
Virtualization & Thin Provisioning Measurements
Virtualization and thin provisioning are relatively new technologies that significantly reduce the amount of storage that is required, and improve the utilization of that storage. A detailed description of the technology is given in Appendix I. The purpose of this section of the Energy Lab report is to estimate the level of saving that can be achieved with virtualization and thin provisioning.
An extensive analysis of 3PAR’s thin provisioning was given in an earlier report to PG&E. 3PAR (and Hitachi) offer a “Call-Home” program that is used to help predict a component failure. It is subscribed to by a good percentage of its customer base. As part of the prevention analysis, the system captures metadata about disk usage and function usage from the service processor in each array connected to the system. There is metadata about each volume that has been created on the arrays. The database of this metadata allows this to be looked at on a customer basis, taking an average for all the volumes at each customer. This is a good way to look at the data, as the way a customer organizes volumes is usually consistent within a customer.
One key metric is the improvement in utilization that virtualization and thin provisioning achieve. For each volume there is a logical size (what the server sees) and a physical size (what has actually been written). In traditional storage arrays, the physical and logical sizes are very similar. With virtualization and thin provisioning, the physical size is significantly smaller than the logical size.
Wikibon performed a statistical analysis of this data base by installation and found that the actual physical disk required to hold the data was only 33% of the logical data (# customers = 115). From an energy point of view, 66% of the disks were not required. However, a closer analysis of the data showed that a few of the customers had very large gains. What is happening is that customers are finding new way of exploiting the virtualization and thin provisioning. For example, customers were making multiple thin copies of data many times and hour, so that they could recover back very quickly to an earlier copy of data if data was corrupted. This is very efficient with thin copying, because only changed data has to be written. The amount of data saved can be very high indeed.
However, from an energy rebate point of view, this is not a saving in energy, because the customer was not actually making physical copies before. To estimate the real power savings, Wikibon assumed that customers with saving in disk space of 50% or more had changed the procedures on how they were storing data. Wikibon took the subset of customers (n=42) that had less that 50% savings. The weighted average saving was 35%. Wikibon believes that this figure is a good predictor of the direct reduction in disk capacity that will be found in customers. This is the default figure that has been supplied in the PG&E spreadsheet for both 3PAR and Hitachi.
Wikibon believes that based on an analysis of Hitachi and 3PAR customers and technologies, that while there may be differences between the specific results gained by individual users of 3PAR’s virtualization/thin provisioning versus Hitachi’s virtualization/Dynamic Provisioning, from an energy savings standpoint both vendors' technologies will deliver substantial average utilization improvements over arrays without such technologies.
Appendix I – Description of Virtualization and Thin Provisioning Storage Technologies
The fundamental building block of storage systems is a storage volume. A storage volume can (and is usually) smaller that a disk drive, but can also be bigger and span multiple storage devices. The operating system “mounts” a storage volume to be able to read only or read/write to that volume. For example, a PC has one or more disk drives. A number of logical drives (equivalent to a storage volume) can be made from these disks drives (C: drive, D: drive, etc) and space allocated to them when they are created. The space on the disk drives is formatted and allocated to these logical drives when they are created. External volumes can be accessed from a PC over a LAN or WAN.
There are number of problems with this approach.
- The Space is allocated when the storage volume is created; a large amount of space is initially allocated but not used
- Over time, the data becomes fragmented as (for example) when data is deleted, but the space is not released
- It is difficult to move volumes around to optimize storage, as the application has to be stopped while the volume is moved. This limits the ability to reorganize the data and take advantage of spare capacity on the disk drives
- When a copy of a volume is required, the whole storage volume has to be copied
Virtualization breaks the connection between what is written, what is allocated, and what is unused. This concept was applied first by IBM in the 1960’s to the storage on the CP/67 server operating system, which later became VM with virtual disks.
Hitachi (and other vendors) has integrated virtualization into the storage operating system. Virtualization allows a number of techniques to be used to reduce the amount of data that is actually stored, the most important of which is thin provisioning
Thin Provisioning Hitachi market thin provisioning as “Dynamic Provisioning”. The figure below give a diagram of how thin provisioning works. As discussed above, the server believes that it has the storage allocated to it in a storage volume, just as it would in the traditional (fat) provisioning environment. The allocation is a virtual allocation, and real physical space is dedicated only when some data is actually written to that volume. The data written are held in a common storage pool, and dedicated to a volume when it is written. As the pool serves a large number of volumes, the probability that all the storage will be requested together is vanishingly small.
(Source: http://www.hds.com/assets/img/storage-software/hdp-diagram.gif, downloaded December 10, 2008)
Thin provisioning does not work with all operating systems and file systems; care has to be taken in setting the correct parameters to avoid allocating a large amount of space to a server, and finding that the file system or database has written on all the data available. As storage virtualization and thin provisioning has matured, the software developers have also ensured that their software can take advantage of thin provisioning.
Disclaimer
This report was prepared by Wikibon. Reproduction or distribution of the whole, or any part, of the contents of this document without written permission of Wikibon is prohibited. Neither Wikibon nor any of its employees make any warranty or representations, expressed or implied, or assume any legal liability or responsibility for the accuracy, completeness, or usefulness of any data, information, method product or process disclosed in this document, or represents that its use will not infringe any privately-owned rights, including, but not limited to, patents, trademarks, or copyrights. This report uses preliminary information from vendor data and technical references. The report, by itself, is not intended as a basis for the engineering required to adopt any of the recommendations. Its intent is to inform the customer of the potential cost savings. The purpose of the recommendations and calculations is to determine whether measures warrant further investment of time and/or resources.