Prior to the adoption of heterogeneous virtualization engines such as IBM's SAN Volume Controller (SVC), Hitachi's USPV and LSI's SVM, a main impediment to storage virtualization for block-based storage was the lack of multiple storage vendor (heterogeneous) support within available virtualization technologies. This inhibited deployment across a data center.
Despite the availability of these technologies, some customers still prefer not to add an additional layer of complexity into the shared storage infrastructure. If so, the only practical approach is either to implement a single vendor solution across the whole of the data center (practical only for small and some medium size data centers) or to implement virtualization in one or more of the largest storage pools within a data center.
This article is a how-to guide on designing and implementing virtualization in storage pools.
Storage virtualization design and deployment capability
One of the most popular storage virtualization techniques is the pooling of physical storage from multiple network storage devices into what appears to be a single logical storage device that can be managed from a central point of control (console). Storage virtualization techniques are commonly used in a storage area network (SAN), but are also applicable to large-scale NAS environments where there are multiple NAS filers.
The management of storage devices takes significant storage administrator time and is error-prone. By hiding the complexity of the storage network, storage virtualization helps the storage administrator to perform the tasks of copy services, backup, archiving, and recovery with less effort, reduced elapsed time, and with fewer errors.
Users of SAN based storage networks can implement the virtualization of volumes with software applications, by using hardware and software hybrid appliances, and by using storage controllers with virtualization engines built in. The technology can be placed on different levels of a storage area network. For smaller configurations, the use of virtualization appliances in the fabric of the storage network infrastructure is feasible. Candidates are IBM’s SAN Volume Controller (SVC) which has been shipping for some time, and EMC’s Invista which is still in limited use. Hitachi (USP-VM either with storage or diskless) and HP have array-based solutions. For larger and/or performance critical environments, most installations use virtualization engines built into the storage controller, where they have least impact on I/O performance and can recover from any failures with higher certainly and reduced elapsed time. Some full virtualization arrays from Xiotech, 3PAR, IBM's XIV, and Compellent are built using virtualization engines which break the data into small pages (1-256MB) which are distributed across the disk drives. Only Hitachi’s USP V lines (also resold by HP and Sun) offer a heterogeneous virtualization solution which allows virtualization within the array and virtualization of external storage by the array. This may be particularly useful for migrating storage from one generation to the next or to different tiers.
Users of NAS based storage networks can implement virtualization of files in a number of different ways. Some vendors such as NetApp (with Data ONTAP GX), BlueArc, and HDS offer a homogeneous solution of consolidating the file system directories across multiple filers. There are recent additions from HP Extreme storage using Polyserve, and IBM's SoFS offering. There are two types of heterogeneous NAS virtualization solutions which consolidate at either the file level or directory level. Directory level consolidation is a partial solution, but simpler to implement. File level consolidation is more complete with greater potential benefits, but is complex and in our opinion is not yet ready for mission critical production systems. Today the best practical way is to implement a homogeneous NAS virtualization solution. The choice is between NetApp and a number of other smaller vendors at the low to mid-range of performance, and BlueArc, HP, IBM & HDS at the high-end of performance and capacity.
Virtualization is now a rapidly maturing technology, and is a very useful building block on which additional storage services can be provided to applications in an on demand way. Technologies such as tiered storage can be significantly enhanced by being based on a virtualization environment. New extensions to virtualization such as the coupling of filesystem growth/shrinkage with volume (LUN) sizing, and overallocation techniques such as thin provisioning depend on virtualization being available.
Specific operational goals of Storage virtualization design and deployment
A successful storage virtualization design and deployment initiative will:
- Improve the management efficiency and utilization of this part of the storage infrastructure
- Reduce the effort required to manage storage
- Improve the overall availability of the storage network
- Improve the flexibility (time to provide or change storage to meet application requirements) of the storage infrastructure
- Be complementary with other storage network initiatives that may be in progress
- Be implemented without significant impact or risk to production systems
- Provide a system that can scale to meet future storage needs and allows for secure migration of existing data
- Enable earlier implementation of projects such as tiered storage and thin provisioning
Using the Standard wikibon business model, overall benefits for the IT department should total in the range of $0.5 million to $1 million over a three year period. Potential benefits to the business from improved productivity should toat in the range of $0.3 million to $0.5 million over a three year period.
The major expected effects on the IT budget will normally include:
- Cost of storage virtualization products (software and additional hardware)
- Cost of implementation of new processes and procedures to implement virtualization
Using the Standard wikibon business model, this should be in the range of an additional $200-300K for a 40 terabyte implementation.
Other potential impacts on organization include:
- Improved productivity from users of large-scale NAS file-sharing systems, as much of the cost of managing the files fall on the end-user
- Potential delay of other initiatives (negative impact)
Risks of implementing Storage virtualization design and deployment initiative
The major risks to the success of a storage virtualization design and deployment initiative are:
- The overhead from the complexity of the virtualization implementation exceeds the benefits that can be achieved
- Errors in implementation that result in reduced service to users and/or loss of data
- Not controlling future storage virtualization and management software costs. A common practice for providers of virtualization software is to base software charges on the terabytes managed. As the cost of terabytes continues to drop by over 30%/year, user can purchase over 50% more storage each year with the same budget. Great care has to be taken in negotiating contracts to ensure that the virtualization software costs do not escalate out of control. Basing virtualization software licenses on a flat fee or on the number of physical storage arrays is a more practical method of metering storage virtualization software costs.
The Storage virtualization design and deployment initiative
The Storage virtualization design and deployment initiative will be implemented when the design systems and arrays are installed, attached and tested, and then approved by the project’s key stakeholders.
The following factors, although necessary for the storage virtualization design and implementation initiative to be successful, are not within the scope of this initiative:
- A project sponsor has been identified with the time and authority to ensure success
- Storage has be significantly consolidated, and good common storage management processes and procedures are in place for the storage pools identified
- A storage virtualization initiative budget has been defined.
- There is management comfort with the trade-off between efficiency and loss of vendor choice for the storage pools identified
This is comprehensively covered in the “Analyzing storage virtualization requirements” article.
Acceptance Test Considerations:
The design phase will be completed when the storage virtualization design has been accepted by the sponsor and agreed to by the key stakeholders, and agreement has been reached to proceed to the deploy phase or kill the initiative.
Key design milestones:
This phase should take about 8 -16 weeks, and should cost between $30K and $50K.
- Virtualization architecture decided
- For volume virtualization environments this will a choice between array based virtualization and storage network based virtualization appliances.
- Unless there is a compelling functional reason for choosing an appliance, array based solutions are more predicable from a performance perspective and will in general have better availability and expansion characteristics
- In NAS environments, homogeneous solutions for directory integration will significantly lower the project risk. There would need to be compelling functionality and additional testing before large-scale heterogeneous NAS virtualization solutions are implemented
- Primary vendor decided
- Decide on vendor hardware and software technologies available and issue RFP/solicit bids
- BlueArc, EMC, Hitachi, HP, IBM, Xiotech, and NetApp would be the primary vendors to consider in addition to other less well-established virtualization companies such as 3PAR & Compellent.
- Storage virtualization procedures designed
- Design virtualization procedures round the hardware and software decided, and integrate the design with the current procedures
- Pay particular attention to auditing procedures for ensuring that data cannot be deleted or tampered with, and to disaster recovery procedures
- Determine training requirements for operations
- Design test procedures and scripts
Acceptance Test Considerations
The Storage virtualization design and deployment initiative will be deployed when the design systems and devices are installed, tested, documented, and approved by the project’s key sponsor and stakeholders, and the systems running without project staff involvement for six months.
This phase should take about 3-5 months and should cost an additional $150-250K over 3 years.
- Storage virtualization system built
- Installation of hardware and software functionality
- Installation of any changes required to current storage system in the pool
- Installation of any changes required in the storage network (switches and ports)
- Update and creation of new process and procedures, with full documentation
- Storage virtualization tested
- Testing of equipment, software, and procedures on historical data
- Performance testing
- Testing of recovery on historical data
- Testing of procedures for migration, backup, recovery, and disaster recovery
- Migration & Cut-over to storage virtualization completed
- User training and documentation completed
- Help desk operatives trained and documentation updated
- Storage virtualization integrated into the storage compliance processes
- Storage audit processes identified in the new environment
- Full compliance agreed with storage compliance group
- Storage virtualization initiative wrapped up
- Procedures set up for monitoring performance, reliability, and recovery characteristics
- Procedures set up for adding additional storage, storage functionality, and storage management applications
- Final review of documentation
- All project staff released and full hand-over to IT operations
Storage virtualization is now a rapidly maturing technology, but is far from being able to manage heterogeneous arrays of storage in a transparent way. However, the volume based offerings from IBM, HP and Hitachi can make solid contributions to improving utilization and reducing storage management headcount. NAS virtualization is less developed, but homogeneous solutions from NetApp, BlueArc, Hitachi and others are solid. Storage virtualization is an enabling technology for implementing other significant storage improvements, including tiered storage and thin provisioning.