The hardest challenge in server virtualization is storage. VMware and the storage vendors have addressed this challenge with an array of features and integration points that add considerable value to end-users. In this article we summarize what is presently available beginning with data protection, which users cite as their number one priority.
Note that many of these integration points became available with vSphere 4 and those that are new with vSphere 5 are italicized. Also italicized are further integration points added since our previous study. A previous article discussing vSphere 4 integration points can be found here.
VADP – vStorage APIs for Data Protection
VADP are a set of APIs focused on local backup/recovery use cases. They replace VCB, VMware Consolidated Backup, and provide significant performance improvements. With VADP during backups, virtual machine data is not copied to a proxy server prior to moving to the storage media. Rather, vSphere creates a snapshot of the VM’s data directly in the storage array itself. This snapshot is then moved to the backup media directly without requiring any dedicated disk cache on a proxy server. This capability significantly improves backup speed, alleviates large amounts of resources on the proxy server, and allows many more virtual machines to be backed up simultaneously.
Also, incremental backups are enhanced with Change Block Tracking (CBT), which quickly identifies only the data blocks on the virtual machine that have changed since the last backup. This replaces the need to scan the entire virtual machine with complex checksum calculations to determine what data has changed, which results in faster backup time.
Moreover, VMs can be restored directly to the appropriate ESX Server and data store without staging on the proxy server, resulting in much faster restores. Many more vendors have integrated Snapshot management with their data protection packages.
VAAI – vStorage APIs for Array Integration
These APIs essentially take advantage of storage array features to offload, speed up or improve common vSphere operations. These APIs were developed in conjunction with ANSI/INCIT T10 (SCSI) standards body. Initially, these API’s were only for block storage arrays, but VMware added several new ones for NFS NAS arrays.
Hardware Assisted Locking
During VMFS file system metadata updates, only storage blocks related to a particular VM are locked, as opposed to locking an entire LUN. This is very important for increasing VM-to-data-store density and performance. One example is VDI boot storms. If only the blocks relevant to the VM being powered on are locked, then more VMs can start per data store. The same applies in a dynamic VDI environment to reduce the impact of busy cloning periods where images are being cloned and then spun up.
Similar to xcopy, Full Copy allows the block storage array to make a full copy of a data store without hauling the data from disk, through the ESX host, and back to disk. This dramatically reduces IO traffic and CPU loads. VM cloning takes a fraction of the time and storage; vMotion can be done in the storage array without much ESX host involvement.
This array function offloads the writing of repeated content such as the zeroing of data stores and results in significant reductions in IO and CPU loads.
Thin Provisioning Suspend/Stun
In previous versions of ESX, when a VM ran out of thinly provisioned disk space, write IO requests from the VMs would stack up and result in Blue Screens of Death or Kernel Panics. With Thin Provisioning Stun, the array notifies ESX of an out-of-space condition, and the ESX host will “stun” or suspend the affected VMs. Then an administrator can add additional capacity and resume the affected VMs. This sequence is essentially the same as running out of space on a real disk, but now it works for thinly provisioned space as well.
VAAI - Thin Provisioning Dead Space Reclaim (SCSI UNMAP)
Thin provisioning in a storage array can save a lot of real disk space, but until this API became available with vSphere 5 there was no way to reclaim space that was used by a deleted file. This function tells the storage array which blocks can be reclaimed.
However, the use of this function caused some problems with Storage vMotion and VMware issued a patch to disable this function. The market is still awaiting a fix from VMware to fix these problems and re-enable this useful function.
VAAI - Hardware Acceleration for NAS - Full File Clone
This provides the same function as VAII - Full Copy, but for NFS. The NAS array creates a full copy of the file.
VAAI – Hardware Acceleration for NAS - NFS Fast File Clone
The NFS NAS array creates a snapshot copy of a file.
VAAI – NFS Clone/Copy Status or Abort
This provides a way for vSphere to interrogate an NFS array about the status of a file copy or to abort it altogether.
VAAI - Hardware Acceleration for NAS - Reserve Space
This reserves the full amount of space for a specified file.
VAAI - Hardware Acceleration - Certification
Although the hardware acceleration APIs were supposed to offload many functions to the storage array to make things faster, in some cases the solution was slower. So now VMware requires certification that hardware offload did indeed make things faster.
vStorage APIs for Multi-Pathing – aka PSA
vSphere’s PSA or Pluggable Storage Architecture is an open modular framework that enables third-party storage multi-pathing solutions for workload balancing and high availability. vSphere by default offers its own native multi-pathing, but if array-specific functionality is required a third-party plug-in using this API can be configured. This allows storage partners to create multi-pathing extensions to deliver storage path failover and storage IO activity optimized for partners' storage arrays. Storage partners certify their modules for support with VMware vSphere through the VMware Ready certification program. Most customers find native multi-pathing sufficient, and only a few vendors have implemented third-party plug-ins.
Several vendors have now also provided a vCenter plug-in to dynamically change the multi-pathing policies or automated equivalents. One vendor has also provided a way to dynamically adjust IO queue depth (Adaptive Queue Depth).
SIOC – Storage I/O Control
SIOC is VMware’s approach to providing QOS for block storage. There are no APIs, but one vendor, 3PAR, figured out how to read what SIOC was attempting to do and tries to help it along from the storage array perspective. This makes sense, and more vendors may attempt this. SIOC for NFS is now available with vSphere 5, but no storage vendor has yet to exploit this feature.
VASRM – vStorage APIs for Site Recovery Manager (SRM)
Not new with vSphere, this set of APIs is focused on storage array vendors’ remote replication features such that they can be orchestrated by vSphere’s Site Recovery Manager (SRM). Vendors provide Storage Replication Adaptor (SRA) software that SRM then uses in a standardized and automated fashion to failover to a recovery site.
SRM in vSphere V4 did not provide automated failback, but several vendors essentially automated the process through scripts and lots of testing. Automated failback with SRM became available with vSphere 5 and most vendors now support automated SRM failback. One key differentiator among vendors’ offerings is whether or not they only transmit changed blocks/files upon failback.
Another handy function integrated by some vendors is the ability for the VM administrator to provision replication (for SRM) from vCenter rather than waiting for the storage administrator to do it.
Many vendors have taken advantage of the vCenter plug-in capability to provide storage-related functions from the vCenter console, thereby offering a single pane of glass for administering both storage and VMware servers. These include:
- Discovery and mapping of storage arrays,
- End-to-end discovery of VMs, ESX servers and their storage with mapping both ways,
- Storage Provisioning and management for VMFS, RDM and NFS,
- Monitoring storage,
- Fast clones of VMs with or without VMware View integration,
- Backup reporting,
- Automated virtual infrastructure reporting,
- Mass replication of VMs at a data store level,
- Integration with security software.
- Managing primary storage de-duplication.
Clearly being aware of VMs and their data stores is handy in a VMware vSphere environment, and the storage vendors have responded by ensuring they can do things like:
- Provide VMware consistent snapshots,
- Granularly restore an individual VM or its data store,
- Perform per-VM data compression,
- Offer linked clones.
VMware Metro Storage Cluster Certification (vMSC)-- vMotion Over Metro Distances Distance with Active/Active Data Stores
Prior to vSphere 5, to address user requirements of a very short recovery time objective (RTO), several vendors have provided specialized hardware and software that essentially keeps two copies of the data stores in two different locations as up-to-date as possible. Failover is near instantaneous.
With vSphere 5, VMware has added certification processes for metro distances. There are separate certifications for iSCSI, Fiber Channel and NFS.
vMotion Over Geo Distances Distance with Active/Active Data Stores
For users wanting short RTOs at greater distances, several vendors provide storage clusters (Geoclusters) leveraging asynchronous replication and fast caching. There are no certifications yet.
vStorage API’s for Storage Awareness - VASA
VASA is key pillar in VMware’s storage journey towards policy- and profile-based storage management. The storage array returns a parsable string that describes a LUNs or arrray’s functionality such as RAID, replicated RAID, high-performance, lower performance, etc.
Additional Storage Integrations/Features
The following are features provided by some vendors that are quite useful in vSphere environments:
- Simultaneous support of multiple storage network protocols (FC, iSCSI, FCoE, CIFS, NFS, SAS, InfiniBand.
- RBAC (role-based access control) Multi-tenant storage management capabilities – that hide or makes visible certain storage infrastructure thereby providing a kind of secure multi-tenancy.
- Write Zeros detect and avoid – The storage array detects that a block or set of blocks are being zeroed (SCSI Write-same) and merely notes that the block(s) are zeroed without actually writing to the blocks. When the block is accessed, zeros are returned. Another approach is to zero every block in the array in the background.
- Cure misaligned VM partitions non-disruptively. Note that with some vendor’s arrays, partitions cannot be misaligned.
- Provision, grow and shrink NFS volumes non-disruptively.
- Auto-registration of vSphere hosts to the storage array for NPIV – For those arrays that require hosts be registered, this function can save a lot of clicks. Some vendors don’t require this registration or provide presentation without (auto) registration.
- PowerShell Command List vSphere Integration – a library of handy scripts.
- Secure Multi-tenancy - isolation/dedication of array resources.
- Per-VM Data Compression.
- Storage Virtual Appliance (SVA) certification.
- Role-based access control (RBAC) based storage management from vSphere.
- Network RAID -- HA/FT cluster.
Action Item: Not every vendor has addressed all these integration points. Some are waiting for the next release of vSphere. Some pressed ahead of VMware with users getting earlier benefits. The vSphere landscape remains in flux and both users and vendors will have to keep up or even stay ahead.