This is the first article in a series of two, designed to explore the impact of server virtualization on DAS storage, most of all if a DAS RAID engine is involved. These terms will be used interchangeably unless otherwise noted.
This article focuses on the status of the art of server virtualization: how we got here, how storage I/O issues are resolved, and what the current advantages and limitations are. The second article focuses on how new and innovative solutions -- specifically aimed at virtualizing I/O -- will help solve most of these issues and complete the transition from independent servers to fully virtualized platforms.
What is server virtualization, and why do we have it? The virtualization story focuses on increasing server deployment and management efficiency while reducing TCO (Total Cost of Ownership).
From a holistic point of view, we can see two stages in server virtualization; the first addressing server consolidation and the second, usage efficiency.
During the consolidation phase, the problem addressed the general under utilization of servers (7% average utilization is a widely accepted IDC number from a study in early 2007). But they are bought as whole, deployed as whole, and consume power as whole. The first and easiest step in addressing this problem is to squeeze the OSes and applications running on all these different servers into a single platform. The main benefit is acquisition cost (CapEx) reduction, although this also cuts power consumption.
The second phase is more focused on efficiency, enabling multiple virtualized servers to dynamically manage resources, move virtual machines across servers, schedule downtime, upgrades, and maintenance transparently, and handle failover automatically. This phase is largely focused on diminishing the management cost and the OpEx of the platform. Technology has been marching on a similar path both in hardware and software: First implementations were based on standard hardware, simply using software tricks like binary patching to create the virtualization layer. Then hardware added hooks such as the Intel VT-x and VT-i technologies and AMD-V to help abstract CPU and memory complexity from each Guest OS.
The next step, again in software first, was to move the the hypervisor infrastructure from binary patching to emulation and para-virtualization, improving hypervisor efficiency and reducing system overhead.
The last step was to reduce the footprint of the virtualization environment from GB to 10s of MB, to make the SW footprint less taxing on resources.
The forgotten I/O Since the initial focus was mostly on software infrastructure, all the effort was focused on improving the usual software bottlenecks, largely inside the CPU and memory complex.
I/O was and usually still is treated as a commodity resource when the hypervisor abstracted it from the OSes (called Guest OS or GOS) and made each act as if it still owned all the I/O. In all honesty, this solution is still working fairly well, particularly under the most recent improvements in emulation and para-virtualization techniques.
But this means that the I/O hardware is still a single element with its own inherent limitations and fooled by the hypervisor to act as if it is serving one single OS, not a collection of virtualized GOSes.
This “reciprocal lack of knowledge of each other” between the I/O and GOS simplifies deployment (no change is needed in either OS or I/O as they don’t know of each other) but poses its own efficiency problems and, in addition, security concerns.
Basically, the hypervisor is the gatekeeper of everything, and every I/O request from every GOS has to be processed by the hypervisor, which provides abstraction services, handles GOS queues, interrupts servicing, manages exceptions, etc.
On the positive side, the current model has created new opportunities to improve I/O and storage functionality. The first and easiest example of this is that the hypervisor may have an internal file system (for example, VMFS in VMware) that can be used as an aggregation point for virtualizing all storage at the software level and then reassigning software as needed to the GOSes.
This has enabled several features that are historically SAN related: one can thin provision storage, create snapshots, and abstract storage through pooling or other technologies. Instead of a set of server connections to a SAN becomes a GOS storage connection from the hypervisor file system, even of that storage is internal in the physical box.
However, this also creates several limitations:
- Performance: This goes beyond the usual metrics of non-virtualized servers that are largely focused on MB/sec or IO/sec, because the hypervisor is both using system CPU to handle the abstraction (and thus reducing the CPU cycles available to productivity apps) and the data has to transition through the hypervisor before being delivered to the requestor GOS. This creates a long latency for each GOS, decreasing GOS productivity. This is especially bad in transactional applications that count on fast response time to be more productive
- Security: As all the data merges into the hypervisor, it can only be as secure as the hypervisor, itself. Bugs in the hypervisor, operator management error, virus, etc., impact all data and OS/applications running on the platform, not just one. Security is only enforced through the hypervisor software, and while this can be more than acceptable in some applications, it may be a major threat on others.
- Quality of Service (QoS): Again, this has to work through the hypervisor, which means that more code is running to enforce QoS (using system resources), and that code is limited to the functionality that the hypervisor provides. Because the I/O is not aware of being virtualized QoS tuning at I/O controller level is impossible, thus making this a far less then optimal implementation.
Virtualization also provides SAN-like advantages to DAS RAID controllers, even if they are unaware of the virtualization layer. The most evident is storage consolidation, historically one of the most distinguished features of SAN-based RAID. Users often have to over-provision storage for non-virtualized servers because of performance and reliability concerns, often resulting in underutilized and expensive deployment. A virtualized server can be provisioned as a SAN, aggregating storage as needed, creating LUNs to be assigned to each GOS (or to the hypervisor’s own file system) and provisioning it with all usual SAN technologies.
One last item to note is that the I/O, and most of all DAS storage, has a character of locality to the platform. You can abstract storage to the GOS running on that server but not to a GOS running on another server because no efficient communication path exists and because in the case of server failure, all the local storage fails with it and won’t be available to any other server. For this reason, the current implementation of VM mobility requires the storage to be attached to a SAN.
But these disadvantages are simply the result of the complete focus on CPU and memory management in first stage of server virtualization, which leaves the hypervisor to solve the abstraction problems unaided by hardware.
Action Item: As virtualization becomes pervasive, the virtualization of I/O is becoming more important. In response, new technologies are on the horizon that promise to remove the limitations in I/O of virtualized systems while preserving all of the advantages. The most promising technologies in this space are Single-Root IO virtualization (SrIOV) and Multi-Root IO virtualization (MrIOV). They will be the subject of next article.
Footnotes: