This is the second part in a two-series segment, addressing the impact of server virtualization to IO devices. In part one, we looked at the current state of the industry; how virtualization is deployed with current server architectures and that most of the efforts, so far, have been focused on the CPU/ memory complex alone, while no attention has been given to the IO.
Looking at the issues that Server Virtualization needed to address, including where its impact would have been greater, especially in terms of CPU and the memory complex, was a reasonable approach. However, the situation is now reversed: while the computing element, based on the new virtualized CPU/ memory infrastructure is starting to deliver what was promised, the IO is lagging behind and will be the object of forthcoming evolutions. And the storage subsystem, a subject of this article due to its role in the system, is particularly affected by such limitations.
The three areas where storage IO suffers in current virtualized servers:
- Performance: not necessarily in MB/sec or IO/sec that may still be at acceptable rates, as much as latency (time to a command to travel from the OS to the storage device and the response from the storage device back to its originating OS) and system CPU usage (how much processing is needed to handle each IO)
- Security: as discussed in part one, all of the OSes (running in Virtual Machines – VM) dispatch their IO calls through a hypervisor level to handle them, creating hazards and opportunities for data integrity and security breaches
- Quality of Service (QoS): limited by the hypervisor functionality and removed from the storage controller
Ultimately all of the above have the root cause that in current server virtualization architectures, IO is not virtualized at all -- if not by the hypervisor that is providing a simple, but inefficient emulation of IO virtualization.
More specifically, on one side, the hypervisor “fools” each VM that they fully own every IO resource, while in reality this is shared with the other VMs; and on the other side, “fools” the IO controller that it is receiving commands from one single VM, while in reality it is receiving the request by all the VMs installed in the system. The value of this solution is that is simple and does not require any changes in the IO subsystem, but the price to pay is the inefficiency and limitations above.
To address these aspects, a IO Virtualization model has been recently proposed and incorporated into the PCI Express standard and has two main variants called Single Root IO Virtualization (SrIOV) and Multi Root IO Virtualization (MrIOV).
The concept is pretty simple: instead of having the hypervisor act as a middle-man between the VM and IO controllers, the IOV proposals create an environment where each VM can directly address the IO controller without any intervention of the hypervisor. On the opposite side, the IO controller can provide a specific path to communicate with each VM, to keep all the IO threads separate.
SrIOV addresses such protocol for controllers fully contained in a single server (called a “PCIe root”) platform, while the MrIOV extends the same solution, envisioning storage controllers physically removed from the server and serving multiple server platforms. While this is an interesting solution for blade and brick computers, it may be more ideal for applications on the long-term horizon.
Although this implementation seems simple and straightforward, in reality it has lot of complexities, especially for storage controllers, and creates a complete new class of opportunities (and challenges) for the storage subsystem and related services.
Three concerns that the hypervisor addresses:
- Performance: the hypervisor is entirely bypassed, creating a direct link whose performance, both in latency and CPU usage, are entirely equivalent to the optimal case of non-virtualized servers
- Security: each VM has its IO thread, which never merges nor uses common resources until they are inside the storage controller. As such, there is no risk of data contamination, corruption or security breach -- at least no more than in independent, non-virtualized servers sharing a common storage controller
- QoS: the storage controller now is aware of each VM and can apply QoS policies per VM or even per IO, creating an environment that can support every level of QoS
However things are not always as simple as they first appear, especially for storage controllers where resources are shared. To illustrate this, let’s look to the other prevalent IO subsystem: the LAN controller. A LAN creates a connection between each local VM and a remote system, using known protocols like TCP/IP and similar. The big advantage they have is that all these threads are separate and isolated and, most of all, do not share any common target element (i.e. they connect different VMs to different targets, using a shared transport). In this environment, the usage of IOV is rather simple and there are currently several LAN controllers supporting it and virtualized servers that can already fully benefit from it.
Storage controllers are a different story because the “target” device is generally not entirely owned by the connecting VM. For example, a RAID controller may project two virtual disks to two virtual machines that can each be used as a sole owner but, in reality, these virtual disks are RAID abstractions of the same physical disks.
What this creates is a major management headache and demands that techniques generally limited to SAN are ported inside the server domain and, consequently, the configuration of these subsystem is going to be much more similar to an External RAID controller than conventional DAS.
If you look at this from a higher level of abstraction, what SrIOV is doing for the server (and MrIOV possibly for blades) is to create a “SAN inside the server,” where the user will take advantage of both the small footprint (and cost!) of server-based storage and the configurability and flexibility of a SAN controller, but the price to pay is the investment in a new architecture for IO subsystems. Call it “evolution”…
As a last comment, these evolutionary solutions will, as it always happen, challenge the status quo of other related technologies but, at the same time, they will create new opportunities for further improvements. As an example today, the hypervisor is not only providing a “funnel” function for IOs but in some cases, provide a local file system (like Vmware VMFS), and this can in turn be used to create services like snapshot, thin provisioning, etc. SrIOV, however, allows the hypervisor to be completely bypassed, and as such there is no opportunity for any data manipulation like the one mentioned above. There could be several solutions to this, but the most likely at this time is the migration of such services inside the storage controller, which will resemble more and more a “SAN in the server”: SAN-like configuration, SAN-like connectivity and SAN-like services like snapshot and thin provisioning.
Looking forward, we should mention that the key to multi-server virtualization is the VM mobility that today requires the SAN to guarantee the storage availability to each server, but it is a short stretch to understand that MrIOV will provide the same capability to DAS storage as well and enable VM mobility for DAS storage as well.
Action Item: We provided a lot of detail into why server virtualization is needed, what its impact to the storage controller is and how virtualizing IO controllers may enable a new class of solutions. Where we end up, when this transition is complete, is that DAS-based storage controllers will bear a similar resemblance to SAN storage controllers. The only difference remaining between the two is that DAS uses PCIe for transport while SAN has an entire class of storage protocols (FC, GbE, SAS, IB, FCoE,…). But even that may get fuzzier in the future as more storage controllers will offer a number of services through virtualization techniques, while servers will embed a lot more storage services.
Quite possibly, a few years down the road, the discussion won’t focus so much on DAS-based server storage and SAN-based platforms, as much it will focus on general computing platforms that can either be configured as general purpose servers or storage servers. And server virtualization technologies, including the IOV technologies, are the foundations for this convergence.
Footnotes: