May 2, 2006
This article was contributed by Rami Rosen
Virtualization addresses the problem of making more efficient use of
available computer resources. This is done by providing an abstraction
layer which maps real resources to virtual resources.
Virtualization solutions have existed for more than forty years.
For example,
the IBM VM/370 project from the early sixties used virtualization
to expose a virtual System/370 machine to the user.
There are a wealth of virtualization technologies for the Linux platform:
QEMU, BOCHS, OpenVZ, coLinux, Xen, and a lot more.
In this article we will focus on Xen and the Virtualization Extensions
found in new processors.
On x86 processors, when running in protected mode, there are four privilege
levels. The operating system kernel executes in privilege level 0
(also called "supervisor mode") while applications execute
in privilege level 3. Privilege levels 1 and 2
are not used. When the processor detects a privilege level
violation, it generates a general-protection violation.
When using virtual machine extensions, there are two classes
of software: VMM (Virtual Machine Monitor), also known as "hypervisor",
and Guests, which are virtual machines.
VMM acts as a host and has a full access to the hardware.
Each Guest virtual machine operates independently of the others.
In the Xen project, running on x86 processors,
the guest operating systems run in privilege level 1.
The guest operating system code has been modified to support
virtualization.
There is no need to modify applications and they run in privilege
level 3 as in the usual case.
Naturally, many will prefer a situation where the guest operating
system code does not need to be modified.
As a result, hardware manufacturers like Intel and AMD have begun
to develop processors with built-in virtualization extensions.
With these processors, the guest operating system code stays unmodified.
Intel has developed the VT-x technology for x86 processor. This
technology provides hardware virtualization extensions. There are
some VT-x processors already available in the market.
For more details on Intel Virtualization Specification for the IA32 see
this
document [PDF].
With Intel's VT-x, the VMM runs in "VMX root operation mode" while the
guests (which are unmodified OSes) run in "VMX non-root
operation mode". While running in this mode, the guests are
more restricted; some instructions, like RDMSR, WRMSR and CPUID,
will cause a "VM exit" to the VMM. VM exit is a transition
from non-root operation to root operation. Some instructions and
exceptions will cause a "VM exit" when the configured conditions are met.
Xen handles the VM exit in a manner that is specific to to the
particular exception.
To implement this hardware virtualization, Intel added a new structure
called VMCS (Virtual Machine Control Structure), which handles
much of the virtualization management functionality. This structure
contains the exit reason in the case of a VM exit.
Also, 10 new instruction opcodes were added in VT-x.
These new opcodes manage the VT-x virtualization behavior.
For example, the VMXON instruction starts VMX operation, the VMREAD
instruction reads specified field from the VMCS and the
VMWRITE instruction writes specified field to the VMCS.
When a processor operates in "VMX root operation mode" its behavior
is much like when it operates in normal operating mode. However,
in normal operating mode these ten new opcodes are not available.
Intel recently published its VT-d (Intel(r) Virtualization Technology
for Directed I/O).
VT-d enables I/O devices to be directly assigned to virtual machines.
It also defines DMA remapping logic that can be configured for an
individual device.
There is also a cache called an IOTLB which improves performance.
for more details see Intel's
documentation [PDF].
In AMD's SVM ("Secure Virtual Machine), there is something quite similar, but the terminology is a bit different: We have Host Mode and Guest Mode.
The VMM runs in Host Mode and the guests run in Guest Mode.
In Guest Mode, some instructions cause VM EXIT, which is handled
in a manner that is specific to the way Guest Mode is entered.
AMD added a new structure called the VMCB (Virtual Machine Control Block) which handles much of the virtualization management functionality.
The VMCB includes an exit reason field which is read when a VM EXIT
occurs. AMD added eight new instruction opcodes to support SVM.
For example, the VMRUN instruction starts the operation of a guest OS,
the VMLOAD instruction loads the processor state from the VMCB and
the VMSAVE instruction saves the processor state to the VMCB.
For more details see the AMD64
Architecture Programmer's Manual [PDF]: Vol 2 System
Programming,
chapter 15,"Secure Virtual Machine".
AMD is supposed to release its first processors with virtualization
support in June, 2006.
AMD has published its I/O virtualization technology specification (IOMMU);
AMD CPUs with this IOMMU support should be available in 2007.
The AMD IOMMU technology intercepts devices access to memory. It finds
out to which guest a particular device is assigned, and decides whether
access is permitted and the actual address is available in system memory
(page protection and address translation).
You can think of AMD IOMMU as providing two facilities for AMD processors:
The Graphics Aperture Remapping Table (GART) and the Device Exclusion Vector (DEV).
In the AMD IOMMU there is optional support for IOTLBs.
For more details see:
AMD
I/O virtualization technology (IOMMU) specification Rev 1.00 [PDF].
Starting at the end of January 2006, the Xen unstable repository has
offered support for both Intel and AMD processors with virtualization
extensions.
Since there is much in common between AMD and Intel, a common API which is
termed HVM (Hardware Virtual Machine) was developed.
For example, HVM defines a table called hvm_function_table, which is a
structure containing functions that are common to both Intel VT-x and
AMD SVM. These methods are implemented differently in the VT-x and AMD SVM
trees. Another example of a common method for VT-x and SVM is the domain
builder method, xc_hvm_build(). (domain is a guest).
With Xen running on non-virtualized processors, there is a device model
which is based on backend/frontend virtual drivers (also called
"split drivers"). The backend is in domain 0, while the frontend is in the
unprivileged domains. They communicate via an interdomain event channel
and a shared memory area which is allocated from grant tables.
Only domain 0 has access to the hardware through the unmodified Linux
drivers. When running on VT-x or SVM, we cannot use this IO model,
because the guests run unmodified Linux kernels. So
Both VT-x and SVM use the emulated device subsystem of QEMU for
their I/O. QEMU runs in Xen as a userspace process. Using QEMU has a
performance cost, so, in the future, it is possible that QEMU will be replaced by a better performing solution. It is however, important to
understand that an IOMMU layer, even one which is built according to the
new AMD or Intel specs, cannot in itself be a replacement for QEMU,
because the same device may need to be shared between multiple domains.
As was mentioned above, there are many common things
between Intel VT-x and AMD SVM (like usage of QEMU and the common API
which HVM abstracts).
However, there are some differences; for example:
- The AMD SVM uses a tagged TLB; this means
that they use an ASID (Address Space Identifier) to distinguish
between host-space entries from guest-space entries.
By using this identifier, we don't have to perform a TLB flush when
there is a context switch between guest and host.
This significantly reduces the number of TLB flushes.
A TLB flush slows the system because after a TLB flush occurs,
subsequent accesses to memory will require a full page table lookup.
- In order to boot an Intel VT-x machine you need an hvmloader
(which was called vmxloader in the past).
According to the VT-x spec, guest OSes cannot operate in real mode.
Using a Linux loader to load a guest OS is impossible because it starts in
real mode. To solve this problem, a vmxloader was written for VT-x guests.
This loader uses the VM86 mode of the processor to run the OS boot loader.
AMD SVM, on the other hand, supports real-mode for guests, so
it does not need the VM86 mode of the hvmloader.
In conclusion, we can see that there are many similarities
between Intel VT-x and AMD SVM when running Xen; sometimes the terms
are even similar (like VM Entry/VM Exit); and the
performance slowdown because the use of QEMU is common to both.
Thanks to Mat Petersson from AMD for reviewing this article.
(
Log in to post comments)