May 5, 2010
This article was contributed by Koen Vervloesem
The Polish security researcher Joanna Rutkowska is specialized in
low-level security, including hardware-based attacks, kernel exploits,
rootkits, and virtualization malware. Among other things, she has
discovered leaks in the Windows Vista kernel, the Xen hypervisor, and
Intel's Trusted Execution Technology (TXT). In 2007 Joanna founded Invisible Things Lab and
subsequently her team has changed strategies: they decided to use the knowledge they have gained in breaking systems to create a new operating system that improves security for users.
Last month, Invisible Things Lab presented the first result of this:
it launched
an alpha version of a new secure open source operating system, Qubes. The project aims at building a secure
operating system for desktop users. The main idea is that different
applications are isolated from each other, but without any big impediments to
usability. To implement this idea, Qubes uses the isolation capabilities of
the Xen hypervisor, together with modern hardware technologies such as
Intel VT-d (Virtualization
Technology for Directed I/O) and TXT.
Virtualization is the cornerstone of the Qubes security architecture
because it allows creating containers that are much better isolated than
the standard processes in typical operating systems. If the user's web browser gets compromised in a typical operating system, it's difficult to prevent other processes or the user's data being compromised as well. If the compromised process is a core system component such as a WiFi driver or network stack, the security of the whole system is at stake.
Of course this architecture means that the choice of the hypervisor is
critical for the security of the whole system. The Qubes developers have
chosen Xen for a clear reason: the hypervisor itself is very simple, and it
doesn't provide services like a network stack or filesystems that could be
an attack vector. A security audit of the Xen hypervisor is therefore much
easier to perform than for other solutions like KVM. A more thorough explanation of why the Xen hypervisor architecture better suits the needs of Qubes can be found in the Qubes OS Architecture [PDF] document.
Isolating domains
Users can divide their tasks and resources into several virtual machines, called AppVMs (the "cubes"). Which AppVMs they choose depends on the user's work environment, but there are some typical examples. A "bank" VM could be set up exclusively for access to the user's bank web site, only allowing HTTPS access to the web site and nothing else. Work and personal stuff can be isolated in their own virtual machines. And a "random" VM could be used for watching YouTube movies and playing games.
Qubes provides some virtual machines for system-wide services by default, called SystemVMs. For example, all networking code (network stack and drivers) is sandboxed in an unprivileged "network" VM. The unprivileged code gets safe direct access to specific PCI devices (the network cards) using VT-d technology. The privileged Dom0 (the "host" operating system of Xen which runs the management stack) doesn't contain any networking code. As only the network VM is granted direct access to the networking hardware, each AppVM uses a virtual network interface created by the Xen network frontend. The other side of this virtual interface, in the network VM, is connected to the physical interface via the Linux packet filter, which also blocks any direct inter-VM traffic. This setup prevents the scenario where a lesser-privileged VM can compromise more-privileged VMs by exploiting a bug in privileged driver code.
Another possible attack vector is Dom0, which is almost as privileged as
the hypervisor: although it cannot modify the hypervisor's memory, it has
access to the memory of all the other virtual machines. So if a certain
AppVM can attack Dom0, it can also modify other AppVMs. However, by placing
the network code in an unprivileged domain, the likelihood of such an
attack is minimal. The only really security-sensitive code in Dom0 that is
accessible by the AppVMs is the XenStore daemon (which contains information
about where various storage devices are located) and the GUI. If a
malicious program can mimic starting and operating AppVMs, they can trick
the user into thinking they are running their application securely —
much like a phishing scam on a web site.
Secure storage
If all user applications are hosted in AppVMs, it could require a lot of
memory and storage: each virtual machine requires an operating system
(e.g. a Linux distribution) and one or more applications. However, Qubes
makes a special effort to save disk space. Instead of replicating the full
OS image for each VM, all AppVMs based on the same distribution share the
same read-only root filesystem (/boot, /bin,
/etc, /lib, /usr, and so on). The AppVM
distribution in Qubes is a lightweight Linux distribution (with a roughly 400 MB footprint) without a desktop environment (as the user's desktop environment is run in the Dom0 operating system), and it only uses a minimal X server.
Because read-only access is not enough, Qubes uses the device mapper to create a copy-on-write device on top of this. This device is discarded when the AppVM shuts down, so (possibly malicious) changes to the root filesystem will not be preserved: even if a virtual machine is compromised, it will boot the next time with a clean state.
For VM-specific data, a separate writable block device is used, containing directories such as /home, /usr/local, and /var. Executable files on this disk, such as browser plugins in the user's home directory or manually installed programs in /usr/local/bin are a risk, because this device is not discarded after use. However, a security audit becomes much easier because exploitable files are limited to this device.
The VM-specific devices (both the copy-on-write image and the private
data image) are encrypted with an AppVM-specific key, known only to the
AppVM and Dom0. This encryption is done by LUKS (Linux Unified Key
Setup). The read-only device used for the root filesystems is signed,
and each AppVM verifies this signature when using the device. To prevent an
attacker that compromised the storage domain from providing a modified kernel or initrd, the kernel and initrd files are explicitly specified in Dom0 to ensure that the initrd verifies the signature of the root filesystem before mounting it.
Centralized updates of all AppVMs are possible because they share the same root filesystem: the only thing that's needed is a special UpdateVM virtual machine with read-write access to the root filesystem and the signing key to re-sign the device. This obviously makes UpdateVM a weak spot, so it should be secured with much care.
Marrying isolation with usability
This all sounds nice in theory, but if the system is too cumbersome, users will not use it and render their system insecure. Fortunately, Qubes integrates the AppVMs seamlessly on the desktop: the various applications are just shown on the same desktop, although they are hosted in different virtual machines. Copying and pasting text between virtual machines also works, but Qubes has taken care that AppVMs have no direct access to the clipboard: the user has to initiate the copy/paste operation. Of course this could still lead to some data leaks, but it is up to the user to enforce a policy on inter-VM data flows.
Transferring files between virtual machines is a bit more
cumbersome. The user has to open the Dolphin file manager in one VM, open
the context menu for the file, choose "Send to VM", enter the name of the
destination VM and then authorize the file transfer in the destination
VM. The files are never automatically copied into the destination's
filesystem, but made available in a virtual "pen drive" that is mounted in
the destination. The last step is copying the files from the virtual pen
drive to the right location in the VM's filesystem. As cumbersome as this
procedure is, this prevents an AppVM from forcing another AppVM to
automatically accept some files, which could lead to attacks.
The Qubes project is currently in alpha, and is not suitable for production use, although Joanna is using Qubes now as her main operating system. A stable version is expected to appear towards the end of this year. In the meantime, intrepid users can follow the installation guide, which covers the installation of Qubes on top of a Fedora 12 system with KDE.
After installing a template image that will be used for all the AppVMs, as well as the image for the network service VM, the user creates AppVMs with the qvm-create command. Icons for the AppVMs are then created in the KDE start menu of Dom0. When the user starts an application from an AppVM for the first time, Qubes automatically starts the AppVM before starting the application, which introduces a delay, but this delay disappears when the user starts a second application in the same AppVM. Obviously, Qubes needs a lot of RAM: 4 GB is recommended.
Each application gets a label, which is the name of the virtual machine, such as "work" or "shopping". Moreover, the window manager shows a colored frame around the application's window to show which AppVM it is part of. Applications are not allowed to maximize to full screen to prevent a malicious application from spoofing the decorations of other AppVMs.
Most of the documentation about the Qubes project can be found in the wiki. The architecture document linked
above has a thorough explanation of the inner workings of Qubes (including an analysis of potential attack vectors), and there's also some practical information in a presentation by Joanna [PDF]. The source code is available in a Git repository and the project welcomes contributions.
The future
Qubes is still under development, and a lot of additions are planned. For example, there
will be an unprivileged storage domain — similar to the network
domain — that holds all storage drivers and filesystem code, and will get safe direct access to the disk controller. So even if a low-level storage driver or protocol stack gets compromised, it won't result in a full system compromise.
Another feature that is planned is support for Intel's Trusted Execution Technology. This will prevent modification of the system's boot code. So if the storage domain is compromised and a backdoor or rootkit is installed in the boot code, the Qubes system will become unbootable to protect itself.
Currently, the Qubes prototype is using Linux as the operating system running in the AppVMs, but there is nothing that would prevent support for other guest operating systems, such as Windows, as long as they support running as a Xen DomU. Of course Qubes must be adapted then, for example to support the shared root filesystem, but this should be possible. According to the FAQ, support for Windows-based AppVMs might become a commercial extension. In the same way, the general architecture could be used with any hypervisor, as long as it supports the features that the Qubes architecture requires, such as unprivileged driver domains. The developers are also thinking about a slimmed-down version of Xen for more security.
It's interesting to see that one of the best security breakers in the
world has now become a builder. The architecture of Qubes is
well-thought-out and based on years of system-level security research. The
concept of virtualization to isolate potentially unsafe processes is
certainly not new (look at FreeBSD jails, OpenSolaris zones, or Linux
containers), but it's refreshing to see it implemented in a (relatively)
user-friendly way. When the project reaches version 1 later this year,
security-conscious Linux users should definitely give it a try.
(
Log in to post comments)