|
|
Log in / Subscribe / Register

Libvirt: what went wrong (and right)

September 16, 2015

This article was contributed by Paolo Bonzini


KVM Forum

At KVM Forum 2015, Michal Prívozník presented a talk entitled "Libvirt: what did we do wrong?" (YouTube video and slides [PDF]). Libvirt is a toolkit to interact with hypervisors and to set up virtual machines and containers. It is used by oVirt and OpenStack, among others. It supports about a dozen hypervisors, either as a bridge to native management tools or (as is the case for KVM) launching and managing guests directly. The purpose of the talk was not only to detail the things that went wrong in libvirt, but also to explain why they happened and to hint at Libvirt's strongest design points.

Libvirt is a complex project. It's a C library with a stable API and bindings for multiple other languages. It has a plugin interface to support multiple hypervisors and container solutions. It also handles the configuration of the host for functionality such as network bridges (with ebtables/iptables), iSCSI connections, and virtual Fibre Channel host bus adapters (with N_Port ID Virtualization or NPIV). Libvirt also supports remote access; API calls can transparently refer to remote machines and the library takes care of establishing a clear text socket connection, a TLS-protected one, or an SSH tunnel to the remote host.

The first thing that went wrong will be familiar to readers of the LWN kernel page: some APIs have no flags argument and thus are not extensible. For example, the virDomainShutdown() function takes a single argument, a pointer to a domain pointer. However, there are multiple ways to shut down a virtual machine. At the very least there are two: through a system interface such as ACPI, and through a remote procedure call (RPC) to a program running in a guest. Therefore a new API function was introduced called virDomainShutdownFlags().

Even when a flags argument was included, new arguments would turn out to be necessary in the future because the APIs were tied to specific hypervisors. For example, virDomainCreateWithFiles() had to be introduced to add the ability to pass file descriptors from the host into a Linux container. One might say that the use case is rather specific and that the new API is also tied to a specific virtualization mechanism.

In general, API names are not generally a strong point of libvirt. virDomainShutdownFlags() was mentioned already; virDomainCreateWithFiles() has an extra "With". It turns out there's also a virDomainCreateWithFlags() function, and virDomainCreateWithFiles() has both a files and a flags argument.

Inconsistency in error conditions was another thing that went unnoticed until it was too late. free(NULL) in C is a no-operation, as is virDomainInterfaceFree(NULL) in libvirt. However, virDomainFree(NULL) is an error. Changing this is difficult at this point, because it is unknown if any user is relying on this.

A different category of mistake is APIs that are inherently subject to time-of-check-to-time-of-use races. Again, this is not a new mistake: the *at() family of system calls was introduced for the same reason in the Linux kernel. In libvirt, the "list active guests" operation was split in two parts: first, it would collect a list of integer IDs and, second, the user would request information for each ID. This API does not make it possible to obtain a consistent view at some point in time.

Some right choices

That said, many other things were done right. In particular, the external API was limited to the minimum necessary for consumers of the library. Libvirt has a large number of internal APIs, which really helps when developing a system library in C. One rather useful subsystem, virCommand, was built to spawn subprocesses with support for file descriptor passing, dropping privileges (UID/GID and capabilities), and more. The developers receive many requests for turning it into a public library. However, the API is good exactly because it was never part of the libvirt ABI, and could grow into a better design over time. Keeping these parts internal, thus, was a good idea, he said.

In a related area, libvirt keeps the internal structure of the objects (domains, networks, storage pools, etc.) hidden to libvirt clients. The clients have to use XML whenever they have to communicate information about the objects. The XML is part of the libvirt API and needs to be kept stable; however, it is much more flexible than C structures and it makes it easy to add new features without having to design tons of APIs. Again, strong separation between internal APIs (which use C structures) and external XML-based APIs simplifies growth.

Some of the XML elements are surely badly named and uselessly hypervisor-specific; these issues come up often enough that Prívozník "could talk for a really long time" about them. In addition, for a long time Libvirt was not even checking the XML against a schema, preventing the API from failing fast. However, the use of XML has become a signature feature of the Libvirt API and Prívozník considers it one of its strong points.

What was the source of the mistakes? Mostly, the lack of a global view of what was happening with multiple hypervisors. Some of the mistakes above can be explained by Libvirt's beginnings as a C wrapper for the Xen RPC. Even today, communication is insufficient about the requirements of different hypervisors and plans for future development. The same problems are solved at different times for different hypervisors; if whoever comes second has additional or different requirements, the resulting API is often suboptimal.

To some extent, this is a consequence of the open-source model, he said. Collaborative development doesn't encourage traditional approaches to requirements design; "agile methodologies" are not easy to apply, either, for a project with a large number of individual contributors scattered around the globe.

Some of the principles of agile methodologies and extreme programming are indeed applied to Libvirt, however. The project developers are not scared of refactoring internal APIs whenever it becomes useful. They write a large number of unit and integration tests and have a monthly release cycle so that consumers do not have to use the Git repository.

A lot of this can be achieved simply by using some discipline and, especially, some common sense, he said. And even if you get some things wrong, it is much more important to get the important points right. In the case of libvirt, the strict boundary between internal and external interfaces, the separation of concerns between its C and XML APIs, and the focus on tests from the beginning were enough for the library to grow at a fast pace and to remain maintainable after almost ten years.

Index entries for this article
GuestArticlesBonzini, Paolo
ConferenceKVM Forum/2015


to post comments

Libvirt: what went wrong (and right)

Posted Sep 17, 2015 16:51 UTC (Thu) by danpb (subscriber, #4831) [Link]

> In a related area, libvirt keeps the internal structure of the objects (domains, networks, storage pools, etc.)
> hidden to libvirt clients. The clients have to use XML whenever they have to communicate information about the
> objects. The XML is part of the libvirt API and needs to be kept stable; however, it is much more flexible than
> C structures and it makes it easy to add new features without having to design tons of APIs. Again, strong
> separation between internal APIs (which use C structures) and external XML-based APIs simplifies growth.

While this is pretty accurate wrt the libvirt core library design philosophy, we do also realize that almost everyone using libvirt hates having to parse & format XML :-) As such, in recent times we have worked on an separate add-on library libvirt-gconfig.so (part of the libvirt-glib package) that provides a formal object oriented model for the XML documents. So applications can in fact use libvirt and entirely avoid using XML if they so desire. One might ask why this is a separate library, rather than being part of the core libvirt.so. The reason for this, is that libvirt.so intends to have a 100% stable ABI forever. We wanted to have a looser ABI guarantee for the libvirt-gconfig library, so if it becomes desirable/necessary, we can change the API for the OO XML modelling in a non-back compat manner.


Copyright © 2015, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds