Development
Libvirt: what went wrong (and right)
At KVM Forum 2015, Michal Prívozník presented a talk entitled "Libvirt: what did we do wrong?" (YouTube video and slides [PDF]). Libvirt is a toolkit to interact with hypervisors and to set up virtual machines and containers. It is used by oVirt and OpenStack, among others. It supports about a dozen hypervisors, either as a bridge to native management tools or (as is the case for KVM) launching and managing guests directly. The purpose of the talk was not only to detail the things that went wrong in libvirt, but also to explain why they happened and to hint at Libvirt's strongest design points.
Libvirt is a complex project. It's a C library with a stable API and bindings for multiple other languages. It has a plugin interface to support multiple hypervisors and container solutions. It also handles the configuration of the host for functionality such as network bridges (with ebtables/iptables), iSCSI connections, and virtual Fibre Channel host bus adapters (with N_Port ID Virtualization or NPIV). Libvirt also supports remote access; API calls can transparently refer to remote machines and the library takes care of establishing a clear text socket connection, a TLS-protected one, or an SSH tunnel to the remote host.
The first thing that went wrong will be familiar to readers of the LWN kernel page: some APIs have no flags argument and thus are not extensible. For example, the virDomainShutdown() function takes a single argument, a pointer to a domain pointer. However, there are multiple ways to shut down a virtual machine. At the very least there are two: through a system interface such as ACPI, and through a remote procedure call (RPC) to a program running in a guest. Therefore a new API function was introduced called virDomainShutdownFlags().
Even when a flags argument was included, new arguments would turn out to be necessary in the future because the APIs were tied to specific hypervisors. For example, virDomainCreateWithFiles() had to be introduced to add the ability to pass file descriptors from the host into a Linux container. One might say that the use case is rather specific and that the new API is also tied to a specific virtualization mechanism.
In general, API names are not generally a strong point of libvirt. virDomainShutdownFlags() was mentioned already; virDomainCreateWithFiles() has an extra "With". It turns out there's also a virDomainCreateWithFlags() function, and virDomainCreateWithFiles() has both a files and a flags argument.
Inconsistency in error conditions was another thing that went unnoticed until it was too late. free(NULL) in C is a no-operation, as is virDomainInterfaceFree(NULL) in libvirt. However, virDomainFree(NULL) is an error. Changing this is difficult at this point, because it is unknown if any user is relying on this.
A different category of mistake is APIs that are inherently subject to time-of-check-to-time-of-use races. Again, this is not a new mistake: the *at() family of system calls was introduced for the same reason in the Linux kernel. In libvirt, the "list active guests" operation was split in two parts: first, it would collect a list of integer IDs and, second, the user would request information for each ID. This API does not make it possible to obtain a consistent view at some point in time.
Some right choices
That said, many other things were done right. In particular, the external API was limited to the minimum necessary for consumers of the library. Libvirt has a large number of internal APIs, which really helps when developing a system library in C. One rather useful subsystem, virCommand, was built to spawn subprocesses with support for file descriptor passing, dropping privileges (UID/GID and capabilities), and more. The developers receive many requests for turning it into a public library. However, the API is good exactly because it was never part of the libvirt ABI, and could grow into a better design over time. Keeping these parts internal, thus, was a good idea, he said.
In a related area, libvirt keeps the internal structure of the objects (domains, networks, storage pools, etc.) hidden to libvirt clients. The clients have to use XML whenever they have to communicate information about the objects. The XML is part of the libvirt API and needs to be kept stable; however, it is much more flexible than C structures and it makes it easy to add new features without having to design tons of APIs. Again, strong separation between internal APIs (which use C structures) and external XML-based APIs simplifies growth.
Some of the XML elements are surely badly named and uselessly hypervisor-specific; these issues come up often enough that Prívozník "could talk for a really long time" about them. In addition, for a long time Libvirt was not even checking the XML against a schema, preventing the API from failing fast. However, the use of XML has become a signature feature of the Libvirt API and Prívozník considers it one of its strong points.
What was the source of the mistakes? Mostly, the lack of a global view of what was happening with multiple hypervisors. Some of the mistakes above can be explained by Libvirt's beginnings as a C wrapper for the Xen RPC. Even today, communication is insufficient about the requirements of different hypervisors and plans for future development. The same problems are solved at different times for different hypervisors; if whoever comes second has additional or different requirements, the resulting API is often suboptimal.
To some extent, this is a consequence of the open-source model, he said. Collaborative development doesn't encourage traditional approaches to requirements design; "agile methodologies" are not easy to apply, either, for a project with a large number of individual contributors scattered around the globe.
Some of the principles of agile methodologies and extreme programming are indeed applied to Libvirt, however. The project developers are not scared of refactoring internal APIs whenever it becomes useful. They write a large number of unit and integration tests and have a monthly release cycle so that consumers do not have to use the Git repository.
A lot of this can be achieved simply by using some discipline and, especially, some common sense, he said. And even if you get some things wrong, it is much more important to get the important points right. In the case of libvirt, the strict boundary between internal and external interfaces, the separation of concerns between its C and XML APIs, and the focus on tests from the beginning were enough for the library to grow at a fast pace and to remain maintainable after almost ten years.
Brief items
Quotes of the week
Discover how to create and use variables that aren't inside of an object hierarchy. Learn about "functions," which are like methods but more generally useful. Prerequisite: Any course that used the term "abstract base class."
Mailpile .95, Mailpile .98, Mailpile NT, Mailpile/X ...
KDE Frameworks 5.14.0 released
Version 5.14 of the KDE Frameworks libraries are available. Changes in this release include a number of renamed private libraries, the refactoring of many settings in KActivities, and numerous fixes in the Plasma and KIO libraries.
Python 3.5.0 released
The Python 3.5.0 release is out. "Python 3.5.0 is the newest version of the Python language, and it contains many exciting new features and optimizations." See the what's new page and this LWN article for details on the new features in this release.
Bassi: Who wrote GTK+ (Reprise)
At his blog, GNOME's Emmanuele Bassi has published
some statistics on developer contributions to GTK+, dating back to the
2.0 release cycle. He also provides some context for interpreting the
raw numbers. "Disparity in the length of the development cycles
explains why the 2.12 and 2.14 cycles, which lasted a year, represent
an anomaly in terms of contributors (148 and 140, respectively) and in
terms of absolute lines changed. The reduced activity between 2.20 and 2.24.0 is easily attributable to the fact that people were working hard on the 2.90 branch that would become 3.0.
" This historical analysis is a follow-up to Bassi's development statistics about GTK+ 3.18, also published this week.
Newsletters and articles
Development newsletters from the past week
- What's cooking in git.git (September 14)
- LLVM Weekly (September 14)
- OCaml Weekly News (September 15)
- Perl Weekly (September 14)
- PostgreSQL Weekly News (September 13)
- This Week in Rust (September 14)
- Wikimedia Tech News (September 14)
Reavy: WebRTC privacy
At her personal blog, Mozilla's Maire Reavy addresses
some common privacy concerns users have brought up regarding WebRTC.
The most basic issue is that engaging in a WebRTC chat session can
reveal one user's public IP address to the other user in the chat
session. As it turns out, there are ways to mitigate the risk.
"We’ve added several new privacy controls for WebRTC in Firefox
42. These controls allow add-on developers to build features that give
users the ability to selectively disable all or part of WebRTC, and
which allow finer control over what information is exposed to JS
applications, especially your IP address or addresses. None of these
features are enabled by default due to the considerable cost of
enabling them to most users (most of them can be also enabled via
about:config).
"
Page editor: Nathan Willis
Next page:
Announcements>>