July 2, 2007
This article was contributed by Donnie Berkholz
Package management is one of the key defining characteristics of a
distribution. The question of where package management is going should be of
interest to anyone involved with a distribution or administering a
Unix-based box of any sort. In many distributions, package management
appears to have reached a near standstill. For example, the RPM format has
hardly changed in years. In Gentoo, however, ongoing development of package
management is so popular that three separate, actively developed package
managers exist.
Over the past couple of years, many developers have grown increasingly
unsatisfied with Gentoo's default package manager, Portage. Portage is a
high-level interface to Gentoo's package format, a series of scripts called
ebuilds. Unfortunately, Portage wasn't planned out in the first place, and
features have been added ad hoc over the course of many years. Today, it's
extremely difficult to add features to Portage or interface with it because
there are complex interdependencies and a pretty much nonexistent
API. Consequently, two groups of developers decided to start fresh with two
separate projects: paludis and pkgcore.
Paludis is implemented in C++ and bash, with a C++ API and an optional
Ruby scripting API. One of the
biggest features that Portage lacks but Paludis supports is the ability to
remove all unused dependencies of a package when removing that
package. Also, it has a much more flexible configuration system,
user-definable hooks into the build process, user-defined sets of packages,
and clean support for multiple repositories. In Portage, secondary
repositories (called "overlays") are second-class citizens. Furthermore, Paludis
added a number of features Gentoo developers have been requesting for years
that add flexibility to how dependencies can be specified. Paludis contains
a number of modules, including:
- paludis—package installation, removal, and queries
- contrarius—a client for building cross-compiling toolchains
- inquisitio—a package searching client
- qualudis—a quality assurance tool for ebuilds
- adjutrix—a tool for architecture teams
Paludis includes experimental Portage support as of the end of March. This
means you can try it out without wasting time migrating config files over,
which significantly lowers its barrier to adoption.
Pkgcore is implemented in Python, the same language as Portage, with a few
time-critical modules in C. It was designed so that there's no reason it has
to be Gentoo-specific—it could easily support other package
formats. Its philosophy is to maintain complete backwards compatibility with
Portage while recoding it in a clean, maintainable, extensible fashion. Some
of the code written for Pkgcore has been pulled back into Portage, such as
the cache-handling code. Its 0.3 release finally reached a point of
usability because it added frontends with comprehensible output—one
that mirrors Portage and another that mirrors Paludis. Despite being in
Python, it runs shockingly fast—it is a good example that not all
programs written in high-level languages need be slow. The Pkgcore API is also viewable online. Some
of the utilities Pkgcore includes are:
- pmerge—package merging and unmerging
- pmaint—repository maintenance: syncing, etc.
- pquery—package searching
- pcheck—QA checker for ebuilds
A couple of interesting features Pkgcore has are N-parent inheritance of
eclasses (a Portage feature that allows inheritance to be used in bash code)
and an ebuild daemon. The daemon has a number of benefits including
near-linear scaling to multiple processors for some tasks—Pkgcore's
home page cites ~90% scaling on a quad Pentium 3. And of course, one benefit
over Paludis is that you don't need to use the occasionally less-than-speedy
g++ to compile it.
Pkgcore and Paludis seem fairly well-matched in the features
department. They both support sets, the additional dependency flexibility,
integrated checking for security vulnerabilities, and Portage's on-disk
format. Another useful feature they both support is the ability to restrict
packages to install based on their licenses. This gives users the choice of
how free they want their installations to be, from FSF-compliant to packed
with proprietary. Both projects have active teams working on them of between
5 and 10 developers each. In comparison, Portage is primarily maintained by
potential masochist Zac Medico—a glance through the ChangeLog showed
that he was the only committer since January.
The advent of multiple package managers accelerated Gentoo's need to adopt a
formal Package Manager Specification. In the past, new features or breaks in
backwards compatibility in Portage simply forced a wait of roughly 6 months,
at which point it was assumed that nobody was using those old Portage
versions anymore. Problems with that should be readily apparent. When new
package managers came along, additional questions came up of which aspects
of ebuild behavior were intrinsic behavior and which were Portage-specific
details. With only one implementation and no spec, it's hard to draw a line.
Together, these two developments motivated creation of an Ebuild API or
EAPI. The current generation will be EAPI=0, which is being documented in a
formal specification. Once this spec is done, Gentoo will have a process in
place for dealing with ebuilds using new features and for dealing with
breaks in compatibility via setting in each ebuild the EAPI that ebuild
supports. This will enable near-instant use of new features that Gentoo
developers have already been awaiting for years as well as agreement upon
how all these package managers must act in common and where they have
flexibility to be different.
(
Log in to post comments)