Specifix, a company founded by a
number of early Red Hat developers, recently came out of hiding. At the
2004 Ottawa Linux Symposium, Eric Troan gave a presentation on Conary, the
company's system for package, repository, and distribution management. It
was a technical talk from the beginning to the end; Eric would not talk
about Specifix's business model even when asked (though he offered to do so
in private). If nothing else, he understands what the OLS crowd is looking
to hear.
Package management systems have come into use in almost every distribution
out there. They are a clear step up from what came before, but, as Eric
pointed out, significant problems have been building for years. These
include:
- Repositories are an afterthought. A typical repository is a simple
collection of files in whatever package format is being used, perhaps
with a bit of metadata.
- The version scheme used by most package managers follows a straight
line model; there is no provision for branches. That makes it hard,
for example, to determine which version of a package is appropriate
for a specific release of a given distribution.
- Packages contain scripts which handle parts of the installation and
removal process which go beyond the simple management of files. These
scripts tend to contain a lot of boilerplate, and are replicated in
every package file. Bugs, too, are replicated, and there is no one
place to go to fix them. The scripts are also not portable across
distributions (even those using the same package format) and cannot be
customized for an individual site's needs.
Conary was developed as a way of addressing the above limitations and to
make it possible for users to create their own, customized distributions in
an easy manner. In the simplest sense, one can think of Conary as a
package management system with a more consistent view of objects from the
repository level down to individual files, combined with a version
management scheme.
Conary treats files as "first class objects," which are managed by the
framework as a whole. Files have a unique ID and a version history; they
also have a set of attributes. One of those attributes is the file's
location in the filesystem; moving a file is a simple matter of changing
that attribute.
A "trove" is a container holding one or more files and other troves. Files
are contained by reference. A "component" is a collection of files, by
reference. Example components listed by Eric for the bzip2 package might
be bzip2:runtime (binary files to run the program),
bzip2:lib, bzip2:doc, and, of course,
bzip2:source. Components can be aggregated
together into packages. Both components and packages are considered to be
"troves," for what it's worth.
Version strings are hung onto everything; Specifix has added some
complexity to the versioning system, though. Each version string includes
the repository name, a namespace (think of it as a distribution name), a
branch name (for the creation of trees in the version space), the upstream
package version, and a two-part local revision number. Needless to say,
the version strings get long, but the system hides the full string most of
the time. Creating versions in this way allows the system to easily
determine which version of a package is the newest, which version of which
distribution is built for, and so on.
Branching is done by adding a branch name to the version string. Branching
allows the tracking of versions of packages which were shipped with a
specific distribution, along with updates to those packages. There is also
a special type of branch called a "shadow" which tracks changes to the
trunk it was branched from. Essentially, the shadow is automatically
merged with each new version of the trunk it is following. This feature
would be useful for somebody maintaining a derivative distribution; they
want to keep up with what the source distribution is doing without losing
track of their own changes. The only problem with shadows is that, like a
number of other Conary features, they are not actually implemented yet.
"Flavors" are another Conary feature; they seem to be patterned after
Gentoo's "USE flags." A flavor is a set of configuration options
describing how all packages are to be built. This feature is used for
multiple architecture support, or for building versions of distributions
with different feature sets (e.g. creating a distribution without PAM
support). Multiple flavors of a package can be installed on a system if
they don't conflict with each other; this allows, for example, the
installation of 32-bit libraries on x86-64 systems.
Then, there is the concept of "changesets." A changeset is a collection of
modifications to files (including attribute changes) and the troves which
contain them. A changeset is, essentially, a patch to a package or a
distribution. Changesets, which track only changes, can be much smaller
than the packages they describe, and can thus be an efficient way of
distributing updates. Changesets describe changes to configuration files
in diff format, which often allows them to be merged automatically
with local changes. A system administrator can also create a changeset
describing his or her local changes to the system; that changeset can then
be used for merging with updates, or replicating the system elsewhere.
Local changesets can also be used for version control and the tracking of
system changes.
"Tags" are Conary's answer to the package script problem (and, also, to the
complex set of interactions represented by the RPM "trigger" mechanism). A
tag is a file attribute describing the type of the file, be it "shared
library," "info file," or any of a long list of alternatives. Most files
can be tagged automatically by Conary. Tags have scripts associated with
them; there is, for example, a script which handles the installation of an
info file and updating the relevant directory. These scripts are
distributed separately; there is only one copy of them on the system. The
scripts are thus easily fixed when bugs turn up, and they can be customized
by the local administrator if need be. Separating out the management
scripts in this way should also make it easier to install packages from
other distributions.
A "fileset" is an arbitrary collection of files built from components in
the repository. Filesets seem to be intended to help in the creation of
small system images for embedded systems; they allow an easy picking and
choosing of an exact set of desired files. "Groups" are, instead, the
analog of the Debian "task" or Anaconda "component." They allow the
management of several packages as a unit, but they come with their own
local changesets so that local changes to the group are tracked properly.
The paper
from the OLS proceedings (PDF format) is worthwhile reading for anybody
wanting more details on how Conary works.
Interested parties can download an early Conary release from the Specifix web site.
Be warned, however, that a few features are still missing; they include
shadows, dependencies (an important issue that they "think" they know how
to implement), flavors, package signatures, and more. "Release early" is an
important part of the free software development process, however, and the
Specifix founders understand that process well. Conary's vaporware
features will, beyond doubt, be filled in soon. As that happens, expect
interest in this tool to increase; it truly does have the potential to
change the way we set up and manage our projects, distributions, and
systems.
(
Log in to post comments)