By Jonathan Corbet
July 16, 2008
Fedora's new version of RPM,
announced on July 9, has hit the
Rawhide repositories; after inspiring some initial cries of pain, it
would appear to
settling in well. It is good to see activity on Red Hat's version of RPM
after a long period where nothing much was happening. In the process of
bringing this new code to Rawhide, the RPM developers have also inspired
some interesting side discussions on topics like whether such a major
change should have gone through the official "features" process first. But
the most extended (and arguably most interesting) discussion came from an
unexpected direction.
Doug Ledford is known in kernel development circles, but, being an RHEL
engineer, he has not been
seen much in the Fedora camp. He joined the
RPM discussion with a feature request of
his own: he would like a set of tags which would facilitate the location of
a package's source code in a distributed version control system (DVCS). So
these tags would indicate which DVCS is in use (git, mercurial, etc.),
where the repository is to be found, the tag corresponding to the source
code for a specific version of a package, etc. And, Doug let it be known,
it would be nice if he could have those tags soon; tomorrow would be nice,
but before the Fedora 10 release in particular.
Once this information exists for a package, interesting things can be
done. For example, source RPM packages could become much smaller; rather
than containing a tarball and a set of distributor-applied patches, it
could just hold the DVCS information. An "installation" of that package
would then just go to the source repository and check out the sources from
there. If the source repository is managed carefully, it could help the
cooperation between Fedora and the upstream projects; patches could be
pushed and pulled between repositories with ease. This kind of mechanism
could also make it easier for the Fedora project to distribute "spins"
created by outsiders by reducing the resources required to make the
associated source code available. See this
lengthy pitch from Doug for more discussion of the advantages of the
distributed source package approach.
Of course, there are some obstacles too. Not all projects are using a
DVCS, so integration with those projects would be more difficult. Quite a
few projects have material in their repositories which, for legal reasons,
cannot be distributed by Fedora. Finding a way to excise that material
without breaking the connections between repositories could be
challenging. The tarballs distributed by many upstream projects - which
are the starting place for Fedora packages now - often contain changes
which are not reflected in their source repositories. Those changes can
include the removal of non-distributable material, or simply generating
the configure file.
These challenges are real, and some of them will take a fair amount of work
to resolve. But it seems clear that things eventually need to go in this
direction. Tighter integration between projects and distributors can only
help the whole free software ecosystem work in a more efficient manner.
Tarballs reflect a form of frozen state which is entirely divorced from the
code's history - and from its future. Or, as
Doug put it:
It's all about the repo. A tarball is something you hand off to
poor saps that haven't joined the 21st century, all the while
snickering at their inability to get with the times. It is nothing
more than a middle man step that interferes with efficiency of
operation and that should be cut out of the loop.
A source package format that can maintain its
connections wherever it goes can only make the whole system work better. So it
is good that the Fedora folks (including those beyond Doug who have been
thinking about this issue for a while now) are working on this problem.
There was, however, an interesting omission from the discussion; as far as
your editor can tell, nobody ever mentioned the work being done by the vcs-pkg project, which is aimed toward this goal:
Our goal is to integrate version control with distro package
maintenance. We want to recognise all involved in the process, from
upstream, the package maintainers of the various distributions,
their security and release teams, and power users, who aren't
afraid to fix their own bugs, and give maximum flexibility to
them.
This group is mostly Debian-based, but its members are making a concerted
effort to create solutions which are independent of any given distribution
(or DVCS). It can only make sense for Fedora to work with this project -
or at least have a look at what vcs-pkg is doing and come up with a good
excuse why a different solution has to be invented for Fedora.
The integration of distributed version control and packaging can only reach
its full potential if, among other things, it facilitates cooperation
between distributors and their upstream providers, their users, and,
importantly, other distributors. If each distributor brews up its own
solution (again), they'll have a hard time sharing their work with each
other. Few upstream projects will have the patience to integrate with
several disparate distributor systems, so that integration will be much
less likely to happen. All of this can be avoided, though, if the
distributors decide now to work toward some common standards for the use of
distributed version control in packaging.
(
Log in to post comments)