By Jonathan Corbet
May 14, 2008
It is fair to say that distributed source code management systems are
taking over the world. There are plenty of centralized systems still in
use, but it is a rare project which would choose to adopt a centralized SCM
in 2008. Developers have gotten too used to the idea that they can carry
the entire history of their project on their laptop, make their changes,
and merge with others at their leisure.
But, while any developer can now commit changes to a project while strapped
into a seat in a tin can flying over the Pacific Ocean, that developer
generally cannot simultaneously work with the project's bug database.
Committing changes and making bug tracker changes are activities which
often go together, but bug tracking systems remain strongly in the
centralized mode. Our ocean-hopping developer can commit a dozen fixes,
but updating the related bug entries must wait until the plane has landed
and network connectivity has been found.
There are a number of projects out there which are trying to change this
situation through the creation of distributed bug tracking systems. These
developments are all in a relatively early state, but their potential
- and limitations - can be seen.
One of the leading projects in this area is Bugs Everywhere, which has recently
moved to a new home with Chris Ball as its new maintainer. Bugs
Everywhere, like the other systems investigated by your editor, tries to
work with an underlying distributed source code management system to manage
the creation and tracking of bug entries. In particular, Bugs Everywhere
creates a new directory (called .be) in the top level of the
project's directory. Bugs are stored as directories full of text files
within that directory, and the whole collection is managed with the
underlying SCM.
The advantages to an approach like this are clear. The bug database can
now be downloaded along with the project's code itself. It can be branched
along with the code; if a particular branch contains a fix for a bug, it
can also contain the updated bug tracker entry. That, in turn, ensures
that the current bug tracking information will be merged upstream at
exactly the same time as the fix itself. Contemporary projects are
characterized by large numbers of repositories and branches, each of which
can contain a different set of bugs and fixes; distributing the bug
database into these repositories can only help to keep the code and its bug
information consistent everywhere.
There are also some disadvantages to this scheme, at least in its current
form. Changes to bug entries don't become real until they are committed
into the SCM. If a bug is fixed, committing the fix and the bug tracker
update at the same time makes sense; in cases where one is trying to add
comments to a bug as part of an ongoing conversation the required commit is
just more work to do. That fact that, in git at least, one must explicitly
add any new files created by the bug tracker (which have names like
12968ab9-5344-4f08-9985-ef31153e504f/comments/97f56c43-4cf2-4569-9ef4-3e8f2d9eb1fe/body)
does not help the situation.
Beyond that, tracking bugs this way creates two independent sets of
metadata - the bug information itself, and whatever the developer added
when committing changes. There is currently no way of tying those two
metadata streams together. Then, there is the issue of merging. Bugs
Everywhere appears to reflect some thought about this problem; most changes
involve the creation of new, (seemingly) randomly-named files which will not
create conflicts at merge time. It did not take long, however, for your
editor to prove that changing the severity of a bug in two branches and
merging the result creates a conflict which can only be resolved by
hand-editing the bug tracker's files. Said files are plain text, but that
is less comforting than one might think.
[PULL QUOTE:
All of this can make distributed bug tracking look like a source of more
work for developers, which is not the path to world domination.
END QUOTE]
All of this can make distributed bug tracking look like a source of more
work for developers, which is not the path to world domination. What is
needed, it seems, is a combination of more advanced tools and better
integration with the underlying SCM. Bugs Everywhere, by trying to work
with any SCM, risks not being easily usable with any of them.
A project which is trying for closer integration is ticgit, which, as one
might expect, is based on git. Ticgit takes a different approach, in that
there are no files added to the project's source tree, at least not
directly; instead, ticgit adds a new branch to the SCM and stores the bug
information there. That allows the bug database to travel with the source
(as long as one is careful to push or pull the ticgit branch!) while keeping the
associated files out of the way. Ticgit operations work on the git object
database directory, so there is no need for separate commit operations. On
the other hand, this approach loses the ability to have a separate view of
the bug database in each branch; the connection between bug fixes and bug
tracker changes has been made weaker. This is something which can be
fixed, and it would appear (from comments in the source) that dealing with
branches is on the author's agenda.
Ticgit clearly has potential, but even closer integration would be
worthwhile. Wouldn't it be nice if a git commit command would
also, in a single operation, update the associated entry in the bug
database? Interested developers could view a commit which is alleged to
fix a bug without the need for anybody to copy commit IDs back and forth.
Reverting a bugfix commit could automatically reopen the bug. And so on.
In the long run, it is hard to see how a truly integrated, distributed bug
tracker can be implemented independently of the source code management
system.
There are some other development projects in this area, including:
- Scmbug is a relatively
advanced project which aims "to solve the integration problem once and
for all." It is not truly a distributed bug tracker, though; it
depends on hooks into the SCM which talk to a central server.
Regardless, this project has done a significant amount of thinking
about how bug trackers and source code management systems should work
together.
- DisTract is a
distributed bug tracker which works through a web interface. To that
end, it uses a bunch of Firefox-specific JavaScript code to run local
programs, written
in Haskell, which manipulate bug entries stored in a Monotone
repository. Your editor confesses that he did not pull together all
of the pieces needed to make this tool work.
- DITrack is a set of Python
scripts for manipulating bug information within a Subversion
repository. It is meant to be distributed (and, eventually,
"backend-agnostic"), but its use of Subversion limits how distributed
it can be for now.
- Ditz is a set of Ruby scripts
for manipulating bug information within a source code management
system; it has no knowledge of the SCM itself.
As can be seen, there is no shortage of work being done in this area,
though few of these projects have achieved a high level of usability. Only
Scmbug has been widely deployed so far. A few of these projects have the
potential to change the way development is done, though, once various
integration and user interface issues are addressed.
There is one remaining problem, though, which has not been touched upon
yet. A bug tracker serves as a sort of to-do list for developers, but
there is more to it than that. It is also a focal point for a conversation
between developers and users. Most users are unlikely to be impressed by a
message like "set up a git repository and run these commands to file or
comment on a bug." There is, in other words, value in a central system
with a web interface which makes the issue tracking system accessible to a
wider community. Any distributed bug tracking system which does not
facilitate this wider conversation will, in the end, not be successful.
Creating a distributed tracker which also works well for users could be the
biggest challenge of them all.
(
Log in to post comments)