LWN.net Logo

Debian gets dgit

By Nathan Willis
September 11, 2013

It is pretty easy to get a new project up and running with Git, but integrating Git—or any other new version control system—can be painful for an existing project with an established code base. Such is the case for Debian, with its tens of thousands of packages spread across multiple versions of the distribution. Migrating Debian to a Git-based version-control system would be a herculean ordeal, if that was even a task that the project was interested in undertaking. But Ian Jackson recently unveiled a new tool that serves as a bridge between the official Debian archives and a Git repository, thus allowing developers to use a Git workflow while remaining fully integrated with the archive.

The tool is called dgit; Jackson announced version 0.7, the first "suitable for alpha and beta testers", on August 22. The concept behind it was hashed out during DebConf13 in mid-August. As Jackson explained it, the goal was to allow package maintainers and developers to use "a gitish workflow" if they so desired, including working with upstream Git repositories and preserving Git histories, but without forcing a Git-based workflow on anyone who was happier using the status quo.

The bird's-eye view of dgit is that it treats the Debian archive (which contains all of the packages that make up a Debian release) as if it were a remote Git repository. A developer can clone or fetch a package from the archive, commit and merge changes, and push updates to the package, all using dgit commands that mirror, in most ways, the offerings of Git itself. But this functionality is on demand; if no developer dgit clones a particular package, there is no Git view of it created—doing so automatically for every Debian package would consume far too many resources.

Thus, there is quite a bit of work going on behind the scenes to keep the archive and the dgit view of the package in sync. When a developer uses:

    dgit clone foopackage sid

for example, dgit initializes a Git repository on Debian's Alioth server, pulls in the contents of foopackage from the sid distribution, then constructs the local repository on the developer's machine. The developer can then use normal Git tools (raw command-line or otherwise) as desired. When it is time to upload changes, dgit sbuild constructs the source package. Then, a dgit push both pushes the current HEAD to the remote Git repository on Alioth and uploads the source package to the Debian archive.

Where things get more difficult are those situations when a package is modified outside of changes made directly on the dgit local branch, such as with a set of patches. The tool includes a dgit quilt-fixup command to integrate with the quilt patch manager (which lets maintainers keep track of a set of patches that need to be applied before each upload). The quilt-fixup command creates a "synthetic commit" which is then added to the Git history before the package is pushed. However, as Jackson noted in the man page and on the debian-devel mailing list, this is an imperfect solution.

Jackson pointed out some peculiarities of quilt that make it incompatible (at least for the time being) with dgit. For example, when one uses dpkg-source to build a source package in Debian's quilt-compatible format, if the result is then extracted (again using dpkg-source), the contents are not identical to the original—specifically, there are extra metadata files generated. This makes it difficult to use quilt to apply a set of patches and push the results with dgit, so Jackson recommended steering clear of quilt-formatted source packages altogether.

On the mailing list, Raphael Hertzog took some umbrage at Jackson's description of this issue as "brain damage" on quilt's part. In the ensuing discussion, Hertzog and Jackson eventually reached an impasse. The disagreement boils down to what is considered the "normal" workflow—specifically, how a developer should manage both local changes and a set of quilt-managed patches. Hertzog contends that developers should record their own local changes as a separate patch in quilt, while Jackson believes local changes should be orthogonal to those patches managed in quilt. But when using Jackson's workflow, quilt copes with the local changes by adding additional metadata, in the form of those the extra files seen by dkpg-source.

In any case, Jackson eventually decided to simply work around the oddities that result from trying to use quilt and dgit together. It is certainly possible for a developer to use dgit without worrying about the issue, merely by not bringing quilt into the mix. Of course, asking a developer to start using a different workflow is rarely a welcome suggestion, but there is hope that the distinctions will eventually be smoothed over.

There are some other limitations, however. For now, dgit is only usable by official Debian Developers (DDs); non-DDs cannot even create a read-only view of a dgit repository. This is due to the access control setup deployed on the Debian servers; it may be resolved in the future when Jackson and the system administrators have sufficient time.

Hertzog also inquired whether there might be any lessons to learn from Ubuntu's Distributed Development (UDD) project, which automatically imported all packages in the Ubuntu and Debian archives into repositories for use with Bazaar. "Automatic" import is in many ways wishful thinking; as several reported, Ubuntu found that there are a variety of special cases that dictate manual intervention to repair an imported package, and it can be problematic to get the full commit history of each package—which can involve upstream changes, patches, and commits made by individual developers. Ubuntu had it easier than Debian because UDD was limited to a single, Bazaar-based workflow. Since Debian is (at least for the foreseeable future) committed to giving its developers and maintainers the freedom to use any workflow they wish, deploying something like dgit for the entire Debian package archive would probably require more people-power than the project has.

No doubt many interesting things could be done with the availability of a Git repository containing the entire Debian archive, and accessible to the world. Dgit is not likely to reach that stage any time soon, but, as Jackson pointed out, he wanted something that he could deploy and use immediately. And it is clearly good news that Debian developers can begin using dgit now; Git has proven itself to be the version-control system of choice in free software at large, so integrating it with one of the premiere free software distributions is sure to reap benefits for developers and Debian users alike.


(Log in to post comments)

Debian gets dgit

Posted Sep 14, 2013 0:39 UTC (Sat) by zuki (subscriber, #41808) [Link]

For me, this seems completely backward. Instead of designing wrappers which provide git functionality, but will break all the time when doing something more advanced or unexpected, let's instead convert everything to git, store package sources as one git branch, debian/ directory as another branch, and forget about quilt, applying patches during build, and other remnants of the 90's. Fedora does this slightly better, because at least all packaging information is stored as git trees, but doesn't go all the way, and patches are still stored as patches. Exporting patches to files, adding names, and checking them back into version control as files is just busywork, imho.

Debian gets dgit

Posted Sep 16, 2013 8:02 UTC (Mon) by jezuch (subscriber, #52988) [Link]

> let's instead convert everything to git, store package sources as one git branch, debian/ directory as another branch, and forget about quilt

https://wiki.debian.org/GitSrc

But I don't think it's used that much.

Debian gets dgit

Posted Sep 16, 2013 16:52 UTC (Mon) by ballombe (subscriber, #9523) [Link]

(3.0) "git" packages are not accepted in the Debian archive.

Debian gets dgit

Posted Sep 15, 2013 12:44 UTC (Sun) by martin.langhoff (subscriber, #61417) [Link]

Active DDs should be forced to spend a week with fedpkg. If dgit evolves to match or improve fedpkg's usability, Debian will make a leap ahead.

To keep the universe in balance, Fedorans will be locked in a room with apt-get for the same duration.

Jokes aside -- with dgit in place, perhaps Debian will consolidate on a common "exploded source package" storage backend and format. This would lowerbarriers to cross-archive changes, NMUs and such.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds