LWN.net Logo

FOSDEM'10: distributions and downstream-upstream collaboration

February 17, 2010

This article was contributed by Koen Vervloesem

For the first time in its ten-year history, FOSDEM didn't organize individual developer rooms per distribution, but it opted for a joint 'mini-conference' in two distribution developer rooms, with talks that specifically target interoperability between distributions, governance, common issues that distributions are facing, and working with upstream projects. A couple of them piqued your author's interest.

Debian and Ubuntu

As a Debian and Ubuntu developer since 2006, Lucas Nussbaum knows the relationship between these two distributions inside and out. He has attended DebConf and Ubuntu Developer Summit, has friends in both communities, and is involved in improving collaboration between both projects. In his talk, he discussed the current state of affairs from his point of view and what could be done to improve matters.

Ubuntu has a lot of upstream projects, like Linux, X.org, GNOME, KDE, and so on, but it has one special upstream: Debian. Integrating a new project or a new release in Ubuntu regularly requires changes, such as toolchain changes and bug fixes. It is often not possible to do this work in Debian first. Lucas gave some statistics from Ubuntu Karmic Koala (9.10): 74% of the projects come directly and unmodified from Debian, 15% come from Debian but are modified with Ubuntu patches, and 7% come directly from upstream projects. (For those that are puzzled why these numbers don't add up: the missing 4% is when Ubuntu packages a newer upstream release than the release that Debian has. This package can be based on the Debian version or fully repackaged.)

Managing this divergence is not trivial. Keeping local changes in Ubuntu requires a lot of manpower, and the changes need to be merged again when Debian updates the package. This is already a strong incentive to push changes to Debian. Bug reports are the main vehicle for pushing changes, but this is where problems can start. Lucas summarized it neatly:

Ubuntu users that want to file a bug, have the choice between three options. They can file a bug upstream, where they might get flamed; they can file a bug in Debian, where they are very likely to get flamed; or they can file a bug in Ubuntu's Launchpad, where there are very likely to get ignored.

There is already some collaboration on bugs today. For example, some bugs get filed in Debian by Ubuntu developers: about 250 to 400 bugs per Ubuntu release cycle, mostly upstreaming of Ubuntu patches. There is also a link to Ubuntu patches and bugs in Launchpad on the Debian Package Tracking System (PTS), although Lucas admitted that at the moment the data is imported using a fragile hack.

The second part of Lucas's talk was about his view on the current state of the relationship between Debian and Ubuntu. Historically, many Debian developers have been unhappy about Ubuntu because of the feeling of the distribution being "stolen" and due to some problems with Canonical employees that tend to reflect on Ubuntu as a whole. However, according to Lucas things have improved considerably and many Debian developers see some good points in Ubuntu: Ubuntu brings a lot of new Linux (and Debian derivative) users, it also brings new developers to Debian, and it serves as a technological playground. Examples of the latter are dash as the default /bin/sh, boot improvements, hardening GCC flags, and so on. Having these things tested first in Ubuntu makes it much easier to import them into Debian later.

On the Ubuntu side, Lucas sees a culture where contributing to Debian is the right thing to do, and as a result many Ubuntu developers contribute to Debian. However, there is often not a lot to contribute back on the package level: many bug fixes are just workarounds. Also, Canonical is a company, which contributes back when there are benefits for it, so Debian shouldn't expect many free gifts.

However, Ubuntu's rise also causes some problems to Debian, and Lucas called the most important problem "the loss of relevance of Debian". Not only has the Debian user base (or at least market share) decreased, but, for many new users, Linux equals Ubuntu. Recent innovations have usually happened in Ubuntu, so even if Debian is now the basis of a major distribution, it becomes less relevant. However, Lucas thinks collaborating with Ubuntu is the right thing to do for free software world domination, because Debian fights for important values and takes position on technical and political issues such as the Firefox trademark issue.

Lucas's proposal to make Debian relevant again and still help Ubuntu is to behave like a good upstream and to communicate on why Debian is better. The former means that collaboration with Ubuntu should be improved. Not only should there be more cross-distribution packaging teams, Lucas also maintains that Debian could help Ubuntu to maintain Debian's packages, e.g. by notifying Ubuntu of important transitions and by bug triaging and fixing directly in Launchpad if time permits this. He also stressed that Debian should acknowledge high-quality work that is done in Ubuntu. This could then be imported into Debian: "Importing packages doesn't have to be one-way only."

According to Lucas, Debian fails at communicating that it is better. It even needs external people like Bradley Kuhn sometimes to do that communicating. Debian is a volunteer-based project where decisions are made in the open, and it advocates the free software philosophy since 1993. In contrast, Ubuntu is a project controlled by Canonical, where decisions like Ubuntu One or the switch to Yahoo! as a search engine are imposed. Moreover, Ubuntu advocates proprietary web services such as Ubuntu One, the installer recommends proprietary software, and there is the controversial copyright assignment to contribute to Canonical projects.

Lucas went further and claimed that Debian is a better distribution because many package maintainers (e.g. of scientific software) are experts in their field, and the emphasis is on quality. In contrast, most of Ubuntu's packages (the 74 % he mentioned before) are just synchronized from Debian, with many maintainers that have no real knowledge about the packages. The conclusion of his talk was that Ubuntu is a chance for Debian to get back in the center of the FLOSS ecosystem, but that the distribution should be more vocal about Ubuntu's issues.

How to be a good upstream

Petteri Räty, a member of the Gentoo Council, talked about how to be a good upstream. Or rather, he gave some "dos and don'ts" to bootstrap a discussion with the audience. The bottom line was: if a project is a good upstream, then distributions need a minimal amount of iterations to package the software. One ground rule to prevent a lot of problems is: "Never change a once released file". Without releasing a new version of the software, that is. If an upstream project violates this rule, bug reports don't make any sense, because upstream and downstream no longer know which version of the file the user is referring to. Another thing to watch for when releasing files is that a release should not build in debug mode by default. There should also be no -Werror in CFLAGS, as the project should respect the user's choice. Moreover, changes that are relevant for distributions should be documented in changelogs, Petteri stressed: "For example, say it explicitly when there's a security fix."

How the upstream project handles dependencies also has consequences for distributions. For example, they should link dependencies dynamically and use libtool in autoconf. They also should never bundle dependencies: for example, if they would include zlib, security problems in the library wouldn't get solved when the distribution updates the "global" zlib. And last but not least, the upstream project should allow downstreams to configure build support for optional dependencies.

Petteri also emphasized that the "Release early, release often" software development philosophy is really important: that way, end users get the project's code faster, which means that the code is tested more quickly, by more people. Not all releases will end up in distributions, but the best ones (tested by distributions' developers and maintainers) will. It's also important to have consistent version numbers: going from 0.10 to 10.0 and then to 11.1 is not the way to go. The audience grinned understandably when Petteri mentioned that a 4.0 version number should only be given to a stable release.

Working with GNOME upstream

As a GNOME and openSUSE contributor, Vincent Untz was the perfect speaker to lead a session about collaboration between GNOME upstream and downstream. This was really an interactive session where the audience gave a lot of suggestions. For example, one downstream packager said that it would be nice to have the same predictable 6-months schedule for GNOME applications, like Rhythmbox, as for the GNOME desktop environment. There is already a precedent: at the end of January, the Banshee music player developers announced that they will align their release schedule with GNOME's: Banshee 1.6 will be released together with GNOME 2.30. Another member of the public found it inconvenient that new GNOME features sometimes depend on arbitrary upstream choices such as X running on the first virtual terminal.

Big changes in GNOME upstream also have an impact on stability. Vincent pointed out the migration from GnomeVFS to GVFS in GNOME 2.22, which maybe happened too early. GDM 2.24 also had too many changes, even to the point where many distributions still use GDM 2.20 now. The change from AT-SPI (Assistive Technology Service Provider Interface) built upon CORBA (Common Object Request Broker Architecture) to AT-SPI2 based on D-Bus is also a big change that will be a challenge for distributions. An example that merits following according to Vincent is the move from HAL to DeviceKit in GNOME Power Manager: Richard Hughes maintained the two branches, which was good for distributions that didn't want to migrate immediately. GNOME 3 will obviously also have a big impact. Vincent welcomes downstream distributions to tell GNOME which "old" libraries they would like to keep using for a time, so that upstream can maintain these to give distributions some time.

With respect to patches, Vincent applauded Debian, which is working on a format where a patch has information in the comments about where the patch has been sent upstream, if it is accepted but not yet released, and so on. Distributions that have an online patch tracker also help upstream maintainers. Again it is Debian (with patch-tracker.debian.org) and Ubuntu (with patches.ubuntu.com) that are helpful to upstream maintainers.

Packaging Perl modules

Gabor Szabo, who has been involved in the Perl community for 10 years, talked about packaging Perl and CPAN (Comprehensive Perl Archive Network) modules for distributions. The problem from the end-user's perspective is that many Perl applications are not packaged in the user's distribution, but are only available as a CPAN module that has to be built and installed in another way. The problem becomes even uglier when users want to install a Perl script that needs several CPAN modules that are not in the distribution's repository.

Gabor calls this a major issue, and he gave some numbers to put this into perspective: CPAN has around 20,000 packages, while Ubuntu 9.10 has about 1,900 of them, which is roughly 10 %. The numbers are even worse for Ubuntu 8.04, a Long Term Support (LTS) version that is used on many web servers: this release has about 1,200 CPAN packages or roughly 6 % of them. "Other distributions have roughly the same numbers, with FreeBSD having quite a bit more in their ports collection. Of course we don't need all CPAN modules packaged in distributions, but we definitely need a lot more than we have now."

So why do most distributions have such a low percentage of Perl packages? According to Gabor, the number one reason is that users just don't ask for more modules, maybe because many users are not used to talking to their distribution. Another obvious reason is that it's time consuming to package and maintain a Perl module. This issue could be solved by further automation and better integration of the packaging tools and the CPAN toolchain. "We should also catch license or dependency issues earlier. As far as I know, only Fedora and Debian have a dedicated Perl packaging mailing list: Fedora-perl-devel-list and debian-perl, respectively." But on the other hand, it's not the quantity that counts, and many modules are not worth packaging: "Therefore, the Perl community needs a better way to indicate the quality and the importance of a CPAN module, as a guideline for distributions. Importance can be measured in different ways: by popularity, by the number of external (non-CPAN) dependencies, by the number of packages depending on this package, and so on."

Apart from solving the usual occasional communication problems (upstream CPAN authors that don't respond or even disappear, patches and bugs reported to downstream that don't always reach upstream, and so on), Gabor has some suggestions for both upstream and downstream. For example, distributions could supply more data directly to CPAN, such as the name and version of the CPAN modules they include as packages, the list of bugs reported to the distribution, and the list of downstream patches. The Perl community could then gather this data and display it on one web site. CPAN itself could also be improved for better downstream integration. For example, there should be a "pass-through CPAN installation" where CPAN.pm should use the distribution's native package manager where possible. Using this installation type, CPAN could also report missing packages and gather statistics about these modules for distributions.

RPM packaging collaboration

Pavol Rusnak of the openSUSE Boosters Team shared his view on RPM packaging collaboration. The biggest issue here is that many RPM-based distributions use their own distribution-specific macros in RPM spec files. However, a couple of them have already been unified between distributions. For example, Fedora and openSUSE first had defined completely different macros for Python packaging, but the Fedora macros have now been adopted in upstream RPM and openSUSE 11.2 is also using them. The make install differences have also been solved: while, at first, openSUSE, Mandriva and Fedora used different macros, RPM upstream introduced the make_install macro which is equivalent to make install DESTDIR=%{?buildroot}.

However, there are still a lot of differences that make porting RPM spec files a challenge. For example, handling desktop files happens differently in Fedora and Mandriva than in openSUSE: they have other BuildRequires dependencies and the content of the install macro is different. Pavol suggested unifying this procedure and pushing things to RPM upstream if some macro is still needed. Ruby packaging also happens with different macros, and here too, Pavol suggests creating common macros in upstream and using them consistently in distributions. "However, different distributions have a different mindset, so it's not always easy to find a solution that everyone likes." The same song goes for parallel builds: Fedora starts them as make %{?_smp_mflags}, while openSUSE uses make %{?jobs:-j%jobs}. And so on.

Pavol ended his presentation with some ideas for future RPM development. For example, Panu Matilainen is working on file triggers, which will help packagers to get rid of a large number of scriptlets in spec files, like calling ldconfig, gtk-update-icon-cache, and so on. File triggers, which allow running some scripts when some file has been added or removed, are already implemented in Mandriva. Pavol also suggests introducing two new scriptlets: %preup and %postup. These will be called when updating a package. The %preun, %postun, %pre and %post scriptlets will not be run in that case. That way, the package writers don't have to write weird code such as if [ "$1" -eq "0" ]; any more to detect whether an operation is an upgrade or an install. In the future, scriptlets will also get more information about the running transaction, which makes it possible to detect more precisely what is happening so that, for example, it could convert a configuration file when upgrading from Apache 1.x to Apache 2.x.

Learning from each other

When it was announced that distributions wouldn't get their own developer room at FOSDEM 2010 but would meet together in the cross-distribution rooms, many people were skeptical about the idea. Some of them liked the idea of cross-distribution rooms, but were disappointed to see the separate distribution-specific rooms go away. But all in all, many visitors seemed to find the new concept a great idea, and there were good cross-distribution discussions at FOSDEM 2010, although in many of the talks there were not enough people from enough distributions to really get the ball rolling. Packaging and working with upstream are important activities where all distributions are confronted with the same issues and where they can learn from each other and share good ideas. Your author has heard rumors that a lot of developers didn't come to FOSDEM this year because their favorite distribution didn't get a dedicated room, but this is a pity: being able to learn from other distributions would have made their own distribution even better.


(Log in to post comments)

'Upstream'

Posted Feb 18, 2010 11:50 UTC (Thu) by epa (subscriber, #39769) [Link]

The word 'upstream' appears many times in this article and with seemingly different meanings. GNOME is an upstream for Ubuntu, and so is Debian; but then users have a choice of filing a bug either 'upstream' or in Debian. Apparently, 'new GNOME features sometimes depend on arbitrary upstream choices such as X running on the first virtual terminal', but that to me seems like a choice made 'downstream' by whoever is packaging GNOME, and this meaning of 'downstream' is used later on in the article.

Perhaps a brief explanation of what exactly these terms mean would have helped, or in some cases, write them out in longhand as 'the Linux distribution building packages' and so on.

FOSDEM'10: distributions and downstream-upstream collaboration

Posted Feb 18, 2010 12:59 UTC (Thu) by nim-nim (subscriber, #34454) [Link]

I didn't get the impression that Debian and Ubuntu were leading the way on patch tracking. Vincent insisted heavily on them because they finally put in place a system that gave outsiders a view on their patches, while other distros have had a public VCS for years (http://cvs.fedoraproject.org/viewvc/ http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/ etc)

FOSDEM'10: distributions and downstream-upstream collaboration

Posted Feb 19, 2010 1:22 UTC (Fri) by kfogel (guest, #20531) [Link]

Hi, nim-nim. Patch tracking is different from just having the distro in a version control system. Patch tracking is about discovering what patches have been contributed to solve particular bugs, and about getting those patches to the right places -- the "right place" often being upstream (where "upstream" means the point of origin for the code in question).

For example, one result of a recent Ubuntu Developers Summit was the Launchpad Upstream Improvements spec [1], whose rationale is: "Upstreams frequently complain that it is difficult for them to find Ubuntu patches. patches.ubuntu.com is a diff between Ubuntu and Debian, so it's not really useful to some upstreams."

In other words, upstreams want to more easily find patches attached to Ubuntu bugs, since sometimes those patches are upstreamable even if the bug report itself has not been linked (formally or informally) to the upstream tracker. One result of this is the "patch tracking" feature coming in Launchpad [2]. I think after reading [1] and [2] you'll understand how patch tracking is about more than putting the distribution in a version control system. ([2] links to some screenshots of the feature, by the way.)

Best,
-Karl

[1] https://wiki.ubuntu.com/Specs/LaunchpadUpstreamImprovements
[2] https://dev.launchpad.net/Bugs/PatchTracking

FOSDEM'10: distributions and downstream-upstream collaboration

Posted Feb 19, 2010 8:31 UTC (Fri) by nim-nim (subscriber, #34454) [Link]

It may be but that was not what was told @fosdem. The article is distorting the Fosdem presentation

FOSDEM'10: distributions and downstream-upstream collaboration

Posted Feb 18, 2010 23:35 UTC (Thu) by jengelh (subscriber, #33263) [Link]

>Ubuntu users that want to file a bug, have the choice between three options. They can file a bug upstream, where they might get flamed; they can file a bug in Debian, where they are very likely to get flamed; or they can file a bug in Ubuntu's Launchpad, where there are very likely to get ignored.

Upstream reaction is not just flames.

(a) the D/U shipped version is numerically outside of the support scope (including, but not limited to, the linux kernel)

(b) the shipped version is considered too patch-ed-y

(c) reject option. ("We are sorry we do not officially support $DISTRO && $THEIR_OLD_VERSION")

>Fedora starts them as make %{?_smp_mflags}, while openSUSE uses make %{?jobs:-j%jobs}. And so on.

Use of %jobs is deprecated — http://en.opensuse.org/openSUSE:Packaging_Guidelines — as it can carry less tuning knobs than %_smp_mflags.

No -Werror in CFLAGS

Posted Feb 20, 2010 1:30 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

There should also be no -Werror in CFLAGS, as the project should respect the user's choice

I don't see where -Werror is a user's choice issue.

I learned early in my GNU-using career not to put -Werror in distributed code because it causes lots of needlessly failing builds because of insignificant differences between the user's and the distributor's systems (like compiler version) and causes virtually no productively failing builds.

I do use -Werror unfailingly on my own builds of things that I distribute, though, and make sure there are no warnings on my own system. Maybe that's the user's choice the statement means. But I see no point whatsoever to choosing to add -Werror to a build of someone else's code -- that's just daring the build to fail.

No -Werror in CFLAGS

Posted Feb 21, 2010 13:43 UTC (Sun) by Darkmere (subscriber, #53695) [Link]

What he means ( From my own experience, as an old time Gentoo developer ) is that if a project ships a default Makefile (configure.ac or whatever build system they like) that includes -Werror, they are inviting for a huge headache downstream.

There has been plenty of times where I've been forced to mangle Makefiles because the developer had -Wall -Werror in their Makefiles, and built with an older/different/odder version of GCC/compiler-of-choice than what we did, and thereby causing a ton of failures.

Compilers become pickier. When you release a piece of software to the wild, -Wall and -Werror do not belong in the sources, as two months later, your tarball will fail to build against the current cvs version of GCC, because it suddenly warns in -Wall that you don't have your commas aligned vertically with your paranthesis, so the smileys turn the wrong way.
( Or similar . )

The point is, be strict when debugging and developing, be lenient to downstream, don't force downstream to patch Makefiles, configure files and similar just because you ship with -Wall -Werror -D_FAIL_ON_GLIBC22

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds