LWN.net Logo

Fedora looks to prevent upgrade disasters

January 28, 2009

This article was contributed by Bruce Byfield

The Fedora project is getting creative about ways to ensure that updates cause fewer problems for users. In the past six weeks, project members have floated over half a dozen ideas about how to achieve this goal on the fedora-devel-list alone — and, no doubt, other, unrecorded ones on chat channels, private emails, and at FUDCon, the project's user and developer conference, held in mid-January. Which of these ideas will be implemented is still undecided, but the discussion is a treasury of ideas, as well as a vivid glimpse into the considerations involved with running one of the largest GNU/Linux distributions.

The discussion began in early December, 2008 because an update to D-Bus, a core package that carries messages between applications, caused numerous broken packages when applied to Fedora 10. Users were particularly concerned because installing the update left them unable to use PackageKit, Fedora's desktop tool for package upgrades. Fedora was quick to issue instructions about how to fix the problem, but the project's developers appear to have become galvanized by the problem, and have determined to avoid similar problems in the future.

Very likely, the response was affected by Fedora's problems in the last six months, including the still-mysterious security crisis that lasted 26 days last August and September, and the need to adjust release schedules because of the security problems. With these events fresh in everybody's minds, Fedora members may well have felt pressure to prove themselves by responding effectively. In fact, the quickness of the responses might suggest that the Fedora community was still in crisis mode from the earlier crises.

It was also worrying that, early in the response to the D-Bus crisis, Fedora developers were openly admitting that they lacked a complete understanding of what was affected. "Does anyone have an understanding of exactly what is broken [and] what isn't?" developer Ian Amess asked, and, in the following discussion, it appeared that nobody did. At times, developers were reduced to anecdotal reporting, such as Arjan van de Ven's report that "I have a strong suspicion that the kerneloops applet is broken (based on a sharp drop of incoming reports since a few days)." Without thorough information, Fedora troubleshooters were unable to say whether the fastest way to offer repairs was to issue an update, or to regress to an earlier version of D-Bus.

In this situation, plans to avoid reoccurrences of the situation began to be suggested even before the immediate problem was solved. One of the first solutions on the fedora-devel-list was from Kevin Kofler, who advocated reverting to the previous version, and only changing the version of D-Bus with new Fedora releases. Similarly, a simultaneous thread discussed the possibility of creating a list of key packages that should receive priority in Fedora quality assurance, with Will Woods suggesting that the list should include yum, Network Manager, GRUB, and the kernel, along with all of their dependencies.

Yet another discussion centered on the the karma system in which developers vote on the readiness of packages in quality assurance. As summarized by Michael Schwendt, the consensus in this discussion was that several communication problems existed: Maintainers could choose the urgency of the notifications of bugs in their packages, responses to bugs are left to maintainers' judgment, and so are efforts to coordinate testing between maintainers when their packages shared dependencies. In other words, responses to problems are not uniform, no quality standards exist, nor any expectations of cooperation. Instead, the response is left to the conscientiousness of each maintainer.

In addition, submitters could vote on the packages they submitted themselves, potentially reducing the scrutiny of others. Nor did the Fedora system have any minimum level of karma that signaled when a package was ready to be added to the stable repositories; instead leaving it once again to the standards of the maintainers. Further insight into Fedora quality assurance was given by Luke Macken on his blog, where he calculated that the majority of packages were released for general use in as little as six days, and often did so simply at the maintainer's request, statistics that might suggest quality assurance is less rigorous than it could be.

As discussion continued over the weeks, other threads discussed innovations that might prevent reoccurrences. Arthur Pemberton advocated what he called a "Fedora Com System" — a kind of hot line on the desktop that would allow Fedora leaders to communicate directly with users. However, others maintained that fedora-announce-list already provided a similar service, especially if users subscribed to it via an RSS feed.

Other comments raised additional possibilities. Steven Moix raised the possibility of creating an alias for yum, the basic command used by Fedora for package management so that it would always use the --skip-broken option. In this way, problematic packages would not be installed or added as updates, and users would be left with intact systems. Others, though, rejected this idea because it could still leave users without the functionality they needed. Moreover, if broken packages were not installed, they might easily go unreported unless users paid close attention to the output of PackageKit or yum.

In much the same way, another contributor's suggestion that every second or so Fedora release include a stable version, so users could choose whether they wanted a bleeding edge operating system or a reliable one. This solution might help to compensate for Fedora's relatively short life cycle for each version, a choice that some users perceive as undesirable compared to the policies of other distributions. Others, though, shot down the idea as not only overly-ambitious but unnecessary, on the grounds that the Red Hat Enterprise Linux and CentOS distributions already provided stable versions of the same code as Fedora.

As discussion continued over Christmas and into the New Year, one of the most interesting proposals was Jesse Keatings' idea of appointing what he called "proven packagers." In Keating's view, proven packagers would be experienced, well-respected experts in package management — the kind whom "you would trust fully with any of the packages you either maintain or even just use." Proven packagers would have a roving brief, and be ready to mentor or intervene as needed, "always with a desire to improve the quality of Fedora." Expressing misgivings that the status might be too easy to attain, Robert Scheck emphasized that proven packagers should not be appointed by a single person, and "should be persons well known to the community and having some presence" in the community so that they could operate more effectively.

This is only a summary of a dozen threads and hundreds of responses. Still, it gives some sense of how the Fedora community is analyzing itself in the aftermath of the D-Bus disaster. At least on the evidence found in fedora-devel-list, Fedora members might be criticized for not looking to other distributions for solutions, and for the fact that, so far, only the proven packagers suggestion is visibly moving forward. All the same, the creative open-mindedness and the general politeness in the discussions might still provide Fedora with the solutions it needs to weather its latest engineering and marketing disaster and prevent similar problems in the future.


(Log in to post comments)

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 8:38 UTC (Thu) by michaeljt (subscriber, #39183) [Link]

Proven packagers... have they been looking at Debian by any chance?

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 15:17 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

Sure. Communities constantly learn from each other. Note that there are a few key difference between them as explained in

http://andrewprice.me.uk/weblog/entry/to-sponsor-the-pack...

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 11:12 UTC (Thu) by motk (subscriber, #51120) [Link]

What are these 'disasters' you kep speaking of? You keep using that word.

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 14:17 UTC (Thu) by skvidal (subscriber, #3094) [Link]

Just for clarity. I'm not sure who Ian Amess is but he doesn't appear to be registered in the Fedora Account system which means while he might be a developer somewhere he's not a developer working on fedora directly.

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 16:20 UTC (Thu) by michaeljt (subscriber, #39183) [Link]

Is there any way of making DBus slightly less system critical, so that apps and tools that depend on it have some sort of fallback if it doesn't work? That might also help. This is not just Fedora of course - I tried stopping DBus on a Ubuntu system just to see, and everything came crashing around my ears :)

Fedora looks to prevent upgrade disasters

Posted Jan 29, 2009 22:50 UTC (Thu) by zooko (subscriber, #2589) [Link]

Apparently Nexenta has this tool named "apt-clone" which does a ZFS snapshot of your system
before installing the .deb. Apparently it can even recover from a bad libc or whatever, since apt-
clone adds an item to your grub menu to boot from the ZFS snapshot. Is that now awesome? It
sounds great. I should try it sometime.

Fedora looks to prevent upgrade disasters

Posted Jan 30, 2009 1:15 UTC (Fri) by walters (subscriber, #7396) [Link]

There were a lot of ideas tossed about, but ultimately there's little tools can do if developers do completely stupid things. And I did a very stupid thing in a moment of thoughtlessness.

Should the difference between testing and stable be more than just a drop box? Should there be peer review of updates? Should we have different policies for core vs addon packages? (to this one, yes, obviously). There's lots of things we could do and should do, but again fundamentally it's just hard to protect against idiocy.

But yes, things are happening. I think we'll see future updates tested more consistently. Jesse has been doing good work trying to reduce the number of things we update at all, etc.

Fedora looks to prevent upgrade disasters

Posted Feb 2, 2009 0:18 UTC (Mon) by louie (subscriber, #3285) [Link]

It's actually not that hard to protect against idiocy. This is the kind of thing a well-tended updates-testing should have caught, but as I've bitched about in the past, Fedora doesn't do that very well. (This may be a special case if it was a security release, but even that should still have gotten 6-12 hours of testing from someone, which would have caught this.)

Fedora looks to prevent upgrade disasters

Posted Feb 1, 2009 4:25 UTC (Sun) by andyp (subscriber, #48701) [Link]

Having gained experience with packaging in Ubuntu, Debian and Fedora my perception is that Debian and Ubuntu's policy of disallowing stable release updates for anything other than security and major disasters is their strength.

However, they can only maintain this strict policy without grinding to a halt because they have the mid-way point of Debian Unstable to filter out many of the bad eggs. For those not familiar with the paths packages take into stable releases for these distros here's a simple overview:

Debian: (Experimental)-> Unstable-> Testing (Frozen for releases)

Ubuntu: (Experimental)-> Unstable-> [Ubuntu+1] (Frozen for releases)

Fedora: Rawhide (Frozen for releases)

(I've parenthesised Experimental because it's not a necessary step, but is handy to use as a packaging sandbox. [Ubuntu+1] is named whatever crazy alliterative animal name Mark dreams up next.)

In conclusion, my intuition tells me that, due to the paths and QA points packages go through to reach stable in each distro, I should expect Fedora stable releases to be as stable as Debian Unstable. Maybe an ex-cosmonaut millionaire should come and base a new distro on stabilised snapshots of Fedora (I jest, I jest...).

Fedora looks to prevent upgrade disasters

Posted Feb 5, 2009 9:48 UTC (Thu) by Tet (subscriber, #5433) [Link]

Others, though, shot down the idea as not only overly-ambitious but unnecessary, on the grounds that the Red Hat Enterprise Linux and CentOS distributions already provided stable versions of the same code as Fedora

I won't comment on the necessity of the original proposal. However, claiming RHEL/CentOS provide stable versions of the same code as Fedora is nonsense. They provide stable versions of the same code as obsolete versions of Fedora. For example, the current release of RHEL is based on a Fedora 6. The oldest supported Fedora release is 9, and that only has a brief time left to live.

I know that this is by design. But the gap between RHEL5 and RHEL6 has been too large, and it's now difficult to recommend RHEL for pretty much anything. For many things, it's now virtually impossible to develop something on a Fedora desktop and deploy it on a RHEL server because the difference between the two is simply too great.

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds