Fedora looks to prevent upgrade disasters
The Fedora project is getting creative about ways to ensure that updates cause fewer problems for users. In the past six weeks, project members have floated over half a dozen ideas about how to achieve this goal on the fedora-devel-list alone — and, no doubt, other, unrecorded ones on chat channels, private emails, and at FUDCon, the project's user and developer conference, held in mid-January. Which of these ideas will be implemented is still undecided, but the discussion is a treasury of ideas, as well as a vivid glimpse into the considerations involved with running one of the largest GNU/Linux distributions.
The discussion began in early December, 2008 because an update to D-Bus, a core package that carries messages between applications, caused numerous broken packages when applied to Fedora 10. Users were particularly concerned because installing the update left them unable to use PackageKit, Fedora's desktop tool for package upgrades. Fedora was quick to issue instructions about how to fix the problem, but the project's developers appear to have become galvanized by the problem, and have determined to avoid similar problems in the future.
Very likely, the response was affected by Fedora's problems in the last six months, including the still-mysterious security crisis that lasted 26 days last August and September, and the need to adjust release schedules because of the security problems. With these events fresh in everybody's minds, Fedora members may well have felt pressure to prove themselves by responding effectively. In fact, the quickness of the responses might suggest that the Fedora community was still in crisis mode from the earlier crises.
It was also worrying that, early in the response to the D-Bus crisis, Fedora
developers were openly admitting that they lacked a complete understanding
of what was affected. "Does anyone have an understanding of exactly
what is broken [and] what isn't?
" developer Ian
Amess asked, and, in the following discussion, it appeared that nobody
did. At times, developers were reduced to anecdotal reporting, such as Arjan
van de Ven's report that "I have a strong suspicion that the
kerneloops applet is broken (based on a sharp drop of incoming reports
since a few days).
" Without thorough information, Fedora
troubleshooters were unable to say whether the fastest way to offer repairs
was to issue an update, or to regress to an earlier version of D-Bus.
In this situation, plans to avoid reoccurrences of the situation began to be suggested even before the immediate problem was solved. One of the first solutions on the fedora-devel-list was from Kevin Kofler, who advocated reverting to the previous version, and only changing the version of D-Bus with new Fedora releases. Similarly, a simultaneous thread discussed the possibility of creating a list of key packages that should receive priority in Fedora quality assurance, with Will Woods suggesting that the list should include yum, Network Manager, GRUB, and the kernel, along with all of their dependencies.
Yet another discussion centered on the the karma system in which developers vote on the readiness of packages in quality assurance. As summarized by Michael Schwendt, the consensus in this discussion was that several communication problems existed: Maintainers could choose the urgency of the notifications of bugs in their packages, responses to bugs are left to maintainers' judgment, and so are efforts to coordinate testing between maintainers when their packages shared dependencies. In other words, responses to problems are not uniform, no quality standards exist, nor any expectations of cooperation. Instead, the response is left to the conscientiousness of each maintainer.
In addition, submitters could vote on the packages they submitted themselves, potentially reducing the scrutiny of others. Nor did the Fedora system have any minimum level of karma that signaled when a package was ready to be added to the stable repositories; instead leaving it once again to the standards of the maintainers. Further insight into Fedora quality assurance was given by Luke Macken on his blog, where he calculated that the majority of packages were released for general use in as little as six days, and often did so simply at the maintainer's request, statistics that might suggest quality assurance is less rigorous than it could be.
As discussion continued over the weeks, other threads discussed innovations
that might prevent reoccurrences. Arthur
Pemberton advocated what he called a "Fedora Com System
"
— a kind of hot line on the desktop that would allow Fedora leaders
to communicate directly with users. However, others maintained that
fedora-announce-list already provided a similar service, especially if
users subscribed to it via an RSS feed.
Other comments raised additional possibilities. Steven
Moix raised the possibility of creating an alias for yum, the
basic command used by Fedora for package management so that it would always
use the --skip-broken option. In this way, problematic
packages would not be installed or added as updates, and users would be
left with intact systems. Others, though, rejected this idea because it
could still leave users without the functionality they needed. Moreover, if
broken packages were not installed, they might easily go unreported unless
users paid close attention to the output of PackageKit or yum.
In much the same way, another contributor's suggestion that every second or so Fedora release include a stable version, so users could choose whether they wanted a bleeding edge operating system or a reliable one. This solution might help to compensate for Fedora's relatively short life cycle for each version, a choice that some users perceive as undesirable compared to the policies of other distributions. Others, though, shot down the idea as not only overly-ambitious but unnecessary, on the grounds that the Red Hat Enterprise Linux and CentOS distributions already provided stable versions of the same code as Fedora.
As discussion continued over Christmas and into the New Year, one of the
most interesting proposals was Jesse
Keatings' idea of appointing what he called "proven packagers." In
Keating's view, proven packagers would be experienced, well-respected
experts in package management — the kind whom "you would trust
fully with any of the packages you either maintain or even just
use.
" Proven packagers would have a roving brief, and be ready to
mentor or intervene as needed, "always with a desire to improve the
quality of Fedora.
" Expressing misgivings that the status might be
too easy to attain, Robert
Scheck emphasized that proven packagers should not be appointed by a
single person, and "should be persons well known to the community and
having some presence
" in the community so that they could operate
more effectively.
This is only a summary of a dozen threads and hundreds of responses. Still, it gives some sense of how the Fedora community is analyzing itself in the aftermath of the D-Bus disaster. At least on the evidence found in fedora-devel-list, Fedora members might be criticized for not looking to other distributions for solutions, and for the fact that, so far, only the proven packagers suggestion is visibly moving forward. All the same, the creative open-mindedness and the general politeness in the discussions might still provide Fedora with the solutions it needs to weather its latest engineering and marketing disaster and prevent similar problems in the future.
| Index entries for this article | |
|---|---|
| GuestArticles | Byfield, Bruce |
