Ubuntu introduces phased updates
Updates to existing packages can occasionally introduce regression bugs, which cause considerable turmoil when they hit all of a large distribution's users at the same time. Ubuntu quietly introduced a new mechanism in its 13.04 release that progressively rolls out package updates, pushing each update to a small subset of the total user base first, then steadily scaling up, rather than publishing the update for everyone simultaneously. "Phased updates" (as they are known) are designed to catch and revert buggy package updates before they are propagated out to the entire user community. On the server side, the distribution monitors crash reports in order to decide whether each roll out should continue or be stopped for repair. The client-side framework has been in place since the release of 13.04, but updates themselves only started phasing in August when all of the server-side components were ready.
Canonical's Brian Murray wrote an
introduction to the new roll-out mechanism on his blog shortly after the system went
live. The system applies to stable release
updates (SRUs) only. SRUs are updates from the main Ubuntu
repositories that by definition are supposed to ship with a
"high degree of stability
" and fix critical bugs—in
contrast, for example, to backport
updates, which can introduce new features from upstream releases and
are not supported by Canonical.
On the client end, phased updates are implemented in the update-manager tool, which is Ubuntu's graphical update installation application. The other methods for updating a package, such as apt-get, are not affected by the phased update plan. The rationale is that a user using apt-get to update a package is expressing a conscious intent to install the new version. update-manager, in contrast, periodically checks the Ubuntu package repositories in the background for new updates, so it is a passive tool.
update-manager generates a random number between zero and one for each package, then compares it to the Phased-Update-Percentage value published on the server for that package. If update-manager's generated number is less than the published percentage, then the package will be added to the list of available updates that the user can install. Dependencies for a package are pulled in automatically; if users are in the update group for foo, they do not also have to "re-roll the dice" (so to speak) and wait for libfoo-common as well.
As is probably obvious, controlling the value of Phased-Update-Percentage throttles the speed at which an update rolls out. Currently, whenever a new package update is published, the Phased-Update-Percentage begins at 10%. The update percentage is incremented by 10% every six hours if nothing goes wrong, so a complete roll-out takes 54 hours to ramp up to 100% availability.
Alternatively, if something does go wrong with an update, the percentage can be dialed back all the way to zero, at which point the update can be pulled from the repository then debugged to catch and repair whatever regressions it introduced. Regressions are counted based on the number of reports generated by Ubuntu's crash reporter Apport. Apport gathers system data for each crash (stack traces, core dumps, environment variables, system metadata, etc.) and after getting the user's consent, sends a report in to Launchpad. All reports are logged on the Ubuntu error tracker; when a newly released update triggers error reports that were not present with the previous version of the package, the Ubuntu bug squad will pull the update. When an update is pulled, both the package signer and the package uploader (who may, of course, be the same person) are notified via email.
In addition to the error tracker, the phased update process is exposed through several other Ubuntu services. The current update percentage is tracked on the publishing-history page for each package (a page which was already used to publication data and status information for each package). There is also a phased update overview page where one can see the current status of every SRU in the phasing process.
At the moment, the overview page only has data going back until August 7 (two weeks ago as of press time), so naturally there are only a handful of SRUs included. There are currently three updates at the 90% level, five at 80%, and two at 0%—indicating that they have been pulled. Those packages are the BAMF support library for Unity and—perhaps ironically—Apport. Ironic or not, the "Problems" column of the overview page links to the error reports for the package in question. For privacy reasons, the individual reports are only visible to approved members of the bug-triaging team. In an email, Murray said that the phased update system has caught five distinct regressions since its launch on August 7, and that nine package updates have progressed completely to the 100% distribution phase.
Five regressions caught may not seem like many, but in the context of Ubuntu's large installed user base, catching them before they are distributed to the entire community is likely to have averted several thousand application crashes. In his blog post on phased updates, Murray commented that the system supports some corner cases, such as not stopping an update if the team knows that the crashes it sees were not introduced by the update itself. He also pointed out that the system is new, so the team is still experimenting with the various parameters (such as the speed of roll-out itself and the utilities used to detect regressions introduced by a package).
The other interesting dimension of the system is that the subset of users who get access to the updated package at each phase is a random sample. That should ensure that the error reports come from a more statistically valid set of machines than, say, a self-selected "early adopter" group or a set of customers paying for first access.
The notion of being the first person to test out an update may make
some users uncomfortable (at least some in the comments on Murray's
blog post suggested as much), but it is important to remember that the
updates being phased in are the SRUs, not experimental updates. SRUs are
already required to go through a testing and sign-off process, so they
should be stable; the fact that there are sometimes still errors and
regressions is simply a fact of life in the software world.
Nevertheless, Murray's post says it is possible to opt-out of the
phased update system entirely by adding a directive to
/etc/apt/apt.conf. Opting out means that
update-manager will only report updates as available when
they reach the 100% phase, by which point they should be more
error-free. Alternatively, the impatient can simply user
apt-get, and install all updates immediately.
Posted Aug 22, 2013 6:27 UTC (Thu)
by brooksmoses (guest, #88422)
[Link] (2 responses)
Posted Aug 22, 2013 18:03 UTC (Thu)
by Baylink (guest, #755)
[Link]
That rule itself a special case of "get the glue right".
Posted Aug 22, 2013 19:13 UTC (Thu)
by wtanksleyjr (subscriber, #74601)
[Link]
(That was sarcasm. I'm dealing with the last two months of Microsoft's security patches.)
Posted Aug 22, 2013 18:02 UTC (Thu)
by Baylink (guest, #755)
[Link] (1 responses)
Yup; that's system administrator practice going all the way back to the Yellow book. Nice to see it showing up in repo updaters.
Now, if I could only decide myself which of my machines go in which group.
Posted Aug 22, 2013 21:55 UTC (Thu)
by dlang (guest, #313)
[Link]
so you have a dev copy that's a plain clone of upstream, then after testing, you copy the new packages to your QA repo and update all your QA boxes from that, then after things pass in QA, you copy the packages to your production repo and update your production boxes.
if you care about which boxes are in which category, then you don't want the random behavior of this anyway, but you don't need it.
One benefit of this being server-side per-package....
One benefit of this being server-side per-package....
One benefit of this being server-side per-package....
Ubuntu introduces phased updates
Ubuntu introduces phased updates