LWN.net Logo

What's happening at Ubuntu: from X.org updates to upstart

Last week a number of Ubuntu users saw something they never expected to see, a "Linux Blue Screen of Death". A patch to the xorg-server package inadvertently broke the windowing environment on some Ubuntu 6.06 LTS systems. The faulty patch was available for download for about 17 hours beginning Monday August 21 and ending on August 22 at 10:00 UTC. After that time the patch was removed and the mirrors temporarily disabled to prevent others from downloading the faulty package.

The problem did not corrupt or lose any data and affected users still had access to the system console. There were no security vulnerabilities associated with this problem. All in all it was not terribly serious, but for many users unused to the command line it may have seemed serious. More information can be found on this page. Instructions for fixing affected systems are also available.

Mark Shuttleworth had this to say:

An incident report is being compiled by the team and we will publish that for our broader community and users as soon as it is complete. My apologies to those who have been affected, I know that a blue screen of death is the very last thing anybody ever wants to see on Linux desktops and that any downtime caused by mistakes on our part, even measured in minutes, is unacceptable....

If there is a silver lining to the error, it is that it happened during the one week in six months when we have the core distribution development team together in one place. This gave us the opportunity not just to analyse and fix the issue, and to talk about the sequence of events that led to the problem, but also to discuss the processes we must improve to further reduce the likelihood of a repeat. The team is now more aware than ever of the responsibility we assume given extraordinary rate of adoption of Ubuntu.

Some more exciting news from Ubuntu is that of an Upstart in Universe. Upstart is an event-based init daemon, designed to replace sysvinit and other startup daemons.

Modern computers are more flexible; USB devices and network devices can be plugged in and removed at any point, some devices may need to load firmware after detection but before use by the system, mounting a partition in /etc/fstab may require tools in the network filesystem /usr requiring networking to brought up first, and so on. Upstart is designed to dynamically order the start up sequence based on the configuration and hardware found as it goes along.

The current plan is to introduce upstart in stages:

  1. Principal development; implement a daemon that can manage jobs as described.
  2. Replace /sbin/init while running the existing sysv-rc scripts.
  3. Replace /etc/rcS.d scripts with upstart jobs.
  4. Replace other daemon's scripts on a package-by-package basis.
  5. Replace cron, atd, anacron and inetd with the end result of having a single place to configure system jobs.
  6. Modification of other daemons and processes to send events to init instead of trying to run things themselves.
According to the current plan upstart will be at least part way into stage #3 by the time edgy is released. "From the start of development of edgy+2, no new packages will be accepted unless they provide upstart jobs instead of init scripts and init scripts will be considered deprecated."

The upstart package is available in the Ubuntu universe and experienced edgy users are invited to test it. Install the package and follow the instructions in /usr/share/doc/upstart/README.Debian to add a boot option that will use upstart instead of init. "If your system boots and shut downs normally (other than a slightly more verbose boot without usplash running) then it is working correctly." They don't mention it, but, should the system respond with a blue screen of death, it is not working correctly.


(Log in to post comments)

upstart as cron replacement

Posted Aug 31, 2006 14:47 UTC (Thu) by kh (subscriber, #19413) [Link]

I'm not sure I understand the reasoning of having the new init daemon also replace cron? I'm not saying cron should not be replaced, there are improvements I would like to see, but especially on a multiuser system - I like the seperation of the two daemons. (..."do just one job"....)

upstart as cron replacement

Posted Aug 31, 2006 15:31 UTC (Thu) by shane (subscriber, #3335) [Link]

I think the idea makes sense. From the upstart blog:
In fact, the goal is that upstart should also replace the "run event scripts" functionality of any daemon on the system. Daemons such as acpid, apmd and Network Manager would send events to init instead of running scripts themselves with their own perculiar configuration and semantics.
It is actually simpler to have a single syntax for any "event" than it is to have a different one for init, cron, and so on. The Unix philosophy of "do one thing" only works if you have a simple, elegant way to combine the tiny, sharp programs. Upstart actually seems like chance to extend this philosophy to more parts of a modern system.

upstart as cron replacement

Posted Aug 31, 2006 16:57 UTC (Thu) by smurf (subscriber, #17840) [Link]

Two reasons.

  • cron is time-based. It recently acquired the ability to run scripts after startup, but it can't run scripts at shutdown time, it doesn't know that running updatedb et al. on battery may not be a good idea, it doesn't know not to run my database sync job until *after* mysql has started up, ...

    It makes sense to put all of that into one configuration file.

  • cron doesn't know whether your job is still running. If so, presumably it should not start another. Right now, most people blithedly assume that their scripts get done fast enough, and the others have had to clean up after 25 identical cronjobs and are now using some sort of insecure ad-hoc locking. ;-)

More info is available here.

Note that replacing cron will happen after Edgy is released.

upstart as cron replacement

Posted Aug 31, 2006 19:41 UTC (Thu) by vmole (subscriber, #111) [Link]

You've given two good reasons why cron needs to be replaced. I'm still not convinced that it makes sense to combine it with init. Yes, init is conceptually the right place to do stuff on startup and shutdown, and providing users a good way to do this would be good. But for periodic jobs, I don't see why init is a better fit.

Now I'll actually go read about Upstart and learn why I'm wrong :-)

upstart as cron replacement

Posted Aug 31, 2006 21:05 UTC (Thu) by smurf (subscriber, #17840) [Link]

init has the unique property that it gets SIGCHLD for all processes which happen to die after their parent. Thus it is able to keep track of exactly when a job is finished, even if that job daemonizes or forks itself. (For the most part.)

This means that you don't have to futz around with either putting your jobs into the background (standard /etc/init.d/* scripts) or preventing them from doing so (if you use start-stop-daemon, or some other way to catch their PID). init can keep track of the job's state and do the Right Thing, which is presumably what you've told it to do in your config file(s).

This also applies to cron jobs and the locking one might want to apply to their execution.

I don't know if you have looked at the cron code. The thing is ugly, and not just because 19-year-old code shows its age. The twenty #ifdef DEBIANs it has acquired don't exactly help either; the Debian patch, when you run it through wc -l, is larger than the original sources.

In principle, hacking support for the features Upstart offers into cron -- it could monitor the events Upstart generates, and extending the crontab syntax doesn't look to difficult -- is certainly possible, but (not only IMHO) a rewrite would be a lot cleaner. The task is too important to do it as a dirty hack.

And if you rewrite it anyway, you might as well put the task into init where it can profit from init's aforementioned special status.

upstart as cron replacement

Posted Aug 31, 2006 22:48 UTC (Thu) by vmole (subscriber, #111) [Link]

I don't know if you have looked at the cron code. The thing is ugly, and not just because 19-year-old code shows its age.

Have I looked at the cron code? Heh. Far too long, far too much. I was the Debian cron maintainer for ~10 years (Javier Fernandez-Sanguino took over last year, bless his soul). You don't have to convince me the code is not pretty. However, a package maintainer can only do so much before the codebase is no longer compatible with patches from others.

The twenty #ifdef DEBIANs it has acquired don't exactly help either; the Debian patch, when you run it through wc -l, is larger than the original sources.

Well, I though keeping the Debian specific functionality (as opposed to general fixes) seperated was worth the small amount of visual noise; opinions may vary. If I'd realized how big it was going to grow, maybe I would have chosen differently. Regarding the patch size, you can't compare the linecount to the original - your counting both lines removed and lines added (one each for changed lines), plus all the new stuff in the ./debian directory. And there is *a lot* of functionality in the diff: support for cron.d, some support for DST and clock changes, big security fixes.

Defensiveness aside, I certainly wouldn't argue that you should hack Vixie cron - it needs to be re-written from the ground up, designed around whatever features are desired. I'm just not convinced that combining init, cron/at/etc., and inetd is the correct choice, despite (IMO) superficial simularities. In particular, I think maintaining appropriate security domains may be more complicated than one would like.

upstart as cron replacement

Posted Aug 31, 2006 23:13 UTC (Thu) by smurf (subscriber, #17840) [Link]

I'm just not convinced that combining init, cron/at/etc., and inetd is the correct choice, despite (IMO) superficial simularities. In particular, I think maintaining appropriate security domains may be more complicated than one would like.

I'm not entirely sure about that myself; certainly the requirements for a safe /sbin/init replacement are a lot higher than those for anything else in the system, and there's something to be said for keeping the thing as simple as possible.

Personally, I wouldn't be entirely happy writing something that complex, but that's simply because I do like to run my code under strace or gdb in order to figure out why it misbehaves, and I'd have to adopt a couple of different debugging techniques. ;-)
You can't trace the process with PID one (for good reasons, too).

upstart as cron replacement

Posted Sep 3, 2006 8:20 UTC (Sun) by bockman (subscriber, #3650) [Link]

init has the unique property that it gets SIGCHLD for all processes which happen to die after their parent. Thus it is able to keep track of exactly when a job is finished, even if that job daemonizes or forks itself. (For the most part.)

It could export this feature: when an orphan child dies, upstart could notify other interested processes of the event. You would need a registration API and a way to transmit the information. Then an independent cron replacement could use this service.

Maybe. Just an idea. :-)

Ciao
------
FB

upstart as cron replacement

Posted Sep 12, 2006 12:02 UTC (Tue) by robbe (guest, #16131) [Link]

> init has the unique property that it gets SIGCHLD for all processes which
> happen to die after their parent. Thus it is able to keep track of exactly
> when a job is finished, even if that job daemonizes or forks itself. (For
> the most part.)

Let's say init starts a daemon (e.g. apache as pid 100) directly and it daemonizes itself (i.e. forks to pid 105, and pid 100 terminates). The init will then get a SIGCHLD from pid 100 and supposes the daemon job as "finished". It doesn't know that pid 100 and pid 105 are connected. If the daemon exits or crashes later init will notice that pid 105 has terminated, but does not connect this fact with the apache job it started sometime in the past. (One can usually suppress daemonization via a commandline argument, though.)

Right now most daemons are started via scripts that setup the environment etc. If init runs this script it knows its pid and when it finishes. Again, it does not know the pid of the real worker process that the script forks off.

Some of your other comments to this article rely on part on this assumption which I argue to be wrong...

upstart as cron replacement

Posted Sep 12, 2006 12:28 UTC (Tue) by smurf (subscriber, #17840) [Link]

It's true that per se, init doesn't associate job 105 with the process 100 that it started. But that is easily fixed.

The simplest way is to actually tell init which pid the daemon has -- ideally in the same call which tells it that the daemon in question is now up and running, instead of merely starting up.

upstart as cron replacement

Posted Aug 31, 2006 20:50 UTC (Thu) by kh (subscriber, #19413) [Link]

it doesn't know not to run my database sync job until *after* mysql has started up

This seems to me to be a perfect example of what not to put in a cron replacement daemon (because of feature creep). Currently, I believe you could have the database sync job test if mysql is running (and either start it or exit on error status if not - but I at least would want to control that in my script, not have cron decide). I don't want a super daemon constantly monitoring all the other daemons and deciding what cron jobs to run and which not to.

upstart as cron replacement

Posted Aug 31, 2006 22:59 UTC (Thu) by smurf (subscriber, #17840) [Link]

Currently, I believe you could have the database sync job test if mysql is running (and either start it or exit on error status if not - but I at least would want to control that in my script, not have cron decide). I don't want a super daemon constantly monitoring all the other daemons and deciding what cron jobs to run and which not to.

?? The decision which jobs to run would lie with the author of the job, not with upstart: you decide on the rules for your script, not cron or anybody else.

Besides: What do you mean by "constantly monitoring"? If it's an init replacement, it doesn't need to monitor anything, because it knows: it gets the information from the kernel for free, no need to poll anything, much less /proc.

The idea behind integrating all of this in some way is that you can just add the "requires foobar" idea to your cron or at job (remember this will be started after Edgy, so nobody has any concrete ideas about syntax yet).
Then, Upstart will either auto-start the mysql / foobar / whatever service for Joe User when necessary, or delay the cron job.

Joe can't do the former himself, he doesn't have the privileges; the latter is a waste of resources -- if you know beforehand that /usr/sbin/foobard is not running, there's no point in starting Joe's job.

upstart as cron replacement

Posted Sep 1, 2006 1:27 UTC (Fri) by kh (subscriber, #19413) [Link]

?? The decision which jobs to run would lie with the author of the job, not with upstart: you decide on the rules for your script, not cron or anybody else.

You did not understand what I was trying to say, I am not comfortable with the idea of moving the logic for tests (such as "is the database server running?") from the sync (or whatever) script, where is currently resides, to the cron replacement (i.e. the update config (crontab?) file). And I do not understand how it is possible to do so with a terse language for the config file, and yet have update be aware of every different possible daemon, their states, the system states, & etc. That seems like a LOT of new features for cron.

No service-specific tests

Posted Sep 1, 2006 7:28 UTC (Fri) by smurf (subscriber, #17840) [Link]

The test upstart will do, using the mysql example, is not "can I connect to port 3306 and get a sensible reply", but "I have forked off /usr/sbin/mysqld and it hasn't died yet".

Testing for actual services isn't upstart's job; I agree that that makes no sense whatsoever.

NB: I don't know how upstart is going to decide that a service has finished starting up. I assume it'll be something along the lines of "the startup script has exited but some of its children are still running" by default, but I haven't checked.

upstart as cron replacement

Posted Aug 31, 2006 18:12 UTC (Thu) by iabervon (subscriber, #722) [Link]

I think it fits more with the "everything is a file" aspect; cron jobs and init tasks are not sufficiently different that it makes sense to have them be fundamentally different things, like they currently are.

upstart as cron replacement

Posted Aug 31, 2006 19:58 UTC (Thu) by cdmiller (subscriber, #2813) [Link]

I tend to agree with "do just one job...". While the Replacementinit wiki page mentions the potential to replace cron, at, and inetd at least the initial scope is targeted at init and initscripts. Dynamically ordering them based on hardware detection sound questionable, most init scripts do not do hardware specific tasks.

I find it suspect that chkconfig is called "under-implemented" when every RedHat derived Linux and FreeBSD include it.

Attempting to morph a new init/initscripts program designed to be better at system start up, shutdown, and hotplug tasks into a do all multi user cron, at, and an inetd replacement should be fun to watch. I smell security problems. Anybody remember the years of alerts that followed the introduction of xinetd?

My first critical thoughts aside, I hope it does a good job addressing the use cases described. I'll kick back in the armchair and watch, maybe my cynicism will be unfounded :)

Upstart

Posted Aug 31, 2006 17:13 UTC (Thu) by beagnach (guest, #32987) [Link]

so how is upstart going compare with launchd on OSX?

Upstart

Posted Aug 31, 2006 17:45 UTC (Thu) by musicon (subscriber, #4739) [Link]

That's mentioned on the Upstart page, with additional comparisons to initng and SMF.

Upstart

Posted Sep 5, 2006 3:45 UTC (Tue) by lovelace (subscriber, #278) [Link]

So, out of curiosity, just what exactly is wrong with the Apache 2.0 license of launchd, I wonder?

What's happening at Ubuntu: from X.org updates to upstart

Posted Aug 31, 2006 19:46 UTC (Thu) by stuart (subscriber, #623) [Link]

The best thing about the article is clearly that one should read README.Debian :-)

What's happening at Ubuntu: from X.org updates to upstart

Posted Sep 1, 2006 2:37 UTC (Fri) by sbergman27 (subscriber, #10767) [Link]

"""If there is a silver lining to the error, it is that it happened during the one week in six months when we have the core distribution development team together in one place. This gave us the opportunity not just to analyse and fix the issue, and to talk about the sequence of events that led to the problem, but also to discuss the processes we must improve to further reduce the likelihood of a repeat."""

I'd certainly love to have seen this! A room full of developers seriously discussing, after 2 years of releasing patches, whether it might not be a good idea to do some sort of testing on those patches before releasing them to a user-base of newbies.

It's pretty obvious that no serious testing was done on the xorg patches before they were loosed upon the world.

I like and use Ubuntu. But for shame, Ubuntu devs! For shame!

What's happening at Ubuntu: from X.org updates to upstart

Posted Sep 3, 2006 13:42 UTC (Sun) by cjwatson (subscriber, #7322) [Link]

Obviously, we deserve and are content to receive criticism that this upgrade broke a lot of people's systems. All the same, while it's true that the testing of the X.org patches in question was clearly not *sufficient*, I think you're maligning the Ubuntu team somewhat by skipping over the fact that it only failed on certain classes of machines. The developer responsible for the upload tested it on a number of his own systems with substantially different hardware configurations before release, and had a number of interested users try out the changes as well, and no problems were found in that process. The problem was that it wasn't tried out on a wide enough range of hardware, and that we didn't flag the patch as potentially risky during approval for dapper-updates, not that it was uploaded blindly with no testing.

Much of the discussion within the development team was more about best practices for dealing with emergencies once they arise, rather than "huh, you think we should do some testing, then?". Of course, there are changes that we can and will make to the testing process for stable release updates to make this sort of thing much less likely in the future; there will be details of that in the post-mortem report.

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds