disk I/O (reading files, searching for the 50 places a config file could be before you find the one where it is, etc)
bus access (you can't query a PCI bus for multiple things at the same time)
timeouts (waiting after sending a command to see if something responds).
timeouts frequently combine with bus access as it may not be safe to do something else until you get a response from the device you just probed for (or decide that it's not really there)
and sometimes you do really have number crunching CPU tasks to do.
multi-core systems make a big difference if you really do have CPU as the limiting factor, but that's usually not the case (and IMHO software that has to do a lot of cpu work to just start up is probably in need of being fixed)
yes, when you get down into the low single-digit bootup range on a relativly slow COU (like the eepc from the talk), you do have to pay attention to the CPU load, but if you have similar systems otherwise, a fast cpu doesn't make that big a difference on a normal distro bootup