OLS: On how user space sucks
Dave set out to reduce the time it took his Fedora system to boot. In an attempt to figure out what was taking so long, he instrumented the kernel to log certain basic file operations. As it turned out, the boot process involved calling stat() 79,000 times, opening 27,000 files, and running 1382 programs. That struck him as being just a little excessive; getting a system running shouldn't require that much work. So he looked further. Here are a few of the things he found:
- HAL was responsible for opening almost 2000 files. It will read
various XML files, then happily reopen and reread them multiple
times. The bulk of these files describe hardware which has never been
anywhere near the system in question. Clearly, this is an application
which could be a little smarter about how it does things.
- Similar issues were found with cups, which feels the need to open the
PPD files for every known printer. The result: 2500 stat()
calls and 400 opens. On a system with no attached printer.
- X.org, says Dave, is "awesome." It attempts to figure out where a
graphics adapter might be connected by attempting to open almost any
possible PCI device, including many which are clearly not present on
the system. X also is guilty of reopening library files many times.
- Gamin, which was written to get poll() loops out of
applications, spends its time sitting in a high-frequency
poll() loop. Evidently the real offender is in a lower-level
library, but it is the gamin executable which suffers. As Dave points
out, it can occasionally be worthwhile to run a utility like
strace on a program, even if there are no apparent bugs. One
might be surprised by the resulting output.
- Nautilus polls files related to the desktop menus every few seconds,
rather than using the inotify API which was added for just this
purpose.
- Font files are a problem in many applications - several applications
open them by the hundred. Some of those applications never present
any text on the screen.
- There were also various issues with excessive timer use. The kernel blinks the virtual console cursor, even if X is running and nobody will ever see it. X is a big offender, apparently because the gettimeofday() call is still too slow and maintaining time stamps with interval timers is faster.
There were more examples, and members of the audience had several more of their own. It was all great fun; Dave says he takes joy in collecting train wrecks.
The point of the session was not (just) to bash on particular applications, however. The real issue is that our systems are slower than they need to be because they are doing vast amounts of pointless work. This situation comes about in a number of ways; as applications become more complex and rely on more levels of libraries, it can be hard for a programmer to know just what is really going on. And, as has been understood for many years, programmers are very bad at guessing where the hot spots will be in their creations. That is why profiling tools so often yield surprising results.
Programs (and kernels) which do stupid things will always be with us. We
cannot fix them, however, if we do not go in and actually look for the
problems. Too many programmers, it seems, check in their changes once they
appear to work and do not take the time to watch how their programs
work. A bit more time spent watching our applications in operation might
lead to faster, less resource-hungry systems for all of us.
Index entries for this article | |
---|---|
Conference | Linux Symposium/2006 |
Posted Jul 20, 2006 22:37 UTC (Thu)
by cventers (guest, #31465)
[Link] (7 responses)
I guess I'm not at all surprised that some applications behave so
One of the reasons I find programming _so_ entertaining is because I am
What this does tell me is that there is tons of low hanging fruit if we
Posted Jul 20, 2006 23:03 UTC (Thu)
by tomsi (subscriber, #2306)
[Link] (2 responses)
I'm glad we have free systems -- not only is it possible to spot these
problems, it's possible for people not originally involved in the
programs to step in and fix them.
Posted Jul 21, 2006 1:17 UTC (Fri)
by sepreece (guest, #19270)
[Link] (1 responses)
Posted Jul 21, 2006 3:40 UTC (Fri)
by cventers (guest, #31465)
[Link]
:)
Posted Jul 21, 2006 5:27 UTC (Fri)
by flewellyn (subscriber, #5047)
[Link] (3 responses)
With respect, I'm not at all sure that layers of abstraction are really necessary to create such
a situation. In fact, in my experience, well-abstracted systems where each conceptual layer is
well-defined and efficient in and of itself, tend to have less of these sorts of problems going on.
I've found that inefficiencies and bogosities of the sort described tend to crop up more in poorly
abstracted systems, or where the abstractions are "leaky", exposing too many internals to the
next level up. I know this because I have built some. As a custom web application developer, I have created a number of large, complex systems
that do very complicated things; in the process, I have had to reimplement and refactor many of
the earlier generations of these systems, because as an application developer in general, I am
prone to the very human desire to just get it working. While I try to make sure the applications
are properly abstracted, oftentimes the combination of deadlines and the exploratory nature of
the process means that I don't do this as well as I should. This results in system components, libraries, and abstraction layers which at times can be
brilliant coups of engineering, and at other (far too frequent) times turn out to be astounding
feats of perverse stupidity. In rereading old code, I think for every time I have said "I did that?
Damn, that's cool!", I have also said "What the HELL was I thinking?" Of course, sometimes the design goals are poorly specified, or very broad, or else there are
no efficient methods of doing something at the moment, and you just have to wing it with an
inferior, but at least functional, solution. That may account for some of the cases above. And
then there is the issue of programs whose very purpose is misguided. That may account for one
of the other cases given above.
Posted Jul 21, 2006 9:21 UTC (Fri)
by nix (subscriber, #2304)
[Link] (2 responses)
If you're not told how expensive some function call is, the only way to tell is to profile the hell out of it *on a system where n happens to be large* (so fontconfig sloth might not be obvious unless, like davej, you have many thousands of fonts), or to look deeply into the internal implementations of every function you ever use (no chance).
What this is really telling us is that we need better docs, I think.
Posted Jul 21, 2006 13:10 UTC (Fri)
by oak (guest, #2786)
[Link]
Actually when I half a year ago installed Breezy on a P166, I found
When I straced the Xlib programs, most of the time seemed to be going to
Posted Jul 21, 2006 17:40 UTC (Fri)
by Tet (guest, #5433)
[Link]
Well, yes. But also better tools. SystemTap is a perfect example of a
"better tool" in this case. It lets you look at problems at a system-wide
level, rather than on a per-process basis, and you'd be amazed at some of
the things that show up, in places you're least expecting. It turns out that
on my desktop, ioctl() and poll() are called more than any other system
call, by an order of magnitude -- where intuitively (and based on experience
on previous systems), I'd have expected
gettimeofday(). SystemTap provides an easy way to track
down the culprit, too (in this case, the java_vm, even when no applet is
running -- but being closed source, there's sadly nothing that can be done
to fix it).
Posted Jul 21, 2006 0:40 UTC (Fri)
by Tara_Li (guest, #26706)
[Link] (2 responses)
Posted Jul 21, 2006 9:11 UTC (Fri)
by pebolle (guest, #35204)
[Link] (1 responses)
Posted Jul 27, 2006 17:15 UTC (Thu)
by lockhart (guest, #31615)
[Link]
Or see the individual paper at http://ols2006.108.redhat.com/
Posted Jul 21, 2006 1:11 UTC (Fri)
by yusufg (guest, #407)
[Link] (3 responses)
Dave, if you are reading this and you've filed bug reports can you link to them please
Posted Jul 21, 2006 3:44 UTC (Fri)
by cventers (guest, #31465)
[Link]
Having a super-robust and massively efficient kernel is only half the
Asking "where's the bug reports" has always seemed to me like a defensive
Posted Jul 21, 2006 12:32 UTC (Fri)
by arjan (subscriber, #36785)
[Link]
Posted Jul 21, 2006 21:39 UTC (Fri)
by davej (subscriber, #354)
[Link]
Lots of the examples I covered are already fixed, but there are still a number of outstanding issues. I'll be rerunning the tests some time soon, and see what's left that sticks out, but based on data I collected not so long back, we're now doing a *lot* better than we used to.
The initial tests the paper was based on were done on Fedora Core 5 test1 iirc, and I did some quick stats on an FC5 final release, and the amount of reads/stats/exec's were pretty much halved, even though we had added more functionality, and a few extra daemons etc.
I'll be looking at this stuff again as we get closer to FC6.
Posted Jul 21, 2006 6:50 UTC (Fri)
by ekj (guest, #1524)
[Link] (7 responses)
Case in point: Fedora, Mandriva and Ubuntu all install pcmcia by default, even in the case where the computer is a stationary that does not even have a pcmcia-slot. They also enable the pcmcia startup-script in the normal runlevels. The script is over 300 lines long. It contains 57 conditinals (ifs, elses, cases) and among other things, calls a script named laptop-detect that tries to guesstimate if we're on a laptop. (by messing around in proc looking for a batteries-file, among other things)
Now, most computers don't turn into laptops overnigth. It would be perfectly possible to do this detection *once* on installation, and thereafter simply not install pcmcia-stuff if we don't actually have that hardware. It would even be reasonable.
Yes, it's "convenient" to have new/changed hardware autodetected and auto-working on first boot after installation. I'm not sure it's worth it though. You could skip a *LOT* of startup if you simply assumed this boot was going to be exactly like the last one. There could be a big fat option in the boot-menu saying: "Configure new hardware" which would do what the bootup-scripts do *every* time now.
Posted Jul 21, 2006 8:41 UTC (Fri)
by Thalience (subscriber, #4217)
[Link] (2 responses)
I'd take reliable hardware auto-dectection over a static configuration at any reasonable cost, just as a matter of personal preference.
Posted Jul 27, 2006 7:54 UTC (Thu)
by ekj (guest, #1524)
[Link] (1 responses)
But bootup-time (measured from GRUB loads the kernel until the last startup-script finishes) improved from 1:48 to 1:32 simply by disabling kudzu (which does detection of new hardware).
Uninstalling packages that where auto-installed without question, and that support hardware I don't have shaved another 10 seconds off that to aproximately 1:23.
1:23 and 1:48 ain't that hugely different, but it *does* mean a 30% increase in bootup-time for trying to detect hardware I don't have on every bootup.
Some hardware is regularily plugged in an out. It's reasonable (and good) to try to detect such. But that should happen on the fly, and not as part of a bootup-script. Afterall, the user may very well plug in a usb-stick or whatever *after* logging in.
Other hardware (like for example a pcmcia-slot) is rather unlikely to suddenly appear. I'm guessing that 99.9% of the computers that don't have it when the distro is installed, will *never* have it.
Posted Jul 27, 2006 17:26 UTC (Thu)
by Thalience (subscriber, #4217)
[Link]
I still think, however, that ditching auto-detection would be to throw the baby out with the bathwater. By focusing more attention on hot-plug style auto-detection, we can have both fast startup and reliable, effortless hardware support.
As an aside, I've always thought that Kudzu was a very appropriate name for that particular peice of software. :)
Posted Jul 21, 2006 10:06 UTC (Fri)
by kleptog (subscriber, #1183)
[Link]
Nowadays most of the pcmcia stuff has been subsumed into udev so this may not even be relevent anymore...
Posted Jul 24, 2006 23:27 UTC (Mon)
by cjwatson (subscriber, #7322)
[Link] (2 responses)
Per Olofsson simplified that init script a fair bit in Debian; Edgy has that simplification, and e.g. no longer calls laptop-detect.
It's also worth noting that 'case' in shell scripts doesn't spawn a subprocess, unlike 'if [ ... ]', so simply counting conditionals doesn't always give you a fair picture.
Posted Jul 25, 2006 3:59 UTC (Tue)
by Richard_J_Neill (subscriber, #23093)
[Link] (1 responses)
$ date; for ((i=0; i<10000; i++)) ; do if [ a = b ] ; then c=d ; fi ; done ;date
$ date; for ((i=0; i<10000; i++)) ; do if test a = b ; then c=d ; fi ; done ;date
$ date; for ((i=0; i<10000; i++)) ; do if /usr/bin/test a = b ; then c=d ; fi ; done ;date
It's a huge difference! The script with the builtins runs 60 times faster.
Posted Jul 29, 2006 15:43 UTC (Sat)
by kreutzm (guest, #4700)
[Link]
Posted Jul 21, 2006 7:37 UTC (Fri)
by nix (subscriber, #2304)
[Link] (1 responses)
The tradeoff might well have changed since then.
Posted Jul 21, 2006 9:43 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Jul 21, 2006 10:22 UTC (Fri)
by tcabot (subscriber, #6656)
[Link] (3 responses)
http://mail.gnome.org/archives/nautilus-list/2001-August/...
I find that developers (at least here in the US) tend to have very fast machines and upgrade them frequently. Do you think that this might be one of the causes, i.e. the developers might not notice these issues since their HW is so fast? If that's the case then I'd expect OLPC to provide a lot of useful feedback. Running in a resource-constrained environment tends to make efficiency problems more "itchy" than they are on monster HW.
Posted Jul 21, 2006 11:20 UTC (Fri)
by Viddy (guest, #33288)
[Link] (1 responses)
Wow. I'm not a particularly good coder, but my understanding is that if it polls like a large chunk of the desktop programs do, generally, you've done it wrong. I know of the sleep() and usleep() functions, and I'm pretty sure I could figure out callbacks.
It seems that the cups daemon responds with http headers several times per second to none other than... gnome-cups-icon. I mean, It's nice to see printers appearing when the cups daemon finds them on the network, but checking multiple times per second?
The update-notifier spits out craploads of file reads per second. I'm pretty sure that my ubuntu repositories don't change _that_ often :)
The upshot of this is that I'm now irritated enough to start downloading source and submitting patches.
The part that gets me is that neither of my machines are that slow, and yet, under a linux desktop, it feels like I'm trying to push mud around with my mouse. I want my desktop to feel snappy.
Posted Jul 28, 2006 17:45 UTC (Fri)
by sandmann (subscriber, #473)
[Link]
Uh, are you saying that using poll() is wrong?
Posted Jul 21, 2006 15:10 UTC (Fri)
by wilck (guest, #29844)
[Link]
I guess the same will happen to me with Dave's findings, unless LWN some time publishes a follow-up titled "user space doesn't suck no more" ...
Posted Jul 21, 2006 13:00 UTC (Fri)
by NAR (subscriber, #1313)
[Link] (24 responses)
Exactly when was the inotify API added? 2.6.x or 2.4.x?
Posted Jul 21, 2006 13:20 UTC (Fri)
by corbet (editor, #1)
[Link] (23 responses)
Posted Jul 21, 2006 13:44 UTC (Fri)
by pizza (subscriber, #46)
[Link] (22 responses)
Now what about the other platforms that Gnome has to run on? How long have they had inotify? Have they ever? Will it work the same way?
When one of your project goals is to be portable, you really do need to code to the least-common-denominator APIs. Special-case code paths add greatly to software complexity and make debugging more difficult.
Yes, userspace often does a lot of dumb things, but "not taking advantage of bleeding-edge kernel features" isn't usually one of them.
Posted Jul 21, 2006 13:53 UTC (Fri)
by arjan (subscriber, #36785)
[Link] (3 responses)
So there is quite reasonable infrastructure for this in gnome.. just it's not being used consistently
Posted Jul 21, 2006 17:02 UTC (Fri)
by sepreece (guest, #19270)
[Link] (2 responses)
Posted Jul 22, 2006 9:51 UTC (Sat)
by drag (guest, #31333)
[Link] (1 responses)
For instance with famd if I had mount point or something like that in my home directory then it would crap out if I tried to go more then 2 directories deep. And basicly cause anything to do with gnome that concernes files (nautilus mostly)
With gamin there is no problem.
I think that a huge part of the problem we have with performance on Linux desktop nowadays is that everybody was scrambling to get just the basics in place and everything more or less working.
Hal/Dbus/X.org/inotify(and it's userspace helpers)/desktop search stuff/udev.. etc etc. All of it is thrown together and made to 'make it just work'.
Now it seems that the push is going towards making 'make it work well'. Filling out the blanks, improving performance. That sort of thing.
Posted Jul 27, 2006 9:35 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Apparently its inability to send notifications to other copies of itself over the network is a *feature*, but given that you're using NFS or a similar fs in any case, I can't imagine what extra security threats could be opened by sending notifications around. (FAM could do this.)
Posted Jul 21, 2006 15:18 UTC (Fri)
by cventers (guest, #31465)
[Link] (17 responses)
It's totally possible to build platform-independent code (hell, the
We depend on the huge mess of scripts known as "autom4te" so much these
Posted Jul 21, 2006 16:04 UTC (Fri)
by nix (subscriber, #2304)
[Link] (4 responses)
However, this is perfectly doable.
Posted Jul 21, 2006 16:45 UTC (Fri)
by cventers (guest, #31465)
[Link] (1 responses)
But yes, just attempt inotify at startup. -ENOSYS? Ok, we'll try this
Posted Jul 21, 2006 18:31 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Hm. Looking at the sources, gamin has had an inotify backend since v0.0.8, Aug 26 2004, *long* before inotify hit the kernel proper. It is enabled by default.
Looks like this might be an out-and-out bug. I'll have a look this weekend and see if I can reproduce and fix it.
Posted Jul 23, 2006 17:51 UTC (Sun)
by NAR (subscriber, #1313)
[Link] (1 responses)
Yes, but I'm afraid this is way above the avarage application programmer's level.
Posted Jul 23, 2006 17:55 UTC (Sun)
by cventers (guest, #31465)
[Link]
Posted Jul 21, 2006 16:08 UTC (Fri)
by pizza (subscriber, #46)
[Link] (11 responses)
* Software outlives hardware, by several orders of magnitude. You really weaken your argument by trying to draw parallels there -- especially when modern distros *still* build userland for a stock i386.
* "least common denominator" gives you the greatest coverage with the least effort. Additional effort should be focused on where it does the most good, and that call is (hopefully) made by those who know the bigger picture and/or do the actual work. (I'd agree that inotify support is a promising candidate, but I'm just an armchair general)
* Different APIs can require radically different software architecures; it's not a matter of "writing an autom4te test"; someone has to actually write a non-trivial pile of non-trivial code, while leaving the existing path intact as a run-time fallback and maintaining complete backwards compatibility (source, binary, and behaivoral) for the APIs that Gnome exports.
So while yes, the "least common denominator" argument sucks, it's not the suckiness of the argument itself, but rather the suckiness of the *reality* that the argument represents.
"Optimization without instrumentation is just mental masturbation"
Posted Jul 21, 2006 16:42 UTC (Fri)
by cventers (guest, #31465)
[Link] (8 responses)
The problem with the least common denominator argument isn't really the
Furthermore, the fact that different systems require different code to be
When you choose to support multiple systems, you should be ready to write
> "Optimization without insturmentation is just mental masturbation"
I've never much been a fan of that argument either, because it's often
Put another way: I would like to think that any reasonably talented
These arguments (the least common denominator and the no optimization
It seems like a perfectly acceptable bargain, and on some level it is. (I
I'm sure not all of Dave's identified misbehaviors were even apparent to
So I propose a new quote:
"Sensible optimizations give pleasure by default"
Posted Jul 21, 2006 19:18 UTC (Fri)
by pizza (subscriber, #46)
[Link] (7 responses)
Here's the bottom line -- we're not all "above average" programmers. Even when we know what "the right way" is, we usually don't have that luxury due to externally-imposed constraints.
"Cheap, fast, good. Pick two"
Posted Jul 21, 2006 20:14 UTC (Fri)
by cventers (guest, #31465)
[Link] (6 responses)
Really? I'm not sure I see how. It seems to me like you were listing
> Here's the bottom line -- we're not all "above average" programmers.
What does "average" have to do with it? It doesn't take oodles of talent
You allude to constraints but never mention what some of them might be.
> "Cheap, fast, good. Pick two"
Why pick just two? One of the greatest things about free software
This stuff isn't actually all that complicated. The problem is either
*A) No one had pointed out ways in which apps misbehave, so no one knew
So I think Dave's paper was spot-on. We should skip the 'apologizing'
Posted Jul 23, 2006 13:08 UTC (Sun)
by pizza (subscriber, #46)
[Link] (5 responses)
If you want your software to be developed "good and fast", then it's not going to be cheap. If you want it "fast and cheap" then it's not going to be all that good. If you want it "good and cheap" then it won't happen particularly quicky.
"fast and cheap" is usually where software ends up when someone is directly footing the bill (and hence, there is an upper bound on cost, aka budgets/deadlines, and "good" tends to suffer). "Good and cheap" is where F/OSS software traditionally lies, where the "it'll be done when it's done" attitude is the norm. Then we end up with the likes of NASA (or other life-critical situtations), where the requirement of "good" is so important that it happens neither quicly nor cheaply.
The problem with the above generalization is that many larger F/OSS projects (including Gnome) actually fall into the first category, as the majority of the "work" is done by people required to do so, with formal goals, deadlines and budgets. F/OSS has gone up and been corporatized!
(Another glaring hole in this generalization is that "good" means different things to everyone -- In the end, only the one who is footing the bill gets to make that call -- but that is the nature of generalizations..)
Dave's ("spot on", as you put it) paper was a direct result of the idea embodied by the "no premature optimization" blurb that you took so much of an issue with. Without that instrumentation, this handful of bugs/mistakes wouldn't have likely come to light, and we wouldn't have been able to learn from them.
Posted Jul 23, 2006 17:10 UTC (Sun)
by cventers (guest, #31465)
[Link] (4 responses)
You could twist the definition of fast, cheap, and good enough to make
F/OSS is getting more and more industrialized, but depending on the
This is free software. The traditional rules of corporate development
Posted Jul 23, 2006 17:55 UTC (Sun)
by NAR (subscriber, #1313)
[Link] (3 responses)
I wouldn't call the 2.6 process "fast, cheap and good". It might be fast, but it's certainly not good (the last usable kernel for me was 2.6.14) and definitely not cheap - I'd like to know how many kernel developers are funded for their work on the kernel. I think it's not a particulary low number.
Posted Jul 23, 2006 18:00 UTC (Sun)
by cventers (guest, #31465)
[Link] (2 responses)
It's unfortunate that you've had problems since 2.6.14. What sort of
After having seen the survey conducted here on kernel quality, it would
Posted Jul 24, 2006 6:42 UTC (Mon)
by drag (guest, #31333)
[Link] (1 responses)
Lower latencies, more usable desktop. Better responsiveness. My hardware is supported out of the box on new kernels, which is wasn't for older. ALSA sound drivers are a huge improvement over OSS for me. With dmix I can have, get this, more the _one_sound_ at a time and it doesn't sound like crap. Multimedia performance has improved.
(of course I am still taking about the kernel here.. it's desktop scedualing options makes life better)
Stability has improved. Wireless support has improved. Udev makes things easier for me now that I just tell the computer what /dev files I want vs having to dig around and finding the stupid major minor numbers for everything.
Maybe if the other person was to post WHY 2.6.15, 2.6.16, 2.6.17 series kernels are unusable maybe they would have receive more sympathy.
Posted Jul 24, 2006 12:27 UTC (Mon)
by NAR (subscriber, #1313)
[Link]
Your mileage may vary, but I never managed to boot my old 486 with 2.6 kernel - fortunately it worked with 2.4. It didn't worked well, the TCP connection tracking code kept tracking connections that were long gone, so the system ran out of memory, but it still worked. On the other hand, one of the two reasons I use 2.6 on my other computer is that with 2.6 I dont' have to reboot between watching a DVD and burning a CD-R.
Stability has improved. Wireless support has improved. Udev makes things easier for me
Again your mileage may vary, but my computer locks up hard with every single 2.6 if I make a larger I/O operation while watching TV with xawtv - and this wouldn't make a useful bug report. I don't have wireless cards and never felt the need for dynamic /dev, so these features do not make me happy.
WHY 2.6.15, 2.6.16, 2.6.17 series kernels are unusable
Recording audio from TV doesn't work with mplayer. I've reported the bug and it's supposed to be in mplayer and supposed to be fixed, yet it still didn't work when I tried last time. So I stick with 2.6.14.
Posted Jul 21, 2006 17:42 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link]
The userland is normally compiled for i386 instructions only, but scheduled (instruction selection and ordering) for i686. The code where full i686 (or whatever) does make a real difference is far in between (and there you do get i686 packages).
Distributions (and their users!) do pay a hefty price if there are zillions of package versions by CPU type.
Posted Jul 22, 2006 6:07 UTC (Sat)
by dvdeug (guest, #10998)
[Link]
Posted Jul 21, 2006 13:31 UTC (Fri)
by oak (guest, #2786)
[Link] (1 responses)
A good example of this is following crash:
ATI driver crashed trying to probe something that didn't exist on ISA
Posted Jul 21, 2006 18:25 UTC (Fri)
by daniels (subscriber, #16193)
[Link]
(For what it's worth, I fixed the library-stat()ing in Ubuntu quite a while ago, but the patch only got half-merged into upstream because it broke a couple of things. But I fixed it again in Dave's talk, after realising that it was still sucking.)
Posted Jul 23, 2006 0:09 UTC (Sun)
by hein.zelle (guest, #33324)
[Link]
xfce4-panel: average 2% cpu usage. A strace of approximately 10 seconds shows 267 function calls to gettimeofday, and 141 to ioctl and poll.
firefox: average just over 2% cpu usage. A strace of approximately 10 seconds shows 594 calls to gettimeofday, 151 calls to poll, read and ioctl, 282 calls to futex. I have no idea why it calls gettimeofday 4 times in a row in every cycle.
Apparently cpu-unintensive polling is a more common problem than I thought it would be. Is using gettimeofday, ioctl and poll the common way to do this?
Posted Jul 24, 2006 21:01 UTC (Mon)
by aegl (subscriber, #37581)
[Link] (5 responses)
$ ls -l /bin/true
Infinite bloat!
But it is worse. Run strace on /bin/true, and you'll see it open and mmap a dozen locale files (and try and fail to open a dozen more).
The problems here seem to have arisen because someone decided to add "--version" and "--help" arguments. Aaargggghhhh!
Posted Jul 25, 2006 1:00 UTC (Tue)
by jonabbey (guest, #2736)
[Link] (2 responses)
Posted Jul 25, 2006 1:11 UTC (Tue)
by zlynx (guest, #2285)
[Link]
Here you go :)
Posted Aug 9, 2006 1:28 UTC (Wed)
by barrygould (guest, #4774)
[Link]
Barry
Posted Jul 25, 2006 1:40 UTC (Tue)
by joey (guest, #328)
[Link] (1 responses)
(true is also a builtin in bash and dash, but I prefer ":" for space-efficiency also.)
However, your version of gnu true doesn't seem to match mine, which opens only /usr/lib/locale/locale-archive, and which is faster than the zero-byte version.
Posted Jul 25, 2006 12:05 UTC (Tue)
by nix (subscriber, #2304)
[Link]
Posted Jul 24, 2006 23:18 UTC (Mon)
by bluefoxicy (guest, #25366)
[Link]
You know all you people are gathering nice profiling data and all, I'd like to know all the stuff that normally gets scanned for and why it's relevant. Seems to me the lot of you are running strace and finding a djillion gettimeofday()s et al; wouldn't it be nice to have a small shell script that does this and spits out timing information?
Unfortunately oprofile tends to suck and eat up gigs of hard disk space in a few days, else it'd be interesting to have a method of automatic profiling and reporting. Of course we then sacrifice cycles to do runtime profiling, which is no good in a production environment; but I'd volunteer some CPU time to gather data like that, it's not that much of a drag for light usage like typical desktop stuff. I'm not sure of the security implications here ... runtime timing data may be useful for getting parts of passwords or encryption keys, if it's detailed enough; but that's black magic to me so far.
Posted Jul 25, 2006 1:49 UTC (Tue)
by joey (guest, #328)
[Link] (6 responses)
If you're unfortunate enough to navigate to /usr/bin using it, it will probably lock up for a good 2 minutes as it reads 10+ mb of data and makes tens of thousands of system calls. Compare with ls /usr/bin which takes about 0.2 seconds even in inefficient sort and colorise mode.
Posted Jul 27, 2006 11:13 UTC (Thu)
by jbh (guest, #494)
[Link] (1 responses)
I guess it wouldn't be too hard to work around either, but I really don't want to wade into the flame fest that surrounds the gtk file chooser. So... for now I mostly use opera, at least on slow machines.
Posted Jul 28, 2006 15:03 UTC (Fri)
by bjornen (guest, #38874)
[Link]
Ouch! opera 9.0-20060616.6 is one of the worst offenders on my computer (a
top's TIME+ column says it used 24 CPU seconds by the time it's done
And this on a fresh install, only 9 tabs opened, the window minimised and
There some surprising blame and constructive criticism in Federico Mena
Posted Jul 27, 2006 14:21 UTC (Thu)
by emj (guest, #14307)
[Link] (2 responses)
Posted Jul 27, 2006 14:33 UTC (Thu)
by emj (guest, #14307)
[Link] (1 responses)
Posted Aug 6, 2006 16:07 UTC (Sun)
by Duncan (guest, #6647)
[Link]
KDE (3.5.4) does. It pops up a 3-section window, textbox file chooser on
Duncan
Posted Aug 3, 2006 9:17 UTC (Thu)
by kelvin (guest, #6694)
[Link]
This was all too true on older versions of GTK+, but the file selector (among other GTK-things) has been heavily profiled and improved in the latest versions. On this P4/3GHz Ubuntu install with GTK+ 2.8.20, a cold-cache opening of /usr/bin takes roughly 2 seconds (including display of the "undecipherable icons").
Posted Jul 27, 2006 16:34 UTC (Thu)
by gdt (subscriber, #6284)
[Link]
linux-2.6.17/Documentation/feature-removal-schedule.txt says: What: mount/umount uevents Also, try and detect if an unrelated process is running without polling (hint: *notify doesn't work on procfs).
Posted Nov 10, 2010 10:11 UTC (Wed)
by Randakar (guest, #27808)
[Link] (1 responses)
How much of this has been changed and/or fixed today?
Posted Nov 11, 2010 9:57 UTC (Thu)
by MKesper (subscriber, #38539)
[Link]
I'm glad we have free systems -- not only is it possible to spot these OLS: On how user space sucks
problems, it's possible for people not originally involved in the
programs to step in and fix them.
poorly... this is what tends to happen when you stack layers and layers
of abstraction on eachother.
challenged by the task of creating the most stable and incredibly
efficient solution I can think of. I guess the same is not true of
everyone :)
want to improve power efficiency and performance. So, great hackers,
let's get to work!
OLS: On how user space sucks
I don't know about Windows, but you could do similar analysis on Unix systems, since 1990-ish, despite their being closed source. Dave's report (which was, indeed, hilarious, and a great lesson for us all) was based on observing the apps from outside (watching their file operations in particular).OLS: On how user space sucks
Right. The difference with free software, though, is that you don't have OLS: On how user space sucks
to wait for the vendor to fix it (or hope that they care enough). If
something is really itching, you're 100% empowered to scratch it.
OLS: On how user space sucks
That one thing the C++ standard library got right, which I wish other interface designers would follow: treat the time and space complexities as *part of the interface*, document them, and *do not increase them*.OLS: On how user space sucks
> If you're not told how expensive some function call is, the only way to OLS: On how user space sucks
> tell is to profile the hell out of it *on a system where n happens to be
> large* (so fontconfig sloth might not be obvious unless, like davej, you
> have many thousands of fonts),
that simple Xlib based programs (xeys, xclock...) took actually twice
as long (10 secs) to start than e.g. Gnome calculator or Abiword
which were using fontconfig through Xft through Pango. So, I would
say that it's an improvement over what was before. :-)
loading information about Asian bitmap based X fonts. It would have been
nice if the other bitmap fonts (for example everything besides the cursor
and fixed font) would have been in separate package that is installed only
when needed.
What this is really telling us is that we need better docs, I think.
OLS: On how user space sucks
Any chance of a transcript popping up somewhere?OLS: On how user space sucks
http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf (pages 441 through 449)OLS: On how user space sucks
OLS: On how user space sucks
I'm assuming Dave filed bug reports in the appropiate places or was this talk the 'bug report' and he's expecting others to file bugs.Pointers to the relevant bugs filed
Pardon, but I'm guessing from reading your comment that you might have Pointers to the relevant bugs filed
missed Dave's point. I don't think his point was to point out specific
problems with specific applications -- his point was to make it clear that
we should be paying more attention to such things in general.
battle. Without making similar improvements in user-space (or worse,
losing ground because of the sort of problems Dave brings up), the free
software desktop as a whole won't improve.
response. The correct response is to acknowledge that there is a problem
and fix it.
Most of the guilty people were actually in the room and took this bug report in public after admitting the humilating fact that we write sucky code...Pointers to the relevant bugs filed
I don't have pointers to bugs, because in a lot of cases, I just mailed/IRC'd the relevant developers.Pointers to the relevant bugs filed
The startup-scripts also have tons of stupidity. Many of them are like literally hundreds of lines of shell-script containing dozens of checks, even in the case where they ultimately do nothing.OLS: On how user space sucks
Do you have profiles to support the idea that hardware autodetection is a big component of the startup time? It could be, but watch out for optimizing blind.Profiling before you optimize
Not scientifically valid profiles, no.Profiling before you optimize
Ok. Those are reasonable numbers to work with. Thanks for making the effort!Profiling before you optimize
Last time I installed Debian, at the end of the installtion it noticed I didn't have a laptop and offered to remove pcmcia-cs for me.OLS: On how user space sucks
On Ubuntu, pcmciautils is installed by default because - aside from its init script - it's pretty small and lightweight (compared to the monster that was pcmcia-cs/cardmgr) and it simplifies the installer, debugging the resulting system, etc. if we just install it all the time.OLS: On how user space sucks
"if [ a = b ] " and "if test a = b " are both shell-builtins. If you want the other one, you have to call /usr/bin/testTest, with "test" vs /usr/bin/test
Tue Jul 25 04:51:16 BST 2006
Tue Jul 25 04:51:16 BST 2006
Tue Jul 25 04:51:27 BST 2006
Tue Jul 25 04:51:27 BST 2006
Tue Jul 25 04:51:33 BST 2006
Tue Jul 25 04:51:50 BST 2006
Where do you see that "60x"? ALso I'd advise you to use time(1) next time ;-)Test, with "test" vs /usr/bin/test
I think it's more that *when the smart scheduler was implemented* (XFree86 4.1?) gettimeofday() was considered too slow (for tight inner loops when processing multiple requests from clients all at once).OLS: On how user space sucks
(Ah, I see davej mentioned this. Thanks for the transcript!)OLS: On how user space sucks
This reminds me of the time a few years ago when Alan Cox ran Nautilus under strace. He didn't like what he saw:OLS: On how user space sucks
Just for sh*ts and giggles, and because my laptop makes lots of fan noise when it uses lots of processor power and gets hot (its a Compaq Evo n1020v P4 2.4GHz) I thought I might run strace on various processes running on a stock ubuntu install to see why it didn't idle at 300mhz and buggerall cpu usage.OLS: On how user space sucks
> Wow. I'm not a particularly good coder, but my understanding is that if it polls like a large chunk of the desktop programs do, generally, you've done it wrong. I know of the sleep() and usleep() functions, and I'm pretty sure could figure out callbacks.OLS: On how user space sucks
All the time since I read that article I have wondered if anybody ever cared to fix this.OLS: On how user space sucks
Nautilus polls files related to the desktop menus every few seconds, rather than using the inotify API which was added for just this purpose.
OLS: On how user space sucks
inotify was merged for 2.6.13, one year ago.
inotify
So inotify has been in linux 2.6 for about a year now; cool. inotify
gnome has a thing called "gamin" which abstracts the various inotify-like interfaces the different operating systems provide. At least IRIX provides a dnotify like thing, as does linux historically. And if the OS is unknown or has no method *gamin* goes to do the polling.inotify
Gamin was actually one of the things Dave complained about...inotify
I am sure that it's worth complaining about... but it's a lot better then the 'famd' it replaced!inotify
... except of course that if you're a poor sod whose home directory is mounted over the network (perhaps from a centralized RAIDed fileserver), then, oops, the damn thing falls back to polling (over a network!)inotify
The "least common denominator" argument really sucks. I get that KDE, inotify
Gnome and X.org try to support as many of the UNIXes as they can. But I
refuse to accept that they should do so at the expense of the majority of
their users (who are using Linux).
toolkits both of our desktops run on are portable to operating
systems /without/ UNIX APIs), yet specialize on each platform. Take the
kernel as a great example -- we have a nice mechanism called
"alternatives" that detects processor model and counts, and then
re-writes parts of the kernel text on the fly in order to make it
maximally efficient. The developers could have instead shot for the
lowest common denominator (386) -- cause the code would still certainly
work on everything else (provided that it's also built for SMP).
days in order to make our buildsystems work, but when I watch all the
crap flying by on every package I build, I realize that few of them
actually /need/ all those damn checks. Why don't we make better use of
the tools we have? autom4te can check inotify. If it's present, don't
build a Gnome desktop that spams the kernel, CPU and memory bus every
second when there's no activity at all.
No, in practice you must build something which tests for inotify at runtime and falls back to dnotify or even polling. The reason: distributors won't want to build programs which fail to work when run on kernels as recent as 2.6.12 --- at least, not non-system-level programs.inotify
Ah, good point. Well, at the very least, having build-time inotify inotify
support would assist some of us (crazy Gentoo users that spend half our
lives watching a compiler) immediately and others later ;)
another way.
fam 2.7.0 uses dnotify in any case, if it's available. It may not be as nice as inotify but it's a hell of a lot better than polling.inotify
in practice you must build something which tests for inotify at runtime and falls back to dnotify or even polling.[...]However, this is perfectly doable.
inotify
True, but this is something that should be part of the desktop inotify
infrastructure, not a part of every application. Most application
programmers wouldn't have a lot of fun with Xlib either, but someone's got
to do it...
A few points --inotify
There is a difference between a vendor choosing to make i386 releases and inotify
programmers refusing to use the features of any more modern chip simply
because a few i386 boxes are still out there clocking their ops. One of
the great things about having open source code is that you can download
and build your own packages optimized just how you choose (indeed,
distributions like Gentoo even make it easy). You're doing well if your
code will build for old hardware but otherwise make use of new features.
suckiness of the reality that the argument represents, it's the fact that
it ever gets used as an excuse to write code in which "sub-optimal" is a
gross understatement.
optimal is a fact of life. It's why we have abstraction layers at all. If
every system was the same, operating systems either wouldn't exist or
they'd be a hell of a lot more simple, and that goes for everything from
the bottom of the stack up. It's very much a reality, as you put it.
multiple implementations of the same function. Writing to the least
common denominator -- and not ever specializing -- is a cop-out.
used to justify incredibly sloppy / inefficient code. The quote as it
stands is simply imprecise. There are /some/ optimizations which are
questionable enough that you very much want insturmentation before you
write large chunks of code, but the world just isn't black and white.
systems programmer would know that polling files several times a second
for something like menu entries, or assembling entire HTTP queries and
responses several times a second to communicate with a system tray icon,
is a bad idea -- something that could be optimized. No need for
insturmentation at all.
without insturmentation) really irritate me, because I started on a 386
and many common operations take more wall-clock time today than they did
back then. I'm now on a Pentium 4, for chrissakes, with a gigabyte of DDR
RAM. What has happened is that as the generations go on, some of us seem
to be trading in programmer time for CPU time (read: being lazy).
don't think any sane person expects you to write desktop apps in
assembler, even though if you somehow had the dedication and
concentration required you'd make something at least slightly faster).
The problem is that programmers are _being lazy_ and choosing points on
the "diminishing returns" curve that are well before returns start to
diminish.
the programmers in question. Many of them are probably 'bugs'. But when I
hear about applications hammering the filesystem many times per second,
or using HTTP as an IPC mechanism between a system tray icon and another
program, I worry that we've all gone just a little bit crazy.
Most of your response is tangental to the argument I submitted.inotify
> Most of your response is tangental to the argument I submitted.inotify
counterpoints to my complaint about programming to the least common
denominator, and I was systematically addressing them (including your
quote about optimization)
> Even when we know what "the right way" is, we usually don't have that
> luxury due to externally-imposed constraints.
to build a model capable of using different implementations. Sometimes,
it's even more trouble to try and come up with something generic!
development is that it's usually not the requirements-driven,
oh-my-the-deadline-is-yesterday-and-the-customer-is-complaining-style
development uncomfortably familiar to programmers working in the
corporate world. And if our projects are being run that way (which I
don't think they are), we should move further up the chain and ask why
we're adopting policies and procedures that impose external constraints
on our code quality.
there was a problem (glad we have this paper to enumerate some examples!)
*B) Developers did what they thought was 'good enough' and just didn't
realize that their implementation didn't make their expectation
*C) We're less than average programmers and we can't figure this stuff
out for the life of us (doubt that, there's oodles of awesome free
software from all of the major projects out there, which demonstrates
competency)
step and move on to 'making it better'.
"Fast, cheap, good. Pick two" is a reflection of the reality that nothing is without cost.inotify
And finally, I would agree with you and chalk up the problems that Dave raised to (A) and (B), although they both are symptoms of (C) -- which is usually due to inexperience, not idiocy. Subsequently, with better awareness of (A) and (B), (C) is lessened as the programmer presumably will learn from their mistakes.
Most of what you say about "Fast, cheap, good. Pick two" is fine and good. inotify
But all I'm really trying to say is that we, the F/OSS community, have the
capacity to do better. Look at the Linux 2.6 process - I would call
that "Fast, cheap, good". It's not perfect, but it's damn fast, it's still
F/OSS and it's still /very/ good.
the "Pick two" argument apply to any project. The problem I have
with "Pick two" and the earlier optimization quote is simply that most of
the time I've heard an engineer saying one, it's being invoked as an
excuse for shoddy design. And I've personally witnessed that when you
simply let a passion for your art drive your work, and sprinkle on a
little bit of experience in the environment you're working in, you can
deliver "fast, cheap, good" all at once.
project, the majority of the code still comes from people with that
passion -- people just scratching their itch. I hope our projects don't
erode into the same corporately-managed disasters as are so commonplace to
the proprietary software engineer. But since engineers have the power in
F/OSS, I think if we focus on passion and rejecting ideas like "fast,
cheap, good -- pick two," we'll be entirely successful in breaking the
traditional rules of development once again.
don't apply; please leave them at the door.
Look at the Linux 2.6 process - I would call that "Fast, cheap, good". It's not perfect, but it's damn fast, it's still F/OSS and it's still /very/ good.
inotify
If 'cheap' is a function including n (the rate of change) rather than a inotify
constant, then I think the kernel is about as 'cheap' as you can get.
problems are you having?
seem like most users are pleased (I'm one of them).
Each kernel gets better for me. 2.4 series was better then 2.2. 2.6 is better for me then 2.4inotify
My hardware is supported out of the box on new kernels, which is wasn't for older.
inotify
i386 userland?
As far as I know, none of the modern distros build for a stock i386. You can't; the C++ libraries depend on i486 opcodes to properly implement threading, and while there are i386 alternatives, they're slow and unreliable.inotify
> X.org, says Dave, is "awesome." It attempts to figure out where a graphics OLS: On how user space sucks
> adapter might be connected by attempting to open almost any possible PCI
> device, including many which are clearly not present on the system.
https://launchpad.net/distros/ubuntu/+source/xserver-xorg...
bus after finding the correct card from PCI bus. I think the problem
is the drivers, not the X server itself.
No, he was talking about how the code in hw/xfree86/os-support/bus/Pci.c and hw/xfree86/os-support/linux/lnx_pci.c walks the tree (it's really bad, and only very recently got better). The ATI situation is due to the driver being braindead (use Driver "radeon", not Driver "ati"), and has nothing to do with the code Dave is talking about.OLS: On how user space sucks
Inspired by the above article, I went to check the first programs in the cpu ordered top-list on my system. Although I realize the point of the article was more generic, it may be an interesting excercise to try this on your own system. My system is running a recently updated debian unstable with kernel 2.6.17-1-k7.Some more observations: xfce4-panel, firefox
A long time ago in a Unix version far, far away /bin/true was an empty shell script (since it was executable, the shell would run it as a shell script when the kernel failed the exec(2), with nothing in the script, the shell returned a 0 exit code).Infinite bloat
-rwxr-xr-x 1 root root 21969 2004-04-05 21:32 /bin/true
But invoking a complete /bin/sh process to evaluate the empty shell script file would have been worse, surely?Infinite bloat
Not mine, but I remembered seeing it before.Infinite bloat
http://www.muppetlabs.com/~breadbox/software/tiny/true.as...
True was changed from a shell script to an executable due to the fact that running a shell for a user that wasn't supposed to be able to login (e.g. an ftp-only user) created a security hole (ctrl-c could get you a shell).Infinite bloat
Shell coders who are really interested in being efficient don't use external commands like /bin/true anyway, when the shell builtin ":" will do the same thing.Infinite bloat
That is all GNU libc-version-dependent, and happens before main() is entered.Infinite bloat
OLS: On how user space sucks
My favorite example at the moment is the gtk file selector dialog, as seen in firefox. Each time you change directories, it first uses getdents to get all the files in the directory, then stats each individual file, then _reads_ 4k of each file to determine the file type. That information is used to put undecipherable tiny useless icons next to the files indicating their file type.gtk file selector dialog
Yes! 2 minutes to choose an application sounds about right (even if I know the exact path and write it in the Ctrl-L location dialog).gtk file selector dialog
> So... for now I mostly use opera, at least on slow machines.gtk file selector dialog
750MHz P4 laptop (dell i8k)).
loading. After this it continues to steal ~3% CPU by itself and increasing
Xfree86's CPU usage to ~7%.
javascript, plugins, et al turned off.
Quintero's blog -
http://primates.ximian.com/~federico/news-2005-11.html#mo...
The best part is you can't just enter a command to execute having firefox look it up in the path.. So this bug of GTK fileselector only show up because a misfeature in Firefox.. ;-( gtk file selector dialog
Here is te bug report in bugzilla all you have to do is fix it.
Does any one know if gnome/kde has a "open with application" dialog like the one windows uses
gtk file selector dialog
> Does [...] gnome/kde has a "open with application"gtk file selector dialog
> dialog like the one windows uses
top (browse button to the right), a tree-view copy of the K-Menu in the
center, and two checkbox options on the bottom, run in terminal (with a
don't close after exit suboption that's dimmed out until run in terminal
is selected), and remember application association (similar to the
MSWormOS dialog option). Below that are the usual OK/Cancel buttons.
If you're unfortunate enough to navigate to /usr/bin using it, it will probably lock up for a good 2 minutes as it reads 10+ mb of data and makes tens of thousands of system calls.
gtk file selector dialog
Sometimes userspace is made to suck, I mean poll
When: February 2007
Why: These events are not correct, and do not properly let userspace know
when a file system has been mounted or unmounted. Userspace should
poll the /proc/mounts file instead to detect this properly.OLS: On how user space sucks
Yes, this would be something worth of a follow-up!
Besides, here's the updated link to the slides of Dave Jones' presentation "Why userspace sucks" (Magicpoint format).
OLS: On how user space sucks