LWN.net Logo

Poettering: The Biggest Myths

Poettering: The Biggest Myths

Posted Jan 30, 2013 17:12 UTC (Wed) by rgmoore (✭ supporter ✭, #75)
In reply to: Poettering: The Biggest Myths by pspinler
Parent article: Poettering: The Biggest Myths

On the concern side, I worry about the extra time it's going to take to debug issues in large chunks of compiled code v. script

This seems like a misplaced worry to me. Consider your example of the perl scripts you debugged. You assumed correctly that any performance problems would be in the scripts rather than the compiled code (i.e. Perl itself). That's a reasonable assumption because your scripts were something you apparently put together in house, so it hadn't had much performance debugging effort. In contrast, Perl is used by millions of people all over the world, and any critical performance issues in it would have been noticed long before you got there. The same thing is true of the shell and standard shell functions when you're writing scripts in sh; you assume that the compiled code has been thoroughly debugged and had its performance problems worked out, so any residual problem must be in your script.

I would argue that systemd is going to be much closer to Perl, bash, etc. than it is to your hand-rolled monitoring scripts. Systemd is going to be doing performance critical tasks on tens or hundreds of millions of machines, so any critical problems are going to be found and fixed quickly. The equivalents of your scripts will be the systemd unit files, which can be much simpler and easier to debug than traditional scripts because they don't require a lot of implementation details.


(Log in to post comments)

Poettering: The Biggest Myths

Posted Jan 30, 2013 19:22 UTC (Wed) by pspinler (subscriber, #2922) [Link]

Well, actually, the perl scripts I'm referring to were themselves vendor written, and presumably in wide use (there's a pretty big customer base for HA clusters running Oracle, after all). Ergo, wide deployment is no guarantee of non-bugginess.

Also, the analogy you make isn't quite getting my point, which is: it's easier and faster to debug script code than compiled code. It's further easier to debug small amounts of code than large amounts of code.

Systemd is both large, and compiled, and thus harder to debug.

To your point: I'm skeptical that systemd will remain so bug free as you imply. Perhaps after 5-6 years of bake in; but now? Uh, sure. It's a large, complicated body of code, which has taken on and rewritten a lot of functionality (initd, inetd, logging, scheduling, timezone, yadda yadda yadda).

That amount of functionality is going to be quite quite hard to get right in any short amount of time, no matter how good your software engineering is. In fact, to quote Mr Poettering from this very discussion, earlier:

> Heck, we have so much more bad code in our stack, Upstart totally stands out in quality.

I'm going to refer to an interesting blog post on software engineering by Joel Spolsky. He's a windows dev, and not a big open source guy, but I think he has some pretty good insights, here:

http://www.joelonsoftware.com/articles/fog0000000069.html

Joel S. argues that rewriting from scratch is one of the worst software project choices that you can make. Why? Here's the crux of it:

> The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed.
...snip...
> that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes.
...snip...
> When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.

There's more, but I think the above is telling. We've spent huge amounts of time making sure that the stuff we have works -- it'll take years more to make sure that the new stuff works (and here's the key point) _no matter how much better an architecture it has_.

So yeah, when systemd actually reaches that level of stability, then at that point I'll unlikely have to debug it in an enterprise environment. Right now, I'm skeptical it's there yet.

-- Pat

Poettering: The Biggest Myths

Posted Jan 30, 2013 19:43 UTC (Wed) by raven667 (subscriber, #5198) [Link]

I think you are right to be skeptical but I think that the systemd team is using appropriate software engineering standards leading to decent initial code quality. While you say the code is complex it is no where near as complicated as, say, a filesystem, so will take less time to stabilize, it has been worked on for several years now and shipped on several systems for at least one release cycle. It represents a net reduction in code compared to the previous systems (sysvinit, startup scripts, shell functions commonly sourced by scripts, ancillary tools used by scripts to daemonize and track pidfiles, etc.) which is another factor which should lead to the core systemd stabilizing quickly.

There will be bugs and problems in the future but I don't expect them to be common or widespread and I'd be surprised if it were in the core functionality because the core has such a limited and well defined scope (starting and killing processes).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds