Missed some rows

Posted Apr 29, 2011 18:17 UTC (Fri) by martinfick (subscriber, #4455)
In reply to: Missed some rows by dlang
Parent article: Poettering: Why systemd?

Then why are all the startup scripts different even across linux distributions?

Missed some rows

Posted Apr 29, 2011 18:55 UTC (Fri) by dlang (guest, #313) [Link] (29 responses)

because the distro maintainers each want to do slightly different things and don't take the time to see what the other distros are doing

there are a few things based on distros choosing to put files in different places (which systemd won't fix), but most of it is just different ways of solving the same problem.

there's a lot of third-party software available that includes the init files, and that software has no problems using the same scripts on many different distros (they just decide where to put their stuff and don't care if it's where the distro maintainers would put it)

I expect the same type of thing to start happening with the systemd config within a year or two of it becoming the default.

Missed some rows

Posted Apr 29, 2011 19:08 UTC (Fri) by martinfick (subscriber, #4455) [Link] (19 responses)

> because the distro maintainers each want to do slightly different things and don't take the time to see what the other distros are doing

And this is an advantage of not using systemd?

> there are a few things based on distros choosing to put files in different places (which systemd won't fix), but most of it is just different ways of solving the same problem.

So tell me again where is the advantage of having "different ways of solving the same problem"?

> I expect the same type of thing to start happening with the systemd config within a year or two of it becoming the default.

If the differences are the only thing that will likely cause them to have different config files, then I would say that systemd has done an excellent job! The differences, the policies, are exactly the things that should be configured in these files, not the mechanisms!

To paraphrase your statement, "the shell scripts are a small amount of different policy and large amount of mechanism (all implemented differently)." Finally, with systemd, the policy is abstracted and obvious and the mechanism can be the same, (more common code and likely way better tested in the long run)!

Missed some rows

Posted Apr 29, 2011 21:20 UTC (Fri) by dlang (guest, #313) [Link] (18 responses)

>> because the distro maintainers each want to do slightly different things and don't take the time to see what the other distros are doing

>And this is an advantage of not using systemd?

it's not an advantage, it's reality, and it will be reality with systemd as well, just give it a little bit of time

> So tell me again where is the advantage of having "different ways of solving the same problem"?

if we didn't allow for different ways of solvign the same problem, nobody would ever be able to find a better way (and systemd for example could not even be attempted because the OS wouldn't work with it)

you are taking advantage of the fact that the system allows you to solve things in a new way to create systemd, but then claiming that the freedom to do so isn't important for anyone else.

Missed some rows

Posted Apr 29, 2011 21:46 UTC (Fri) by martinfick (subscriber, #4455) [Link] (17 responses)

> if we didn't allow for different ways of solvign the same problem, nobody would ever be able to find a better way (and systemd for example could not even be attempted because the OS wouldn't work with it)

Who said anything about not allowing different ways? The freedom to do so is great. Does that mean it should be done differently without a good reason? Systemd attempts to unify different ways when it makes sense and when there is no good reason for them to be different. In most of these cases it makes sense, if not, why would the different distros agree to switch? Likely because they agree with the proposed way that systemd solves these problems.

No to mention that most of these old ways are likely broken! Many of them are nasty old hacks because there isn't a good common solution implemented anywhere. For example, do you really think that keeping track of pids in files is a better way to kill a daemon than using cgroups? Such a mehtod is fraught with potential miss kills, I am shocked that enterprise distros allow such behavior... "oops, just killed the company database server when I meant to kill my homebrewed monitoring script". Today, most distro scripts are likely knowingly broken. Do you think the average homebrewed script even has a chance of being only half as broken as a distro script? I suspect that systemd makes writing an unbroken homebrewed startup config possible and likely.

Missed some rows

Posted Apr 30, 2011 0:22 UTC (Sat) by nicooo (guest, #69134) [Link] (16 responses)

> if not, why would the different distros agree to switch?
Which ones? I only know of F15.

> I am shocked that enterprise distros allow such behavior
It worked for enterprise unix.

Missed some rows

Posted Apr 30, 2011 8:16 UTC (Sat) by rahulsundaram (subscriber, #21946) [Link]

Meego has switched in their devel branch. OpenSUSE next version. Mandriva has already switched as well. Keep looking.

Oh, yeah.

Posted Apr 30, 2011 9:51 UTC (Sat) by khim (subscriber, #9252) [Link]

It worked for enterprise unix.

Wow. Talk about ignorance! There are many things which worked for Unix till it got serious competition. Then it lost first desktop, then server and not the bettle for enterprise is in full swing.

I fail to see how "it worked for enterprise unix" can be used as justification. Sure it did - because there was no alternative. Some things "enterprise unix" does are still better then what the Linux does, but the list shrinks over time... and usaged of pid files to manage daemons instead of cgroups is not one of them.

Missed some rows

Posted Apr 30, 2011 11:29 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (13 responses)

Actually, no. The problem with miskills due to PID wraparound are very well-known.

Various 'enterprise' Unixes had workarounds since forever. Like ability to 'lock' PID of a process (so it won't be reused). Or locking PID for several minutes after getpid() calls (so "ps | grep ... | xargs kill" won't kill some innocent process).

Missed some rows

Posted May 3, 2011 1:21 UTC (Tue) by wahern (subscriber, #37304) [Link] (11 responses)

That all seems so convoluted. The whole problem boils down to the size of the namespace and the familiar TOCTOU race condition. The cgroups solution works because it uses a different namespace with well-crafted rules, and really only works in the context of systemd, which is taking on a role--maintaining a persistent, unique, global namespace--part of which should be done in the kernel.

The easiest and cleanest general purpose solution would be to extend the PID namespace to 64-bits, or maybe even 128-bits. Problem solved. This is a common solution for when maintaining and communicating a consistent global state is not practically feasible, which is the case with the historical paradigm of process management on Unix.

I don't know why this has never been done. The existing 16-bit namespace is ridiculous. There should be a kernel compile-time option to increase the pid_t width. Then over the course of several years broken applications that make unwarranted assumptions about pid_t could be fixed. The vast majority of issues are probably with printf formatting; people usually cast pid_t to (int). If PIDs were chosen at random (as on OpenBSD) than the 31- or 32-bits shown would actually be useful, much like Git's truncated hash identifiers. So even most broken apps would only be half broken.

I realize it's a *huge* change, but its simple and straight-forward, the consequences are mostly foreseeable, and with open source software readily addressed by even casual C programmers. GCC could be instrumented to track pid_t conversions, and in a matter of weeks I bet Debian's build system would uncover the vast majority of issues. All of a sudden one of the most ugly Unix warts--that is, fundamentally broken in the context of common usage--disappears.

Missed some rows

Posted May 3, 2011 9:10 UTC (Tue) by leighbb (subscriber, #1205) [Link] (1 responses)

Just so that you are aware, you can actually enable a 22-bit pid by doing:

sysctl -w kernel.pid_max=4194304

Not as much as you were after but bigger than you thought :-)

Missed some rows

Posted May 3, 2011 13:13 UTC (Tue) by wahern (subscriber, #37304) [Link]

Thanks. I was completely unaware.

Missed some rows

Posted May 3, 2011 15:03 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

Not really. By going to 32 bits for PID namespace this problem still won't be solved, it will just be harder to trigger.

And larger PID lengths are way too clumsy for humans. That's definitely NOT a good engineering.

Besides, even with 128-bit PID length you'll still have problems with double-forked processes (which are reparented to init).

systemd nicely solves these problems.

Missed some rows

Posted May 3, 2011 18:17 UTC (Tue) by wahern (subscriber, #37304) [Link] (7 responses)

A larger PID wouldn't do everything that systemd does with cgroups. cgroups does two things: (1) provides a larger namespace (roughly 2^(8 * 255) bits, AFAIU) to identify processes, and (2) handles inheritance. But a larger PID would solve in a backwards compatible fashion the one clear issue in Unix process management, the signal-PID race, which is more-or-less the same as the first thing above. Although I'm not familiar with cgroup usage, I think that there's still a race in adding a fresh process to a cgroup, so even systemd could benefit from a larger PID space.

It's really only an unresolvable issue when you have errant, buggy processes. Otherwise, a sophisticated daemon should have a domain socket which takes control messages. But I'm presuming that process management means being able to handle processes that aren't well behaved.

Missed some rows

Posted May 3, 2011 18:21 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

Signals must die, they are a relic of ancient time.

>Although I'm not familiar with cgroup usage, I think that there's still a race in adding a fresh process to a cgroup, so even systemd could benefit from a larger PID space.

Nope. cgroups work on kernel level and so they use proper locking, so PIDs won't be able to leak. Also, one can easily protect processes in a cgroup from an accidental kill (in fact, cgroups can be used as a complete lightweight virtualization solution).

Missed some rows

Posted May 4, 2011 5:03 UTC (Wed) by wahern (subscriber, #37304) [Link] (5 responses)

I'm confused then. Say I have a new process which I want to add to a cgroup. How do I assign the process to a cgroup? All the documentation I can find says to echo the PID to a cgroup control file. But if I'm using a PID--and I'm not the process with that PID--then I'm still subject to a race--the PID can become stale between acquiring the value and communicating it to the cgroup subsystem.

cgroup inheritance I can understand. A process forked from a process already assigned to a particular cgroup atomically inherits membership in the cgroup, just as it would atomically inherit a session id and process group id. But now, say, I want to reassign that process to a different cgroup PID. It seems like there's the same problem as above. What am I missing?

Missed some rows

Posted May 4, 2011 5:44 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

That's a trick question.

You need to somehow have a unique process handle, which PID is definitely not. On Linux it can be done using the /proc/PID/ directory. The sequence would be:
1) Change current directory to /proc/PID
2) Look around and check that this PID is still the correct one. That's safe because if the process its /proc/PID directory becomes empty - and stays that way.
3) Write to /proc/PID/cgroup.

Of course, it's better to create a process directly in the required group in the first place.

Missed some rows

Posted May 4, 2011 8:26 UTC (Wed) by wahern (subscriber, #37304) [Link]

I thought /proc/$PID/cgroup was read-only; to add a process to a group you needed to write to /dev/cgroup/$TASK/tasks. In such case, you're left with a race condition. (I tried confirming or disproving this, but can't even get the example in cgroups.txt to work.)

My proposal was to make PID a unique quasi-handle the same way random UUIDs are unique.

Missed some rows

Posted May 4, 2011 19:31 UTC (Wed) by njs (subscriber, #40338) [Link] (2 responses)

> Say I have a new process which I want to add to a cgroup. How do I assign the process to a cgroup? All the documentation I can find says to echo the PID to a cgroup control file. But if I'm using a PID--and I'm not the process with that PID--then I'm still subject to a race--the PID can become stale between acquiring the value and communicating it to the cgroup subsystem.

In the above scheme, if you're the one who's spawning this new process that you want to end up in a cgroup, then you can do
1) fork
2) the child adds itself to the desired cgroup
3) the child calls exec()

That's race-free.

Missed some rows

Posted May 4, 2011 21:08 UTC (Wed) by wahern (subscriber, #37304) [Link] (1 responses)

Sure. But the issue is handling arbitrary, non-well behaving processes. And AFAICT there's still no provably safe way to handle that on Unix systems. With only a 16-bit (or 15-bit, or 22-bit) PID space, it's trivial to write a program to sit around and wait to take advantage of a race. (I don't have an attacker mindset, but I wouldn't bet against the proposition that it could be a useful vector.)

Of course, "who cares" is a valid reply; we've been living with it for 40 years. But that response challenges the value added by systemd's reliance on esoteric Linux subsystems. For example, when we talk about how a service manager is so much better than a race-prone PID file, nobody ever considers that the race condition is easily avoided by not using root. If you create a user per daemon--_www, _ftp, etc--then even if you read a stale PID and signal the wrong process, as long as you're sending the signal with a service-delegated UID then it will never be delivered.

I never brought it up before because it's arguably not very elegant. I'm loath to defend PID files. But if we're going to replace them with something, I'd like it to be generic and tailored to the specific issue, rather than lauding some supposed panacean init replacement.

The past decade in Linux-land has seen a parade of sophisticated daemon services intended to patch over some clunky Unix interface (device management, process management, etc, etc). They each require application developers to change from portable POSIX patterns to using some new API or library or protocol. But they come and go like the wind. Worthy solutions tend to be so obviously beneficial that all the free unices eagerly adopt or mimic them.

Missed some rows

Posted May 5, 2011 1:06 UTC (Thu) by njs (subscriber, #40338) [Link]

I guess I don't understand what you mean by "managing arbitrary, non-well behaving processes".

IIUC, when systemd starts a service, that service gets stuck (reliably, and race-freely) into its own cgroup, from which it cannot escape. Then you can kill it or whatever reliably, even if it's badly behaved (spawning children that double-fork and end up as orphans, forking to a new PID every 100 ms, whatever you like).

If you're trying to go after a process that was started outside of a cgroup, then this doesn't work so well, but not much does. That process that keeps switching PIDs as quickly as possible can't easily be killed even if you have a collision-free PID space.

Missed some rows

Posted May 4, 2011 21:28 UTC (Wed) by mjthayer (guest, #39183) [Link]

> Actually, no. The problem with miskills due to PID wraparound are very well-known.

> Various 'enterprise' Unixes had workarounds since forever.

A workaround I implemented a while ago for "normal" Unixes was for the daemon to place an advisory lock on its pidfile. It only works on filesystems with that feature of course, but by checking that the file is locked before issuing your kill command you greatly reduce the race window.

Missed some rows

Posted Apr 29, 2011 19:57 UTC (Fri) by mezcalero (subscriber, #45103) [Link] (8 responses)

Actually, systemd -- if successful -- will standardize a lot of configuration and the places it is stored it.

I think you have little experience with actually writing sysv init scripts. Making them portable between distributions is irrationally hard, and even LSB headers, which were supposed to improve the situation just made it worse. Most (all?) upstream software I have seen which ships init scripts has different init script implementations for the different distributions.

And no, I see no reason that distros will differ too much on the systemd unit files. Why I come to that conclusion? Well, we have stuff like .desktop files or D-Bus service activation files, and they do not differ between distributions. They are the same everywhere. And they have been around for quite some time now already. With systemd we make service descriptions as easy and portable as .desktop files and D-Bus service files, and thus I see little reason to believe that the distros will try to divert too much from upstream.

Missed some rows

Posted Apr 29, 2011 21:22 UTC (Fri) by dlang (guest, #313) [Link] (1 responses)

you are right, I've only been making my living administering unix systems of various flavors for 14 years, so I can't possibly understand what it takes.

someone who comes along and states that we should ignore anything that's not a linux box (and then in practice limits it further by eliminating the entire embedded space) understands the issues better.

Missed some rows

Posted Apr 30, 2011 13:47 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

systemd works just fine on embedded devices. In fact, it works BETTER than SysV-based scripts.

Missed some rows

Posted Apr 29, 2011 22:44 UTC (Fri) by dskoll (subscriber, #1630) [Link] (5 responses)

Most (all?) upstream software I have seen which ships init scripts has different init script implementations for the different distributions.

We use the same init script across all our Linux distros (and even on the BSDs) for our commercial product.

That being said, our product is largely written in Perl, so we wrote the init script in Perl too. That might make some recoil in horror, but it simplifies our life quite a bit, and since you need Perl to run our product anyway... :)

Missed some rows

Posted Apr 29, 2011 23:06 UTC (Fri) by jspaleta (subscriber, #50639) [Link] (4 responses)

Do you believe that the care and effort you put into the portability of your init script is typical or atypical?

Do you have any conditional logic in your perl based init script which fires depending on the environment your script ends up running under?

I don't think he was claiming that it is impossible, I believe the claim is one of pragmatic realities of the accumulating differences across distributions makes a strawman out of any argument about assuming perfect portable initscripts commonly found in the wild.

-jef

Missed some rows

Posted Apr 29, 2011 23:19 UTC (Fri) by dlang (guest, #313) [Link]

I have seen several perfectly portable init scripts on third-party software.

the distro maintainers don't like them, but the software is unconditionally put in one place, no matter what OS it is (frequently under /opt) and the init scripts start and stop the application without any problems.

this isn't just for trivial apps either, all it takes is a decision that you don't care what the 'normal' thing for this particular platform is, (frequently on the basis that the platform is just there to support the application, which is commonly valid) and you don't have to worry very much about the variations between platforms.

Missed some rows

Posted Apr 30, 2011 2:59 UTC (Sat) by dskoll (subscriber, #1630) [Link] (2 responses)

Do you believe that the care and effort you put into the portability of your init script is typical or atypical?

Oh, hey, I'm all for systemd. I think it's a great idea. It won't help us on non-Linux platforms, though. (At least not for a while...)

Do you have any conditional logic in your perl based init script which fires depending on the environment your script ends up running under?

Nope. The Perl stuff is used so we can have pluggable components to start different executables depending on what's installed and on how the particular node is configured. It also rationalizes PATH handling, but there's no platform-specific code.

Managing programs startup using a perl script

Posted Apr 30, 2011 19:49 UTC (Sat) by rvfh (guest, #31018) [Link] (1 responses)

Question is: how will this work with systemd? A service calling your script (which looks to me as a kind of init system in itself)?

Managing programs startup using a perl script

Posted May 1, 2011 6:40 UTC (Sun) by dskoll (subscriber, #1630) [Link]

Question is: how will this work with systemd?

I can't imagine there will be any problems. To the outside world, our script looks like a normal sysvinit script:

script start

script stop

etc...