|
|
Subscribe / Log in / New account

Systemd 254 released

Systemd 254 released

Posted Aug 1, 2023 15:33 UTC (Tue) by paulj (subscriber, #341)
In reply to: Systemd 254 released by Wol
Parent article: Systemd 254 released

Easiest way to convert, where your script is doing lots of prep work say (generating arguments, creating files, checking stuff ,whatever), is to just change it so it does /not/ daemonise the process it is launching. So your script shouldn't exit till the process does. Sometimes this is as simple as removing some kind of "--daemon" argument from the commandline.

Systemd does /not/ want your process to daemonise basically. Just let systemd manage the process.


to post comments

Systemd 254 released

Posted Aug 1, 2023 21:12 UTC (Tue) by Wol (subscriber, #4433) [Link] (34 responses)

As I said, there's no script. The binary is invoked directly, and I don't know what it does. I suspect it daemonises, but that will involve digging into a load of C code.

Cheers,
Wol

Systemd 254 released

Posted Aug 1, 2023 23:29 UTC (Tue) by mjg59 (subscriber, #23239) [Link] (33 responses)

It's a binary that's directly symlinked into /etc/rc?.d? How do you restart it or shut it down?

Systemd 254 released

Posted Aug 2, 2023 7:39 UTC (Wed) by Wol (subscriber, #4433) [Link] (32 responses)

> It's a binary that's directly symlinked into /etc/rc?.d? How do you restart it or shut it down?

As I said, I use a systemd unit file. In response to a comment that said writing such files was easy.

Except as soon as I put an execstop in there, it triggers a mad killing spree at boot that kills loads of unrelated services.

I don't know why (and haven't got round to trying to debug it).

Cheers,
Wol

Systemd 254 released

Posted Aug 2, 2023 7:58 UTC (Wed) by mjg59 (subscriber, #23239) [Link] (28 responses)

The suggestion was that you write a systemd unit that runs the old sysv script, not that the program you were trying to run was a script.

Systemd 254 released

Posted Aug 2, 2023 10:39 UTC (Wed) by Wol (subscriber, #4433) [Link] (27 responses)

> The suggestion was that you write a systemd unit that runs the old sysv script, not that the program you were trying to run was a script.

And where did that come from? Not from me. Yes I think I can see the confusion, but at no point did *I* mention scripts at all. My comment was

"Because writing a trivial systemd unit file is not, in fact, trivial?"

Which it isn't. Writing my first unit file, and getting it to work with a simple ExecStart, wasn't easy. Then someone else added that ExecStop and the killing sprees started.

At some point I need to dig into the code to find out why this perfectly functional daemon does not function correctly with a very simple unit file :-(

Too many people seem to think that *their* normal applies to everyone else ...

Cheers,
Wol

Systemd 254 released

Posted Aug 2, 2023 11:37 UTC (Wed) by mb (subscriber, #50428) [Link] (7 responses)

> "Because writing a trivial systemd unit file is not, in fact, trivial?"
> Which it isn't.

Well, it actually is trivial for many many cases.

I really think you are hitting a corner case here and your application is doing something absolutely crazy.
We obviously can't debug that here on LWN.
But please stop saying that writing systemd unit files was hard, just because you have *one* case that might be a bit harder. In general it is easy.

Systemd 254 released

Posted Aug 2, 2023 14:21 UTC (Wed) by Wol (subscriber, #4433) [Link] (6 responses)

"not easy", or "hard".

Not necessarily the same thing.

But you can't expect a complete novice at writing them, to churn out several in the first hour, as the OP implied. My first attempt did nothing. I scrabbled around in the documentation, emailed the mailing list, and got exec start to work. It wasn't hard, but it was quite of lot of frustration trying to find out information.

Then as I say someone else added the exec stop and all hell broke loose.

It probably is true that writing unit files is pretty easy. But not to a novice. If I stuck TCL in front of you, even with excellent documentation you'd struggle, and it really is easy.

And it may also seem odd for someone on LWN so much, but I'm not that familiar with (or a fan of) "the Unix way". What little I know is what I've had to learn (and no, I wouldn't put Windows in my "favourite OS" list, either).

If you're a Unix fan, unit files probably felt familiar to you, even when meeting them for the first time. They still feel alien to me.

Cheers,
Wol

Systemd 254 released

Posted Aug 2, 2023 15:09 UTC (Wed) by mb (subscriber, #50428) [Link]

>If you're a Unix fan, unit files probably felt familiar to you

I'm sorry. That doesn't really make any sense. At all.

And you're also running in circles. Multiple times. We all understand that you do have trouble writing a unit file and didn't succeed so far. But that's far from being the norm.

Systemd 254 released

Posted Aug 2, 2023 23:01 UTC (Wed) by rschroev (subscriber, #4164) [Link] (4 responses)

If I understood everything correctly, you're trying to manage third-party software of which you don't really know how it expects to be managed properly. That's always going to be an uphill battle, regardless of which supervisor you use. Writing code, or config files in this case, doesn't work very well when you don't have a specification for what the code is supposed to do.

Systemd 254 released

Posted Aug 3, 2023 10:00 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

Bingo.

This thread all started because the OP to whom I replied said that someone else could port 30 or so SysV init scripts to systemd unit files in a few hours. If you don't know what that script is doing, what the daemon it's starting is doing, not it's NOT that trivial. And you don't stand a hope in hell of knocking them out that quick.

And I just gave my experience as an example, where an attempt to start a binary with a systemd unit file blew up in my face spectacularly, precisely because I didn't know what exactly that binary was doing. I know there are landmines. I know I clearly stepped on one. I just haven't debugged which one, yet :-)

Cheers,
Wol

Systemd 254 released

Posted Aug 3, 2023 10:45 UTC (Thu) by paulj (subscriber, #341) [Link]

In my experience it _is_ pretty simple:

1. Open the init script
2. Find where it launches your daemon
3. Remove the argument telling the daemon to daemonise
4a. If this is the first one you're doing: Write the trivial systemd unit file to ExecStart that script
4b. If not the first, copy the trivial systemd unit file you've already got and change the ExecStart line.

Pretty much every daemon I've ever used, there is an argument to enable or disable daemonisation, cause a) developers want to be able to debug daemons (run under GDB usefully); b) there already are other init systems (inc. various SysVs Unixes in the olden days, like AIX, that already have various process managers; and other hacky homebrew and vendor-hacky process managers) that want processes to not daemonise; and so this is nearly always easy to figure out and set.

Systemd 254 released

Posted Aug 3, 2023 10:57 UTC (Thu) by bluca (subscriber, #118303) [Link]

There's 2 cases here: either you have nothing at all, and then this change to the generator doesn't affect you in any way, or you have an init script used through the deprecated generator. In the latter case, it is as simple as taking the generated unit and initially just shipping that as-is. That _is_ trivial, it's just a copy and a git add!
Why is it better? Because you can then iteratively improve on that, pick up recommended patterns, add sandboxing, etc etc, so that it can evolve and improve over time, rather than being ossified to the lowest common denominator of whatever silliness was happening back in the 80s.

Systemd 254 released

Posted Aug 3, 2023 11:17 UTC (Thu) by anselm (subscriber, #2796) [Link]

If you don't know what that script is doing, what the daemon it's starting is doing, not it's NOT that trivial.

And that should help convince anyone that allowing arbitrary shell code for daemon startup is, with hindsight, not the greatest of ideas. At least with systemd you know where you stand.

And you don't stand a hope in hell of knocking them out that quick.

For most if not all SysV init scripts it's not a huge problem to come up with simple service units that call them (nobody said you had to get rid of the init script altogether, after all). Systemd even does that automatically, for now anyway. That way you're not taking advantage of many of the helpful and convenient things systemd can do, but it's a start. If you're only interested in keeping the service working as before once the automatic support for SysV init scripts is removed from systemd, it may be all you need to do. You could even take a peek in /run/systemd/generator.late to see what you can find there.

It's when you want to replace the init script completely with a service unit that you need to look at the init script to see what exactly it does, and that of course takes time (especially with some of the more gnarly init scripts out there). I don't think anyone has seriously doubted that.

Systemd 254 released

Posted Aug 2, 2023 12:28 UTC (Wed) by jem (subscriber, #24231) [Link] (17 responses)

We need more details of what the program does. Does it daemonise? If it does, you need to add Type=forking to the unit file, if not, don't add a Type= line. (What happens if the program is started from an interactive shell? Does it go into the background?)

If the program daemonises, it probably writes its PID to a file. From man systemd.service: "If this setting is used, it is recommended to also use the PIDFile= option, so that systemd can reliably identify the main process of the service."

If the program expects to shut down in some special way, like running the program binary with a special command line parameter, add this to the ExecStop= option. The ExecStop= option is not mandatory; the default is for systemd to send a SIGTERM signal to the process, followed by a SIGTERM (if needed), with the assumption that the program catches one of these signals and does a graceful shutdown.

Systemd 254 released

Posted Aug 2, 2023 14:37 UTC (Wed) by Wol (subscriber, #4433) [Link] (16 responses)

Thanks. I think you're pretty spot on.

I believe I've tried "forking = yes".

You do start and stop it with special command line arguments (--start and --stop would you believe :-), and yes with --start it does go into the background.

Beyond that, I need to investigate what's going on. I suspect the fact that it's backgrounding makes systemd think it's stopped and triggers the exec stop. And when it gets that, I suspect it gets confused as to what services are its own and what are not, and sends kills to the wrong processes. But that's a debugging session I need to get into when I have time. At present I just don't use exec stop.

Because if exec stop is enabled, I can guarantee a bunch of random services will fail to start, with systemd reporting they've been killed on startup :-(

Cheers,
Wol

Systemd 254 released

Posted Aug 2, 2023 23:14 UTC (Wed) by rschroev (subscriber, #4164) [Link] (15 responses)

> I suspect the fact that it's backgrounding makes systemd think it's stopped and triggers the exec stop.

If that's what you see then that's what happening, but it goes completely against my understanding of how systemd behaves. It is my understanding that systemd runs the ExecStop commands when it wants to stop the service, not when it detects that the service is stopped.

Systemd 254 released

Posted Aug 3, 2023 6:28 UTC (Thu) by zdzichu (guest, #17118) [Link] (14 responses)

Your understanding is incomplete. Quoting man systemd.service:
Also note that the stop operation is always performed if the service started successfully, even if the processes in the service terminated on their own or were killed.
Nevertheless, Wol's stories about one service stop killing unrelated services are hard to believe in. Unless he wrote ExecStop=/usr/bin/killall…

Systemd 254 released

Posted Aug 3, 2023 6:41 UTC (Thu) by jem (subscriber, #24231) [Link] (13 responses)

The key thing here is what *triggers* the invocation of the command in ExecStop. The command is run as a result of systemd actively wanting to stop the service, not because it has suddenly detected that the process has exited. The sentence from "man systemd.service" you quoted means that systemd does *not* check whether the program is alive or not, it runs it anyway.

Systemd 254 released

Posted Aug 3, 2023 7:35 UTC (Thu) by zdzichu (guest, #17118) [Link] (12 responses)

I disagree, ExecStop can be invoked if is defined and

  1. when sysadmin run systemctl stop unit;
  2. when systemd decides to stop unit;
  3. when unit exited/failed by itself.

By coincidence, systemd is open source so we don't have to guess!

Function service_enter_stop() runs ExecStop= if the section defined. service_enter_stop() is invoked in 10 cases:

  1. line 2264 – service did not start successfully and is not RemainAfterExit=true
  2. line 2294 – ExecStartPost failed
  3. line 2663 – command_next failed? (I don't know this mechanism)
  4. line 2692 – failed to start main task
  5. line 2801 – stop command is invoked
  6. line 3632 – PIDFile was specified but never wrote by the service
  7. line 3688 – Out of Memory condition manifested
  8. line 3815 – both main and control processes exited
  9. line 4004 – another failure of PIDFile
  10. line 4128 – RuntimeMaxSec= was reached

In summary, when ExecStop= is defined, it is run in multitude of cases, including service failure to start*. Not only when administrator requests service to stop.

* - I suspect failures to start are most often caused by wrong Type=. I once wrote a blog note with a table explaning the symptoms of mismatch.

Systemd 254 released

Posted Aug 3, 2023 8:33 UTC (Thu) by rschroev (subscriber, #4164) [Link] (11 responses)

I feel some of these conflict with the documentation. From https://www.freedesktop.org/software/systemd/man/systemd....

"Note that the commands specified in ExecStop= are only executed when the service started successfully first. They are not invoked if the service was never started at all, or in case its start-up failed, for example because any of the commands specified in ExecStart=, ExecStartPre= or ExecStartPost= failed (and weren't prefixed with "-", see above) or timed out."

That directly conflicts with your items 1, 2, and 4, and I feel it also conflicts with 6 and 9. And I don't see where it is documented that ExecStop= commands are called when systemd detects that processes have stopped (I haven't read *all* the documentation though, there's quite a lot of it); I feel generally while the documentation does explain what ExecStop does, it doesn't say enough about if and when ExecStop is triggered.

If the documentation is wrong, incomplete or unclear, I don't see how we're supposed to write correct unit files that work in all cases including edge cases. We shouldn't have to read the code to find out.

Systemd 254 released

Posted Aug 3, 2023 10:07 UTC (Thu) by Wol (subscriber, #4433) [Link]

I believe the systemd documentation itself warns of the consequences of "double daemonisation" or whatever it is in SysV scripts. There are various behaviours common (indeed, almost mandatory) under other init systems that are pathological to systemd. It would not surprise me if this binary assumes SysV (or indeed no init system at all) and behaves in a manner systemd neither likes nor expects.

What then triggers the mass killing I don't know. All I know is (1) it only happens if the systemd unit file contains an ExecStop. And (2) iirc the systemd logs actually point the finger straight at this binary!

At some point I need to fix it, but it's a load of reverse engineering I don't have time for :-(

Cheers,
Wol

Systemd 254 released

Posted Aug 3, 2023 10:35 UTC (Thu) by bluca (subscriber, #118303) [Link] (9 responses)

Writing documentation is hard - what is obvious to me as developer, can be entirely opaque to a given user, and it's difficult to tell when what is what. In this case, to me it's perfectly obvious that ExecStop is ran regardless of _how_ a unit went away, because we don't track commands being sent by admins/programs, we track cgroups. So, to me, "Note that the commands specified in ExecStop= are only executed when the service started successfully first." is clear. But to a user, this might be totally confusing and unexpected.

So, long story short, that manpage is here: https://github.com/systemd/systemd/blob/main/man/systemd.... please send a PR to reword so that it becomes clear to you as a user, and I'll happily review and merge it

Systemd 254 released

Posted Aug 3, 2023 11:11 UTC (Thu) by rschroev (subscriber, #4164) [Link] (8 responses)

I can't reword it because I don't have a good enough grasp on how it works. Every new piece of information seems to contradicts the previous one. And when the documentation doesn't match the code, I can't know which one of them (if any) is correct.

> to me it's perfectly obvious that ExecStop is ran regardless of _how_ a unit went away

But *when*? Is it triggered e.g. at the time you do 'systemctl stop', regardless of what happened to the service in the meantime? Or is triggered at the time systemd notices that the service went away? That's a big difference.

> To me, "Note that the commands specified in ExecStop= are only executed when the service started successfully first." is clear.

It seems clear to me too, but my interpretation is contradicted by the list in zdzichu's comment (https://lwn.net/Articles/940224/), which is correct as far as I can see. According to that, the commands in ExecStop= *are* executed even if the service did *not* start successfully, at the moment systemd detects that.

Systemd 254 released

Posted Aug 3, 2023 12:28 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

I'm guessing that at least one pathological behaviour here is

(1) systemd fires off a process
(2) this process fires off the daemon and exits
(3) systemd sees the process has terminated, and runs ExecStop

That certainly is the sort of behaviour I assumed was behind the double-daemonisation, and why this fork option was added to the unit file - to prevent exactly this mis-understanding by systemd. I must admit that wasn't obvious from said documentation but it was all in there ...

And that's what's probably behind ExecStop being executed in my case (still doesn't explain the killing spree ...)

Cheers,
Wol

Systemd 254 released

Posted Aug 3, 2023 13:05 UTC (Thu) by gioele (subscriber, #61675) [Link] (1 responses)

> And that's what's probably behind ExecStop being executed in my case (still doesn't explain the killing spree ...)

Maybe the service was also launching other services or calling other init scripts?

In that case these newly spawn processes will live inside the cgroup of the service and are going to be killed by systemd once the main service is stopped.

Systemd 254 released

Posted Aug 3, 2023 15:37 UTC (Thu) by Wol (subscriber, #4433) [Link]

> In that case these newly spawn processes will live inside the cgroup of the service and are going to be killed by systemd once the main service is stopped.

Oh the joys of people not reading the thread. The killing spree is OF OTHER SERVICES which have nothing whatsoever to do with the service causing the problem ...

Ie something is seriously wrong somewhere. I just need to debug it.

Cjhers,
Wol

Systemd 254 released

Posted Aug 3, 2023 21:14 UTC (Thu) by malmedal (subscriber, #56172) [Link]

> And that's what's probably behind ExecStop being executed in my case (still doesn't explain the killing spree ...)

Wild guess. Some init-scripts kill process-groups instead of pids, so if it hit the wrong one...

Systemd 254 released

Posted Aug 3, 2023 16:55 UTC (Thu) by jem (subscriber, #24231) [Link] (3 responses)

The linked man page contains the following text for ExecStop: "Commands to execute to stop the service started via ExecStart". This hints that the purpose of ExecStop is to provide the commands to explicitly stop the service, triggered by some external event like systemctl stop.

But *when*? Is it triggered e.g. at the time you do 'systemctl stop', regardless of what happened to the service in the meantime? Or is triggered at the time systemd notices that the service went away? That's a big difference.

Looking at the code, it is called as a direct result of systemctl stop, which calls service_stop. If the service state is SERVICE_RUNNING, the service_stop function unconditionally calls service_enter_stop, which in turn executes the command specified in ExecStop (if any).

I don't see why it would be "totally confusing and unexpected" to a user that the commands in ExecStop are not run if the service fails to start. If the service failed to start, what's the point in trying to stop it? You don't try to close a file that you failed to open, either.

Systemd 254 released

Posted Aug 3, 2023 18:09 UTC (Thu) by rschroev (subscriber, #4164) [Link] (2 responses)

> I don't see why it would be "totally confusing and unexpected" to a user that the commands in ExecStop are not run if the service fails to start. If the service failed to start, what's the point in trying to stop it? You don't try to close a file that you failed to open, either.

But according to the source code, the ExecStop commands *are* run even if the service fails to start. Referring to zdzichu's comment somewhere in this thread (see https://lwn.net/Articles/940224/), line 2264 in service.c (in service_enter_running()) calls service_enter_stop() when a service fails to start. service_enter_stop() in turn executes the ExecStop commands.

I agree with you: I don't expect ExecStop to be triggered if a service fails to start. The documentation agrees, if I interpret it correctly. But unless both zdzichu and I are misreading the code, the code does trigger it in that case.

Systemd 254 released

Posted Aug 3, 2023 18:56 UTC (Thu) by bluca (subscriber, #118303) [Link] (1 responses)

They are not:

$ sudo systemd-run --quiet -t -p ExecStop="echo hello" false
$ sudo systemd-run --quiet -t -p ExecStop="echo hello" true
hello

Systemd 254 released

Posted Aug 4, 2023 7:47 UTC (Fri) by rschroev (subscriber, #4164) [Link]

And neither are the ExecStop= commands executed when ExecStartPost failed:

$ sudo systemd-run --quiet -t -p ExecStop="echo hello" -p ExecStartPost="false" true
-> gives no output

So it seems we both did misread the code. Good, that solves my worries.

Systemd 254 released

Posted Aug 2, 2023 17:18 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

It came from paulj in https://lwn.net/Articles/939956/ - he's suggesting that you write a systemd unit that just wraps the existing sysv script. You appear to have misinterpreted that as a suggestion that your program was a script.

Systemd 254 released

Posted Aug 2, 2023 9:15 UTC (Wed) by anselm (subscriber, #2796) [Link] (2 responses)

Except as soon as I put an execstop in there, it triggers a mad killing spree at boot that kills loads of unrelated services.

The approach where in your systemd unit file you use ExecStart= and ExecStop= to call your existing init script is generally pretty safe. In effect it's what systemd's SysV init compatibility layer does, too.

But of course whatever the init script does is outside systemd's control, and some init scripts can be pretty wild. What I'm wondering is why the behaviour you're seeing would in any way, shape, or form be systemd's fault. At its heart systemd is a fairly straightforward piece of software, certainly as far as launching services is concerned.

Systemd 254 released

Posted Aug 2, 2023 10:43 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

I'm not blaming systemd at all. I am merely observing that THAT IS WHAT HAPPENS. Why people can't tell the difference between an objective description of reality and a subjective allocation of blame I really don't understand! Sadly the latter seems to be the norm.

Which can make fixing problems EXTREMELY difficult - one only has to look at society to see the trouble this causes :-(

Cheers,
Wol

Systemd 254 released

Posted Aug 2, 2023 11:46 UTC (Wed) by anselm (subscriber, #2796) [Link]

I am merely observing that THAT IS WHAT HAPPENS.

To be precise, this is what happens to you, or at any rate what seems to have happened to you in one particular instance. It is certainly not the norm.

As I said, the usual way for systemd to support services which only have SysV init scripts is to construct basic systemd service units on the fly that essentially use ExecStart= and ExecStop= to invoke the init script, and that seems to work without a hitch in the vast majority of cases. Taking such an init script as a first approximation to/starting point for a free-standing service unit file would not be the dumbest of ideas (even though systemd prefers services that don't double-fork).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds