|
|
Subscribe / Log in / New account

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 9, 2016 6:24 UTC (Thu) by matthias (subscriber, #94967)
In reply to: SIGHUP for "session has gone away", not SIGTERM/SIGKILL by ras
Parent article: Distributors ponder a systemd change

>> If 3 wents wrong, the process survives.
> True. But if you're doing cleanup on exit, your also doing it for a SIGTERM at the very least. It will be identical code and if a SIGHUP freezes it's very likely a SIGTERM will do the same thing.

Therefore systemd sends a SIGKILL after some time. This is meant to bring down the processes where SIGTERM did not work. This is the same mechanics that shutdown has used since decades.

>> The old behaviour is not really working this way, as SIGHUP is not successful in terminating all these processes.
> Then there is a bug.

Obviously, but it is not only a bug of gnome. I have seen this bug on KDE many years before systemd even existed. Of course it is nice to fix the bugs, but it is also obvious that there always will be some bugs.

>> - We need some version of nohup that also tells systemd to not kill the process (systemd-run should work, users need to get used to this).
> Yep, we have it. It's called signal(SIGHUP, SIG_IGN).

Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating. So there is software ignoring SIGHUP for other reasons as that the process should survive the session. Also every software with a bug in the SIGHUP signal handler could be a problem. From my experience, problems with the SIGHUP handler are the usual reason for processes lingering around that should have exited.

> But most people won't use it because it's not a serious issue on a headless server or a personal laptop, ...
I expect distros to accept the change, once the few problematic programs have fixes. Most users will not change it back, once screen and tmux work and the manual says use systemd-run for background processes instead of nohup, as they will not encounter any problems.


to post comments

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 9, 2016 7:08 UTC (Thu) by ras (subscriber, #33059) [Link] (12 responses)

>> SIGTERM at the very least. It will be identical code and if a SIGHUP freezes it's very likely a SIGTERM will do the same thing.
>
> Therefore systemd sends a SIGKILL after some time. This is meant to bring down the processes where SIGTERM did not work. This is the same mechanics that shutdown has used since decades.

I think you missed the point. The point is there is a bug in the SIGHUP handling, there is also most likely a bug in the SIGTERM handling. Sending a SIGKILL does not fix the problem. It hides it. Assuming the application is trapping both of these for a reason such as saving data, the fixing the bug is the correct path - not hiding it.

That said, it's been a long while since I've seen either problem. Until now, when GNOME introduced it as a "feature".

> Obviously, but it is not only a bug of gnome. I have seen this bug on KDE many years before systemd even existed.

Yep, and it was fixed by KDE long long ago. The difference GNOME / Systemd doesn't consider what they have done to be a bug - it's a new feature. Then to fix the bugs their new feature introduced they want to breaks backward compatibility with systems that don't use GNOME. KDE had the decency to fixed their bugs without using it as an excuse to inflict their version on Utopia on everyone else.

> Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating.

Only when it's been co-opted for other purposes - like reloading the configuration in system daemons. And if the particular program does either they are uninterested in knowing when the user logged out, or they have introduced a bug because there is no other way to know short of polling. This won't change under the proposed new regime, as SIGHUP will remain the way a process learns the login session has ended.

> I expect distros to accept the change, once the few problematic programs have fixes.

We will see. As the article points out, the tmux people don't see their program as the problematic one in this case, and from what I can tell a fairly large cohort of people agree with them.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 9, 2016 9:09 UTC (Thu) by matthias (subscriber, #94967) [Link] (11 responses)

>> Therefore systemd sends a SIGKILL after some time. This is meant to bring down the processes where SIGTERM did not work. This is the same mechanics that shutdown has used since decades.
>I think you missed the point. The point is there is a bug in the SIGHUP handling, there is also most likely a bug in the SIGTERM handling. Sending a SIGKILL does not fix the problem. It hides it. Assuming the application is trapping both of these for a reason such as saving data, the fixing the bug is the correct path - not hiding it.

I am arguing that session shutdown is like system shutdown. Since decades we use SIGTERM/SIGKILL when shutting down the system. Would you argue that when I type shutdown -r now and some application is not terminating cleanly, then the system should hang forever because sending a SIGKILL after SIGTERM is hiding bugs?

I fully agree that bugs should be fixed, but on the other hand some fundamental things as session management should handle bugs of applications gracefully.

>> Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating.
> Only when it's been co-opted for other purposes - like reloading the configuration in system daemons.
Or a pty going away because an X terminal is closed. Not every X terminal is a session on its own. Semantics have changed in time.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 10, 2016 2:49 UTC (Fri) by ras (subscriber, #33059) [Link] (10 responses)

> I am arguing that session shutdown is like system shutdown.

Yes, I knew that and should have addressed it, I guess. But it did sound to me like "I use a hammer to crack nuts, so why not an egg?"

The way I see it is: a process hangs around after logout when it's shouldn't, about the only harm done is a little lost RAM, or at worst a pinned CPU if it's gone infinite. If that happens and it bothers you, the fix is also simple: kill it. On the other and automatically killing a process when it hasn't shutdown properly delays getting the bug fixed. And there is a bug that needs to be fixed: either it doesn't matter in which case why is trapping SIGHUP at all, or it does matter and tears will follow one day.

If a process doesn't stop on shutdown the implications are much more severe. I've lost control of remote servers because of it. Plane flights cost time and money. It's not that the consequences of killing the process isn't the same: both result in loss of information. It's that the consequences of not shutdown not happening is very different.

> Or a pty going away because an X terminal is closed. Not every X terminal is a session on its own. Semantics have changed in time.

Yes they have. The session id the kernel used has been co-opted for all sorts of purposes now. This is the real problem you are grappling with. We used all sorts of kludges to get around it, but apparently these GNOME changes were the straw that broke the camels back. It seems session tracking is now far too hard, so rather than track sessions you've decided killing all processes belonging to a user is the way to go.

Obviously it's a kludge. It's racy (what if a person is logs between the 0->1 test and you lot starting to kill processes), and it won't always work (what about processes started as a different user), and it isn't backward compatible with what worked for 30 years now.

I'd have more sympathy if you were trying to get something simple done and stumbled onto this mess. Instead, you lot with your multiple systemd process trees are responsible for the worst aspects of it. And all this so you can optimise multiple GNOME sessions on the one machine. Does that even happen?

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 10, 2016 3:23 UTC (Fri) by pizza (subscriber, #46) [Link]

> And all this so you can optimise multiple GNOME sessions on the one machine. Does that even happen?

Yes.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 10, 2016 8:55 UTC (Fri) by matthias (subscriber, #94967) [Link] (8 responses)

Ok, I agree that the consequences in case of a shutdown/reboot are usually more severe. I was just bit by this during my studies, as we constantly had some PCs in the pool with some processes that instead of exiting cleanly started using 100% CPU. And yes, these were bugs not students to help SETI with university computing power. Killing was often not possible as the admins had different working times than the students and obviously a normal user cannot kill processes of different users. The systemd KillUserProcesses would have been very welcome.

> Obviously it's a kludge. It's racy (what if a person is logs between the 0->1 test and you lot starting to kill processes),

In contrast to solutions with pkill, systemd should only kill processes belonging to closed sessions (and not of the new session), as it tracks sessions with cgroups. There might be a race condition if the new session decides to use some process of an old session which gets killed. I am not sure whether systemd removes this race by delaying the start of the new session while on a killing spree. This should be possible.

> and it won't always work (what about processes started as a different user),
I just tested this starting some process with su as a different user (I temporarily added my test user to the wheel group). The process was terminated, because it was in the same cgroup. This case should not be that important anyway, as normal users are not allowed to start processes as different users.

> and it isn't backward compatible with what worked for 30 years now.
I agree, but the programs that need changes are very few. For most cases, background processes are started as daemons anyway. I always see screen and tmux mentioned and their session management is broken anyway. Helper programs like ssh-agent get terminated when the user logouts, even when they are needed inside the screen. Registering a session with PAM would be cleaner anyway.

Obviously this change should only hit stable distributions, once screen and tmux are fixed (either upstream or by distribution patches).

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 0:30 UTC (Sat) by ras (subscriber, #33059) [Link] (7 responses)

> The systemd KillUserProcesses would have been very welcome.

Yes, I can well imagine it would be.

But you did have another option: http://lwn.net/Articles/690555/

Well maybe not that precisely, but with a few tweaks you could have made it kill all processes owned by a student when his last session was gone, and unlike the current proposal made it very selective so only the students were effected and not sysadmins or others squawking here. This is exactly the sort of problem PAM's session management is well suited to.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 2:06 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

So why is it any better than a setting in user config file?

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 8:42 UTC (Sat) by ras (subscriber, #33059) [Link] (5 responses)

> So why is it any better than a setting in user config file?

It was just a work-around, and I don't doubt there are many people who think getting upstream to provide a config option for their particular problem is a better solution. Maybe you are one of them. I'm not.

There is a 50 line solution to the problem. It isn't a patch to upstream I have to carry, the API is stable, a compile isn't required, it doesn't require me to monitor upstream security problems and rebuild it with every fix - it's just drop a file into a directory and go. If I was the sysadmin being given grief by miscreant students, I know I would have invested the hour needed to write it. If as claimed there are lot of other sysadmin's with the same problem, I am somewhat puzzled that it isn't packaged and available on the major distro's already, because if it had been it would have been just a setting in PAM's config file.

Which brings us to the real point. I don't use Linux because it has a setting in a config file for my every need - that's an impossible ask after all. (If I believed it was possible, I would be using Windows. Obviously it's not there yet, but given it's possible it must be just around the corner ...) I use 'nix because it's swiss army knife that is so flexible, in for most problems there is a 50 line solution.

The KillUserProcesses setting looks nothing like that. Elsewhere you said it can be controlled per user. What if I don't think per user particularly useful? Maybe I'm a sysadmin with miscreant student population in a large educational institution that turns over staff regularly, and with every change of staff I have to change the systemd configuration on 100's of machines. I don't think so. Give me a system that provides the flexibility to configure in a way that suits me. Maybe I put all students in the one group, or maybe I lookup payroll, or read a flag out of FreeIPA.

I'd take that over flexibility over a specialised "config option" any day. Quite apart from anything else, I could not be as productive in my profession life without it.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 11:25 UTC (Sat) by pizza (subscriber, #46) [Link] (4 responses)

> I'd take that over flexibility over a specialised "config option" any day. Quite apart from anything else, I could not be as productive in my profession life without it.

Then it's a good thing that you're not forced to choose between those two, eh?

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 11:55 UTC (Sat) by ras (subscriber, #33059) [Link] (3 responses)

> Then it's a good thing that you're not forced to choose between those two, eh?

If we could leave it turned off with no repercussions other than our tmux sessions continue to run, there wouldn't be almost 300 posts on LWN about this. The reality is, if we want GNOME to clean up properly, we have to enable KillUserProcesses. Frankly I'd even accept that, albeit for purely selfish reasons as I'm not a fan of GNOME 3. Unfortunately many of the other window managers rely on GNOME to fill the gaps in their own efforts, including the one I use on my laptop.

This doesn't feel like we are being offered a choice.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 11, 2016 14:55 UTC (Sat) by pizza (subscriber, #46) [Link]

> If we could leave it turned off with no repercussions other than our tmux sessions continue to run, there wouldn't be almost 300 posts on LWN about this.

You are, in a word, incorrect.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 12, 2016 10:20 UTC (Sun) by micka (subscriber, #38720) [Link] (1 responses)

> there wouldn't be almost 300 posts on LWN about this

Slowly reading through them. From what I've read up until now two thirds are from 3 or 4 persons. I'm not sure what you can deduce from the number of comments except that there are very talkative commenters.

SIGHUP for "session has gone away", not SIGTERM/SIGKILL

Posted Jun 17, 2016 16:01 UTC (Fri) by Wol (subscriber, #4433) [Link]

:-)

Certain topics press certain buttons. Gnome brings out one set of posters.Systemd brings out another (and I've noticed systemd tends to attract troll accounts I've never seen before ...)

And databases? Well that tends to get me going :-) It's all about what matters to people. And some people just enjoy sitting in the peanut gallery lobbing rotten tomatoes ... :-)

Cheers,
Wol


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds