Distributors ponder a systemd change
Killing processes on logout
Systemd makes extensive use of control groups to manage processes on the system; because control groups contain processes and any subprocesses they may create, systemd has a clear view of the full set of processes that belong to each service or session. That allows it to, for example, kill all of the processes belonging to a given service, regardless of whether those processes were started directly by systemd. When it comes to user sessions, systemd (or, rather, the logind component) has, for some time, had the ability to kill all of a user's running processes when that user logs out.
That capability is controlled by the KillUserProcesses configuration option to logind; by default in older systemd releases, that option was set to "no". The systemd 230 release announcement included, among many other changes, the news that the default had been changed to "yes," so that, absent other measures, users cannot leave processes running after they log out of the system. This change quickly found its way into the faster-moving development distributions; the unhappy cries from users were not far behind.
The problem, from the point of view of these users, is that systemd 230's behavior represents a significant change in how Linux systems work, and that this change will surprise people in unpleasant ways. Prior to this change, processes could easily be made to persist after the user logs out; it was just a matter of blocking the SIGHUP signal that is delivered when the controlling terminal goes away. That could be done by running the process in the background on C shell variants, or with the nohup command for Bourne shell variants. A program could also take control of its own fate by managing SIGHUP directly.
In the new scheme of things, life is not so simple. Blocking SIGHUP will no longer work, since systemd does not use that signal; instead, it sends SIGTERM followed by an unblockable SIGKILL. Thus, programs that have traditionally set themselves up to survive a logout (screen and tmux are often-cited examples) can no longer do so and will not be able to perform their intended function.
There are ways around this problem, of course. One is for the program to tell systemd directly that it needs to persist. That involves accepting the systemd library as a dependency. Not all projects are willing to add this kind of systemd-specific code; see this tmux tracker entry, for example. So it may be some time before even the programs that are explicitly intended to run after logout are able to work transparently in this manner.
Beyond that, most programs are not designed for this kind of persistence, but can be used that way anyway; it is not uncommon to place a long-running task in the background and expect it to complete after logging out. In the new systemd world, these programs need to be run with the systemd-run command. Creating a version of nohup that invokes systemd-run should not be a hard thing for a distributor to do, so there should be relatively little disruption in cases where users explicitly ask for persistence already.
The only remaining snag is that the user account must have "lingering" enabled for persistence requests to actually be honored. This is done with the "loginctl enable-linger" command, but is turned off by default. Enabling lingering can be configured as an unprivileged operation, or it can be restricted to system administrators. In the end, the ability to run processes that persist after logout has not gone away, but the way things behave by default has changed.
Arguments pro and con
It has been suggested that this change has been made to cope with GNOME applications that won't go away without a SIGKILL-sized hint. There may be some truth to that, but it also appears that all desktop environments have at least some problems with unwanted persistent processes. Such processes don't just clog the system; they can also delay the availability of the console for another login or slow down the shutdown process. It is easy to say that such programs should simply be fixed, but, in the real world, sometimes one has to stop playing whack-a-mole and just pave the field instead.
Beyond that, systemd creator Lennart Poettering sees process persistence as a security issue. Allowing somebody to log into a machine should not imply allowing that user to run code when they are not logged in; that is a separate decision that should be reserved for the administrator. Even if the change creates some trouble, he believes that somebody needs to do it:
One need not look far to find criticism of Lennart's code, attitude, ancestry, or hair style, but one would be hard put to find accusations that he (or the systemd development community as a whole) is unwilling to dare to make disruptive changes. The community was certainly up to the challenge this time, and so the change was made.
This change has clearly already surprised a number of users; see this Debian bug or this fedora-devel thread for examples. Some of the discontent can be seen as the sort of "but that's not how we've always done it" screaming that seems to follow everything that happens in the systemd space, but that is not the whole of it.
Most people who have looked at how systemd works in this area seem to think that the ability to ensure that no user processes persist after logout is useful. But many object to turning it on by default, especially in a setting where most programs that need to be persistent have not been updated to work in the new regime. As Chris Adams put it in the Fedora thread:
It seems fairly clear that there would be less resistance to this change if common patterns still worked without the need for users to notice the new rules and make explicit changes. On the other hand, without the pressure represented by this change, the work to make programs operate properly might never actually get done. On yet another hand, there is little evidence of the systemd developers submitting patches to fix popular programs before making this change.
Distributor response
In the Linux world, upstream projects do not normally have the last word when it comes to system behavior; that privilege (and responsibility) falls to distributors. So the logout behavior seen by users may differ from that chosen by the systemd development community. The default as shipped by that community can be changed by distributors in a couple of ways. One is to build logind with the --without-kill-user-processes flag to restore the default to "no"; that is what Arch Linux and Gentoo have chosen to do, for example. The other is to set the KillUserProcesses option to "no" in a distribution-supplied logind.conf file; that would appear to be the outcome of the Debian discussion. [Correction: in truth, Debian seems to have taken the build-time flag approach as well.]
Other distributions that use systemd will still need to make a decision on when — and if — they will pass this change on to their users. The Fedora discussion has had few clear outcomes so far, with one exception: this change certainly will not appear in Fedora 24, which, having slipped three times, is now due on June 14. It would appear that the chances of this change showing up in Fedora 25 are fairly high, though. Some of the other community-oriented distributions have yet to hold their discussion on this issue, but, if past history is any guide, the lengthy threads will eventually appear.
Enterprise-oriented distributions have more time to think about it, of course; by the time they have to make a decision, it may well be that most of the known problems have been fixed. This may be the setting where the default behavior matters the most, though. In 2016, logging out of a desktop system is relatively rare; users are far more likely to lock the screen, power down entirely, or suspend the system. Logging into a server, starting a long-running process, and logging out is a much more common pattern. So the enterprise Linux user base, which is arguably more focused on servers, may care more about this issue.
For all the fuss, one might well argue that the development community is
working as it should. Upstream projects should push toward their view of a
better world, making changes where they can even if those changes sometimes
go against longstanding tradition. Distributors, instead, are responsible
for delivering stable, working systems to their users; most of them would
prefer to avoid shipping unpleasant surprises. So it is fitting that
systemd pushes things in a direction that its developers see as being
cleaner and more secure, but it is also fitting that distributors hold off
on shipping changes until they work well. The process is noisy, but the
results are often good.
Posted Jun 7, 2016 23:20 UTC (Tue)
by GhePeU (subscriber, #56133)
[Link] (19 responses)
A change of this magnitude can't just be introduced like this, with no discussion at all and the expectation that users must suddenly launch commands differently, downstream developers must suddenly depend on systemd and system administrators everywhere must start fiddling with configuration files just to revert to the normal, useful behaviour of any Linux system pre-this idiocy after users start seeing things breaking (with negligible consequences like launching a process overnight before leaving and discovering in the morning that no, the results you were expecting aren't there because three different people didn't each do something to prevent the issue).
Insane, just insane, and now I'm really starting to believe that maybe the people who complained so much about previous "innovations" weren't so wrong.
Posted Jun 7, 2016 23:31 UTC (Tue)
by GhePeU (subscriber, #56133)
[Link] (9 responses)
Also, I'm not sold on this pretended difference between "servers" and "workstations" or "desktop systems." A Linux system is a Linux system is a Linux system, I remotely login to so-called "workstations" and "desktop systems" regularly, at work and at home, and I don't see why suddenly screen should stop working as intended on my desktop PC just because I'm running Fedora instead of Red Hat.
Posted Jun 8, 2016 3:54 UTC (Wed)
by pizza (subscriber, #46)
[Link] (6 responses)
It was never guaranteed to work unless you took explicit steps to ensure as such when you launched the process. The fact that it sorta mostly did (except when it randomly didn't) was more luck than any sort of explicit design decision.
Posted Jun 9, 2016 16:58 UTC (Thu)
by ksandstr (guest, #60862)
[Link] (5 responses)
So how do you justify systemd's new default explicitly breaking what even you recognize, above, having worked before?
Posted Jun 9, 2016 17:20 UTC (Thu)
by pizza (subscriber, #46)
[Link] (4 responses)
FYI, on my personal systems I explicitly turned KillUserProcesses *on*, because I actually want that behavior. A couple of years ago I replaced my regular uses of nohup and screen with native systemd units or timers or whatever was appropriate, and haven't looked back since.
On the old-school shell server I administer, I've left that feature off, and I will do so until at least screen and tmux are shimmed to request proper login sessions. Once that's done, I'll flip the switch there too, and then I can finally get rid of my periodic process reapers that have to clean up after misbehaving crap.
This is a change, yes. But it's a change that, after a very minor amount of learning, leaves me with a more robust system that requires less ongoing attention than before. (Call me strange, but I believe in using the best tools for the task at hand)
Posted Jun 12, 2016 13:31 UTC (Sun)
by jspaleta (subscriber, #50639)
[Link]
I'm trying to keep my development system and even my workstation as locked down as my production environment now...and tracking the configuration differences..so I know exactly why I'm relaxing constraints on the dev system. I want to push against production constraints using non-production workloads in unexpected ways and see what breaks. The amount of relearning isn't that bad. I mean its not like relearning to jump to python3... this is minor.
Posted Jun 19, 2016 3:53 UTC (Sun)
by zblaxell (subscriber, #26385)
[Link] (2 responses)
Since then I've copied the behavior for myself, in the form of a half dozen five-line shell scripts that replicate systemd's cgroup behavior. Every aspect of the cgroups' lives--how much RAM, CPU, and IO they can use, and making sure processes run, live, and die when they're told to--can be handled this way. It's _awesome_, and it's definitely one of the better ideas coming out of the systemd project.
The other thing I realized was that sysvinit had been ruining my days for years, and systemd was going to continue that pattern. To isolate myself from upstreams that should know better, but make breaking behavior changes anyway, I replaced init with a shell script. It's a little longer than five lines--ranging from 55 to 155 lines of code depending on whether it's a desktop, embedded, or server workload--but I haven't looked back since.
It was a painful transition with a bit of learning curve, but it needs much less attention than before. Apparently the best tools for the task at hand were the Unix shell, the & operator, and some small syscall wrapper programs.
Posted Jun 20, 2016 7:56 UTC (Mon)
by zlynx (guest, #2285)
[Link] (1 responses)
You can get away with it for system rescue, but long term?
Posted Jun 20, 2016 14:01 UTC (Mon)
by zblaxell (subscriber, #26385)
[Link]
I'm not sure if _any_ shell works, but bash and dash do. Any shell that can trap signals (i.e. all of them) and that uses PID 0 as the argument to waitpid (all of them written after 1987) do this just fine. The kernel blocks most of the fatal signals anyway. If /bin/sh is segfaulting you have big problems and you should probably panic the kernel to stop them from getting worse.
Posted Jun 8, 2016 12:56 UTC (Wed)
by amarao (guest, #87073)
[Link] (1 responses)
If you are siting on the bleeding edge your butt is bleeding. But you are on the edge.
Posted Jun 10, 2016 9:56 UTC (Fri)
by linuxrocks123 (subscriber, #34648)
[Link]
Like this?
Posted Jun 8, 2016 3:28 UTC (Wed)
by pizza (subscriber, #46)
[Link] (6 responses)
If all of this is "no discussion at all" one wonders what you consider to be actual "discussion".
BTW, the only people for whom this change in defaults broke anything were those who deliberately opted to use fairly bleeding-edge, rolling-release distributions (eg Fedora Rawhide or Debian Unstable) that are, by their very nature, intended to flag potential issues due to [un-]anticipated changes in behavior.
I for one agree with our esteemed editor when he says "For all the fuss, one might well argue that the development community is working as it should."
Posted Jun 8, 2016 7:41 UTC (Wed)
by rschroev (subscriber, #4164)
[Link] (5 responses)
> If all of this is "no discussion at all" one wonders what you consider to be actual "discussion".
An actual discussion happens before the act, not after. In an actual discussion, one listens and considers arguments from other parties. None of that happened.
Posted Jun 8, 2016 8:43 UTC (Wed)
by ovitters (guest, #27950)
[Link] (3 responses)
Posted Jun 8, 2016 8:54 UTC (Wed)
by jubal (subscriber, #67202)
[Link] (2 responses)
Posted Jun 8, 2016 10:02 UTC (Wed)
by pboddie (guest, #50784)
[Link]
"But the plans were on display..."
Posted Jun 12, 2016 13:14 UTC (Sun)
by jspaleta (subscriber, #50639)
[Link]
There are a hell of a lot of communication channels... is everyone on the same page as to where they expect discussion to happen with regard to upstream changes for any project?
-jef
Posted Jun 8, 2016 21:39 UTC (Wed)
by HenrikH (subscriber, #31152)
[Link]
Posted Jun 8, 2016 12:17 UTC (Wed)
by vadim (subscriber, #35271)
[Link]
It's up to distributions to set their policy and to configure the packages however they wish. If a distro blindly ships software without checking whether the changes in the new version will result something undesirable, then that's a problem with the release process.
That's the whole point of a distribution after all, and the reason besides convenience why we don't just tar -xvf ; make; make install everything.
Posted Jun 8, 2016 15:11 UTC (Wed)
by johannbg (guest, #65743)
[Link]
Upstream should always reflect how things should be ( from their own perspective ) while downstream reflects how things are or atleast how things are in relevance to them since these things can deviate by factor of how many different downstream sources there are.
It falls directly under downstream package maintainership responsibility to monitor upstream changes and if he, she or them deem some upstream changes warrants discussion in the downstream community, engage themselves with their own respectful community based on their own community guidelines, procedures and processes and discuss it. Then act accordingly to the conscious that has been reached in said community.
Posted Jun 7, 2016 23:29 UTC (Tue)
by acollins (guest, #94471)
[Link] (33 responses)
Breaking an extremely common usecase and then telling everyone to fix their packages after the fact is not acceptable.
This should have followed the kernel process, where a change like this is clearly communicated months/years ahead of time and the community given time
Posted Jun 8, 2016 3:31 UTC (Wed)
by pizza (subscriber, #46)
[Link] (32 responses)
Assuming your distribution doesn't do it for you before you even get the update, the immediate solution is "flip the default back".
> This should have followed the kernel process, where a change like this is clearly communicated months/years ahead of time and the community given time
This feature was in the very first public systemd release. How much longer should they have waited?
Posted Jun 8, 2016 4:37 UTC (Wed)
by drag (guest, #31333)
[Link] (1 responses)
Systemd releases are not intended for end users. They are low-level Linux plumbing software. The intended audience are distributions and system integrators. Nothing systemd is doing here is forcing anybody to do anything at all.
If distributions don't like it they just switch it back. It's their job as distributions to be aware of these sorts of things and make these sorts of decisions. If the distributions are so clueless to miss this sort of thing or they abdicate their rights to make their own choices and just accept code carte blanche from upstream... then what is the point of distributions at all? Users would be better off just cutting out the middle man and just pull directly from upstream in some sort of automated 'linux from scratch' approach to installing Linux.
Posted Jun 9, 2016 6:17 UTC (Thu)
by NightMonkey (subscriber, #23051)
[Link]
God, I love Gentoo. :)
Posted Jun 8, 2016 4:43 UTC (Wed)
by error27 (subscriber, #8346)
[Link] (4 responses)
Posted Jun 8, 2016 5:50 UTC (Wed)
by rengolin (guest, #48414)
[Link] (3 responses)
Any stable distro can protect you from this change (I use Arch), so I don't see this as a big deal at all.
The only slight problem I see is that they should have worked directly with those more affected (ex. tmux and screen) *before* defining what the interface looks like, to avoid long lasting design problems. Though, this is again, chicken and egg, as defining who you talk to first is not easy.
All in all, noisy and a tad inefficient, but nothing out of the ordinary.
Posted Jun 9, 2016 9:33 UTC (Thu)
by Seegras (guest, #20463)
[Link] (2 responses)
And this means, if you want to change the default there, to PROVIDE A FUCKING FIX for afflicted programs such as nohup, screen and tmux. Even if it's your forked version of it, and furthermore ANNOUNCE IN ADVANCE to your (biggest) downstreams when you're changing the default.
I can see the reasoning behind what systemd did. But I absolutely object the way they did and communicated it.
Posted Jun 9, 2016 14:10 UTC (Thu)
by johannbg (guest, #65743)
[Link] (1 responses)
And with regards to announcements it's expected that a downstream package maintainer(s) are in good relation with upstream ( which all the major distribution Arch/CoreOS/Debian/Fedora/Gentoo/OpenSuse/Ubuntu etc are with systemd ) and follow upstream changes closely ( which they do ) to be prepared for changes like these ( which they where ) and start dialogs with downstream community should they feel necessary to do so ( which is far as I know all of them did, some before the official release of systemd 230 version, others after ).
So the question here is how come you missed that discussion in your community ( which usually indicates people that miss such discussions aren't active contributes in their communities or part of it at all as in just end users ) .
Posted Jun 9, 2016 18:59 UTC (Thu)
by flussence (guest, #85566)
[Link]
If.
Posted Jun 8, 2016 7:44 UTC (Wed)
by rschroev (subscriber, #4164)
[Link] (23 responses)
Until there is a consensus in the Linux community (not just in Poettering's mind or the systemd dev community) that this is the right thing to do.
Posted Jun 8, 2016 8:21 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (19 responses)
Posted Jun 8, 2016 8:28 UTC (Wed)
by rschroev (subscriber, #4164)
[Link] (18 responses)
Posted Jun 8, 2016 8:32 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (17 responses)
Posted Jun 8, 2016 21:19 UTC (Wed)
by xtifr (guest, #143)
[Link] (15 responses)
But the kernel developers at least claim to care about breaking working code. And in this case, we're talking about a change that has the potential to render a system completely unusable! If I background (as I frequently do) a system update, and then log out, this change means the update will be killed at some unknown point. If this happens while certain key files are being updated, the result can be catastrophic. And if I later log back in and use ps to see if the update is still running, and find it isn't, I may decide it's ok to shut down, destroying any remote chance I might have had at a simple recovery. (Assuming I can still log back in in the first place....)
If it's a remote system, staying logged in during the entire update process may not even be an option. My laptop and I may have places we need to be.
Compared to the potential for catastrophe this change brings, the (admittedly real and very annoying) problem of programs which fail to shut themselves down properly when asked to do so seems minor.
Now, I admit that sometimes, a potentially system-destroying change is unavoidable, for various reasons. But in a case like that, there really have to be bright, shining, neon lights that say "if you do X at this point, your system may be corrupted!" Quietly making such a change without warning is simply not acceptable!
Posted Jun 8, 2016 22:43 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (13 responses)
It's the same with systemd - its interface is stable.
Posted Jun 9, 2016 4:35 UTC (Thu)
by xtifr (guest, #143)
[Link] (12 responses)
Still, that's the only recent example I can think of where the kernel folks did something with as much potential for catastrophic breakage as this has. If you have some other examples, which affected programs as widely used as screen, tmux, emacs-daemon, and nohup, I'd be curious to hear them.
Posted Jun 9, 2016 7:21 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (10 responses)
If you want a comparable change in Linux, then remember the /dev/hda -> /dev/sda switch. That broke a _lot_ of stuff that was doing stupid things like detecting hard disks by checking for /dev/hd? devices.
Posted Jun 9, 2016 10:59 UTC (Thu)
by xtifr (guest, #143)
[Link]
On the other hand, neither this nor the fsync thing actually bit me, while the /dev changes did, so, good example. :)
Posted Jun 9, 2016 13:24 UTC (Thu)
by johannbg (guest, #65743)
[Link] (8 responses)
Which issues in systemd do you feel should have higher priority development priority?
Posted Jun 9, 2016 17:19 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (7 responses)
Posted Jun 9, 2016 19:17 UTC (Thu)
by viro (subscriber, #7872)
[Link] (6 responses)
If distros decide, en mass, that reverting the change of default is less headache than doing urgent fixups to screen/tmux/whatnot, then the whole thing had been handled wrong. By definition. And responsibility for the choice of tactics that happened to backfire is upon those who chose it. Especially since the headache for distros had been easy to anticipate, along with the likely areas where that headache would come from. FWIW, simple search shows reports of screen(1) breakage *5* *years* *ago* on that very thing turned on on reporter's box. In fedora. With Johann directly involved in handling of that report, amusingly enough, so if systemd developers were unaware of the likely sources of trouble, a part of blame was actually his...
Posted Jun 9, 2016 19:26 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
If they fail to take off after many years of trying then the idea was obviously bad (see: Python 3) and probably should be adjusted and/or abandoned.
Posted Jun 10, 2016 13:12 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link] (2 responses)
Python 3 is slowly making inroads, held back by the still not complete fixup of key libraries. Usage stands at around 50% Python 3 and 70% Python 2 (the 20% overlap is code used with both).
Posted Jun 10, 2016 18:25 UTC (Fri)
by drag (guest, #31333)
[Link] (1 responses)
There are two ways of screwing something up, or 'making mistakes'
1. No-fault. Based on information available at the time it seemed like it was a good idea. Unfortunately it turned out to have jacked everything up.
2. Fault. Based on information available at the time I knew it was a bad decision, but I thought I could get away with it. Too bad I got caught jacking everything up.
The first one is fine. It can't be avoided. It's part of how technology progresses and dealing with mistakes is just something we have to do. The second one is where you deserve to be removed from a position of trust.
Even if you have people disagreeing with you about choices you make it doesn't mean you fall into category 2, even if they are ultimately right. They just now have proof of their correct decisions. Doesn't mean they will be correct next time, though.
A lot of people see mistakes type 1 and then assign malicious intent in their minds to transform them to type 2, then go cry on the internet. A lot of people see people make mistakes type 1 and then try to erase the mistake because of a confusion that only type 2 mistakes exist, or they fail to realize the distinction.
Personally I like python3. I don't know what they could of done to improve the transition. Alternative seems to be what perl has done.
Posted Jun 13, 2016 15:32 UTC (Mon)
by niner (subscriber, #26151)
[Link]
Posted Jun 9, 2016 22:20 UTC (Thu)
by johannbg (guest, #65743)
[Link] (1 responses)
Posted Jun 10, 2016 0:23 UTC (Fri)
by viro (subscriber, #7872)
[Link]
a) left the project (fedora, I take it?) != lost all ability to contact systemd developers. Their list isn't closed, AFAIK.
b) whether you've failed to inform them about screen(1) breaking in such situation back then or not, they certainly could've looked themselves. Searching for bug reports mentioning that setting is not a rocket science.
c) I've no comment on the state of Fedora (before or after systemd transition - it's not something I use other than for testing and that only when I can't reproduce a bug on something saner; I sure as hell do not watch the politcs in it), and your choice of standards is up to you, but that amount of drama got to be counterproductive whatever those standards might be. I really, honestly have no fucking idea how you've parted ways with that project; judging by your postings years later it had to have been messy as hell and at a guess hadn't been any calmer than said postings (BTW, as an aside - what the hell _is_ in phoenix? You keep refering to it, and by the context it sounds like a center of some Evil Corporate Cabal(tm), presumably RH-related one. I'm fairly sure that RH headquarters are still in Raleigh and AFAIK there's no office in Phoenix - AZ or otherwise...)
Posted Jun 18, 2016 16:29 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
Not recent, but it was a deliberate decision by Linus and it caused chaos ...
Who remembers him ripping the swap optimisation code out of kernel 2.4? Leading to a spate of linux systems crashing as sysadmins discovered "swap should be twice ram" was NOT an old wives' tale ...
Cheers,
Posted Jun 9, 2016 8:49 UTC (Thu)
by pflykt (subscriber, #2757)
[Link]
...and which is the correct action for a responsible distribution to take, considering what its users think is the intended behavior, if I may add. This, and figuring out how to address the initial problem, is where the distributor really adds value.
Posted Jun 13, 2016 8:23 UTC (Mon)
by Jluis (guest, #28564)
[Link]
Posted Jun 8, 2016 15:24 UTC (Wed)
by johannbg (guest, #65743)
[Link]
You do realize that there exist no such things as "consensus in the Linux community".
If someone as much as farts in the wrong direction he can have create 10 forks. 5 standards, 3 foundations and one distribution as an result of that because someone in some community did not like how that fart smelled.
Posted Jun 8, 2016 15:48 UTC (Wed)
by anselm (subscriber, #2796)
[Link] (1 responses)
“Consensus.” You keep using that word. I don't think it means what you think it means.
See RFC 7282:
Posted Jun 9, 2016 21:21 UTC (Thu)
by mstone_ (subscriber, #66309)
[Link]
Posted Jun 8, 2016 16:06 UTC (Wed)
by lsl (subscriber, #86508)
[Link]
The feature itself is pretty useful and I assume no one has any issues with it. It's the flipping of the default setting that's the disruptive (and therefore controversial) change.
Posted Jun 7, 2016 23:54 UTC (Tue)
by darwish (guest, #102479)
[Link] (47 responses)
Now people will complain how this will break tmux, screen, nohup, etc. But really, if systemd did not take this initial bold first step, no one will do it. I've always admired systemd boldness: it's that boldness that forced a well-engineered layer above the kernel; no matter how the UNIX administrators and server folks crying to keep everything in stone. This boldness will make traditional Linux builds still maintain their competitive edge wrt other operating systems over the long term.
Posted Jun 8, 2016 0:19 UTC (Wed)
by pikhq (subscriber, #98351)
[Link] (1 responses)
Posted Jun 11, 2016 19:01 UTC (Sat)
by micka (subscriber, #38720)
[Link]
Posted Jun 8, 2016 0:49 UTC (Wed)
by khim (subscriber, #9252)
[Link] (1 responses)
It's Ok to introduce “bold”, backward-incompatible changes when you are doing experimental work. Once your creation is in use by millions “bold steps” are no longer allowed. Think Windows Phone (Windows Mobile had 12% market share, Windows Phone 7 broke everything and as a result Windows Phone will, most likely die) and compare it to Windows itself (at introduction it also was pretty much incompatible with previous setups but that just meant that people ignored it… only when Windows 3.0 made it possible to use MS DOS programs it took off). Note that you could eventually remove stuff (Windows x64 no longer supports MS DOS programs) but there must be a transition period. The sane decision here would be a dialog which asks user about it (similarly to how user is asked if they want to keep their “Documents” directory name after locale change), then add code to screen/tmux/etc (if people just flat out refuse to cooperate then Ok, you could just include link to the appropriate bug in “release notes”), etc. IOW: this may be a good default, but it still breaks user's expectations without warnings. This is really bad—lack of warnings, that is. The change itself may be good, but “bold” moves like this is how your create not Linux, but Plan/9: good (as in: really good, no quotes) OS which nobody uses (not even it's creators).
Posted Jun 20, 2016 19:04 UTC (Mon)
by ThinkRob (guest, #64513)
[Link]
But that's the strength of the distro model isn't it? Upstream projects can experiment, do cool stuff, etc. and the distros make sure that the parts that they ship are configured to suit whatever the goals of the distro (ease of use, niche-specific stuff, whatever.)
And that's exactly what's happening here. systemd switched the default in their upstream repo, and it's up to the distros to determine whether to follow that change immediately, give people some lead time, or say "fuck it" choose to ignore the change forever. All three are valid options depending on the distro's goals.
Posted Jun 8, 2016 1:05 UTC (Wed)
by viro (subscriber, #7872)
[Link]
Bravo. One rarely sees such a superb example of sarcasm these days. Mind if I steal that for the next time I need to suggest (politely) that such-and-such proposal is crap obvious for anyone with even a modicum of experience?
Posted Jun 8, 2016 6:38 UTC (Wed)
by ras (subscriber, #33059)
[Link] (15 responses)
Dragging us back? That would be reasonable if the majority of Linux deployments were now multi-user machines, whose users were miscreants prone to starting rouge processes with nohup, screen or whatever. In that case sysadmin's and developers insisting the defaults suit them would indeed be holding things back.
But that's not the case is it? The machines running systemd are dominated by embedded computers, servers and developer laptops. On these machines if the user knows enough to use nohup, screen or tmux, it's almost certain any process they want left running after logout should remain running. So surely that should be the default. In the very rare case that's wrong (I'm struggling to think of one - public use machines in a Uni - but do they still exist?) the sysadmin in charge of those boxes can set the KillUserProcesses option.
That aside, it looks to me this change wasn't introduced for the philosophical reasons you mention. It is a kludge to work around a Gnome design bug: https://github.com/systemd/systemd/issues/2900 In fact kludge is far to kind a word. It's a horrible hack.
Posted Jun 8, 2016 8:13 UTC (Wed)
by ovitters (guest, #27950)
[Link] (14 responses)
Please clarify.
Posted Jun 8, 2016 11:12 UTC (Wed)
by ras (subscriber, #33059)
[Link] (13 responses)
I have trouble distinguishing between GNOME and systemd. To the extend they are separate projects, you may be right.
Roughly what happened was:
1. In the beginning there was there login. Every process started after login was a child of it, the kernel used a very simple process to track those children and so it was easy to clean up on logout.
2. Then X and xdm replaced login, but every process was a child of xdm, and cleaning up on logout remained simple.
3. Then there was GNOME, and gdm, and later gdm spawned corba. Things were rapidly getting more complex, but nonetheless everything was a process child of gdm and so cleaning up on logout was till simple.
4. GNOME moves to dbus.
5. systemd takes over dbus.
6. systemd takes over session management - primarily via logind.
7. GNOME immediately adopts logind, causing much angst on Debian because it meant the default desktop required you to use systemd.
8. GNOME starts uses dbus to lazily start services.
9. systemd starts dbus under a separate process tree (the one under systemd --user, as opposed to the one started by gdm).
10. GNOME notices if the user logs in twice, they start services such as the evolution-address-book twice. Seems inefficient. They share services between two login sessions. For some services.
11. Consequently keeping track of what session owns what process becomes hard. Some things aren't killed properly when the sessions logout. Since logind is tracking the sessions, seems like a good idea to make it the systemd mob's problem. KillUserProcesses is implemented, and GNOME's problem is solved.
12. But no one is turning KillUserProcess on so GNOME sessions are still leaving services running. So systemd-230 changes it to default to be on.
And so here we are. If you tell me it's really systemd's at fault then so be it - for me it like picking between two peas in a pod.
Posted Jun 9, 2016 2:10 UTC (Thu)
by BradReed (subscriber, #5917)
[Link]
Posted Jun 9, 2016 23:47 UTC (Thu)
by xtifr (guest, #143)
[Link] (4 responses)
Interesting. That definitely makes sense.
However, that suggests another possible approach to this problem. Instead of having evolution-address-book ignore sighup (which is what I assume it's doing now, so that it can survive one session's ending), have it respond to sighup by checking how many sessions it's attached to! If the number is one (or less), it can shut down gracefully, but if the number is higher, then it knows it should ignore the signal!
Of course, this would mean that the programs would have to know how many sessions they were attached to, but isn't that exactly the sort of thing dbus was created to handle?
This would place the problem where it belongs (on Gnome processes which want this session-sharing feature), rather than on unrelated software (screen/tmux, etc., anything launched via nohup, and unknown number of "little" homerolled background persistent thingies).
Posted Jun 10, 2016 0:50 UTC (Fri)
by johannbg (guest, #65743)
[Link] (2 responses)
What an excellent idea we can go together I for my chronic depression and you for your alter ego which gets you off by judging, putting down or belittling other people in comment section and on mailing lists. ;)
Posted Jun 10, 2016 9:31 UTC (Fri)
by xtifr (guest, #143)
[Link] (1 responses)
Oh, you replied to the wrong post. Never mind. It happens.
(For a second I thought you were suggesting that my proposal was extremely crazy. Which it might be, but a quick skim showed me the post you were trying to respond to.)
Posted Jun 10, 2016 11:08 UTC (Fri)
by johannbg (guest, #65743)
[Link]
I would have spotted it if the sites comment editor would be posting what's being responded to above not below itself ( it arguably should do so in atleast above the comment preview so the preview is in full context with what's being responded to ).
Sorry for the mistake.
Posted Jun 11, 2016 15:13 UTC (Sat)
by krake (guest, #55996)
[Link]
My guess is that the problem is that the program is only attached to one session, or more specifically one bus.
But again, this is just a guess.
Posted Jun 10, 2016 11:27 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
This is looks like a plausible theory but it's most definitely a wrong one. I've seen lingering GNOME processes left behind years before systemd was even imagined. I'm not really sure what/when spawned them (gvfs with it's use of FUSE was the main culprit, but I think there were others, too), but that happened way earlier than systemd introduction. IOW: systemd solves the real and sizable problem which predates it, although it's not clear if that's the best solution available (well, it's certainly the best available solution in a sense that it's the only one that works, but it does not mean it couldn't be better).
Posted Jun 11, 2016 5:46 UTC (Sat)
by ras (subscriber, #33059)
[Link]
So have I. Yes, it's definitely the same symptom. But the same symptom does not mean the underlying cause is the same - particularly in software.
Posted Jun 12, 2016 15:37 UTC (Sun)
by davidstrauss (guest, #85867)
[Link] (1 responses)
systemd *uses* DBus and provides its own DBus client library. It's basically what cURL's relationship to Apache is.
Just in case you're thinking about kdbus, it's not part of systemd, hasn't happened yet, and may or may not happen in the future.
Posted Jun 12, 2016 19:43 UTC (Sun)
by johannbg (guest, #65743)
[Link]
Posted Jun 13, 2016 11:34 UTC (Mon)
by gb (subscriber, #58328)
[Link] (1 responses)
System-wide change is clearly not necessary here!
Posted Jun 13, 2016 13:33 UTC (Mon)
by micka (subscriber, #38720)
[Link]
Posted Jun 15, 2016 2:57 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
The flag was implemented way long ago (before 2011). This is about changing the default.
Posted Jun 8, 2016 8:48 UTC (Wed)
by diegor (subscriber, #1967)
[Link] (7 responses)
And there is already a "clean the process after logout", it is called sighup. And it works. When i work on a remote server with ssh, I never see a process surviving the logout, that are not meant to do it.
Now, desktop is a different beast. Most of the problem come from program launching new process with nohup. That's happen because, they don't want the process be killed, when the father process is terminated. For example when you open an attachment from you email client, the viewer is a child of the email client. So when you close it, unix, as expected, clean all his child process. Because is cleaner, right? Do you see the problem now? Cleaning process is a desiderable thing, in theory, but the user complain, so every one start to mark the child process, as a process to not be cleaned out.
A better solution would be to have a "launcher" process for gui. Your email client instead of forking a new process, it can talk to the launcher and let it fork the new process. So when you logout, let the "launcher" clean his process, and let him kill them with fire, if it is really needed (without breaking everything else).
Posted Jun 8, 2016 9:21 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (4 responses)
Also what is the difference if systemd kills you process or the launcher process of the gui? There has to be an opt-out for screen, tmux and nohup in both cases.
Posted Jun 8, 2016 11:56 UTC (Wed)
by ras (subscriber, #33059)
[Link] (3 responses)
I remember I have had a process going infinite on logout, but can't actually recall when - it was a long time ago. Infinite loops after getting a sighup must be easy to track down.
What I can recall happening recently is a kernel driver misbehaving. I go through the rmmod, modprobe dance. But instead of fixing it the processes using the driver hangs on some unbreakable kernel lock, and the eventual kill -9 is a complete waste of time. The only way out is a reboot of a production machine. I gather race conditions during module removal are unavoidable, and regardless there always seems to be more of them. If the systemd or gnome guys have a fix for this, no matter how bad, I promise to call it a thing of beauty.
But using this sledge hammer to cure a simple infinite loop bug, and break backward compatibility - sorry, no.
Posted Jun 8, 2016 13:51 UTC (Wed)
by bronson (subscriber, #4806)
[Link] (1 responses)
Posted Jun 8, 2016 22:03 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Jun 8, 2016 14:51 UTC (Wed)
by diegor (subscriber, #1967)
[Link]
Usually process for which kernel is serving a system call, can not be interrupted, until kernel have finished.
Posted Jun 8, 2016 11:23 UTC (Wed)
by niner (subscriber, #26151)
[Link] (1 responses)
Posted Jun 10, 2016 22:23 UTC (Fri)
by xtifr (guest, #143)
[Link]
So you have a bunch of programs with bugs (don't respond properly to SIGHUP), and this will merely hide those bugs!
Wouldn't it be better in the long run to have those bugs fixed? And wouldn't it be easier to fix them if the bugs weren't hidden?
In the case of the various GNOME-specific processes which are supposed to be shared between sessions and thus shouldn't necessarily just die whenever one session shuts down (so they can't just die on SIGHUP), they should, instead, keep track of how many sessions they've been attached to (probably via dbus or something), and kill themselves when the last session goes away.
And this kill-everything option could remain an option for people like you who have buggy processes which don't shut themselves down when they should (either when receiving SIGHUP or when their last session goes away). I'm fine with that. But making it a default for everyone, and forcing every program that might be intended to survive beyond logout to be modified just for systemd-based systems seems like the wrong choice.
Posted Jun 8, 2016 21:55 UTC (Wed)
by JoeF (guest, #4486)
[Link] (17 responses)
Posted Jun 8, 2016 23:37 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link] (15 responses)
What makes you think they aren't? If you start a process with nohup then it won't be sent a SIGHUP when the shell's controlling terminal is closed, just like it says in the manual. Nothing about that has changed. It was always a rather imprecise way to manage process lifetimes, though.
Killing a user's processes on logout is hardly a new idea. This was the policy on the shared Linux servers at my university over a decade ago, for example. The ability to do this reliably with logind rather than ad-hoc scripts would be a welcome change.
Should this be the default? I'd say that's for distributions to decide. But it's a nice feature to have available, and IMHO any programs (like screen and tmux) which are meant to provide long-running services independent of the session they were started from should register themselves as separate user sessions, regardless of whether killing processes on logout becomes the default.
Posted Jun 9, 2016 18:12 UTC (Thu)
by JoeF (guest, #4486)
[Link] (14 responses)
Posted Jun 10, 2016 15:55 UTC (Fri)
by nybble41 (subscriber, #55106)
[Link] (13 responses)
Systemd sends both SIGHUP and SIGTERM, followed some time later by SIGKILL. Your objection is mainly to the SIGKILL, but this is the same process that is used to terminate programs on system shutdown, and the correct way to end a user's session after the user logs out.
This setting is primarily about how to detect when the user's session has ended. That matters because various per-user (not per-session) background processes, most notably dbus-daemon running in user-session mode, are meant to terminate when the user's last session ends. But is the end of the session when *all* the processes started in that session exit, or when the *main* process exits? In the former case (the previous default) you end up with sessions that never terminate because they contain processes which are waiting for those per-user background processes to exit. In the latter case a handful of programs need to be more specific about the fact that they are really per-user services and not part of a particular login session—which is the right thing to do in any case.
Note that assembling a screen.service file that runs "screen -dm" as a background service outside of the session is really rather trivial, and requires no modification to the screen binary. It can be started automatically, or manually with "systemctl --user start screen@session.service".
bash$ cat ~/.config/systemd/user/screen@.service
[Service]
[Install]
bash$ systemctl --user enable --now screen@.service
The screen@.service file can also be installed system-wide as part of a system package rather than in one user's ~/.config directory. Note that you still need to run "loginctl enable-linger" in order to have the screen service (or rather, the parent user@.service) survive without any active login sessions, and you need to attach to the existing screen session rather than letting screen start a new one.
Posted Jun 12, 2016 5:59 UTC (Sun)
by elvis_ (guest, #63935)
[Link] (12 responses)
Not everyone has the time to relearn things they shouldn't have to.
Posted Jun 12, 2016 10:02 UTC (Sun)
by nybble41 (subscriber, #55106)
[Link] (11 responses)
From the point of view of either upstream or a distribution, it _is_ trivial. The template unit file can be provided as part of the screen package, and with a minor tweak to screen to do "systemctl --user start screen@$id.service" instead of running new servers directly, end-users wouldn't even need to be aware of the difference.
Even without support from the distribution or upstream there's not much to learn: run a few one-time commands to set things up—you don't need to understand unit file syntax, just copy the provided template into the configuration directory—and remember the one systemctl command needed to start a new session. Or take the quick route and (after running "loginctl enable-linger" once) alias screen='systemd-run --user --scope /usr/bin/screen'. If you can't handle that much you should probably avoid mainstream operating systems in general, as they all change at least this much on a regular basis. You might be more comfortable with one of the BSD variants, though even they aren't _completely_ static.
Posted Jun 12, 2016 23:27 UTC (Sun)
by elvis_ (guest, #63935)
[Link] (10 responses)
Posted Jun 13, 2016 6:18 UTC (Mon)
by anselm (subscriber, #2796)
[Link] (9 responses)
No, he suggested that you should use a distribution that provides the required hand-holding if you can't be bothered to learn some pretty cool new stuff like systemd (which would incidentally come in useful in a few other places, too). Not quite the same thing.
Seriously, if you're that worked up about this, we're talking about one switch that you need to flip to make the issue at hand go away, and that's only if your distribution doesn't do it for you already. We can quibble endlessly about whether changing that default was a great idea, and there are reasonable arguments on both sides – but as far as I'm concerned, “That's not how we used to do it in 1980” is one of the less reasonable arguments.
Posted Jun 13, 2016 7:14 UTC (Mon)
by jrigg (guest, #30848)
[Link] (8 responses)
There's a big difference between "can't be bothered" and "don't have time".
Posted Jun 13, 2016 7:36 UTC (Mon)
by anselm (subscriber, #2796)
[Link] (7 responses)
That's a lame excuse if ever I heard one.
Learning the basics of systemd takes one or two hours, tops. It's not exactly rocket science. That would cover the various types of unit files, what they contain and where they're located, how service activation works, the systemctl command and its more important subcommands, and an overview of ancillary software such as journalctl or systemd-logind. It should certainly give one enough knowledge to be dangerous and to build upon incrementally as required. There's what I would consider a reasonable primer on systemd in this manual from the tuxcademy project (although I'm biased because I wrote it myself), and Lennart Poettering's blog and the documentation on freedesktop.org are also worth a peek.
Given the importance of systemd in current and future Linux systems, one would be more than justified in considering these two hours a reasonable investment (for some people it would also be worth it just to learn enough about systemd to not appear ignorant in discussions on LWN.net). Think of it as alternative entertainment on an evening when there's nothing interesting on TV.
Posted Jun 13, 2016 10:50 UTC (Mon)
by johannbg (guest, #65743)
[Link] (2 responses)
People that approach and view systemd as a new technology with new concepts adapt to it quicker than those with any background in any legacy init system in which they more often than not approach systemd as an legacy init system and apply legacy init system concepts that are not applicable to systemd ( and expect same and similar outcome or behavior ) like for example the concept of "run levels" which does not exist in systemd but the concept of boot targets does etc.
Posted Jun 13, 2016 12:22 UTC (Mon)
by anselm (subscriber, #2796)
[Link] (1 responses)
In my experience as a Linux instructor, one or two hours of systemd instruction is adequate to provide the basics for people who would otherwise be using System-V init as system administrators. Building on that, it is certainly more feasible to spend another couple of hours teaching somebody how to write a systemd service unit file for a new service and to integrate that into an existing setup, than it is to spend a couple of days teaching them enough shell scripting and distribution-specific minutiae to be able to write a robust System-V init script for a new service and to integrate that into an existing setup, on one single distribution. (The next distribution is going to be subtly, or not so subtly, different.)
For an upstream project, it is reasonable to invest the time to produce a good systemd-based configuration, which by now is likely to be applicable with few if any changes to a large number of platforms, because the effort for that is going to be smaller, in the long run, than the effort required to test and tweak new versions of their application (or application stack) on a huge number of subtly different legacy environments that all require some degree of individual adaptation.
Posted Jun 13, 2016 16:01 UTC (Mon)
by johannbg (guest, #65743)
[Link]
Daemon vs socket activation, path used in type units and simply the name of the component ( apache vs httpd for example ) etc still differs between distributions and that problem will never be solved unless unification in the core/baseOS can be achieve so those upstream(s) that actually care and ship initscripts of any kind are still dealing with that issue.
Posted Jun 13, 2016 12:38 UTC (Mon)
by jrigg (guest, #30848)
[Link] (2 responses)
It's easy to say that's purely a problem for the distros, but system upgrades are painful enough without having to learn several different versions of things in preparation for the next one.
Posted Jun 13, 2016 20:58 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Posted Jun 14, 2016 13:05 UTC (Tue)
by jrigg (guest, #30848)
[Link]
According to this, https://www.freedesktop.org/wiki/Software/systemd/MyServi... the way to allow realtime scheduling for users for a specific service is to add ControlGroup=cpu:/ to its [Service] section. The ControlGroup= option was removed in systemd 205 (July 2013) but the document hasn't been changed. That's an example of what I was referring to.
To be fair it's only one specific example, but it did contribute to my decision to stick with sysvinit-core on my Debian systems for the time being.
Posted Jun 13, 2016 13:23 UTC (Mon)
by paulj (subscriber, #341)
[Link]
That's the basics of good system maintenance.
Posted Jun 15, 2016 3:54 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
Most of my "nohup" processes are things like image and PDF viewers for opening attachments from mutt so that the viewers don't close when I close the tmux window. I don't think `nohup` means "should persist the session" at all; something stronger needs to be done (I have it on my TODO list to do some experimentation with a "new-session" wrapper application to start a PAM session).
Posted Jun 7, 2016 23:58 UTC (Tue)
by JoeBuck (subscriber, #2330)
[Link] (3 responses)
At minimum, GNU screen has to be able to persist after logout; as a software developer I rely on that feature to deal with loss of network connectivity. Likewise, background compute jobs have to work.
I don't think it is "insane" to consider this change, or even to try to push it to create pressure for other packages to live in a world where there are restrictions on background processes after logout. But this is why it is good that the distros sit between the upstream and the end users; the change can only be delivered after the use cases are worked out in detail.
Posted Jun 9, 2016 22:44 UTC (Thu)
by jeff@uclinux.org (guest, #8024)
[Link] (2 responses)
No, this breaks the expectations of the POSIX C runtime environment. If I write code to run on a supposedly compliant system by managing signals myself, I absolutely require that to work.
Someone with quite a bit of experience is noted for saying something like "UNIX doesn't have all the good ideas, just most of them". I rely on correct behaviour of signals, they were a good idea, and linking with some dumbass library to get behaviour I had before does not qualify as rational, let alone a good idea.
Fail.
Posted Jun 9, 2016 23:09 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Jun 16, 2016 16:12 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Well, as someone with quite a bit of non-Unix experience (and no, that's not just Microsoft), ime Unix/linux has quite of lot of stupid ideas too. And given Linus' attitude of "We're posix compliant if posix makes sense", he probably thinks the same.
Simple example - "cp a b". What's that going to do? Oh, and I don't want a long winded answer with a load of "if"s in it :-)
Whereas on Pr1mos, "copy a b" gives me an exact copy of a (security permissions permitting) called b.
And ime, most of these comments seem to be bikeshedding between two camps - the "sooner the better" camp, and the "the right time is never" camp. If it's going to happen, then now is as good a time as any. How long has this change been in the works? Since the dawn of systemd? And if all these programs - screen, tmux, nohup, haven't done anything about it yet, then they're not going to unless something gives them a kick up the bum.
Cheers,
Posted Jun 8, 2016 0:10 UTC (Wed)
by Nahor (subscriber, #51583)
[Link] (24 responses)
Posted Jun 8, 2016 0:49 UTC (Wed)
by TMM (subscriber, #79398)
[Link] (15 responses)
It's possible, but hard. (The screen thing is just an example, actually making it hard requires more than screen, but this systemd change takes care of this entire class of problem)
Posted Jun 8, 2016 1:06 UTC (Wed)
by smoogen (subscriber, #97)
[Link] (1 responses)
I understand the security item that Lennart sees, but I think that this is a bandaid where the 'fix' he wants will require him to write his own distribution from the 'ground-up' and find the users and use cases to use it. He gets angry about the amount of band-aids he is already carrying around, but this is in many ways the fact that the users already have too many of the old around and can not just fork lift fix their infrastructure at his urging.
Posted Jun 16, 2016 16:18 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
The problem is that your "band-aid" is Lennart's security hole. All these band aids are unnecessary code, that is more likely than average to harbour bugs (and hide bugs in the other program too), and are dangerous things to leave lying around. And this example is classic - leaving processes lying around because the system can't/won't get rid of them by default is exactly that! If they're buggy enough not to shut down, how many other bugs do they harbour?
Cheers,
Posted Jun 8, 2016 1:42 UTC (Wed)
by dskoll (subscriber, #1630)
[Link] (3 responses)
The security issue is a red herring. If you want to make sure that all of a user's processes have been terminated, you use pkill -U and pkill -u. Distros could even make their user-deletion tools ask if you want to do this and then do it for you if you say yes.
Posted Jun 8, 2016 2:23 UTC (Wed)
by hmh (subscriber, #3838)
[Link] (2 responses)
I seem to recall a full fix for the "desktop" *security* case requires something like a revoke() syscall, and a proper SAK implementation. I hope I am wrong, because those can be quite hard to implement.
I mean, what is the point of killing every job started in an user session in the name of security, when all the user has to do is to simulate a login screen without ending his session at all in the first place?
There are other ways to ensure no desktop environment components are left behind when the session is closed, especially because these are actually a bounded set *and* because there would be little reason for them to refuse the patches.
Posted Jun 8, 2016 8:27 UTC (Wed)
by NAR (subscriber, #1313)
[Link]
Well, if I want a user gone from the system, I'd want to do a decent job and purge it. Remove user id, cron jobs, all files, kill all processes, whatnot. I admit it might be complicated in a corporate multi-computer system when a user can have access to dozens of computers with some kind of single sign on - but systemd wouldn't help with the cronjobs either. Malicious users can also easily avoid the systemd reaping procedure. So I do think that the "security" issue is totally BS.
Posted Jun 8, 2016 12:03 UTC (Wed)
by dskoll (subscriber, #1630)
[Link]
Even that is too little.
Yes, I know, but it is the equivalent (actually, a superset of) what systemd does when it kills processes started in the login session.
Posted Jun 8, 2016 2:08 UTC (Wed)
by Nahor (subscriber, #51583)
[Link] (8 responses)
To me, it seems that if you can't trust your users while they are logged out, you should be trusted them while there are logged in.
Moreover, there is still the "systemd-run" function to do the same thing as nohup/screen/... so it's not that they are removing the feature anyway, it's just used differently.
Posted Jun 8, 2016 4:34 UTC (Wed)
by rahvin (guest, #16953)
[Link] (7 responses)
What if you had an orphaned ssl process when the heartbleed vulnerability was disclosed? Even if you patch the binary if you don't shutdown the zombies you just exposed your key to the world. IMO this is a good change to enable default security but I also totally agree that this should have been talked about very publicly that they were going to turn it on with X release. By not doing this they sabotaged their own effort because all the distributions will just disable it and boom it will never get implemented as default. And these programs like tmux and screen whose primary purpose is to daemonize a process (and have been around for more than 30 years) should have not only been warned but the systemd developers should have found a good way to retain that functionality that wouldn't require a dependency (in an upstream that's agnostic to the OS) to fix, even if that is a default exception for those specific binaries.
tldr It's a good change but it was handled terribly.
Posted Jun 8, 2016 4:41 UTC (Wed)
by drag (guest, #31333)
[Link] (4 responses)
There is no threat from them. They are not running anymore. They may be using up some memory in a process table or something, but that is about it. All it really means is that if you end up with a bunch of zombie processes you have a bug in the kernel or init or something.
Posted Jun 8, 2016 8:57 UTC (Wed)
by sbakker (subscriber, #58443)
[Link] (3 responses)
I suspect rahvin meant "orphaned" rather than "zombie". Nevertheless, the term "zombie" is confusing, implying that the entity is somehow still active. It isn't. In fact:
A better name for zombie processes would be "corpse" or "carcass", but that just looks wrong in "top" listings. "Dead parrot" has a nice ring to it though.. :-)
Posted Jun 8, 2016 15:44 UTC (Wed)
by drag (guest, #31333)
[Link] (2 responses)
A 'orphaned' process is essentially a daemon.
The problems we are running into now is that in Unix-ville every application was tied to a TTY. The 'getty' process was the 'session manager'. However it's complete shit to work in a modern environment by 'backgrounding processes'. So using daemons for setting up user environments and such things are very common. So is using terminal multiplexers like tmux or screen.
These are really just bandages to a bigger issue. All these things are just work arounds to the 'tty' limitations. As a result it's a big mess. You can control when you start up all these processes, but they are no longer tied to anything. You have to have some process to come back in when you log out to clean them up.
With systemd we now have the ability to have a true user session that is no longer tied to a particular tty. Systemd can 'daemonize' processes without actually going through the traditional deamonizing processes. It makes sense that when the user session dies so does their programs. It's just the way things should of always worked.
Posted Jun 8, 2016 21:02 UTC (Wed)
by rahvin (guest, #16953)
[Link] (1 responses)
I completely agree this is a needed change. The problem is it breaks 30 years of experience, and those types of breaks need lots and lots of warning or people will put workarounds in place that negate the change. Every distribution disabled it, IMO that's a sign they handled the messaging wrong.
Posted Jun 8, 2016 23:35 UTC (Wed)
by johannbg (guest, #65743)
[Link]
If by "they" you mean downstream distribution package maintainers of systemd you might be right depending on which community you reside in.
Unlike many upstream project there is also always last call [2] open for couple of days before each upcoming release ( encase someone might have missed something or wants to discuss some change before release ) and for some facts ( since some people might find those relevant ) the change was committed in early april and it was Zbyszek that was pushing for this change not Lennart ( but apparently the pattern is blame Lennart or Gnome for all the world problems ).
1. https://lwn.net/Articles/688640/
Posted Jun 8, 2016 12:03 UTC (Wed)
by itvirta (guest, #49997)
[Link]
That's a problem with all binary / library upgrades (libc and static binaries too). You need a way to find if a process is still using the old binary, and for long-running services
Posted Jun 8, 2016 16:07 UTC (Wed)
by Nahor (subscriber, #51583)
[Link]
Posted Jun 8, 2016 1:59 UTC (Wed)
by droundy (subscriber, #4559)
[Link]
Posted Jun 8, 2016 3:24 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (6 responses)
Posted Jun 8, 2016 6:39 UTC (Wed)
by dd9jn (✭ supporter ✭, #4459)
[Link] (3 responses)
Posted Jun 8, 2016 8:31 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (2 responses)
Posted Jun 8, 2016 10:46 UTC (Wed)
by hmh (subscriber, #3838)
[Link] (1 responses)
I should add that pkill/killall can actually implement the "no user processes left" behavior, unlike the new systemd functionality, which is about "no processers started by this session are left when the session ends". Two very different things, but people seem to want to claim the new systemd behavior is actually useful for security, when it is *completely useless* for that, so it looks like we need to point out the utterly obvious...
The new behavior is a best-effort house cleanup thing, nothing more. And an unwelcome one *as implemented right now* at that, because it causes too much collateral damage for very little gain. The old behavior, where one would explicitly enable the functionality where useful, was a lot better.
Posted Jun 8, 2016 11:30 UTC (Wed)
by matthias (subscriber, #94967)
[Link]
Having really no processes survive when a session ends makes no sense at all. If I open two sessions and log out of one of them, the second session would be killed. If every session takes care to kill its own processes, no process should survive.
For myself, I like the new behaviour. Not because of security, but because I think that it is the job of session management to do some clean-up. Of course this means that screen/tmux/nohup should get changed to work again. Once these few programs are fixed, there should not be much collateral damage. Before that, I do not expect this change to hit stable distributions, anyway.
Posted Jun 8, 2016 16:13 UTC (Wed)
by Nahor (subscriber, #51583)
[Link] (1 responses)
Posted Jun 8, 2016 16:26 UTC (Wed)
by anselm (subscriber, #2796)
[Link]
Which is presumably why systemd very sensibly does not force that situation, but gives the administrator a variety of tools to specify whether users get to keep processes running after they log out or not, and if so which users.
Posted Jun 8, 2016 4:56 UTC (Wed)
by drag (guest, #31333)
[Link] (38 responses)
What it actually accomplishes is that it helps systemd be a better session management for users. When users log out they don't usually want to have a bunch of processes lingering unless they explicitly expect it. Systemd brings the session into the world and it should be the one to take it out.
What I do NOT like is the fact that if you want to have link to a new dependency to have a program indicate to systemd that it needs to keep running. I especially don't like the idea of tying it into PAM or any such thing. And tmux/screen are not the only types of programs that I may want to linger around. I may want IRC bots, for example. Or have a program that collects and indexes my mail. Or some protein folding application, or whatever.
I should be able to tell systemd that I want a program last past logout. I should be able to do this with a entry in a service file or a systemctl command.
This way I can take full advantage of systemd to handle 'daemonizing' and logging and all that happy stuff without any effort, which will make things much simpler for me. I can program in whatever language I feel without having to figure out how to link it to a new library. And I still will be able to retain the benefits of having systemd kill any lingering process automatically that I don't explicitly tell systemd to leave alone.
To conclude:
Posted Jun 8, 2016 7:34 UTC (Wed)
by ras (subscriber, #33059)
[Link] (37 responses)
'nix has always fulfilled this expectation, since at least V7. When the user logged out every process in the login session was sent a SIGHUP. If the process wants to hang around it had to intercept the signal, since the default was to kill it. (And yes, I know you know this.)
If the user wanted it to hang around he had to use nohup or an equivalent. As far as I can tell that won't change under this regime, albeit nohup will have to jump through different hoops.
Inexplicably, they look to be worse hoops. Before a program could distinguish between the user logging out, (SIGHUP), and the user asking the program to exit (SIGTERM). So for example if I deliberately left a process running after logging out by masking SIGHUP, I could ask it nicely to exit by sending it a SIGTERM later. Now that distinction is gone.
There is a second change: the option to send a SIGKILL if the process doesn't respond to SIGHUP. I can think of situations that might be useful, although it is a stretch - it isn't in any systemd installation I'm familiar with.
A PAM session management plugin would be the cleanest way to implement it. For example, I don't recall every wanting to leave something running when I exit a GUI session on my laptop, but if I ssh into a server and start a tmux session, it had better damned well still be running when I log out. PAM can already distinguish between these cases - no extra code required. Sending the SIGKILL to wayward processes after a grace period would only be a few lines of code.
And there is a third change: to flip the default from not sending the SIGKILL to sending a SIGKILL for everything. To re-iterate what I said above, that would be a perfectly reasonable if most installations had policy abusing users, and so sysadmins found themselves having to change the default on most machines they configured. But given no one has bothered to write the PAM plugin in the last decade I doubt rouge processes running after logout are a serious problem. On the other hand, I can tell you because the current implementation can't distinguish between session types I personally would have to turn it off on every install I do. And I don't know anybody that doesn't apply to.
Finally, I am sure someone will argue that SIGHUP clearly doesn't work because there are occasionally rouge processes left around on logout. But they are only hanging around become someone has screwed up the session tracking (in which case this new solution won't work either), or because they are deliberately ignoring SIGHUP for some reason. Presumably the reason will remain after this change, so they too will alter their programs to jump through the new hoops. And so, after the few it takes everyone to adapt, we will be back where we started.
Posted Jun 8, 2016 8:42 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (5 responses)
The session-mangement has to know wether a process should survive. If a process shall not survive, it has to be SIGKILLed if the usual shutdown procedure fails for whatever reason. We cannot expect all programs to be bugfree.
Posted Jun 8, 2016 16:13 UTC (Wed)
by ballombe (subscriber, #9523)
[Link]
Posted Jun 10, 2016 16:17 UTC (Fri)
by azumanga (subscriber, #90158)
[Link] (3 responses)
Posted Jun 10, 2016 17:49 UTC (Fri)
by matthias (subscriber, #94967)
[Link] (2 responses)
Which processes do you have in mind that would be changed? Obviously tmux and screen. Anything else? Processes that are actually daemons do not count (they are out of scope anyway as they do not belong to the session). Neither do processes count that are explicitly backgrounded by the user. For these processes the user will make sure that session management does not kill them (starting with a wrapper like systemd-run).
Posted Jun 10, 2016 18:05 UTC (Fri)
by viro (subscriber, #7872)
[Link]
Posted Jun 11, 2016 10:53 UTC (Sat)
by ras (subscriber, #33059)
[Link]
All accurate.
> With this change systemd will clean up processes where this did not work.
Sadly systemd isn't physic and so can't currently distinguish between when it didn't work and when it didn't matter. But as you observe, there are only a few known programs this effects. So they could be patched. And I willing to concede the inhouse broken by this change don't matter. Thats seems to be business as usual open source - I get screwed over by non-backward API changes on a fairly regular basis.
But this always cleaning up processes where "it didn't work": not a good idea. The default should be not hide the problem. Just in case you don't know: "didn't work" is bad thing. It's caused by a bug. It is better for all of us if we get that bug fixed ASAP. If processes hanging around caused most of us a lot of pain, then maybe you would have point. But we lived with it for 30 years, so the pain can't be that great.
Nonetheless as you have pointed out most is not all, and in particular this behaviour has caused you real pain. Fair enough. I hope nobody would argue with it providing a solution for you and anybody else that has it.
My issue: that solution has existed for 15 years.
Posted Jun 8, 2016 17:41 UTC (Wed)
by drag (guest, #31333)
[Link] (7 responses)
What if you consider a 'user session' to be system-wide instead of tied to a specific TTY?
I want to start processes that I can connect to and use regardless of how I log into a system, but when I am gone from the system completely I want them to be gone as well. Except in specific cases when I don't.
A example of this is Emacs.
Right now I use Emacs in 'daemon mode'. I would like to be able to connect and use it from different logins. I should be able to launch new client windows from another system over SSH and I would like to be able to use it from local login as well as X. When X dies I don't want it necessarily to be killed along with X if I happen to be using it from SSH for example.
But I also use Emacs for sensitive things. I have gpg-encrypted files that I open in Emacs for old passwords and things like that. So I _REALLY_ don't want Emacs sitting there running forever when I log out completely. It's far better for me to have to re-start emacs then have it sitting there with all my passwords decrypted in memory if I get forcefully disconnected from it.
I also want the behavior to be the same regardless how I happen to launch Emacs. If I launch it from a terminal emulator I want it's session behavior to be the same as if I launched it from ssh or from a X application menu.
> A PAM session management plugin would be the cleanest way to implement it.
How is a PAM session management plugin going to know which programs I want dead and which ones I don't?
Posted Jun 8, 2016 23:17 UTC (Wed)
by ras (subscriber, #33059)
[Link] (6 responses)
It can't easily know. (Easily being the operative word. It's just code, after all. It could say read ~/.kill-on-logout.lua, and use some function in there to decide.)
But I'm missing something here. How can any other solution easily know? Or perhaps you are focusing on the 1 -> 0 sessions trigger I gather the KillUserProcesses solution uses. If so, obviously the PAM plugin could use the same trick.
Sorry, I don't really understand your question.
Posted Jun 9, 2016 4:49 UTC (Thu)
by drag (guest, #31333)
[Link] (5 responses)
Well now I have the feeling that I was missing something.
You just tell systemd you want the process to live via the 'system-run --user' or launch the program via '--user' service file with a 'linger' option. But I suspect that is what the behavior is now for '--user' for 230 anyways. I'll need to play around with it I guess.
If it is true you can cause a process to linger just with a invocation of system-run or defining a service file, then my concerns/issues/questions are already all addressed.
Yeah, so I don't know.
When you can strip out most of the deamonization portions of a program and replace it with a simple <prognam>.service text file or wrapper shell script that calls 'system-run'.... and get superior results then was possible before I suspect that is the way to go rather then screwing around with pam, calls to dbus, or anything else. The simple solution is usually the better one.
I am starting to suspect that this is all just one huge non-issue as far as the technical issues are concerned, the problems are solved and the work needed by the developers is actually simplified. It's the social aspect of it that is the problem. People don't like to see change. I am not trying to downplay the issue, though. The social aspects are important.
Posted Jun 9, 2016 5:43 UTC (Thu)
by ras (subscriber, #33059)
[Link] (4 responses)
Now I'm lost. The work required by developers under the old scenario was one of:
- None: if the user wants it to run after logout, he uses "nohup comand" or uses tmux or something.
- signal(SIGHUP, SIG_IGN)
How on earth could it get any easier than that?
Sure, it doesn't handle the emacs scenario you mentioned, but that seems to be pretty esoteric. It's very rare for me to have two logins on one machine, let alone a burning desire to share an editor session that I want killed when I log out of all sessions. Breaking backward compatibility, and adding all this complexity for the sake of something so specialised seems like a very odd design decision. It only gets odder when you know few lines of Python PAM module could do it.
Posted Jun 9, 2016 7:33 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
The scenario is dead simple - you logout, you log in and get two copies of a process that should exist in only one instance.
Posted Jun 10, 2016 0:57 UTC (Fri)
by ras (subscriber, #33059)
[Link]
> What if a user _does_ NOT want a process to survive a logout? Two scenario's: And so now you say but this is effecting a large cluster of machines the unwashed masses use - and they aren't going to do the kill. So the sysadmin that looks after them to interrupt his tea break on occasion - until the bug fix arrives. As a part time sysadmin myself, I have to concede this is indeed a very serious situation. Fortunately, there is a workaround. A short PAM module:
import os, signal, sys, time Job done! Well maybe. Comparing sessions works fine for ssh, but GNOME creates many of them now. To fix that the easiest kludge would be to kill all the user's processes when all of his sessions are gone. It would be hacky and racey - but this is just a kludge until the real bug fix comes in. It's a pity that's exactly what the real bug fix does.
Posted Jun 9, 2016 16:31 UTC (Thu)
by drag (guest, #31333)
[Link] (1 responses)
These sorts of command-control programs are becoming more common. Tmux is a example. Emacs is a example. But there are others. Gnome-terminal uses a terminal daemon to manage things. Urxvt has the ability to use Urxvtd to as well. With X they are limited to a particular login, but is that same limitation going to exist for Wayland? Is it going to be possible to run the program independent of the display and connect to it?
Then there is things like pulseaudio, mpd (music player daemon), irc bots, irc clients, IM bots, email weirdness, etc etc. These are programs you launch, you leave them floating around, and then connect to using a different process. In the future you'll start running into more AI stuff like Mycroft. Were you have 'helper' programs that the user interacts with, open source versions of Siri, 'hello google' or whatever.
They would all work just a bit better if they were tied to a user being logged into a machine, but not to a specific login.
What I think is going to happen is that we are running into a 'long tail' situation. Each of these things is esoteric, but there are a whole of people wanting to do their own esoteric thing.
At this point I really don't know. I'll have to play around with it.
Posted Jun 10, 2016 4:01 UTC (Fri)
by ras (subscriber, #33059)
[Link]
Lots of good questions. The answer to all of them is probably something along the lines of "session tracking is broken, lets fix it"
It's not like the problem of keeping something around only while there are references to it hasn't been stumbled over, cursed at and solved a million times already. The answer being proffered here is "session tracking is broken, so we're abandoning it".
That aside, rather than solving the issue at hand in a minimally intrusive way (may be by sending a SIGHUP to all processes owned by the user when his login session count drops to 0?), they pair it with "followed by an unconditional SIGKILL" because in their opinion they way we have been doing it for the last 30 years is wrong, and we need to be forced down their enlightened path.
If they had of decoupled the two, they probably would have got the first one through without much fuss and if the second one was a good idea it would have become the default in due course.
Posted Jun 9, 2016 3:33 UTC (Thu)
by ewen (subscriber, #4772)
[Link] (22 responses)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#221
which points out that sending SIGHUP at "session termination" time would have been the compatible thing to do. screen/tmux/nohup, etc, all know how to ignore SIGHUP, and SIGHUP is precisely intended as a "end of user session" indicator, ie, the controlling terminal has gone away. (Now we don't have controlling terminals that much, but we have a more sophisticated idea of "session" -- so "session has gone away" makes more sense as the meaning of SIGHUP.)
The choice to send SIGTERM (ie user initiated termination) rather than SIGHUP (external action initiated, ie session gone) -- and particularly to default to following that up with SIGKILL -- seems to be the root cause of the pain experienced. By contrast, turning on a default of sending SIGHUP at the "end of session", when that's a "GUI session" without a controlling terminal, seems fairly likely to produce the right, backwards compatible, results (since all the "intended to stay running" programs know how to handle SIGHUP, and have for decades; and all the "intended just during login session" do something useful on SIGHUP, even if it's just the default behaviour of exiting).
FWIW, it does seem reasonable to have a "sysadmin enabled, off by default" session manager policy option to also send SIGTERM/SIGKILL at "end of session" if the site policy is "no persistent processes at all". But I don't think that's the common case at all. Particularly for what seems to be the original cause for the change (ie, login session processes persisting "too long" because they didn't get HUP'd on last logout due to not having a controlling terminal).
Ewen
Posted Jun 9, 2016 5:39 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (21 responses)
If 3 wents wrong, the process survives. A cleaner implementation would have been to sent a SIGHUP followed by some SIGHUP2 with the meaning that SIGHUP2 is only intercepted by daemons, i.e., with the SIGHUP a process does a clean exit and SIGHUP2 will terminate all non-daemons that failed for whatever reason to exit. Unfortunately it will be hard to change this semantics.
Ontop we have the problem of processes ignoring SIGHUP for other reasons, as SIGHUP get also send without the login session ending (e.g., closure of X terminals). After all the semantics of SIGHUP have changed in time.
For most of the processes the correct choice is they should not survive the login session. The old behaviour is not really working this way, as SIGHUP is not successful in terminating all these processes. Having the session manager do a SIGTERM/SIGKILL at the end of the session is reasonable. However it needs to know which processes should survive. Therefore we need small changes to very few applications.
- We need some version of nohup that also tells systemd to not kill the process (systemd-run should work, users need to get used to this. Of course one could install a version of nohup that takes care of this)
This way, session management would be much cleaner independent of a no persisting processes policy. Such a policy would then be implemented by not allowing lingering processes and not allowing access to cron, at, batch (and possibly screen,tmux). With screen, tmux employing propper session management, a user using these programs would still be listed as logged in. So it is harder to hide some processes. Of cousre depending on policy someone might want to restrict the access to these programs, too.
Posted Jun 9, 2016 6:00 UTC (Thu)
by ras (subscriber, #33059)
[Link] (14 responses)
True. But if you're doing cleanup on exit, your also doing it for a SIGTERM at the very least. It will be identical code and if a SIGHUP freezes it's very likely a SIGTERM will do the same thing.
> The old behaviour is not really working this way, as SIGHUP is not successful in terminating all these processes.
Then there is a bug. In the situation this change is trying to address is a bug that was introduced deliberately by Gnome / Systemd. They wanted to see some user services (eg, address book), between login sessions. This means when the user logged out, it had to ignore SIGHUP.
I've seen several versions of the address book service running on my own laptop, so I've been hit by it myself. I can't say felt a huge impact beyond thinking "gee, that's untidy". It certainly was not worthy of anything more than raising the energy to file a bug report. I confess it seemed so minor I didn't even bother doing that.
> - We need some version of nohup that also tells systemd to not kill the
Yep, we have it. It's called signal(SIGHUP, SIG_IGN). Then the process won't die, and no one has to learn anything. It ain't rocket science. If you want to enforce a policy of all processes being forced to exit on logout, add a PAM plugin. But most people won't use it because it's not a serious issue on a headless server or a personal laptop, and Gnome will have to find some other way of fixing their bug.
Posted Jun 9, 2016 6:24 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (13 responses)
Therefore systemd sends a SIGKILL after some time. This is meant to bring down the processes where SIGTERM did not work. This is the same mechanics that shutdown has used since decades.
>> The old behaviour is not really working this way, as SIGHUP is not successful in terminating all these processes.
Obviously, but it is not only a bug of gnome. I have seen this bug on KDE many years before systemd even existed. Of course it is nice to fix the bugs, but it is also obvious that there always will be some bugs.
>> - We need some version of nohup that also tells systemd to not kill the process (systemd-run should work, users need to get used to this).
Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating. So there is software ignoring SIGHUP for other reasons as that the process should survive the session. Also every software with a bug in the SIGHUP signal handler could be a problem. From my experience, problems with the SIGHUP handler are the usual reason for processes lingering around that should have exited.
> But most people won't use it because it's not a serious issue on a headless server or a personal laptop, ...
Posted Jun 9, 2016 7:08 UTC (Thu)
by ras (subscriber, #33059)
[Link] (12 responses)
I think you missed the point. The point is there is a bug in the SIGHUP handling, there is also most likely a bug in the SIGTERM handling. Sending a SIGKILL does not fix the problem. It hides it. Assuming the application is trapping both of these for a reason such as saving data, the fixing the bug is the correct path - not hiding it.
That said, it's been a long while since I've seen either problem. Until now, when GNOME introduced it as a "feature".
> Obviously, but it is not only a bug of gnome. I have seen this bug on KDE many years before systemd even existed.
Yep, and it was fixed by KDE long long ago. The difference GNOME / Systemd doesn't consider what they have done to be a bug - it's a new feature. Then to fix the bugs their new feature introduced they want to breaks backward compatibility with systems that don't use GNOME. KDE had the decency to fixed their bugs without using it as an excuse to inflict their version on Utopia on everyone else.
> Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating.
Only when it's been co-opted for other purposes - like reloading the configuration in system daemons. And if the particular program does either they are uninterested in knowing when the user logged out, or they have introduced a bug because there is no other way to know short of polling. This won't change under the proposed new regime, as SIGHUP will remain the way a process learns the login session has ended.
> I expect distros to accept the change, once the few problematic programs have fixes.
We will see. As the article points out, the tmux people don't see their program as the problematic one in this case, and from what I can tell a fairly large cohort of people agree with them.
Posted Jun 9, 2016 9:09 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (11 responses)
I am arguing that session shutdown is like system shutdown. Since decades we use SIGTERM/SIGKILL when shutting down the system. Would you argue that when I type shutdown -r now and some application is not terminating cleanly, then the system should hang forever because sending a SIGKILL after SIGTERM is hiding bugs?
I fully agree that bugs should be fixed, but on the other hand some fundamental things as session management should handle bugs of applications gracefully.
>> Unfortunately, SIGHUP is also sent in some situations when the login session is not terminating.
Posted Jun 10, 2016 2:49 UTC (Fri)
by ras (subscriber, #33059)
[Link] (10 responses)
Yes, I knew that and should have addressed it, I guess. But it did sound to me like "I use a hammer to crack nuts, so why not an egg?"
The way I see it is: a process hangs around after logout when it's shouldn't, about the only harm done is a little lost RAM, or at worst a pinned CPU if it's gone infinite. If that happens and it bothers you, the fix is also simple: kill it. On the other and automatically killing a process when it hasn't shutdown properly delays getting the bug fixed. And there is a bug that needs to be fixed: either it doesn't matter in which case why is trapping SIGHUP at all, or it does matter and tears will follow one day.
If a process doesn't stop on shutdown the implications are much more severe. I've lost control of remote servers because of it. Plane flights cost time and money. It's not that the consequences of killing the process isn't the same: both result in loss of information. It's that the consequences of not shutdown not happening is very different.
> Or a pty going away because an X terminal is closed. Not every X terminal is a session on its own. Semantics have changed in time.
Yes they have. The session id the kernel used has been co-opted for all sorts of purposes now. This is the real problem you are grappling with. We used all sorts of kludges to get around it, but apparently these GNOME changes were the straw that broke the camels back. It seems session tracking is now far too hard, so rather than track sessions you've decided killing all processes belonging to a user is the way to go.
Obviously it's a kludge. It's racy (what if a person is logs between the 0->1 test and you lot starting to kill processes), and it won't always work (what about processes started as a different user), and it isn't backward compatible with what worked for 30 years now.
I'd have more sympathy if you were trying to get something simple done and stumbled onto this mess. Instead, you lot with your multiple systemd process trees are responsible for the worst aspects of it. And all this so you can optimise multiple GNOME sessions on the one machine. Does that even happen?
Posted Jun 10, 2016 3:23 UTC (Fri)
by pizza (subscriber, #46)
[Link]
Yes.
Posted Jun 10, 2016 8:55 UTC (Fri)
by matthias (subscriber, #94967)
[Link] (8 responses)
> Obviously it's a kludge. It's racy (what if a person is logs between the 0->1 test and you lot starting to kill processes),
In contrast to solutions with pkill, systemd should only kill processes belonging to closed sessions (and not of the new session), as it tracks sessions with cgroups. There might be a race condition if the new session decides to use some process of an old session which gets killed. I am not sure whether systemd removes this race by delaying the start of the new session while on a killing spree. This should be possible.
> and it won't always work (what about processes started as a different user),
> and it isn't backward compatible with what worked for 30 years now.
Obviously this change should only hit stable distributions, once screen and tmux are fixed (either upstream or by distribution patches).
Posted Jun 11, 2016 0:30 UTC (Sat)
by ras (subscriber, #33059)
[Link] (7 responses)
Yes, I can well imagine it would be.
But you did have another option: http://lwn.net/Articles/690555/
Well maybe not that precisely, but with a few tweaks you could have made it kill all processes owned by a student when his last session was gone, and unlike the current proposal made it very selective so only the students were effected and not sysadmins or others squawking here. This is exactly the sort of problem PAM's session management is well suited to.
Posted Jun 11, 2016 2:06 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
Posted Jun 11, 2016 8:42 UTC (Sat)
by ras (subscriber, #33059)
[Link] (5 responses)
It was just a work-around, and I don't doubt there are many people who think getting upstream to provide a config option for their particular problem is a better solution. Maybe you are one of them. I'm not.
There is a 50 line solution to the problem. It isn't a patch to upstream I have to carry, the API is stable, a compile isn't required, it doesn't require me to monitor upstream security problems and rebuild it with every fix - it's just drop a file into a directory and go. If I was the sysadmin being given grief by miscreant students, I know I would have invested the hour needed to write it. If as claimed there are lot of other sysadmin's with the same problem, I am somewhat puzzled that it isn't packaged and available on the major distro's already, because if it had been it would have been just a setting in PAM's config file.
Which brings us to the real point. I don't use Linux because it has a setting in a config file for my every need - that's an impossible ask after all. (If I believed it was possible, I would be using Windows. Obviously it's not there yet, but given it's possible it must be just around the corner ...) I use 'nix because it's swiss army knife that is so flexible, in for most problems there is a 50 line solution.
The KillUserProcesses setting looks nothing like that. Elsewhere you said it can be controlled per user. What if I don't think per user particularly useful? Maybe I'm a sysadmin with miscreant student population in a large educational institution that turns over staff regularly, and with every change of staff I have to change the systemd configuration on 100's of machines. I don't think so. Give me a system that provides the flexibility to configure in a way that suits me. Maybe I put all students in the one group, or maybe I lookup payroll, or read a flag out of FreeIPA.
I'd take that over flexibility over a specialised "config option" any day. Quite apart from anything else, I could not be as productive in my profession life without it.
Posted Jun 11, 2016 11:25 UTC (Sat)
by pizza (subscriber, #46)
[Link] (4 responses)
Then it's a good thing that you're not forced to choose between those two, eh?
Posted Jun 11, 2016 11:55 UTC (Sat)
by ras (subscriber, #33059)
[Link] (3 responses)
If we could leave it turned off with no repercussions other than our tmux sessions continue to run, there wouldn't be almost 300 posts on LWN about this. The reality is, if we want GNOME to clean up properly, we have to enable KillUserProcesses. Frankly I'd even accept that, albeit for purely selfish reasons as I'm not a fan of GNOME 3. Unfortunately many of the other window managers rely on GNOME to fill the gaps in their own efforts, including the one I use on my laptop.
This doesn't feel like we are being offered a choice.
Posted Jun 11, 2016 14:55 UTC (Sat)
by pizza (subscriber, #46)
[Link]
You are, in a word, incorrect.
Posted Jun 12, 2016 10:20 UTC (Sun)
by micka (subscriber, #38720)
[Link] (1 responses)
Slowly reading through them. From what I've read up until now two thirds are from 3 or 4 persons. I'm not sure what you can deduce from the number of comments except that there are very talkative commenters.
Posted Jun 17, 2016 16:01 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
Certain topics press certain buttons. Gnome brings out one set of posters.Systemd brings out another (and I've noticed systemd tends to attract troll accounts I've never seen before ...)
And databases? Well that tends to get me going :-) It's all about what matters to people. And some people just enjoy sitting in the peanut gallery lobbing rotten tomatoes ... :-)
Cheers,
Posted Jun 9, 2016 6:47 UTC (Thu)
by ewen (subscriber, #4772)
[Link] (5 responses)
Your case 2 is either (a) "long running daemon", which these days are typically launched by some sort of "init" process (directly or indirectly) so have their own session (and thus okay) or (b) is a "long running user process" (screen, tmux, nohup background process, etc) which are detached from the controlling terminal and (in systemd land) have a "user session" . In both cases (apart from any "site policy") it's intended, by the user/process that started them, that they survive. (And as someone else suggests, PAM seems a good place to put such "all processes must go away on user GUI logout" site policy.)
Your case 3 is a "smarter" background process that wants to, eg, save state and *then* exit. In which case on receiving SIGHUP it should exit very soon afterwards, no problem. If there's a bug that prevents it from getting to exiting in a timely fashion... then that's a bug and should be fixed in the program. Having a "nuke it from orbit, it's the only way to be sure" (SIGTERM/SIGKILL) approach that affects all processes "just in case" of bugs in the occassional program seems... excessive.
As you allude to there is an issue with the "background process that wants to exit cleanly" if they are currently attempting to use SIGHUP for something else (eg, "reread config"). The obvious solution to that problem is to hook their "reread config" option up to another signal -- SIGUSR1 is common. ("SIGHUP to reread config" makes sense as a convention only for running daemons, because it's a "soft exit" -- ie, act as if you exited and started again loading the new config, but without actually exiting. But even there it's at best a kludge. Just a really long standing convention. SIGUSR1 is another common convention for "reread config" or "soft restart", also used for decades.)
It still seems to me there's no need to force every long running program to be rewritten to be "systemd/session" aware, since there's a long standing convention (SIGHUP when the session is going away) that can be used again here and all the relevant programs already understand that so it would be elegantly backwards compatible.
Ewen
PS: If one were starting again from scratch, without 30+ years of history, it's arguable that having screen/tmux/nohup being "session aware" makes sense. But in historical terms they are: they start new (1990s) sessions by detaching from the controlling terminal, which is how "sessions" worked for decades (see, eg, Stevens "Advanced Programming in the Unix Environment"). The semantics/needs haven't changed in the last 30 years, only what indicates "the session" and "end of session".
Posted Jun 9, 2016 8:52 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (4 responses)
The problem is that there are many cases, where a program has to intercept SIGHUP despite the program does not want to survive the session.
I agree fully that not every long running program should be systemd/session aware. The overwhelming majority of daemons just work, as they are started as daemons. If a user wants a special program to survive, the user should use systemd-run instead of nohup in a systemd setting. No need to change the program. The only programs needing changes are programs starting new sessions themselves (screen/tmux if you have other examples I am still waiting to here of them). These program should register a PAM session (not only for not being killed but also for proper management of associated processes like ssh-agent).
Posted Jun 9, 2016 9:22 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Now, I'm quite partial to doing ctrl-Z/bg/disown dance to background unexpectedly long running jobs. E.g. I've started a build and it suddenly decided to download a new version of the docker image. Very slowly.
What would be the best equivalent in this case?
Posted Jun 9, 2016 16:10 UTC (Thu)
by cortana (subscriber, #24596)
[Link] (1 responses)
Posted Jun 9, 2016 16:50 UTC (Thu)
by viro (subscriber, #7872)
[Link]
The point is that back when you had launched that sucker you had no idea that it might need to be left to run - otherwise you would've used nohup to start it in the first place. And yes, there's a bunch of real-world situations where you really don't want to kill the damn thing and restart it from scratch, this time with nohup. Consider a case when what you expected to be a couple of hours of calculations you've started in ssh session on a big fast box at 1pm, only to discover at 11pm, when you get around to checking what it has produced that it's only ~2/3 way through. And you really have to disconnect the laptop you'd been using and leave. Killing that sucker and starting it with nohup means that results won't be there until 2pm tomorrow instead of waiting for you when you get back there in the morning. Sure, you ought to have added checkpointing, etc., but the whole problem is that it was supposed to be a one-off thing, and a reasonably quick one. I've no idea whether that's the scenario original poster had in mind, but it definitely does happen. disown can save you a lot of PITA in such case.
Posted Jun 9, 2016 9:33 UTC (Thu)
by ewen (subscriber, #4772)
[Link]
In the unlikely event that the *foreground* process still isn't behaving properly, and is blocking the session exit as a result, most desktops provide the user with tools to... persuade the process to exit. ("Force Quit" and the like -- which can do the SIGTERM/SIGKILL dance, at the user's explicit request, for the single buggy *foreground* process.)
AFAICT foreground processes were already doing the right thing; the change seems to have been caused solely by background "session-wide" processes (user dbus and the like) and the changes in how they worked.
FWIW, as I said earlier programs like screen/tmux/etc *are* starting new sessions, as that's been historically defined (by detaching from the controlling terminal, calling daemon(), etc). The discussion (in the tmux issue) about putting this "one init/session manager" specific behaviour in a commonly used place (eg, daemon()), so a simple recompile/relink picks it up seems more appropriate than requiring each "working for years" tool to suddenly have to add special code just to avoid being killed in one context. (Even PAM isn't necessarily used on all platforms that tools like screen, tmux, etc, use.)
And IMHO, breaking nohup and then saying "well you should use this other tool (systemd-run) instead just on systems that have this init system" is also unfortunate. It'd probably be better for distributions to install a nohup that continued to "do the right thing" so the background process was allowed to continue to run, keeping the long standing (25+ years I know about) semantics of "nohup".
Ewen
Posted Jun 8, 2016 5:35 UTC (Wed)
by iamsrp (subscriber, #84011)
[Link] (16 responses)
srp
Posted Jun 8, 2016 5:59 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (2 responses)
The Linux desktop is coming!
Posted Jun 8, 2016 7:25 UTC (Wed)
by SimonO (guest, #56318)
[Link] (1 responses)
On servers, cluster nodes in a scientific environment, shared desktops, etc. I'd expect this to be disabled by default. With Linux that is the majority of use-cases, as single user linux-desktop is a small niche of the whole spectrum I guess...
The whole thing appears to be in the hands of distributors. Systemd is a raw tool (like a swiss army knife with a BFG 9000 included ;-) and they need to expose only the safe parts for normal use and include some safety precautions for the BFG.
If all distro's have to put in work to make every new systemd release palatable it should become an upstream problem to reduce the redundant work in all distro's.
Posted Jun 8, 2016 8:34 UTC (Wed)
by ovitters (guest, #27950)
[Link]
It's not that much work to integrate new systemd releases. It often aligns behaviour and reduces differences across distribution, making things easier. It is usually the differences causing the interesting bugs. Theoretically you should be able to modify anything you want, but it does lead to bugs.
Posted Jun 8, 2016 6:36 UTC (Wed)
by oldnpastit (guest, #95303)
[Link] (2 responses)
> The commit that made the change:
And if you go look at those bugreports, the first of them is complaining that latest dbus leaves processes lying around.
And if you go investigate why dbus is now leaving processes around, it's because gnome leaves special magical gnome per-user processes running after the user's session has terminated. And that in turn is relying on some new systemd features to create a user session which persists (or something, I don't really understand systemd).
Posted Jun 8, 2016 18:56 UTC (Wed)
by drag (guest, #31333)
[Link]
Remember how it was common in Gnome 2-land to have a session manager process? So you could go into there and say 'I want FOO started when I log in'?. People would use it for all sorts of fun things. Starting up browsers, launching their terminals, blocking daemons they didn't want launched.
Then along came XDG and the various *.desktop things to make app menus cross-platform. So you could put *.desktop files in ~/.config/autostart/ and have them autostart when you logged into X (provided you used a desktop environment that was compliant).
Systemd wants to do that same sort of thing for your user, but have it be system-wide. That is one of the reasons they want Kdbus.. so you wouldn't have to run all the dbus-launch stuff for each login. You could have a dbus for a user account and have it 'just work'. Thus you have 'user sessions'.
So a lot of the commands you use for managing the init system for your system can be used to manage your user session for your user by using the '--user' option.
So for example I have a ~/.config/systemd/user/synergys.service on my desktop. When I start up my laptop I have a corresponding synergyc service that launches and they try to talk over a ssh session. I can automate the management of all of this through systemd and '--user'.
systemctl enable --user synergys
systemctl start --user synergys
systemctl status --user synergys
journalctl --user -f
etc etc. These things work as expected, but just for your user account.
Posted Jun 9, 2016 16:06 UTC (Thu)
by cortana (subscriber, #24596)
[Link]
Posted Jun 8, 2016 8:49 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (8 responses)
I have seen processes consuming 100% CPU just because shutdown did not work (the SIGHUP handler was there but failed to terminate the process). Having the session management kill such processes seems reasonable. Therefore it has to know which processes should survive.
Posted Jun 8, 2016 10:25 UTC (Wed)
by NAR (subscriber, #1313)
[Link] (7 responses)
As to the technical choice that every program who wants to avoid getting killed after logout has to link against libsystemd - throwing out binary compatibility again. They could have provided some kind of wrapper, so screen could be aliased to something like "wrapper screen".
Posted Jun 8, 2016 11:21 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (1 responses)
And asking that the administrator should do the job of session management seems like a joke to me. It is enough work when the administrator has to react to a user doing malicious things on purpose.
I expect that the usual tools for having background processes are changed before this change hits stable distributions. Before that, disabling the feature is a reasonable choice. tmux and screen could need proper session management (using PAM) anyway (e.g., to avoid that the ssh-agent gets terminated when the user logs out). The current behaviour looks broken even without the systemd change. PAM will do the necessary things to avoid systemd killing the tmux/screen session in this case. I do not see a problem, when a distribution, which uses systemd as PID 1, needs a version of nohup that is linked against libsystemd.
Once theese changes are in place, the distributions can enable the feature again.
Posted Jun 8, 2016 13:18 UTC (Wed)
by NAR (subscriber, #1313)
[Link]
Posted Jun 8, 2016 21:36 UTC (Wed)
by xtifr (guest, #143)
[Link] (4 responses)
Like a lot of programmers (and other technical people), you're ignoring the very common case of the *family* computer! Dad may log out so the kid can work on a school paper. Doesn't mean that dad wants his emacs "daemon" to die. In fact, he may be relying on it not dying!
Posted Jun 8, 2016 23:50 UTC (Wed)
by johannbg (guest, #65743)
[Link] (3 responses)
Posted Jun 9, 2016 4:26 UTC (Thu)
by xtifr (guest, #143)
[Link] (2 responses)
Not everyone has a screaming, top-of-the-line, latest model machine with all the trimmings. Especially those who are trying to feed three or more children! :)
Posted Jun 9, 2016 10:37 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (1 responses)
Posted Jun 9, 2016 21:45 UTC (Thu)
by mstone_ (subscriber, #66309)
[Link]
^^^ lol, yeah, the machine's unusable for several minutes every time someone "fast" user switches.
Posted Jun 9, 2016 2:51 UTC (Thu)
by kokada (guest, #92849)
[Link]
The workaround was to reduce the time that systemd waits for a process to be killed, however this is a more definitive and less hacky solution, at the cost of changing on how you think about *nix logins.
And yeah, this is a problem with Gnome that should be fixed. However it is the kind of problem that is intermitent, does not occur with everyone, and come back after a while (I didn't had this problem in some point update of Gnome 3.18, got back in 3.20). And I remember to had this problem in KDE too, even before systemd existed. So yeah, an old and annoying problem.
Posted Jun 8, 2016 5:56 UTC (Wed)
by marcH (subscriber, #57642)
[Link]
Posted Jun 8, 2016 7:44 UTC (Wed)
by jaromil (guest, #97970)
[Link] (5 responses)
If we include and respect the forks and the freedom some of us have taken to opt out (https://lwn.net/Articles/685521/) then yes, I agree with you. But please keep such reasonable and calm tones also when you evaluate disagreement in action. Hooligans do not help anyone really.
Many of us do not trust how systemd reasons and takes decisions, for us this new tmux story is just the tip of an iceberg and, as professionals, we cannot double up our work at the discretion of such disruptive upstream changes and polarized visions of how GNU/Linux systems should work.
So, thanks for this fine round-up, may the dust in your camp settle and good luck trusting the systemd developer team. I simply don't trust them and believe that fixing a default now and then won't solve the root of the problem.
ciao
Posted Jun 8, 2016 8:22 UTC (Wed)
by ovitters (guest, #27950)
[Link] (4 responses)
It is humorous that you're using your message to complain about how Devuan is not respect, while above quote is pretty lacking respect IMO. In the last article I pointed out how aggressive the Devuan mailing list is.
> If we include and respect the forks and the freedom some of us have taken to opt out
So you want to be included in on a systemd discussion, while you've opted out of systemd?!?
Posted Jun 8, 2016 8:35 UTC (Wed)
by jaromil (guest, #97970)
[Link] (3 responses)
> It is humorous that you're using your message to complain about how Devuan is not respect, while above quote is pretty lacking respect IMO.
I'm not entirely sure what you are trying to say in English language, however I am not lacking respect. I sincerely wish everyone going the systemd way good luck. Further in my message I state that I do not trust the systemd developers. I'm free to do so and that is not lack of respect, that is basic communication. You are free to not trust me. That's how networks of trust are made, actually.
I take the occasion to state that my message was not directed to Corbet, whom has been very correct in including Devuan in the picture with his past article, for which even he has received immediate and very disrespectful critic in response.
>> If we include and respect the forks and the freedom some of us have taken to opt out
I have the right to be in this forum as much as you do. I have opinions about systemd, I like to debate them and I'm even open to criticism.
Please note that our project, Devuan, has been among the first to struggle to avoid personal attacks and create a platform for civil debate and constructive action in the middle of a very unpleasant escalation of views about systemd. Please help me perceive that systemd is not made of young bullies and a crowd of hooligan supporters. Your approach now does nurture this perception in me.
ciao
Posted Jun 8, 2016 8:51 UTC (Wed)
by jaromil (guest, #97970)
[Link]
Posted Jun 8, 2016 13:04 UTC (Wed)
by corbet (editor, #1)
[Link] (1 responses)
Posted Jun 8, 2016 15:48 UTC (Wed)
by jaromil (guest, #97970)
[Link]
Posted Jun 8, 2016 8:01 UTC (Wed)
by peter-b (subscriber, #66996)
[Link] (7 responses)
Posted Jun 8, 2016 8:28 UTC (Wed)
by ovitters (guest, #27950)
[Link] (6 responses)
So current status is that people would notice. Then I can understand. Unfortunately various times maintainers reject anything to do with systemd. Leading to no other solution than to force things. Systemd developers seem to be a bit aggressive to start pushing things, but on the other hand, there's not been too many changes.
This is part of the user sessions, which Lennart talked+blogged about for many years.
Posted Jun 8, 2016 8:52 UTC (Wed)
by paulj (subscriber, #341)
[Link] (3 responses)
It's much easier to update the programmes you know want the new behaviour, than find all the programmes that don't want it. If you miss programmes using the first approach, they just continue with the behaviour they're already running with anyway. For the latter approach, you don't even know if you can find all such programmes - people and organisations have private and internal code.You can't just scan code in free software distro repositories and hope to catch everything.
Posted Jun 8, 2016 9:02 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (2 responses)
The old behaviour is only useful to very few programs (I always see screen, tmux, nohup mentioned), which could easily be fixed.
Posted Jun 8, 2016 14:29 UTC (Wed)
by paulj (subscriber, #341)
[Link] (1 responses)
Pray tell, how does the systemd change to just kill them outright help with that?
Posted Jun 8, 2016 15:33 UTC (Wed)
by anselm (subscriber, #2796)
[Link]
Programs that want to clean up after themselves must already trap SIGTERM (which is the signal that kill(1) and friends send by default). If their cleanup process fails such that they hang or loop rather than exit, or otherwise takes longer than systemd – or for that matter shutdown(8), which follows exactly the same approach as systemd – lets them have, they get bopped on the head with a SIGKILL. This is not exactly breaking new ground.
Posted Jun 8, 2016 11:20 UTC (Wed)
by nowster (subscriber, #67)
[Link]
Posted Jun 8, 2016 22:21 UTC (Wed)
by error27 (subscriber, #8346)
[Link]
That's not true. They made the change first then tried to fix tmux after everyone got enraged already. Read the tmux RFE from the article and look at the date on it. The tmux RFE was from May 27 and the Debian bug was filed on May 26.
That's the way it should have been done, but it's not the way it happened.
Posted Jun 8, 2016 8:29 UTC (Wed)
by NAR (subscriber, #1313)
[Link] (2 responses)
I wish they had the same attitude with pulseaudio and the ALSA bugs, I wish...
Posted Jun 8, 2016 17:42 UTC (Wed)
by kyrias (guest, #101770)
[Link]
Posted Jun 12, 2016 10:54 UTC (Sun)
by micka (subscriber, #38720)
[Link]
Posted Jun 8, 2016 8:35 UTC (Wed)
by ju3Ceemi (subscriber, #102464)
[Link] (2 responses)
You don't have many real people behind a single PC, most of the time.
That systemd feature is only "useful" for multiuser systems, but is evil for multi-uid systems.
Posted Jun 8, 2016 22:52 UTC (Wed)
by droundy (subscriber, #4559)
[Link] (1 responses)
Buggy code doesn't have to be malicious in order to cause problems, and you don't need a multi-user system to benefit from this change. True, you always reboot every time you log out, and maybe that is your habit if you're a Windows user. But I don't have that habit, and prefer for my computer to behave reliably and predictably, where I consider "doing the same thing each time I log in" as an aspect of reliable behavior.
Posted Jun 9, 2016 11:16 UTC (Thu)
by HenrikH (subscriber, #31152)
[Link]
Posted Jun 8, 2016 10:00 UTC (Wed)
by szbalint (guest, #95343)
[Link] (3 responses)
We've had ancient unix plumbing that was really starting to show it's age and while there were some attempts to bring contemporary code into the whole lower layer between the kernel and userland (and system startup), it was not really a coordinated push towards something new. systemd entered that vacuum and pushed hard, solved some problems, made some things convenient to do and won market share with this approach.
The whole problem is that systemd is the php/mysql equivalent of init systems combined with a tendency to borg more and more components into it's sphere of influence. Sure, plently of people use php and mysql too and I don't mean to ignite old flamewars but it's still important to realise that people use software like that _despite_ their technological shortcomings, not because of their technological merits. PHP and MySQL both got bootstrapped into widespread use because they entered a market vacuum, were widely available, easy to start using and were heavily geared towards "keep going" no matter what. The fact that they belong to the "gentleman's C" or "barely adequate" level of software didn't really matter and still doesn't to some extent.
On the macro level I guess it doesn't matter* that much that systemd is making questionable design decisions, it's something that distributions and developers can sort of work around and mitigate for now, people can keep passing --without-stupid-decision-x and their ilk like with OpenSSL and gcc to some extent. What systemd is, is a lost opportunity / opportunity cost. It's not that it worsened the status quo much, it's that we're not building something that's clearly superior to both ancient unix plumbing and systemd, that would actually matter on the large scale and improve things.
*with the caveat that as systemd keeps borging stuff with a willingness to force change on software orthogonal to it and increasing the monolithic complexity, things might start breaking pretty badly
Lennart Poettering is I think the perfect example of someone just smart enough to dig himself into a deep well and with surefire convictions making him incapable of stopping. This current change about killing processes on logout cannot be reasonably justified from a security perspective, killing processes on logout doesn't make things more secure at all. That's a nonexistent security barrier even to the stupidest malicious actor. The fact is that if a user has access to a system, it means control over a huge range of things on that system regardless of whether that user is currently logged in or not. The security barrier is at the point when a user gets deleted, then it makes sense to kill their processes and audit as many resources as the user had access to.
Security is just an excuse in this case, misbehaving gnome processes that do not handle sighup are the real reason for this change and this change is the direct equivalent of MySQL silently inserting '00-00-00 00:00:00' as a date on invalid data, an attempt to brush problems under the carpet where it inevitably leads to more problems, instead of fixing it by rejecting that kind of thing at the source of the problem.
Posted Jun 8, 2016 11:22 UTC (Wed)
by niner (subscriber, #26151)
[Link] (2 responses)
However, what really will keep a superior solution from appearing, is if you never start building it.
Posted Jun 8, 2016 11:35 UTC (Wed)
by szbalint (guest, #95343)
[Link] (1 responses)
(sidenote: I don't think the "why don't you do it better, then?" response is reasonable. Recognising that a situation is bad and being able to fix it are two different things.)
Posted Jun 8, 2016 11:39 UTC (Wed)
by niner (subscriber, #26151)
[Link]
Posted Jun 8, 2016 12:07 UTC (Wed)
by rsidd (subscriber, #2582)
[Link] (14 responses)
Posted Jun 8, 2016 12:52 UTC (Wed)
by pizza (subscriber, #46)
[Link] (13 responses)
Exactly -- you don't expect processes to just sit around because you merely backgrounded them; you had to explicitly request this behavior, in advance (ie nohup/screen/etc) or you had no expectation of it remaining active when you logged out.
Heck, this was one of the first things I learned when I was but a wee lad some twenty years ago with my first exposure to Unix (SunOS) in an educational/scientific settings.
What will happen is that these traditional mechanisms will be extended to speak the right incantations to make sure nothing changes from an end-user's perspective -- except that if you don't explicitly request a process linger past logout, it *will* get killed, instead of proceeding in a schrodinger-ish state. Until then distros won't enable this feature by default.
Posted Jun 8, 2016 14:28 UTC (Wed)
by rsidd (subscriber, #2582)
[Link] (12 responses)
Like it or not (and I know Lennart P. doesn't like it), other unixen do exist, and many of the programs concerned are cross-platform. The most one can hope for is distro-level patching for some of the most popular ones.
Posted Jun 8, 2016 16:06 UTC (Wed)
by pizza (subscriber, #46)
[Link] (1 responses)
Oh, please. A proposal was rejected after determining there was a better way to achieve the desired goals.
> Which distro has promised this, so far?
From TFA, Debian, Arch, and Gentoo?
And, although the official decision is still pending, Fedora will likely follow.
Posted Jun 8, 2016 20:13 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
Posted Jun 8, 2016 18:48 UTC (Wed)
by alankila (guest, #47141)
[Link]
Posted Jun 10, 2016 14:48 UTC (Fri)
by paulj (subscriber, #341)
[Link] (8 responses)
This is a fairly major Unix environment break. I'd have thought it should be via opt-in APIs - not opt-out.
Posted Jun 10, 2016 15:29 UTC (Fri)
by rsidd (subscriber, #2582)
[Link] (7 responses)
I believe the answer is "screw that, we care about the Gnome desktop".
Posted Jun 10, 2016 15:55 UTC (Fri)
by pizza (subscriber, #46)
[Link] (6 responses)
No, the answer is "no matter how long we wait for folks to voluntarily update their stuff, it will never be long enough."
Meanwhile the rest of the world moves on.
Posted Jun 10, 2016 16:12 UTC (Fri)
by rsidd (subscriber, #2582)
[Link] (5 responses)
I'd be more sympathetic if Gnome had produced a desktop that was widely regarded as awesome. Instead, in something like 17 years now, Gnome has produced Gnome1 which was a bad knockoff of KDE/CDE; Gnome2 that was a quite useful thing for some people; Gnome3 that threw out Gnome2 in favour of chasing some mythical unicorn. In that time, diehard Unix users moved to Linux and the BSDs, but many, eventually, to OS X. And desktop users of all persuasions (including Windows) became a minority in the face of the mobile onslaught. Where Linux dominates, but in a form (Android) that has little to do with Unix.
If you want to chase the market, just adopt Android on the desktop already. If you want to go after people who like the Unix way, stop pulling stunts like this that make decades-old habits suddenly stop working. Really.
Posted Jun 10, 2016 17:32 UTC (Fri)
by pizza (subscriber, #46)
[Link] (4 responses)
That only demonstrates that said "diehard unix users" actually prioritize "JustWorks" or even "OOooshiny" far higher than "respecting unix conventions".
Posted Jun 10, 2016 19:30 UTC (Fri)
by flussence (guest, #85566)
[Link] (3 responses)
Posted Jun 10, 2016 20:51 UTC (Fri)
by pizza (subscriber, #46)
[Link] (2 responses)
Oh? By any quantifiable quality, things are better now than they've ever been.
As for touchy-feely stuff like "respecting UNIX conventions". Twenty years ago in my academic days, I was taught that the overriding principle for UNIX was the simplicity of implementation. That was prioritized over everything else, including correctness, performance, and ease-of-use (for both developers and end-users)
Oh, for sake of discussion, I'll refer to this list:
http://c2.com/cgi/wiki?UnixDesignPhilosophy
I'll note that even UNIXen at the time that list was published didn't really adhere to UNIX conventions all that well. As does the entire notion of GUIs.
Posted Jun 11, 2016 21:25 UTC (Sat)
by flussence (guest, #85566)
[Link]
Unix (the abstract ideal people usually talk about) shouldn't be confused with UNIX (sometimes spelled with an ®). I think most here would see ignoring the latter's conventions as a feature, on the other hand I agree with and often design systems according to that list. There are even a few points in there I didn't know about, but was doing anyway...
Posted Jun 13, 2016 13:17 UTC (Mon)
by paulj (subscriber, #341)
[Link]
If I really wanted to shake up my desktop, I'd probably just go for Android as my desktop, and use some rootless Xserver for whatever older apps I needed. I'd have tried that already, but I can't just 'yum install android-desktop' on my Fedora desktop and laptop.
Posted Jun 8, 2016 12:19 UTC (Wed)
by itvirta (guest, #49997)
[Link] (1 responses)
Dear Editor, is this on purpose?
Posted Jun 8, 2016 13:06 UTC (Wed)
by corbet (editor, #1)
[Link]
The working title was "the systemd process-killing apocalypse," but I decided it needed to be a bit more restrained when the article went out.
Posted Jun 8, 2016 13:52 UTC (Wed)
by kh (guest, #19413)
[Link] (1 responses)
Which seems fine, unless you suspect they will unilaterally remove those two knobs shortly after the enterprise distros adopt this.
Posted Jun 8, 2016 14:02 UTC (Wed)
by pizza (subscriber, #46)
[Link]
So, on what are you basing this suspicion? When has systemd ever unilaterally removed something that was part of an external API or configuration option?
There are many things reasonable people can dislike about systemd, but if you're going to claim they're operating in bad faith, you're going to have to back up your assertion with *something*
Posted Jun 8, 2016 15:35 UTC (Wed)
by jberkus (guest, #55561)
[Link] (6 responses)
The problem I have with this change ... as with a lot of systemd's foibles ... is that it's motivated by the desktop use-case, primarily Gnome. Thing is, there are a lot more Linux systems running on servers than on desktops, and a lot of the existing Linux desktops don't run Gnome. Yet all Linux users are being asked to change how their applications work because Gnome can't control its processes. That's equivalent to forcing everyone to wear face masks because a few people have bad breath. Get those people some Scope and stop bothering the rest of us.
I don't buy the security argument for two reasons. First, this is a system-wide parameter which doesn't let admins discriminate among users, making it practically useless. Second, if it was really a security issue, it belongs in SELinux, where we can have real policies for persistent sessions instead of this ad-hoc BS. So I believe that the security argument is just a smokescreen for "we can't fix Gnome so here's a big rug to cover the issues."
Posted Jun 8, 2016 16:41 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link] (2 responses)
This is definitely not true. You can find plenty of examples even in the comments thread of programs that don't exit cleanly. I do think, distributions are doing the right thing in disabling this feature for now, it is definitely not a desktop only problem.
>Second, if it was really a security issue, it belongs in SELinux
That might be true if SELinux was more widely adopted. Unfortunately, that isn't the case.
Posted Jun 8, 2016 17:05 UTC (Wed)
by jberkus (guest, #55561)
[Link] (1 responses)
> That might be true if SELinux was more widely adopted. Unfortunately, that isn't the case.
That's a "but the light's better over here" argument. Poettering is pushing this because it's "the right thing to do". But the *right* thing to do is for it to be in SELinux, where there can be actual admin policies around process-killing instead of just an on/off switch. So we should either do the expedient thing to do (which is to leave the defaults where they are) or the right thing to do (which is to put this in SELinux with hooks in systemd to support it). This change is neither right, nor expedient.
Posted Jun 8, 2016 17:12 UTC (Wed)
by rahulsundaram (subscriber, #21946)
[Link]
> But the *right* thing to do is for it to be in SELinux
I don't see why SELinux is the obviously right place to do it.
> where there can be actual admin policies around process-killing instead of just an on/off switch
It isn't just a switch in systemd. You can have admin policies in polkit.
Posted Jun 8, 2016 17:11 UTC (Wed)
by mjg59 (subscriber, #23239)
[Link]
Posted Jun 8, 2016 20:43 UTC (Wed)
by barryascott (subscriber, #80640)
[Link]
KillOnlyUsers=, KillExcludeUsers=
These settings take space-separated lists of usernames that override the KillUserProcesses= setting. A user name may be added to KillExcludeUsers= to exclude the processes in the session scopes of that user from being killed even if KillUserProcesses=yes is set. If KillExcludeUsers= is not set, the "root" user is excluded by default. KillExcludeUsers= may be set to an empty value to override this default. If a user is not excluded, KillOnlyUsers= is checked next. If this setting is specified, only the session scopes of those users will be killed. Otherwise, users are subject to the KillUserProcesses=yes setting.
Posted Jun 9, 2016 4:36 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
Nice one, thank you.
In an ideal world, upstream would come pre-configured to ease the work of the majority of its users. In the real-world... thank God we're spoilt with our choice of Linux distributions and all the hard and ungrateful work they're doing.
Posted Jun 8, 2016 15:41 UTC (Wed)
by ccchips (subscriber, #3222)
[Link]
I am sorry to have to say this, but this change doesn't sit well with me. If it goes through, it should be EXTREMELY EASY for a person who has installed Linux to fix, and the instructions should be written in big bold letters by the distributor.
If you Linux developers don't want to lose users because of frustrating problems caused by one guy, I suggest you remember that some of us are still out here boosting Linux, and often those who try it aren't all that happy when they get frustrated. There are a lot of packages that don't "just work" on Linux, and that ain't good. The distributors had better make sure software continues to work as expected if you don't want things to get worse.
Posted Jun 8, 2016 16:02 UTC (Wed)
by joey (guest, #328)
[Link] (1 responses)
I hope that at least one of screen or tmux gets support for systemd's API, because I'd like to re-enable KillUserProcesses on my servers eventually.
Posted Jun 8, 2016 16:08 UTC (Wed)
by joey (guest, #328)
[Link]
Posted Jun 8, 2016 16:05 UTC (Wed)
by mezcalero (subscriber, #45103)
[Link] (8 responses)
Lennart
Posted Jun 8, 2016 20:46 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link] (5 responses)
Posted Jun 9, 2016 0:54 UTC (Thu)
by anselm (subscriber, #2796)
[Link] (4 responses)
Daemons, by definition, have no business running in user sessions. SIGHUP is used to prod daemons to reread their configuration exactly because they don't have a controlling terminal and are therefore immune against the original use of SIGHUP, namely their session going away – this means that, for a daemon, the signal is available to be used for this.
Posted Jun 9, 2016 10:04 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (3 responses)
Posted Jun 9, 2016 11:21 UTC (Thu)
by HenrikH (subscriber, #31152)
[Link] (2 responses)
Posted Jun 9, 2016 11:24 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (1 responses)
Posted Jun 9, 2016 19:52 UTC (Thu)
by HenrikH (subscriber, #31152)
[Link]
Posted Jun 9, 2016 10:48 UTC (Thu)
by diegor (subscriber, #1967)
[Link] (1 responses)
Another question: what does it happens if the process just ignore sigterm?
Posted Jun 9, 2016 11:22 UTC (Thu)
by HenrikH (subscriber, #31152)
[Link]
Posted Jun 8, 2016 16:15 UTC (Wed)
by pizza (subscriber, #46)
[Link]
Ironically, this creates a functional regression for those that had already chosen to utilize this feature. It makes far more sense to just flip the default back, instead of disabling this feature altogether.
Posted Jun 8, 2016 18:13 UTC (Wed)
by cyperpunks (subscriber, #39406)
[Link] (1 responses)
will systemd changes destroy such usage?
Posted Jun 8, 2016 19:37 UTC (Wed)
by dtlin (subscriber, #36537)
[Link]
Posted Jun 8, 2016 19:36 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link] (2 responses)
Perhaps a program that lets you start a process in its own control group? Managing control groups to make processes that are explicitly "persistent" by definition?
Posted Jun 8, 2016 19:56 UTC (Wed)
by smcv (subscriber, #53363)
[Link] (1 responses)
That would be systemd-run(1), added in 2013.
Posted Jun 8, 2016 21:35 UTC (Wed)
by flewellyn (subscriber, #5047)
[Link]
Posted Jun 8, 2016 22:31 UTC (Wed)
by flussence (guest, #85566)
[Link] (5 responses)
This to me looks like a workaround for Someone Else's badly-engineered software not exiting in a timely manner. I'm not even sure who's to blame for that software, but it must be pretty awful — and widespread — to elicit this kind of nuclear response. It sets an unpleasant precedent in that people writing long-lived processes, who've done nothing wrong, now have an extra non-standard codepath to deal with alongside Android, Windows and OS X.
I thought systemd's whole mission was about pulling up the weeds, not making excuses to keep them there. If the upstream at fault won't cooperate and fix their bugs, why not replace them with code owned by someone less obstinate? That seems objectively better than fracturing the Linux ecosystem to appease them.
Posted Jun 9, 2016 6:05 UTC (Thu)
by matthias (subscriber, #94967)
[Link]
Most software will not need an extra codepath.
I have not seen any other program so far that is affected. I do not claim that there are none, but there seem to be only very few. All the people calling this a big problem have only mentioned these few examples. If this really is a problem, then I would like to see a few more examples.
Posted Jun 9, 2016 10:11 UTC (Thu)
by dunlapg (guest, #57764)
[Link]
http://lwn.net/Articles/690299/
If it's accurate, it's makes it much less justifiable.
Posted Jun 9, 2016 14:43 UTC (Thu)
by ksandstr (guest, #60862)
[Link] (2 responses)
Per Hanlon's razor, either systemd is again breaking well-working, well-defined userspace (i.e. screen, tmux, nohup'd processes, etc.) just for the dependency-tree influence it yields, or they simply don't know any better than to do a stupid thing and then imperiously double down on it.
Posted Jun 9, 2016 15:18 UTC (Thu)
by johannbg (guest, #65743)
[Link] (1 responses)
Well defined by who?
It's always so that the programs that misbehaving are the ones that need fixing so establish first whether those programs you listed have been technically correctly implemented or are technically misbehaving.
Peoples "feelings", "history", "workflows", "religion" and what not are entirely irrelevant in these types of matter.
So answer this is systemd technically doing the correct thing.
If the answer is yes it's doing the technical right thing then what breaks should be fixed elsewhere.
It's as simple as that.
Posted Jun 9, 2016 17:00 UTC (Thu)
by ksandstr (guest, #60862)
[Link]
By practice across multiple independent implementations of Unix. Call it a silent standard.
Posted Jun 9, 2016 4:38 UTC (Thu)
by jcm (subscriber, #18262)
[Link]
Posted Jun 10, 2016 16:42 UTC (Fri)
by rsidd (subscriber, #2582)
[Link] (8 responses)
This is an utter myth.
Yes, the modern Linux desktop "just works" but that's entirely due to hardware support.
Winmodems were a nightmare. But nobody uses modems anymore.
CD/RW were a nightmare and were the only rewriteable mass storage media. For over 10 years now everyone uses USB mass storage, all devices follow the same protocol, that problem simply went away.
Similarly with webcams, when they all started following a standard USB interface.
Similarly with pretty much all the hardware that one used to fight with. The progress is because of standardisation on hardware interfaces (across various versions of Window and Mac).
I am not at all discounting the efforts of the kernel developers (and, in some cases, hardware manufacturers) in making sure all these things work perfectly. It is a huge and impressive effort. If it weren't, the BSDs would be competitive with Linux on the desktop and laptop. That they aren't is due entirely to hardware support. (And, ironically, the BSDs supported USB before Linux did. Yet, from my experience, FreeBSD continued to crash reliably when using USB media, well into the 2000s.)
But: NONE OF THIS PROGRESS CAN BE CREDITED TO BREAKING LONGSTANDING UNIX CONVENTIONS.
Many of us were used to Ctrl-Alt-Bksp killing the X session. It was convenient, it is an unlikely key combination to hit accidentally. But some people decided it was dangerous. Well, ok.
Many of us were used to Ctrl-Alt-Function keys giving you consoles. That got disabled too for reasons I don't understand. Well, ok.
But the acquiescence is, I think, more from "this is not worth fighting over" or "too few people really care about this", not so much from "this is an intelligent decision and we should go with it".
When you start killing processes on logout, it's a whole different matter. It affects huge numbers of users who have been brought up on the "Unix way". Not just convenience, but actual work. You risk losing days or weeks of work because you forgot that the powers-that-be changed how basic practices work.
Demanding that commands like nohup, screen, tmux and numerous in-house applications adapt to the new reality, because this is what Gnome developers have decided that Gnome users want, is unbelievable arrogance.
Arguing that all these programs can be trivially fixed (how trivially?) misses the point.
The BSDs have a term, POLA ("principle of least astonishment"), that serves as a policy principle for this sort of thing. Linus has something similar for API breakage in the kernel. Even Microsoft works incredibly hard to ensure compatibility across Windows versions.
Destroying decades-old practice in this manner is complete disrespect for the longest-standing and most loyal users. There is no other way to put it.
Posted Jun 10, 2016 17:28 UTC (Fri)
by pizza (subscriber, #46)
[Link]
And a great deal of software written on top of that hardware support to automagically configure and utilize said hardware.
> CD/RW were a nightmare and were the only rewriteable mass storage media.
Nightmare how? At worst they were about the same as using them under Windows.
> Similarly with webcams, when they all started following a standard USB interface.
Webcams remain an ongoing source of joy, because even within that "standard" there's plenty of rope for manufacturers to hang themselves with, and the list of workarounds is a mile long at this point.
> But: NONE OF THIS PROGRESS CAN BE CREDITED TO BREAKING LONGSTANDING UNIX CONVENTIONS.
No, when it came to actual hardware, drivers, and low-level OS manipulation, there were never any UNIX conventions. Every UNIX had its own way of doing those things. And still does. (Even the "everything is a file" abstraction wasn't ever true)
Meanwhile, beyond POSIX, there wasn't any meaningful conventions for building higher-order systems. Sessions? IPC? Everyone had their own mechanisms, none compatible. Even from a GUI perspective, beyond raw xlib, you had nothing you could count on being universal.
Posted Jun 15, 2016 3:00 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (6 responses)
Huh? When was this disabled? Have a reference?
Posted Jun 15, 2016 6:35 UTC (Wed)
by jrigg (guest, #30848)
[Link] (5 responses)
Posted Jun 15, 2016 6:53 UTC (Wed)
by jrigg (guest, #30848)
[Link]
Looks like you can re-enable additional ttys by changing NAutoVTs= in logind.conf .
Posted Jun 15, 2016 8:16 UTC (Wed)
by micka (subscriber, #38720)
[Link] (3 responses)
Posted Jun 15, 2016 8:57 UTC (Wed)
by jrigg (guest, #30848)
[Link] (2 responses)
Posted Jun 16, 2016 7:30 UTC (Thu)
by geek (guest, #45074)
[Link]
Posted Jun 28, 2016 23:14 UTC (Tue)
by mcortese (guest, #52099)
[Link] (1 responses)
While I don't have strong feelings for or against this change in systemd, I have something to say about SIGHUP, nohup & disown that many keep promoting in the comments.
In the glory days of UNIX, I would log in to a shell. When starting a user process, the choices were either run until I shut down the shell (i.e. I log out) or keep running forever. The shell would send a SIGHUP on exit, so the choices were actually either obey or ignore such signal. Since 'obey' was the default, 'ignore' had to be specified: nohup was all I needed, back then.
Today I can have several shells open inside one graphical session, and several (graphical or textual) sessions open at once. The behavior I might require from a user process varies from 'run until I shut down the shell', to 'run until I close this session', to 'run until I close all sessions' to 'run forever'. I can't express this variety with nohup. What I need is a replacement for nohup where I can specify the exact 'scope' I want to give to a process. (Ideally, that would be paralleled by new signals besides SIGHUP that express the variety of possible events with the same granularity, but I'm pragmatic enough to understand this will never happen!)
Now, I don't know if systemd-run is the "nohup on steroids" I envision, nor if KillUserProcesses can make up for the missing signals, but pretending that nohup is still the best tool we can hope for is disingenuous!
Posted Jun 29, 2016 13:57 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
And I really love not having to run bolt-on process reapers.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
a.k.a. “it was discussed, but not with those who will be most affected”.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
to make the transition. It's very disappointing that they've ignored the lessons learned over the past decades on how things like this should be handled.
Distributors ponder a systemd change
to make the transition.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
So, like, until 2056?
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Kernel developers do a lot of stuff that allows breaking existing code. Kernel build option defaults are routinely flipped to change some parts of user-visible behavior.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Wol
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Engineering always involves a set of tradeoffs. It is almost certain
that any time engineering choices need to be made, there will be
options that appeal to some people, but are not appealing to some
others. In determining consensus, the key is to separate those
choices that are simply unappealing from those that are truly
problematic. If at the end of the discussion some people have not
gotten the choice that they prefer, but they have become convinced
that the chosen solution is acceptable, albeit less appealing, they
have still come to consensus. Consensus doesn't require that
everyone is happy and agrees that the chosen solution is the best
one.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
I.e. previously each login had its own D-Bus session (session bus), but now there is one D-Bus "session" per user (user bus).
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
And I suppose not all such processes use dbus.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
And the system damn well has to follow my wishes.
A user-mode process then going and quietly killing the process against my explicit wish is a really bad thing!
They need to honor things like nohup.
Distributors ponder a systemd change
Distributors ponder a systemd change
They should send SIGHUP, or not send anything at all.
Distributors ponder a systemd change
[Unit]
Description=screen
Type=forking
Restart=always
ExecStart=/usr/bin/screen -dmS %I
ExecStop=/usr/bin/screen -S %I -X quit
WantedBy=default.target
DefaultInstance=autoscreen
bash$ systemd-cgls # Amended for brevity
Control group /:
-.slice
└─user.slice
..├─user-XXXX.slice
..│..├─user@XXXX.service
..│..│..└─screen.slice
..│..│......└─screen@autoscreen.service
..│..│..........├─27015 /usr/bin/SCREEN -dmS autoscreen
..│..│..........└─27016 /bin/bash
..│..├─session-2.scope
..│..│..├─ 2216 /bin/sh /usr/bin/startkde
bash$ screen -ls
There is a screen on:
........27015.autoscreen........(06/10/2016 10:31:12 AM)........(Detached)
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
You awareness is incorrect. Systemd is committed to backwards compatibility, so my units from 2012 still work just fine.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distros can (and will) change the default back to "don't kill" until other packages are updated as needed.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Wol
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Wol
Distributors ponder a systemd change
Distributors ponder a systemd change
unless you are actually purging the user from the system ("userdel" style).
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
If the server runs under its own account, then it doesn't matter if the user is logged in or not. The user could start it while staying logged in, then pretend nothing nefarious is happening (ssh is idle, terminal is idle, ...) while doing his evil things using the server.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
“This process is no more. It has ceased to be. It's expired and gone to meet its maker. This is a late process. It's a stiff. Bereft of life, it rests in peace. If its parent had reaped its exit status, it wouldn't be occupying a process slot, but pushing up the daisies instead. It's rung down the curtain and joined the choir invisible. This is an ex-process.”
Tasks: 304 total, 1 running, 303 sleeping, 0 stopped, 0 dead_parrot
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
If by "they" you mean the upstream systemd community, it follow standard announcement procedure like most project do and that even got picked up and announce here on lwn [1].
2. https://lists.freedesktop.org/archives/systemd-devel/2016...
Distributors ponder a systemd change
> Even if you patch the binary if you don't shutdown the [running processes] you just exposed your key to the world.
that needs to be done during the upgrade, not during some arbitrary point in the future, like someone's logout.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
There are also side-channel attacks that can break (some) cryptography. Or your background process can keep the microphone open and thus recover what's being typed.
Lots of possibilities for the enterprising miscreant.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
I figure just A) add a option to systemctl/service files. B) make it possible for admins to disable the linger feature completely.. and then you have the problem largely solved. Distributions can wrap 'tmux' or 'screen' in a shell script to retain the old behavior if they want. Nobody will have to make significant changes to their programs to fit into systemd either.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
def pam_sm_open_session(pamh, flags, argv):
return pamh.PAM_SUCCESS
def might_fail(func, default=None):
try:
return func(*args)
except EnvironmentError:
return default
def session_pids():
my_pid = str(os.getpid())
get_proc = lambda pid, name: open("/proc/%s/%s" % (pid, name)).read()
my_session = get_proc(my_pid,"sessionid")
return (
int(pid) for pid in os.listdir("/proc")
if pid[0] >= '0' and pid[0] <= '9' and pid != my_pid
if might_fail(lamnda: get_proc(pid, "sessionid")) == my_session
if not "Z (zombie)" in might_fail(lambda: get_proc(pid, "status"), ""))
def kill_all(sig):
for pid in session_pids():
might_fail(lambda: os.kill(pid, sig))
def pam_sm_close_session(pamh, flags, argv):
kill_all(signal.SIGTERM)
for i in range(50):
if not any(session_pids()):
return pamh.PAM_SUCCESS
time.sleep(0.1)
kill_all(signal.SIGKILL)
return pamh.PAM_SUCCESSDistributors ponder a systemd change
Distributors ponder a systemd change
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
1. do nothing. kernel will clean up
2. intercept because process wants to survive
3. intercept to do a clean exit (save some data, etc.)
- programs that start some sort of long-living sessions (e.g., screen, tmux) should really start sessions on their own. Starting a PAM session for screen has also the advantage that the session management can take care of not terminating processes like ssh-agent while they are still needed for the program inside the screen.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
process (systemd-run should work, users need to get used to this).
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
> True. But if you're doing cleanup on exit, your also doing it for a SIGTERM at the very least. It will be identical code and if a SIGHUP freezes it's very likely a SIGTERM will do the same thing.
> Then there is a bug.
> Yep, we have it. It's called signal(SIGHUP, SIG_IGN).
I expect distros to accept the change, once the few problematic programs have fixes. Most users will not change it back, once screen and tmux work and the manual says use systemd-run for background processes instead of nohup, as they will not encounter any problems.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
>
> Therefore systemd sends a SIGKILL after some time. This is meant to bring down the processes where SIGTERM did not work. This is the same mechanics that shutdown has used since decades.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
>I think you missed the point. The point is there is a bug in the SIGHUP handling, there is also most likely a bug in the SIGTERM handling. Sending a SIGKILL does not fix the problem. It hides it. Assuming the application is trapping both of these for a reason such as saving data, the fixing the bug is the correct path - not hiding it.
> Only when it's been co-opted for other purposes - like reloading the configuration in system daemons.
Or a pty going away because an X terminal is closed. Not every X terminal is a session on its own. Semantics have changed in time.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
I just tested this starting some process with su as a different user (I temporarily added my test user to the wheel group). The process was terminated, because it was in the same cgroup. This case should not be that important anyway, as normal users are not allowed to start processes as different users.
I agree, but the programs that need changes are very few. For most cases, background processes are started as daemons anyway. I always see screen and tmux mentioned and their session management is broken anyway. Helper programs like ssh-agent get terminated when the user logouts, even when they are needed inside the screen. Registering a session with PAM would be cleaner anyway.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
Wol
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
Why? Just wrap/replace nohup with a script that simply invokes systemd-run.
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
SIGHUP for "session has gone away", not SIGTERM/SIGKILL
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
> https://github.com/systemd/systemd/commit/97e5530cf20
> referenced two bugreports:
>
> https://bugs.freedesktop.org/show_bug.cgi?id=94508
> https://github.com/systemd/systemd/issues/2900
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
What breakage does this actually fix?
Distributors ponder a systemd change
sshd orphans processes when no pty allocated https://bugzilla.mindrot.org/show_bug.cgi?id=396
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
My message is directed to you and all the other hooligans. I hope you understand it, but I'm noticing it may just be inflamatory, so I'll ignore further manipulations in this thread.
>So you want to be included in on a systemd discussion, while you've opted out of systemd?!?
Distributors ponder a systemd change
You did so while disregarding the answers that official Devuan developers have taken the time to give to your questions, but went quoting answers from other random posters who are not affiliated to the mailinglist.
Reference: https://lwn.net/Articles/685750/
MVG
If you are going to talk about respect and avoiding personal attacks, calling others "hooligans" is probably just not the best way to go about it. Can we please keep the name-calling out of the discussion?
Please
Please
Please note in my first reply I call 'hooligans' people acting as such on both camps.
ciao
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
is easy to say that such programs should simply be fixed, but, in the real world, sometimes one has to stop playing whack-a-mole and just pave the field instead.
Paper over bugs in other tools
Paper over bugs in other tools
Paper over bugs in other tools
Multi-user system vs multi-uid system
However, in the real world, Linux is multi-uid : on my PC, I am the real single one true user.
On many servers, same stuff : you have one real user (say: adminsys).
As the only user on my PC, that PC is mine. When I run a code in background, I expect it to keep running. Why on earth should my PC do anything against me ? He is mine, you shall not do anything against my will.
Multi-user system vs multi-uid system
Multi-user system vs multi-uid system
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
What will happen is that these traditional mechanisms will be extended to speak the right incantations to make sure nothing changes from an end-user's perspective
Looking at the tmux bug report inspires little confidence in this prediction. Nor does the quote from the article "So it may be some time before even the programs that are explicitly intended to run after logout are able to work transparently in this manner." Knowing corbet's style of writing this is likely an understatement.
Until then distros won't enable this feature by default.
Which distro has promised this, so far?
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
Please don't shaft scientific computing users!
But what change is it
say anything about what kind of a change it is that is being pondered and discussed.
I ask, because it reminds me of the very bad habits I've seen in some mainstream news sources.
Not on purpose, but I'm not sure it was a bad thing either? The article explains the situation quickly enough, I think.
But what change is it
For now
For now
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
So no, it's not just some broken gnome thing, it's things like alpine that have had decades to get this right and have instead gotten it wrong.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
$ at -m -f run.sh now
I don't even have at, both in my personal Arch Linux and the CentOS 7 I use for work.Distributors ponder a systemd change
But as long as atd is running, it should be fine. Looks like upstream has a systemd service file for atd too.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
- Daemons should be started as daemons. Sytemd will not kill them.
- screen/tmux and the like should register their sessions with PAM. PAM will take care that systemd does not kill. This should be done anyway to manage processes as ssh-agent. Without proper session management they will vanish when the user is logged out even if they are still needed from inside screen.
- Instead of calling nohup you will need to use systemd-run.
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change
if the answer is no it's not doing the technical correct thing then what breaks should be fixed in systemd.
Distributors ponder a systemd change
Distributors ponder a systemd change
Rant
Rant
Rant
Rant
Rant
Rant
All have consoles on Ctrl+Alt+Fn. I don't remember having changed a config setting.
Rant
Distributors ponder a systemd change
Distributors ponder a systemd change
Distributors ponder a systemd change