Developers split over split-lock detection
Split locks are, in essence, a type of misaligned memory access — something that the x86 architecture has always been relatively willing to forgive. But there is a difference: while a normal unaligned operation will slow down the process performing that operation, split locks will slow down the entire system. The 1,000-cycle delay may be particularly painful on realtime systems, but split locks can be used as an effective denial-of-service attack on any system. There is little disagreement among developers that their use should not be allowed on most systems.
Recent Intel processors can be configured to execute a trap when a split lock is attempted, allowing the operating system to decide whether to allow the operation to continue. Fenghua Yu first posted a patch set enabling this trap in May 2018; LWN covered this work one year later. While many things have changed in this patch set, the basic idea has remained constant: when this feature is enabled, user-space processes that attempt a split lock will receive a SIGBUS signal. What happens when split locks are created in other contexts has varied over time; in the version-10 patch set posted in late November, split locks in the kernel or system firmware will cause a kernel panic.
This severe response should result in problems being fixed quickly; Yu cites a couple of kernel fixes for split locks detected by this work in the patch posting. That will only happen, though, if this feature is enabled on systems where the code tries to create a split lock, but that may not happen as often as developers would like:
Disabling the feature by default guarantees that it will not be widely
used; that has led to some complaints. Ingo Molnar asserted that
"for this feature to be useful it must be default-enabled
",
and Peter Zijlstra said:
Zijlstra is particularly concerned about split locks created by code at the
firmware level, something he sees as being likely: "from long and
painful experience we all know that if a BIOS can be wrong, it will
be
". Enabling split-lock detection by default will hopefully cause
firmware-created split locks to be fixed. Otherwise, he fears, those split
locks will lurk in the firmware indefinitely, emerging occasionally to burn
users who enable split-lock detection in the hope of finding problems in
their own code.
Forcing problems to be fixed by enabling split-lock detection by default has some obvious appeal. But the reasoning behind leaving it disabled also makes some sense: killing processes that create split locks is an ABI change that may create problems for users. As Tony Luck put it:
#include <linus/all-caps-rant-about-backwards-compatability.h>
and the patches being reverted.
Zijlstra is not worried, though; he feels that the kernel issues have mostly been fixed and that problems in user-space code will be rare because other architectures have never allowed split locks. For those who are worried, though, he posted a follow-up patch allowing split-lock detection to be controlled at boot time and adding a "warn-only" mode that doesn't actually kill any processes.
In that patch set, he noted that "it requires we get the kernel and
firmware clean
", and said that fixing up the kernel should be
"pretty simple
". But it turns out to be perhaps not quite
that simple after all. In particular, there is the problem pointed
out by David Laight: the kernel's atomic
bitmask functions can easily create split-lock operations. The core
problem here is that the type of a bitmask is defined as unsigned
long, but it's natural for developers to use a simple int
instead. In such cases, the creation of misaligned accesses is easy, and
those accesses may occasionally span a cache-line boundary and lead to
split locks.
Opinions on how to solve this problem globally vary. Yu posted a
complex series of cases meant to make atomic bit operations work for
almost all usage patterns, but that strikes some as too much complexity;
Zijlstra said
simply that this solution is "never going to happen
". An
alternative, suggested
by Andy Lutomirski, is to change the atomic bit operations to work on
32-bit values. That would, obviously, limit the number of bits that could
be manipulated to 32. Zijlstra noted
that some architectures (alpha, ia64) already implement atomic bit
operations that way, so it may well be that 32 bits is all that the
kernel needs.
There is one other "wrinkle", according to Sean Christopherson, getting in the way of merging split-lock detection: the processor bit that controls split-lock detection affects the entire core on which it's set, not just the specific CPU that sets it. So toggling split-lock detection will affect all hyperthreaded siblings together. This particular wrinkle was only discovered in September, after the split-lock patch set had already been through nine revisions, leaving the development community less than fully impressed. But now that this behavior is known, it must be dealt with in the kernel.
If split-lock detection is enabled globally, there is no problem. But if there is a desire to enable and disable it at specific times, things may well not work as expected. Things get particularly difficult when virtualization is involved; guest systems may differ with each other — or with the host — about whether split-lock detection should be enabled. Potential solutions to this problem include disallowing control of split-lock detection in guests (the current solution) or only allowing it when hyperthreading is disabled. Nobody has yet suggested using core scheduling to ensure that all processes running on a given core are in agreement about split-lock detection, but it's only a matter of time.
All of this adds up to a useful feature that is apparently not yet ready
for prime time. The question at this point is whether it should be merged
soon and improved in place, or whether it needs more work out of tree.
Perhaps both things could happen, since 5.6 is the earliest time it could
be pulled into the mainline at this point. Split-lock detection exists to
eliminate unwanted delays, but it still seems subject to some delays
itself.
Index entries for this article | |
---|---|
Kernel | Architectures/x86 |
Posted Dec 8, 2019 1:53 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (7 responses)
There are two statements too simplistic to be true in this single sentence:
- "Everything" implies the subset of users who would explicitly turn the feature on is exactly zero.
The ability to predict the future is impressive, the ability to predict it with so much precision is even more.
Nuances maybe?
Posted Dec 9, 2019 11:43 UTC (Mon)
by xophos (subscriber, #75267)
[Link]
Posted Dec 9, 2019 12:07 UTC (Mon)
by cesarb (subscriber, #6266)
[Link] (3 responses)
That is, in the scenario where this feature is default-enabled, broken software will end up being fixed; in the scenario where this feature is default-disabled, not only will broken software not be fixed (there being no pressure to do so), but also there will be pressure to never enable this feature.
Personally, I think this feature should be enabled by default, since it's a security fix against a local DoS issue, with a compatibility knob for those running systems with trusted but buggy software.
Posted Dec 9, 2019 16:34 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
Why write "everyone" if/when you mean "most"? Considering the main question is "how many?", it gives the impression that one hasn't really thought about it.
There are cases where the difference between "all" and "most" doesn't matter. But here it does because it takes very few early adopters to find bugs. Even fewer if they're running a data center.
> (2) there's no pressure on fixing the bug if most people aren't affected, [...] not only will broken software not be fixed (there being no pressure to do so), but also there will be pressure to never enable this feature.
How's any of that worse than not merging the code at all?
Is blocking the code itself merely used as leverage in the discussion about the default setting?
Posted Dec 9, 2019 23:46 UTC (Mon)
by nix (subscriber, #2304)
[Link] (1 responses)
Isn't a SIGBUS also a local DoS? Rather more of one than a 1000-clock stall?
Posted Dec 9, 2019 23:50 UTC (Mon)
by corbet (editor, #1)
[Link]
Posted Dec 9, 2019 16:54 UTC (Mon)
by naptastic (guest, #60139)
[Link] (1 responses)
The nuances are not lost on Zijlstra; they've been covered elsewhere in the thread. His point is correct though: split-locks cause huge latencies, which are only going to get worse, and nobody's gonna fix anything until the plug's been pulled.
(ATI's graphics drivers still relied on the BKL despite YEARS of advanced warning that it was going away. When the BKL was fully removed, I couldn't upgrade my kernel; I think I waited 3 kernel releases before saying "screw it" and selling all my ATI/AMD gear in favor of Nvidia. Pull the plug. Break the bad things. It's the only way to make progress happen.)
Posted Dec 9, 2019 20:27 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
...
Posted Dec 8, 2019 5:20 UTC (Sun)
by flussence (guest, #85566)
[Link] (9 responses)
So maybe there's something else going on here that Intel's not telling us.
Posted Dec 8, 2019 13:42 UTC (Sun)
by tux3 (subscriber, #101245)
[Link] (8 responses)
Posted Dec 9, 2019 0:31 UTC (Mon)
by Tov (subscriber, #61080)
[Link] (4 responses)
Furthermore people will have little sympathy of their trusty old applications suddenly being killed with SIGBUS errors due to some new standards of performance and "correctness".
Hopefully I am misunderstanding how this is supposed to work...
Posted Dec 9, 2019 3:20 UTC (Mon)
by hmh (subscriber, #3838)
[Link] (3 responses)
Not to mention truly ancient stuff that runs under DOSEMU and friends...
So, I bet one will be able to disable this "feature" in its final form during boot, if not at any time...
Posted Dec 9, 2019 11:37 UTC (Mon)
by pbonzini (subscriber, #60935)
[Link]
Posted Dec 9, 2019 12:07 UTC (Mon)
by smooth1x (guest, #25322)
[Link] (1 responses)
It would have to be optional as not every environment will be capabale of being 100% clean.
I would expect distributions to either have this turned off on Desktop installs or have their installer have a blacklist for firmware versions.
However for commercial environments would they not test first?
Surely all commercial environments are agile pipeline driven environments who break early and can fix things instantly? :->>
Posted Dec 12, 2019 16:03 UTC (Thu)
by dps (guest, #5725)
[Link]
Tasks like fixing spelling mistakes, especially those in comments, become almost impossible.
The only way agile environments will be made to fix split locks is for them to cause something very bad to happen somewhere they care about. A demonstration that an allegedly safe interface to root allowed anybody to read /etc/shadow was required to get time to fix it.
As you might expect no customers where told anything and no fix issued. I will neither confirm nor deny any suggestions about the company or product involved.
Posted Dec 13, 2019 6:51 UTC (Fri)
by GoodMirek (guest, #101902)
[Link] (2 responses)
Posted Dec 13, 2019 8:26 UTC (Fri)
by bof (subscriber, #110741)
[Link] (1 responses)
Posted Dec 13, 2019 8:47 UTC (Fri)
by GoodMirek (guest, #101902)
[Link]
Posted Dec 20, 2019 14:16 UTC (Fri)
by oldtomas (guest, #72579)
[Link]
After that, it may be time to declare it the default!
IOW: why should everyone else play guinea pigs to the benefit of the cloud providers?
Developers split over split-lock detection
- "Remain" implies the default setting can never be changed in the future after the feature is merged and - for instance - some transition period.
Developers split over split-lock detection
Or this is a simple hyperbole to make a point.
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
The signal only affects the process creating the split lock. The slowdown affects the entire system...
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
Developers split over split-lock detection
E.g. if I run 100 faulty worker nodes, spread over 100 random hosts, each host having 48 CPU cores (96 vCPU's due to hyperthreading/SMT), that looks like a potentially significant issue.
Cloud providers -- lead the pack!