Meltdown and Spectre mitigations — a February update

By Jonathan Corbet
February 5, 2018

The initial panic over the Meltdown and Spectre processor vulnerabilities has faded, and work on mitigations in the kernel has slowed since our mid-January report. That work has not stopped, though. Fully equipping the kernel to protect systems from these vulnerabilities is a task that may well require years. Read on for an update on the current status of that work.

Variant 1

Perhaps the biggest piece of unfinished business in January was a proper response to Spectre variant 1 — the bounds-check bypass vulnerability. Variant 1 is likely to be difficult to fix at the hardware level, and so may be with us for a long time. Unfortunately, it is also difficult to address at the software level.

The seemingly final form of the patches for variant 1 has changed the interface yet again. Consider a simple code fragment that might be vulnerable to speculation that bypasses a bounds check:

    if (index < ARRAY_SIZE)
	return array[index];

The way to protect this kind of reference to ensure that no out-of-bounds references to array occur would be:

    if (index >= ARRAY_SIZE)
    	return 0;  /* Or whatever error value makes sense */
    else {
    	index = array_index_nospec(index, ARRAY_SIZE);
	return array[index];
    }

The protective macro array_index_nospec() no longer actually accesses the array; instead, it just manipulates the index value in a way that blocks speculation. It uses the masking technique described in the mid-January article, avoiding the rather more expensive barrier operations entirely. For cases where the operation that needs to be protected is more complex than a simple array access, there is another macro called barrier_nospec() that does use a barrier to block all speculative activity. It is rather more expensive to use than array_index_nospec(), on the x86 architecture at least, but sometimes there is no alternative.

Actual uses of these new macros are relatively scarce at the moment. The get_user() function in the kernel is one area of concern, since it can be used to attempt an access to an arbitrary address in the kernel; since get_user() has the necessary bounds check to ensure that the given address points into user space, adding a call to array_index_nospec() (more correctly, an optimized assembly version of it for x86) is enough to prevent problems. The __get_user() variant, though, lacks those checks and is invoked in well over 1,000 call sites in the kernel. Protecting __get_user() requires tossing in a barrier_nospec() invocation.

Another area of concern is the system-call table, which is indexed using an integer value (the system-call number) from user space. A call to array_index_nospec() is used to prevent out-of-bounds access to that table. Protections have also been added for file-descriptor lookup, in the KVM code, and in the low-level wireless networking code. Nobody believes that all of the potentially exploitable places have been found, though.

Meanwhile, there is an arm64 patch set in circulation with mitigations similar to the x86 patches. ~~It has fewer mitigations currently~~ It doesn't repeat the non-architecture-specific mitigations found in the x86 tree, but does add protections to the futex() system call that are not currently present (and maybe not needed) for x86.

Finding the remaining locations where variant-1 protection is needed is likely to require fairly advanced static-analysis tools. The work done so far has relied on the proprietary Coverity tool, and has had to contend with a high false-positive rate. Everybody involved would like to see a free tool that could do this work, but nobody is apparently working on such a thing. That is certain to slow the rate at which vulnerable code is found and increase the rate at which new vulnerabilities are introduced.

Arjan van de Ven has suggested that what really needs to happen is a centralization of many of the security checks that are currently dispersed throughout the kernel. He recommends the creation of a utility function described as:

    copy_from_user_struct(*dst, *src, type, validation_function);

Where the validation_function() would be automatically generated from the UAPI headers that describe the interface to the kernel. Widespread use of a function like this would free most developers from the need to worry about Spectre variant-1 vulnerabilities; it might also improve the (not always great) state of argument validation in general.

Variant 2

Spectre variant 2 (branch prediction buffer poisoning) has been mostly mitigated by way of the "retpoline" mechanism that was merged for the 4.15 kernel release. With the GCC 7.3 release, a retpoline-capable compiler is finally available. There are, however, a number of loose ends that are slowly being dealt with.

There is still a fair amount of uncertainty around the question of when retpolines provide sufficient protection. The "indirect branch prediction barrier" (IBPB) operation provided by Intel in recent microcode updates will protect against poisoning, but its use is expensive, so there is a desire to avoid it whenever possible. There are cases, though, such as switching into and out of a virtualized guest, where IBPB is needed.

There is also the inconvenient fact that Intel released a number of microcode versions with implementations of IBPB that, to put it politely, did not function as well as users would have liked. Dealing with that last problem requires avoiding IBPB entirely on the affected microcode versions. There was some discussion over whether the kernel should blacklist known-bad versions or use a whitelist of known-good alternatives; the latter approach was somewhat driven by worries that Intel was never going to get things right. In the end, though, the blacklist approach won out, on the theory that the problems have, in the end, been fixed.

Similar concerns relate to the "indirect branch restricted speculation" (IBRS) barrier-like operation that, by some accounts, is needed to get full protection on Intel Skylake-generation processors. That, too, has had issues with some microcode versions. Those too, with luck, have been fixed; if not, David Woodhouse warned: "then I think Mr Shouty is going to come for another visit."

There is still some resistance to using IBRS at all, though. It also is an expensive operation, and nobody has demonstrated an exploit on Skylake processors when it is not used. Meanwhile, Ingo Molnar has proposed a different approach: use the ftrace machinery to keep track of the number of entries in the return-stack buffer (RSB) and force the use of a retpoline when it gets too deep. It is not yet clear that this idea can be implemented in a practical way; Thomas Gleixner has played with it but he ran into some complications and set it aside for now.

One concern about variant 2 is that it might lend itself to attacks by one user-space process (or thread) against another. JavaScript code running in a browser appears to be the most likely vector for such an attack, but it's not the only one. This patch, for example, is an attempt to protect high-value processes by issuing an IBPB barrier prior to switching into a process that has marked itself as being non-dumpable. The idea is to provide some protection for programs like GnuPG while avoiding the overhead of IBPB on every context switch.

Other odds and ends

Protection against Meltdown ("variant 3") was mostly in place when the embargo fell in January; its basic form has not changed much since then. There are numerous bugs to fix, of course, and that work has been ongoing. The arm64 architecture gained kernel page-table isolation during the 4.16 merge window. There has also been some work on the x86 side to avoid using kernel page-table isolation on systems that do not need it — AMD processors and ancient Intel processors, for example. The whitelist of safe processors is slowly growing.

Systems with Meltdown and Spectre mitigations also have a new sysfs directory (/sys/devices/system/cpu/vulnerabilities) listing known CPU vulnerabilities and their mitigation status. On your editor's laptop, they currently read:

    meltdown:	Mitigation: PTI
    spectre_v1:	Vulnerable
    spectre_v2:	Mitigation: Full generic retpoline

There have been some concerns that these files, which are world-readable, provide useful information to attackers and should be restricted. On the other hand, Alan Cox responded, that this information is already readily available and it can be useful to utilities like just-in-time compilers, which might change their output when certain vulnerabilities are present. As of this writing, no patches changing the protections on those files have been merged.

Other than that, though, everything described here has been merged for 4.16 and is quickly headed toward the stable kernel updates as well. There are a number of smaller issues not described here that were also addressed for 4.16; see this pull request for the full list. Clearly, even if things have slowed a bit to allow the developers involved to get some sleep, a lot is still happening to deal with the fallout from Meltdown and Spectre.

Index entries for this article
Kernel	Security/Meltdown and Spectre
Security	Linux kernel
Security	Meltdown and Spectre

Good one, David

Posted Feb 6, 2018 0:04 UTC (Tue) by Frogging101 (guest, #113180) [Link]

> David Woodhouse warned: "then I think Mr Shouty is going to come for another visit."

I laughed. Thanks for that :D

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 1:03 UTC (Tue) by karkhaz (subscriber, #99844) [Link] (4 responses)

Two things about static analysis to detect where Variant 1 mitigations need to be applied.

1. It seems to me that the Clang Static Analyzer would be an excellent tool to do this, if only it could parse the kernel. There has been excellent work on compiling the kernel with Clang/LLVM (as described in a talk at the last LLVM Developers' Meeting [1] and here on LWN [2]), but I'm not aware of efforts to do analysis on the kernel (would love to hear if anybody is working on this!).

In particular, the Clang Static Analyzer preserves the AST of the language, whereas many comparably powerful tools first convert the program into some lower-level intermediate representation. The IR is usually better for implementing proper static analyses, but for ``pattern-matching'' cases like this one, you really do want the original AST.

I'm not sure if GCC has softened their position on allowing the AST to be exported to help static analysis tools, text editor autocompletion, and other use cases; but if they have not, then here is one more argument that they really should. We're compiling the kernel with GCC, and it would be really excellent to dump the kernel's AST for cases like this.

2. I contribute to a static analyzer which does convert program code into IR. However, the particular use case in the article does seem to be fairly easy to recover from IR. Is it really just: "flag the locations where the access of an array is guarded by a bounds check for that access"? I would be interested in exactly what pattern is needed (the article does not elaborate), and what trouble Coverity has had (Coverity isn't very good, but this seems quite simple, so I wonder about the nature of the false positives they're reporting and whether I could do any better). Does anybody have pointers to discussion about this or other relevant information?

[1] https://www.youtube.com/watch?v=6l4DtR5exwo
[2] https://lwn.net/Articles/734071/

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 15:30 UTC (Tue) by Koral (guest, #115236) [Link]

I think it would be really possible to create a gcc plugin that is doing some pattern matching on the AST and raising a warning in case of spectre var 1.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 18:57 UTC (Tue) by amaranth (subscriber, #57456) [Link]

It's not just "flag the locations where the access of an array is guarded by a bounds check for that access", that would have an insanely high false positive rate. What is needed is to track inputs to the kernel and find places where they are used (however indirectly) to index an array. If those places have bounds checks they need the Spectre v1 mitigation applied.

Meltdown and Spectre mitigations — a February update

Posted Feb 9, 2018 9:19 UTC (Fri) by jezuch (subscriber, #52988) [Link]

If it was just that then you wouldn't need powerful static analysis, Coccinelle would be enough.

Meltdown and Spectre mitigations — a February update

Posted Feb 16, 2018 11:53 UTC (Fri) by oldtomas (guest, #72579) [Link]

> I'm not sure if GCC has softened their position on allowing the AST to be exported [...]

It has, since a while ago (version 4.5, somewhere 2011):
https://old.lwn.net/Articles/457543/

People are using that for static analysis:
https://old.lwn.net/Articles/370717/

Grsecurity is actually using that in the Linux kernel context:
https://old.lwn.net/Articles/691102/

And for those who want to play comfortably, there's a kind of "meta-plugin" which you can program in some kind of Lisp (and which ironically is named MELT):
http://gcc-melt.org/

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 1:38 UTC (Tue) by jcm (subscriber, #18262) [Link]

A scanner that audits binaries for variant 1 should be available soon from a fantastic tools person who has been busily working on it. There's a lot more tooling to create and hopefully folks can collaborate on that over the coming months - this isn't done. Btw, there are several ways you could address variant 1 in hardware. They aren't inexpensive to implement but they exist.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 1:38 UTC (Tue) by roc (subscriber, #30627) [Link] (1 responses)

Has anyone discussed how array_index_nospec behaves when the array is empty? Seems to me that it allows speculation based on what would be at the first element of the array.

I guess in common cases the array is either statically known to be non-empty or the data pointer for an empty array would be null, but it may be a potential footgun. I can't find any discussion of this.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 4:02 UTC (Tue) by willy (subscriber, #9762) [Link]

If you kmalloc 0 bytes, you get a special ZERO pointer back, different from NULL, but still in the unmapped first 4k page.
I suppose there might be a danger if you have struct { int count; void *entries[]; } and kmalloc four bytes to store the int ...

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 7:04 UTC (Tue) by mjthayer (guest, #39183) [Link] (6 responses)

I asked this in a previous comment section, but - is any work going into expanding the KPTI stub kernel which is still mapped into all processes? There must be enough code and data in the kernel which doesn't leak security-critical information which could safely live there, thereby reducing the performance hit for processes which need to use that code and data. Maybe it would even be possible to map page cache pages which a process has access to in that process's address space.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 14:16 UTC (Tue) by corbet (editor, #1) [Link] (5 responses)

I am not aware of such work. The focus is still very much on dealing with corner cases and ensuring that everybody is protected. In truth, I'd be surprised to see work on expanding the amount of the kernel exposed when running in user mode. The gain would be small (I believe), the security risks would be real, and any benefits would be fleeting since future processors will not, one hopes, have this vulnerability in the first place.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 16:07 UTC (Tue) by Paf (subscriber, #91811) [Link] (4 responses)

John,

That seems to run counter to something I read - in one of the many Meltdown/Spectre articles here - suggesting that KPTI is a "big hammer" that will deal with many potential speculative execution derived security holes. This seems to imply that the speaker (I believe it was a kernel developer) thinks it's likely KPTI will be needed in the future, even once Meltdown is fixed (as of course it should be in future hardware).

Any thoughts?

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 16:13 UTC (Tue) by corbet (editor, #1) [Link] (3 responses)

KPTI does have the advantage of providing SMEP emulation for free on x86, so that might be a reason to keep it around. I'd still be surprised to see anybody working to move kernel-space memory back into the user-space page tables, though. The ratio of risk to performance gain seems way too high.

Meltdown and Spectre mitigations — a February update

Posted Feb 6, 2018 16:53 UTC (Tue) by corsac (subscriber, #49696) [Link] (2 responses)

But future processors with RDCL_NO (protection against Meltdown) are likely to have SMEP/SMAP support.

And past processors without SMEP are unlikely to have PCID, so the cost of KPTI is huge there.

Meltdown and Spectre mitigations — a February update

Posted Feb 7, 2018 15:19 UTC (Wed) by MarcB (subscriber, #101804) [Link] (1 responses)

The cost is fixed per syscall. So it is marginal for many workloads, even on old CPUs. And, of course, horrible for others, even on modern CPUs.

I am wondering, if it would be possible to lower the cost significantly. There are architectures - like SPARC - where separate address spaces are mandatory, after all. Did they accept the performance penalty or have they put thought - and silicone - into optimizing this? (Likely: both).

From a layman's view, separate address spaces seem like the "cleaner" solution.

Meltdown and Spectre mitigations — a February update

Posted Feb 7, 2018 15:21 UTC (Wed) by MarcB (subscriber, #101804) [Link]

I'd like to retract an 'e' :-)

Meltdown and Spectre mitigations — a February update

Posted Feb 19, 2018 20:26 UTC (Mon) by garloff (subscriber, #319) [Link]

Maybe interesting benchmark results for the readers:
https://imagefactory.otc.t-systems.com/Blog-Review/SpecEx...
The difference b/w IBRS and Retpolines is HUGE.