Finding Spectre vulnerabilities with smatch

By Jonathan Corbet
April 20, 2018

The furor over the Meltdown and Spectre vulnerabilities has calmed a bit — for now, at least — but that does not mean that developers have stopped worrying about them. Spectre variant 1 (the bounds-check bypass vulnerability) has been of particular concern because, while the kernel is thought to contain numerous vulnerable spots, nobody really knows how to find them all. As a result, the defenses that have been developed for variant 1 have only been deployed in a few places. Recently, though, Dan Carpenter has enhanced the smatch tool to enable it to find possibly vulnerable code in the kernel.

Spectre variant 1 is the result of the processor incorrectly predicting the results of a bounds check; it then speculatively executes code with a parameter (such as an array index) that falls outside of its allowed range. This problem can be mitigated by disabling speculative execution in situations where an array index is under the control of a potential attacker. In the kernel, that is done by replacing code like:

    value = array[index];

with:

    index = array_index_nospec(index, ARRAY_SIZE);
    value = array[index];

That's the easy part; the hard part is finding the places in the kernel where the array_index_nospec() macro should be used. Until now, the only tool available has been the proprietary Coverity checker, which is not accessible to everybody and produces a fair number of false positives. As a result, there are only a handful of array_index_nospec() calls in current kernels.

Carpenter's addition to smatch changes that situation by providing a free tool that can search for potential Spectre variant-1 vulnerabilities. The algorithm is simple enough in concept:

What the test does is it looks at array accesses where the user controls the offset. It asks "is this a read?" and have we used the array_index_nospec() macro? If the answers are yes, and no respectively then print a warning.

This test returns a list of about 800 places where array_index_nospec() should be used. Carpenter assumes that a large percentage of these are false positives, and has asked for suggestions on how the test could be made more accurate. Instead of offering suggestions, though, both Thomas Gleixner and Peter Zijlstra confirmed that a number of the reports were accurate; Zijlstra said "I fear that many are actually things we want to fix". He followed up with a patch series fixing seven of them — nearly doubling the number of array_index_nospec() calls in the kernel.

Once the low-hanging fruit has been tackled, there probably will be a focus on improving the tests in smatch to filter out the inevitable false positives and to be sure that vulnerable sites are not slipping through. But, now that there is a free tool to do this checking, progress in this area can be expected to accelerate. Perhaps it will be possible to find — and fix — many of the existing Spectre vulnerabilities before the attackers get there.

Index entries for this article
Kernel	Security/Meltdown and Spectre
Security	Meltdown and Spectre
Security	Static analysis

Finding Spectre vulnerabilities with smatch

Posted Apr 20, 2018 20:24 UTC (Fri) by adamg (guest, #42260) [Link] (3 responses)

Slightly offtopic - I followed a link to Peter Zijlstra's patch series and noticed LWN has a nice interface to threads on lkml:
https://lwn.net/ml/linux-kernel/20180420131407.721875616@infradead.org/

Is this something new or did I live my life without nowing it is there? :)

Great work - thanks!

Mail archive

Posted Apr 20, 2018 20:30 UTC (Fri) by corbet (editor, #1) [Link] (1 responses)

It's brand new and highly experimental; it's essentially a layer built on top of public-inbox. We'll see how it works out.

Mail archive

Posted Apr 21, 2018 2:53 UTC (Sat) by nysan (guest, #81015) [Link]

This is good stuff.
Makes it so much more readable in a webbrowser.

Finding Spectre vulnerabilities with smatch

Posted Apr 25, 2018 6:06 UTC (Wed) by pabs (subscriber, #43278) [Link]

Is there cross-list Message-ID lookup?

Finding Spectre vulnerabilities with smatch

Posted Apr 20, 2018 21:37 UTC (Fri) by neilbrown (subscriber, #359) [Link] (2 responses)

> there probably will be a focus on improving the tests in smatch to filter out the inevitable false positives

I wonder why we think there will inevitably be false positives. That would only be the case if we humans understood the problem better than smatch.
I suspect that for the vast majority of developer, the description given in red in the article is very close to all we really know - and smatch already knows exactly that.
Even then, it may be a lot closer to "accepting" that it is correct rather than "understanding" that it is. This really is something outside our experience. I went to look at a couple of nfsd examples and thought "no, that code is fine"... then I realized that I was missing the whole point and in reality they are quite possibly susceptible to spectre.

I think it is entirely possible that smatch will achieve a much lower error rate than developers would - certainly than I would.

Finding Spectre vulnerabilities with smatch

Posted Apr 20, 2018 21:59 UTC (Fri) by admalledd (subscriber, #95347) [Link] (1 responses)

The difficulty likely comes mostly from the first bit:

> What the test does is it looks at array accesses **where the user controls the offset**.

Knowing if the user can control the offset may be rather hard to know so easily.

(Disclaimer, I have never used smatch and it has been years now since I even compiled a custom kernel, and my knowledge of static analysis tools is certainly not up to speed either. All to say, I could be completely wrong and smatch knows perfectly if something is user controlled. Or not. don't ask me!)

Finding Spectre vulnerabilities with smatch

Posted Apr 26, 2018 10:25 UTC (Thu) by error27 (subscriber, #8346) [Link]

I am the author of Smatch.

Smatch tries to track "the user can set this variable to 0,20-30.". It's pretty important information so I've put a lot of effort into that.

Sometimes it's tricky when you are dealing with function pointers. Sometimes you function pointers like ->set_foo() and the caller always passes user data. That's pretty straight forward.

But maybe the caller looks like "if (user_var >= 0 && user_var <= dev->max) dev->set_foo(user_var);" Then that's trickier and I just record that "user_var" has been capped. This isn't very granular and I have a few ideas about how to improve it.

But then there are things like timer functions where the data and function are closely tied. If there is one timer function that takes user data, the default behavior in Smatch would be to mark every timer function as taking user data. What I do there is use a script to tweak the cross function database to remove that. http://repo.or.cz/smatch.git/blob/HEAD:/smatch_data/db/fi... The kernel is small enough that you can often manually hack around false positives.

Finding Spectre vulnerabilities with smatch

Posted Apr 23, 2018 10:11 UTC (Mon) by jcm (subscriber, #18262) [Link]

A little plug for one way to solve the problem in hardware (doesn't have all the details yet):

https://medium.com/@jonmasters_84473/speculative-data-loa...

The idea has evolved quite a lot since I came up with this last fall, and at some point I'll have some followups. I got the blog out just to force this into the public domain and get people talking.

Finding Spectre vulnerabilities with smatch

Posted Apr 23, 2018 19:53 UTC (Mon) by mchehab (subscriber, #41156) [Link] (5 responses)

I'm a big fan of having testing tools that would allow us to get errors sooner. Yet, my first impressions with regards to this new feature is that such warnings should be taken very carefully.

For example, at the media subsystem, I received this patch: [PATCH 01/11] media: tm6000: fix potential Spectre variant 1

Basically, it tries to "fix" prediction of reading the content of a this static table (with is not modified by the Kernel):

static struct tm6000_fmt format[] = {
	{
		.name     = "4:2:2, packed, YVY2",
		.fourcc   = V4L2_PIX_FMT_YUYV,
		.depth    = 16,
	}, {
		.name     = "4:2:2, packed, UYVY",
		.fourcc   = V4L2_PIX_FMT_UYVY,
		.depth    = 16,
	}, {
		.name     = "A/V + VBI mux packet",
		.fourcc   = V4L2_PIX_FMT_TM6000,
		.depth    = 16,
	}
};

E. g. all that a "spectre" branch prediction hacking tool would be doing is to guess that someting like:

$ v4l2-ctl --list-formats

would be returning:

$ v4l2-ctl --list-formats
ioctl: VIDIOC_ENUM_FMT
	Index       : 0
	Type        : Video Capture
	Pixel Format: 'YUYV'
	Name        : YUYV 4:2:2

	Index       : 1
	Type        : Video Capture
	Pixel Format: 'UYVY'
	Name        : UYVY 4:2:2

	Index       : 2
	Type        : Video Capture
	Pixel Format: 'TM60'
	Name        : A/V + VBI Mux Packet

With is quite obvious by just looking at the code (with it would be required anyway to write such spectre-exploitation tool).

And just knowing that the tm6000 was loaded and was bound to a hardware is enough to get exactly the same data, without needing to do some complex code to exploit Spectre. In other words, just a cat /proc/modules is usually a good hint.

The thing is: I can't foresee any way where the above could actually be exploited by some hacker. The same applies to several other similar "fixes" that, without a real threat scenario, seems just a false positive.

My point is: just blindly test the Kernel without a real threat scenario and start flooding the Kernel with patches that will just add extra latency to syscalls seems just plain wrong on my eyes.

Finding Spectre vulnerabilities with smatch

Posted Apr 23, 2018 20:06 UTC (Mon) by sfeam (subscriber, #2841) [Link] (4 responses)

I may misunderstand what Spectre is capable of, but isn't the concern that normal bounds-checking in the original code is irrelevant? So Spectre could be used to read anything sufficiently near to that table in kernel memory space. You may be correct that nothing interesting is near the table, but the table contents per se are not the concern.

Finding Spectre vulnerabilities with smatch

Posted Apr 23, 2018 20:43 UTC (Mon) by mchehab (subscriber, #41156) [Link] (3 responses)

> I may misunderstand what Spectre is capable of, but isn't the concern that normal bounds-checking in the original code is irrelevant? So Spectre could be used to read anything sufficiently near to that table in kernel memory space. You may be correct that nothing interesting is near the table, but the table contents per se are not the concern.

Specifically at tm6000-core, the only static vars there are the ones that either contain structs with const data or modprobe parameters (and the table itself). Granted, it could try to read memory elsewhere, but, as far as I understand, "elsewhere" is actually limited to L1 cache size, with, in practice, should very likely be restricted to what's there at tm6000 module.

Finding Spectre vulnerabilities with smatch

Posted Apr 23, 2018 23:34 UTC (Mon) by excors (subscriber, #95769) [Link] (2 responses)

> as far as I understand, "elsewhere" is actually limited to L1 cache size

I don't see why that would be the case. An attacker could set e.g. f->index = 0x80000000. The CPU may (incorrectly) speculatively predict the bounds check will pass, then speculatively read format[f->index].name (which is about 32GB after 'format'), then speculatively execute the strlcpy and read characters from that name string. Any read can leak information about the address that was accessed (via its effect on the caches), and in this case the address is the value at an attacker-controlled location in a ~64GB region after 'format', so the attacker could use it to leak the contents of potentially sensitive memory.

Finding Spectre vulnerabilities with smatch

Posted Apr 28, 2018 23:25 UTC (Sat) by dvdeug (guest, #10998) [Link] (1 responses)

According to https://gist.github.com/jboner/2841832 , a main memory access is 20 times as slow as a mispredicted branch, 100 ns versus 5 ns. I can imagine designs that would do speculative main memory accesses, but even before Spectre, the cost of loading something that won't be used into a cache (and ejecting what's actually going to be used) and clogging the memory bus to turn a 100 ns load into a 95 ns load seems unproductive.

Finding Spectre vulnerabilities with smatch

Posted Apr 29, 2018 14:06 UTC (Sun) by excors (subscriber, #95769) [Link]

The benefit can be much bigger than that. E.g. in code like "struct { int n; bool last; char pad[56]; } *p; while (!p->last) { sum += p->n; ++p; }", if you did all the loads and branches sequentially based on their dependencies, it would take ~100ns per iteration (since you can't start the next load of p until you've checked the result of the the previous load). But if you predict the branches then you can queue up dozens of (speculative) loads at once, and complete dozens of iterations per 100ns (limited only by memory bandwidth and queue sizes), which is a massive improvement. That extra parallelism is worth a tiny reduction in cache efficiency.

(In practice you'd need slightly more complicated code to avoid simply being optimised by the cache prefetcher etc, but presumably that kind of code comes up enough in benchmarks and/or real applications to be a worthwhile optimisation, given that Intel has been doing it for two decades.)