The bumpy road to reference-count protection in the kernel
When reference-count hardening was covered here last July, most of the attention was on a PaX-derived patch set posted by David Windsor. More recently, this patch set has been taken over by Elena Reshetova, who posted a new revision on November 10. The basic approach taken by the patch set has not changed: the kernel's atomic_t type, which is the usual choice for reference-count implementations, is instrumented to detect potential overflows. When an overflow happens, warnings are issued, the offending process is killed, and the affected counter is frozen at a high value so that it will never return to zero. That turns a potential use-after-free vulnerability into a memory leak, hopefully closing off an avenue of attack.
This time around, the patches ran into some stronger opposition, much of which came from core developer Peter Zijlstra. He had two fundamental objections to the approach taken with these patches; the first of those is that they do not preserve the atomic nature of atomic_t, leaving code open to certain kinds of race conditions. This race condition, which was known to Cook and others, could allow an attacker to bypass the reference-count protection. The conclusion that had been reached was that the risk was acceptable and that, in particular, the bypass could still be detected, even if it could not be prevented.
In a sense, the fact that this vulnerability has not been fixed in the hardening patches can be seen as a result of the pressure that developers of security-related patches are under. The vulnerability is easy to close by using a compare-and-swap instruction for reference-count changes, but that would have an adverse effect on performance. Security-related code is hard enough to merge even without performance regressions; in this case, the developers decided to stick with a less-than-perfect implementation to avoid slowing the kernel down. But Zijlstra was adamant that atomic operations must be atomic, even if there is a cost to be paid by users who want the reference-count protection.
The harder problem to solve, though, is tied to the fundamental approach
used by this patch set. It changes the atomic_t implementation on
the assumption that most users are implementing reference counts. It then
becomes necessary to go through the kernel, find all non-reference-count
uses of atomic_t, and switch them to an unprotected variable type.
This approach is necessary, Cook said, to
ensure that all reference-count vulnerabilities have been closed off:
"We need a hardened infrastructure, not just 'stuff people can maybe
remember to use'
". The only way to get there, he said, is with an
opt-out implementation.
The problem with this approach, in the eyes of the core kernel developers, is that it requires an audit of the entire kernel to find the non-reference-count users, and that is an error-prone process at best. Beyond that, atomic_t offers a wide range of operations that are not relevant to reference counts; making them available to developers implementing reference counts is just asking for trouble. In this view, it is far better to create a new type for reference counts, implement overflow protection there, and switch reference-count users over.
Back in June, Jann Horn suggested this approach, using the existing kref type for reference counts. That work didn't get much further at that time, but the approach has returned in the form of a new patch set from Zijlstra. Therein, he creates a new, protected refcount_t type; it is implemented using atomic_t and provides a restricted set of operations. The kref implementation is then reworked to use refcount_t, cleaning up some of the interfaces and users along the way. The intended end result is a well-defined way to implement reference counts in the kernel that is difficult for developers to abuse and which can be protected from overflow vulnerabilities.
The current reference-count hardening patch set from Reshetova touches nearly 400 files; Zijlstra's patch set is far smaller. To a great extent, that is because its ambitions are far lower: it adds an infrastructure for protecting reference counts and implements it for code that was already using the kref type, but does nothing about the vast number of reference-count implementations built directly on atomic_t; that is an exercise left for others to do later. The exercise is straightforward, but it does involve understanding the code in question to be sure that the switch to the new type will not introduce bugs.
Assuming that the kernel adopts Zijlstra's approach — a reasonably safe
assumption — it will end up with a reference-count protection mechanism
that runs more slowly and, initially, protects far less code than the
PaX-derived approach. But it will also get a solution without
race-condition worries and which doesn't have the same potential to
introduce bugs into code using atomic_t for purposes other than
reference counting. Over time, assuming developers devote some time to the
task (not always a good assumption, alas), vulnerable code should be
switched over and the end result, from a protection point of view, should
be the same. For security-related patches, that sort of outcome is often
the best-case scenario, even if the developers who put much of their time
into the PaX-derived code find it less than fully gratifying.
Index entries for this article | |
---|---|
Kernel | atomic_t |
Kernel | Reference counting |
Security | Linux kernel/Hardening |
Posted Nov 17, 2016 9:16 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (8 responses)
Posted Nov 17, 2016 9:46 UTC (Thu)
by tao (subscriber, #17563)
[Link] (5 responses)
Posted Nov 17, 2016 11:32 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (4 responses)
Posted Nov 17, 2016 11:43 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (3 responses)
Posted Nov 17, 2016 11:57 UTC (Thu)
by tao (subscriber, #17563)
[Link] (2 responses)
Your mileage may vary, obviously.
If you can find a better language that:
a.) the entire current developer base feels comfortable switching to--so not C++
and
b.) put in the effort to do the rewrite on your own (because while you might be able to convince people to stick to a new language once it's been rewritten in this language, I doubt you can convince them to do the rewrite on their own) without introducing new bugs
then your idea might have merit. If not it's, as I already said, moot.
And if using C extensions, those extensions should at least be such that they are likely to be supported both in llvm and gcc; the safest bet is to go with features that have already been made part of the latest C standards, thus you'll have to submit your ideas to the C working group and wait at least 5-10 years, assuming your ideas are good.
Posted Nov 17, 2016 12:09 UTC (Thu)
by SLi (subscriber, #53131)
[Link]
Posted Nov 17, 2016 12:45 UTC (Thu)
by excors (subscriber, #95769)
[Link]
That seems an unhelpfully strict requirement. Today, not the whole developer base is comfortable with the kernel being written in C - some would be much happier with a language that had features to make it easier to write correct code and harder to write buggy code. Of course there are also people who would be much less happy with anything other than C. Any choice of action or inaction will attract some people and alienate others; it'd be best to consider whether, on balance (accounting for the effects on current developers and potential future developers and the objective benefits of the language itself), that would improve the long-term prospects of the kernel.
But I expect the people with the power to make those decisions are nearly all people who'd be less happy with any other language, and (since they're human) they're obviously not going to choose to alienate themselves for some hard-to-quantify long-term benefits to the project, so in practice nothing is going to change.
Posted Nov 21, 2016 7:31 UTC (Mon)
by scottt (guest, #5028)
[Link] (1 responses)
Posted Dec 1, 2016 16:02 UTC (Thu)
by thestinger (guest, #91827)
[Link]
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel
> a.) the entire current developer base feels comfortable switching to--so not C++
The bumpy road to reference-count protection in the kernel
The bumpy road to reference-count protection in the kernel