Better kernels with GCC plugins
- Structures containing only function pointers are made const,
regardless of whether they are declared that way. Of course, it turns
out that this is the wrong thing to do in a number of cases, so the
developers had to create a no_const attribute and use it some
180 places in their patch.
- A histogram of the distribution of sizes passed to kalloc()
is generated; it's not clear (to your editor) what use is made of that
information.
- Some fairly sophisticated tweaks to the generated assembly are made for
AMD processors to improve the prevention of the execution of kernel
data.
- Instrumentation is inserted to track kernel stack usage.
Use of plugins in this way allows significant changes to be made to the kernel without actually having to change the code:
On the other hand, plugins of this type can increase the distance between
the code one sees and what is actually run in the kernel; it is easy to
imagine that leading to some real developer confusion at some point.
Still, says PaXTeam, "the cost/benefit ratio of the plugin approach
is excellent and there's a lot more in the pipeline
". It is not too
hard to imagine other uses that are not necessarily tied to security.
(Amusingly, the plugins are licensed under GPLv2, meaning that they do not
qualify for the GCC runtime library exemption. The kernel does not need
that library, though, so all is well.)
Index entries for this article | |
---|---|
Kernel | Build system/GCC plugins |
Kernel | GCC |
Posted Oct 6, 2011 8:41 UTC (Thu)
by liljencrantz (guest, #28458)
[Link] (1 responses)
Posted Oct 6, 2011 10:54 UTC (Thu)
by PaXTeam (guest, #24616)
[Link]
Posted Oct 6, 2011 9:24 UTC (Thu)
by epa (subscriber, #39769)
[Link] (6 responses)
Posted Oct 6, 2011 11:11 UTC (Thu)
by PaXTeam (guest, #24616)
[Link] (5 responses)
Posted Oct 6, 2011 14:07 UTC (Thu)
by epa (subscriber, #39769)
[Link] (3 responses)
I guess the answer would be to generate the patch automatically against a given source tree. Then even if only a fraction of the total patch can be applied at a time, it is possible to keep trying. Clearly if it takes a week of manual work to create the patch it's never going to be practical to get it in.
Posted Oct 6, 2011 19:41 UTC (Thu)
by PaXTeam (guest, #24616)
[Link] (2 responses)
the first problem you'll hit with manual patching is that it takes a lot of time to check every variable instance and their use to determine whether the given type can be made read-only or not (or you'll have to settle on individual variables only), the plugin approach is already a godsend for that reason (in the past another route was tried with coccinelle but that didn't work out too well).
the second problem is patching a given tree *and* forward porting the changes to a newer kernel, including redoing the analysis in the first step since new uses of a given type can come and go any time, meaning that the constifiable property of a type can change either way over time. the plugin approach is again the most efficient way of tracking these kind of changes.
last but not least, once you have such a plugin that can determine when to do constification, it's a very natural step to actually do it from the same plugin (it's like one extra line of code to set TREE_READONLY on the type) at which point one's enthusiasm to maintain a huge patch quickly evaporates ;).
now this was the producer side, but the consumer side is not any better unfortunately. if you read those lkml threads when constification patches were submitted, you'll realize that different developers expected them in different (and conflicting) ways (broken up by subsystem or maintainer or directory or structure type, etc), putting a huge burden on the producer side as if creating the patches wasn't consuming enough time already. then there's an issue with enforcing policy as well, seemingly noone takes constification or checkpatch.pl seriously enough as i keep seeing patches go in all the time that simply don't bother to constify structures.
all in all, while patching the source code is surely a noble goal, my and other people's lives are too short for it...
Posted Oct 7, 2011 6:26 UTC (Fri)
by Lionel_Debroux (subscriber, #30014)
[Link] (1 responses)
I myself gave up trying to push upstream the constification of snd_pcm_ops, which represented ~10% of the size of the constification part of the PaX patch, because there were so many instances to check...
Could the plugin be modified to print warnings about pointers to non-const instances being passed to functions that take pointers to const instances ? (if enabled by some argument to the plugin, because upon introduction of that feature and for quite some time afterwards, there will be thousands of occurrences...)
Finding functions that take pointers to mutable instances while they could take pointers to const instances would be harder, wouldn't it be ?
Posted Oct 7, 2011 8:22 UTC (Fri)
by PaXTeam (guest, #24616)
[Link]
it's surely possible but i'm not sure this is what you really want as such typecasts are allowed by C and extensively used everywhere. now i assume you'd really want this detection for ops structures only in which case the problem is how the plugin would know of them (i assume you wouldn't want to use the current 'constify by default' approach). you could make use of the do_const attribute (without calling constify_type) then in the FINISH_TYPE callback you can check whether the pointed to structure is already const or not (and has the do_const attr, although on second thought, i forget now whether attrs are carried with types or have to be acted upon in the attr callback).
with all that said, i would not go the 'check the pointer' route but rather i'd constify the variable instances instead (but not function arguments) and then rely on gcc's existing warning system (if you take the address of a const object to initialize a ptr to the non-const type then you'll get a warning).
Posted Oct 7, 2011 15:24 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link]
That there are lots of places that would require patching, and that the patch covering all of the different variables is large, isn't reason enough to just don't send patches fixing it piecewise upstream. Presumably there are reasons for not accepting said patches (probably much lessened today by using git).
Posted Oct 6, 2011 10:46 UTC (Thu)
by PaXTeam (guest, #24616)
[Link] (5 responses)
1. the canonical source code is in PaX itself so you can get the plugins from where you'd get PaX normally (grsec gets updated too eventually, but given how new this route is for all of us, spender's more conservative and doesn't immediately take everything into grsec). note that the plugins are not readily usable outside of the PaX kernel as they rely on both the build infrastructure i added to the kernel and some configuration as well.
2. as of last night, there're actually 5 plugins released (must be the late wednesday release syndrome ;) and a few more in the making.
3. the plugins in order of appearance:
3.1 the 'stackleak' plugin augments a PaX feature of the same name in that it provides an estimate about how deep the current syscall used the kernel stack (so that before returning to userland the used portion can be cleared). this feature (along with a few other changes) was written in response to a particular exploit technique published earlier this year.
3.2 the constify plugin make ops structures and those marked with the (newly introduced) do_const attribute read-only at compile time and consequently at runtime as well (non-static allocations will be flagged by the compiler and the source has to be patched to use a writable type marked with the (newly introduced, again) no_const attribute).
3.3 the kallocstat plugin (must be enabled with 'make CONFIG_KALLOCSTAT_PLUGIN=y') will emit statistics about the size argument of *alloc* functions (see the plugin for the whole list, it's not just kmalloc) if the given size is a compile time constant. the reason i wrote this plugin is because i was interested in the actual allocation size distribution vs. that of the kmalloc-* slab sizes. this in turn would enable one to adjust both the slab sizes and to fix some allocation sites to reduce internal fragmentation and waste. some excerpts from the histogram of a 3.0.4-i386-allyesconfig-nodebug kernel ('allocation size' 'call site count'):
16 503
as you can see, there seems to be room for improvement.
3.4 the kernexec plugin augments an old feature of the same name in PaX. it's not AMD (CPU) specific, rather it's amd64 (arch) specific. the short story is that on the i386 arch KERNEXEC does not only enforce non-executable pages for the kernel's side of the address space but also for userland (as in, the kernel won't be able to execute code from executable userland pages, SMEP in future CPUs will achieve the same). this was possible due to the use of segmentation, something not available in 64 bit mode so in this regard the amd64 version of KERNEXEC was always weaker, at least until i implemented this missing sub-feature as part of UDEREF/amd64. the problem with that approach is its non-negligible performance impact so for those not wanting the whole UDEREF experience, i wrote this plugin. it forces function pointers to point into the kernel's part of the address space, therefore effectively preventing executable userland pages from actually being executed from kernel code (and all this at a much lower performance impact than UDEREF of course). btw, the 'fairly sophisticated tweak' is a simple 'btsq $63,(%rsp)' before every 'retq' and something equivalent before every indirect call ;).
3.5 the checker plugin is the latest addition and as its name says, it may one day cover some/many things that sparse and checkpatch do. for now it's a PoC to demonstrate the use of the new address space support in gcc 4.6+ (as with the kallocstat plugin, it must be explicitly enabled with 'make CONFIG_CHECKER_PLUGIN=y'). note that i didn't actually patch the kernel to add all the missing annotations (and only __user is enabled for now, __iomem and the rest are put into the generic address space), so expect the compiler to error out frequently until someone fixes everything properly.
3.6 the soon-to-be-released intoverflow plugin will instrument all call sites where one or more argument is used as some kind of size (think *alloc and copy*user, but since the plugin adds a new function attribute, anything can be instrumented) and whose computation could have suffered from integer overflow/truncation - the runtime checks will detect such issues and prevent the incorrectly computed size from being used. you can find a beta (and for now, somewhat buggy ;) version of it at http://grsecurity.net/~ephox/overflow_plugin/ (note its license, since we are aware of how this one could actually be useful in userland as well, the license is not GPLv2 only but v2+).
3.7 the to-do list of future plugins just keeps growing, so without detailed explanations: generic ret2libc prevention (think of an actually sophisticated version of the kernexec plugin ;), free'd ptr sanitization to detect use-after-free problems and also infoleaks, forced structure gap initialization to eliminate infoleaks to userland, etc.
4. about licensing: i'm not sure if the situation is as amusing as Jake seems to think :). consider that the GPLv2 and GPLv3 are not compatible licenses and the kernel as a whole is GPLv2 so it cannot have GPLv3 parts in it (and even if the plugins are userland code, they can't really be argued to be mere aggregation for distribution purposes, they and their build system actively integrate with the kernel). that leaves us with licenses that are compatible with both, including GPLv2+ except the plugin versions distributed with the kernel would have to choose the GPLv2 part of the license and therefore changes made outside of the kernel (whose authors may choose the GPLv3) could not be reincorporated into the kernel's versions of the plugin (at least not without going through the explicit relicensing dance). the few other options left are some versions of the LGPL and BSD/MIT, none of which seems appealing to the kernel itself, and certainly not to me so i went with GPLv2 for now (note that not all the above mentioned plugins are mine, i'm just speaking of my code here). with all that said, if/when compiler plugins become part of upstream, the kernel devs will have to make some policy decisions regarding the acceptable licenses.
Posted Oct 6, 2011 11:20 UTC (Thu)
by PaXTeam (guest, #24616)
[Link]
Posted Oct 6, 2011 13:13 UTC (Thu)
by SEJeff (guest, #51588)
[Link]
Posted Oct 6, 2011 17:34 UTC (Thu)
by mjw (subscriber, #16740)
[Link] (2 responses)
What is the canonical download/repository URL of PaX? I found http://pax.grsecurity.net/ but that seems to not have been updated for some years.
Posted Oct 6, 2011 18:49 UTC (Thu)
by Lionel_Debroux (subscriber, #30014)
[Link]
Posted Oct 6, 2011 19:06 UTC (Thu)
by PaXTeam (guest, #24616)
[Link]
Posted Oct 6, 2011 13:24 UTC (Thu)
by eliezert (subscriber, #35757)
[Link] (1 responses)
Posted Oct 6, 2011 19:01 UTC (Thu)
by ortalo (guest, #4654)
[Link]
Posted Oct 13, 2011 10:24 UTC (Thu)
by callegar (guest, #16148)
[Link] (8 responses)
This may have different implications depending on whether kernel compilation with PaX plugins gets widespread or not.
If it gets widespread, the kernel might end up depending on PaX semantics, leading to a kernel written in PaX_C rather than in C.
If it does not get widespread, I guess there are going to be issues in managing bug reports about kernels compiled with PaX.
Posted Oct 13, 2011 18:55 UTC (Thu)
by vonbrand (subscriber, #4458)
[Link] (6 responses)
What I worry about is that PaX_C ends up being an opaque language, in which all sort of magic happens behind the scenes ("a struct full of function pointers gets to be const, unless..."). The central virtue of C (instead of, say, C++) is its transparency, which has been derided as "high-level assembly language": What you write is translated rather straightforwardly into what gets executed. Sure, C has its grave faults, extra tools for checking what you write are a must (thankfully the days of
Posted Oct 14, 2011 21:51 UTC (Fri)
by PaXTeam (guest, #24616)
[Link] (5 responses)
as for C being a "high-level assembly language" and "is translated rather straightforwardly into what gets executed", it's never been true, one needs to look no further than all the undefined behaviours the current linux code triggers.
Posted Oct 16, 2011 4:03 UTC (Sun)
by quotemstr (subscriber, #45331)
[Link] (3 responses)
The compiler plugin source being available is no defense: INTERCAL sources are available too, but this fact doesn't make INTERCAL comprehensible.
Posted Oct 16, 2011 11:06 UTC (Sun)
by PaXTeam (guest, #24616)
[Link] (2 responses)
given that i don't even try (lately that is, in previous years i got many bugfixes in without much hassle), i can't see how you managed to draw that conclusion ;).
> The OP is in no way spreading FUD, and claiming that he is reflects more on you than it does him.
and claiming that he doesn't reflects more on you than it does on me? can we skip the silly rhetoric please?
> The OP's point, which is that semantics differing markedly from ordinary
which is what i called FUD. the only semantical change we can talk about in this context is the forced constification of certain types and variables, the exact details of which i described above (did you read them?) and can also be learned from the source code of the constify plugin. this change in semantics, surprise surprise, does not make *any* change to the kernel source code, therefore it is as readable as it is without using the plugin.
> When code becomes opaque, it's not "your own problem":
you're misquoting me. i made that comment on his unwillingness to take a look at the plugin source code where he could have learned what it does. similarly how you take a look at the kernel source code if you want to learn what it does. IOW, lazyness doesn't an argument make.
> it's a problem for the entire community, increasing barriers to entry
why does a constification plugin increase the barriers to entry?
> and ongoing maintenance costs for everyone.
and what's the maintenance cost for everyone by using a constification plugin? i know its cost on me (having to check the code regularly for violations of the assumptions the plugin's based on) but a cost for *everyone*? what would that be?
also when evaluating a change one has to look at both sides of the coin, a.k.a. cost/benefit analysis. why did neither of you address the benefit side? don't see any? too little to be worth? any reasoning one can discuss?
> It would behoove you to actually address this point.
the ball's on your court now ;).
Posted Oct 16, 2011 19:14 UTC (Sun)
by vonbrand (subscriber, #4458)
[Link] (1 responses)
There is a semantics change, and you say yourself you have to check regularly if any of the assumptions of your behind-the-secenes changes get violated and fix the plugin accordingly. See, that is exactly the kinds of changes that the random kernel hacker won't be able to do by herself (and she will probably left scratching her head when perfectly sane looking C doesn't compile, or Oopses inexplicably, or changes plainly stated in the source just don't work). The cost for you is probably much larger than for everybody else, but that doesn't mean that the cost for others doesn't exist. As also stated, this extends the source of the kernel from GCC-C to PaX-C + GCC-plugins, and that means there have to be people familiar enough with that combination to keep it running (What has been called "the bus test": What happens to the kernel if you get run over by a bus?). A cost/benefit analysis would say that the benefit is slim to none (if there was a huge benefit, the changes would presumably have been done or accepted by the regular kernel hackers; please do spare me your conspiracy theories); costs include a specially spiked compiler (every time a risk) plus the growingly complex and opaque PaX-C language and GCC plugins as source, run-of-the-mill competent C programming skills aren't enough to pick up kernel hacking anymore. Sounds like a net loss today, and getting worse as time passes. And at least for constification there are perfectly sane solutions using vanilla C, so there is no real need for this circuitous route, so there isn't much of a justification either.
Posted Oct 17, 2011 21:45 UTC (Mon)
by PaXTeam (guest, #24616)
[Link]
actually no, we don't have to fix the plugin, we have to 'fix' the kernel source rather. we made a conscious design decision to not bury non-constifiable type/variable names in the plugin code and implemented two attributes to give explicit control instead (with the default being 'constify unless marked' in PaX, but it's easy to flip it around to become 'constify if marked').
> See, that is exactly the kinds of changes that the random kernel hacker
you mean, upon seeing a clear compiler error about assigning to a read-only variable one won't be able to add a simple __no_const attribute to the affected type? this is the same effort when one has to remove a const qualifier for the same error message or to add annotations for sparse, etc. it's not a particularly hard to learn skill, maybe you want to give it a try just to see it for yourself instead of taking my word for it?
> and she will probably left scratching her head when perfectly sane
the compiler error message is very clear about what goes wrong ;), i never had a problem with figuring out what to 'fix', this gcc behaviour induced by the plugin was a conscious choice actually so that we can let the compiler's existing infrastructure figure out where the 'bad' code is (actually, the quotes are not exactly justified, it did find bad code).
> or Oopses inexplicably,
we're getting into FUD territory. how would the constify plugin cause an oops?
> or changes plainly stated in the source just don't work
what do you mean here?
> The cost for you is probably much larger than for everybody else [...]
yes, it's an insane amount of time. 15 mins of compiling an allyesconfig/allmodconfig kernel and say another hour to filter out and fix the newly introduced problems. every 3 months. i wish all other kernel changes cost this much to fix/work around ;). and if the plugin was part of the normal development process then this cost would be a minute per problem per developer (which is seemingly an order of magnitude more effort than people invest into running and acting on checkpatch). and if the default behaviour were flipped, it'd be back to the kernel janitor folks to add do_const every now and then to the deserving types.
> As also stated, this extends the source of the kernel from GCC-C to
if only there was such a thing as GCC-C ;). but there isn't, the kernel isn't written to any particular C standard, not even any GNU C specific extension or implementation (ever tried -O0?), it's some mix of these.
constifying ops structures is janitorial work, it's something that a programmer should have done ("if you don't mean this variable to be actually writable then don't leave it writable") but didn't, nor did anyone else to bother with the checkpatch output, etc.
> and that means there have to be people familiar enough with that
as with any new thing, one needs to learn the details of how gcc plugins work, not unlike how people had to learn the intricacies of SMP, etc along the way. i trust that more and more people will find useful things to do with gcc plugins and this lack of knowledge will disappear as it did with many other things.
Posted Oct 26, 2011 23:54 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Oct 14, 2011 21:43 UTC (Fri)
by PaXTeam (guest, #24616)
[Link]
now obviously not all flavours of C are created equal, so one has to carefully evaluate what's worth using and what isn't. for the PaX flavours the defining goal is always something to do with "make the generated code more secure, even if the original source code wasn't written with security in mind". whether it's something you or anyone else values is of course not up to me to decide, but i did decide that it was worth for me (and many others) to pursue this route.
as for bugreports, we've always handled PaX/grsec bugreports ourselves and directed users upstream only when we could determine that the given problem existed there as well.
Better kernels with GCC plugins
Better kernels with GCC plugins
const structures
const structures
const structures
solving the "let's make all constifiable variables const" problem is not possible with patching due to the sheer amount of patching needed.
I remember reading about this on LWN but I never quite believed it. A patch to add the 'const' keyword to some code is not hard to merge manually, even if the code has diverged a lot in the meantime. But then, the sheer size of the Linux codebase may make even a simple change unmanageable - I will defer to your experience.
const structures
const structures
Some constifications performed by the PaX patch were possible only due to other changes in PaX - for example, ata_port_operation, which is one of the most widely used function pointer structs in the kernel.
For instance, this could have helped against addition of mutable instances of backlight_ops being added after backlight_device_register() was modified to take a "const struct backlight_ops *ops" argument...
const structures
> instances being passed to functions that take pointers to const instances ?
const structures
Better kernels with GCC plugins
17 2
18 10
32 351
33 4
34 1
36 293
64 255
65 3
66 12
68 123
128 123
129 4
130 4
132 62
133 1
256 211
257 1
258 2
259 3
260 9
512 200
513 4
514 2
516 16
517 4
1024 157
1025 13
1026 8
1028 13
1032 13
2048 122
2049 2
2052 6
2056 2
4096 295
4098 2
4100 3
4104 1
8192 44
8195 2
8196 2
8200 1
16384 32
16392 1
Better kernels with GCC plugins
Better kernels with GCC plugins
Better kernels with GCC plugins
Better kernels with GCC plugins
https://grsecurity.net/test/pax-linux-3.0.4-test32.patch
https://grsecurity.net/test/pax-linux-2.6.32.46-test124.p...
Better kernels with GCC plugins
Better kernels with GCC plugins
Better kernels with GCC plugins
Better kernels with GCC plugins
Better kernels with GCC plugins
lint(1)
are past, but the kernel has its own sparse
checker now). That doesn't make a almost-but-not-quite-C language a good idea, least of all for the kernel.Better kernels with GCC plugins
Better kernels with GCC plugins
While nonstandard C mechanics brings benefits, source-to-source transformation and other techniques can yield equivalent results without, in effect, forking the C language.
Better kernels with GCC plugins
> C make code less readable, is perfectly valid.
Better kernels with GCC plugins
The OP's point, which is that semantics differing markedly from ordinary
C make code less readable, is perfectly valid.
which is what i called FUD. the only semantical change we can talk about in this context is the forced constification of certain types and variables, the exact details of which i described above (did you read them?) and can also be learned from the source code of the constify plugin. this change in semantics, surprise surprise, does not make *any* change to the kernel source code, therefore it is as readable as it is without using the plugin.
Better kernels with GCC plugins
> regularly if any of the assumptions of your behind-the-secenes changes
> get violated and fix the plugin accordingly.
> won't be able to do by herself [...]
> looking C doesn't compile,
> PaX-C + GCC-plugins,
> combination to keep it running (What has been called "the bus test":
> What happens to the kernel if you get run over by a bus?).
Better kernels with GCC plugins
Better kernels with GCC plugins