LWN.net Logo

Please educate a curious cat

Please educate a curious cat

Posted Aug 16, 2007 5:27 UTC (Thu) by felixfix (subscriber, #242)
Parent article: Exploiting races in system call wrappers

I understand what is going on; a pointer or some other piece of user data is changed by a user program, in a different thread probably, between validation and use.

I haven't written this kind of code; the last OS work I did passed all syscall parameters in registers. But I am a bit confused. Wouldn't it be very simple to avoid these race conditions by copying the user data to kernel memory before validating? Obviously this wouldn't work with the infamous setuid switcheroo, but for syscall parameters, it would seem to work very well. The only case I can think of to make it difficult would be where the user data in question is too large for easy copying to kernel memory.


(Log in to post comments)

Please educate a curious cat

Posted Aug 16, 2007 13:27 UTC (Thu) by kleptog (subscriber, #1183) [Link]

I think the point is that the system call wrapping was supposed to be cheap and quick, hence the wanting to avoid copying the data twice. The wrapper gets the data exactly the same way as the system call.

What you suggest (copying data then checking) is I think pretty much what the LSM do. Rather than just wrapping the system call, it gets called *after* the kernel has copied it to kernel space. This it's safer, but not as easy to write...

Please educate a curious cat

Posted Aug 16, 2007 13:52 UTC (Thu) by kilpatds (subscriber, #29339) [Link]

As one of the authors of GSWTK.... (We knew about the issue. GSWTK was
a research project, not intended for production use)

We stopped working on it before the vsyscall method was adopted, so
please limit these comments to the software interrupt syscall method.

Someone has to copy in all "complex" data into kernel space before
operating on it. In linux, that is done by the sys_* methods that
implement the system calls.

GSWTK Wrappers replace the system call vector, so they are called before
the sys_* call. So the wrapper has to copy in the data to analyze it,
but has to hand the original system call a pointer that can be copied in.
That is, a pointer in user space.

We could copy the data to somewhere else in userspace (a page we allocate
in their process space), make that page not writable by the program, and
pass that pointer in. But this is just a band-aid. The process could
reset the flags on the page and change the data. It just shrinks the
race period. It doesn't fix the fundamental race.

If one were a kernel developer who wanted to support wrapper-like
interposition, you could add a layer to enable it. The base system call
would copy data in, then call the method that actually implemented the
logic. This would provide an alternate interposition point. But it
would slow everyone else down. I can't imagine such a feature making it
in.

Doug

Correct - the approaches work fine when race conditions are eliminated

Posted Aug 16, 2007 14:35 UTC (Thu) by dwheeler (guest, #1216) [Link]

Correct; the attacks ONLY work if the design permits race conditions. The notion that user-space data will stay unchanged during a kernel call is untrue is practically all of today's OSs, and this attack worked in the 1960s and 1970s too (it's well-documented). The solutions are well-documented, too; eliminate the race condition. The "easy" way is to copy all data into the kernel, and then use that protected version. The trick is to get good performance as well.

Correct - the approaches work fine when race conditions are eliminated

Posted Aug 23, 2007 7:13 UTC (Thu) by Cato (subscriber, #7643) [Link]

Indeed - this model of 'copy first then check' was known as 'touch once programming' over 20 years ago, so there's little excuse for repeating this mistake again. Perhaps what's needed is smarter static analysis tools that can point out this sort of error?

Getting good performance is a challenge, but with the speed of modern CPUs I'd rather spend some CPU cycles on copying than spend many administrator hours responding to a security breach.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds