What NaCl seems to do is to restrict the instruction set, so that it can be verifiable, at the expense of requiring a special compiler.
However, honestly, that seems an absolutely unnecessary complication, since it's possible to just run any arbitrary binary code is a separate process, isolated by OS functionality that only allows a very limited set of system calls (e.g. seccomp in Linux).
Windows might not provide a similar system call limitation feature, but it should be very easy to implement in a Windows kernel driver.
That would increase performance, allow to avoid having to use special compilers, as well as allowing trivial portability to any CPU with an MMU and built-in protection features.
In other words, IMHO this technology is horribly complicated without any real reason for that (since hardware offers the same security, much better), although it's somewhat academically interesting to see how instruction sets can be restricted to be made verifiable.
Posted Aug 5, 2011 19:56 UTC (Fri) by elanthis (guest, #6227)
[Link]
> However, honestly, that seems an absolutely unnecessary complication, since it's possible to just run any arbitrary binary code is a separate process, isolated by OS functionality that only allows a very limited set of system calls (e.g. seccomp in Linux).
Which is exactly what Chrome already does for its individual tab processes, so the Google folks are quite aware of this facility. They're going for more than just sandboxing of an individual process with NaCl, though.
> Windows might not provide a similar system call limitation feature, but it should be very easy to implement in a Windows kernel driver.
Actually, from all reports I've heard, Windows' built-in mechanisms for sandboxing are actually better than Linux's. (That is, at least when comparing Vista/7 to the least common denominator of what the popular Linux distributions provide; SELinux and such likely blow Windows out of the water in this area.)
> That would increase performance, allow to avoid having to use special compilers, as well as allowing trivial portability to any CPU with an MMU and built-in protection features.
I'm not sure at all that it would help performance compared to what Google is trying to do. One of the goals of what NaCl does with its opcode sanity checking is to allow trusted and untrusted code to exist inside the same process. That allows for trusted library code to be called by the untrusted application code without context switches, without IPC, without anything other than a simple function call.
For example, for something like a game, you probably don't want your D3D/GL calls to be going over an IPC mechanism. Especially when dealing with streaming buffer uploads and the like as that would just make NaCl useless for high-end games (or other graphically rich real-time interactive applications). So you want the actual NaCl process to directly communicate with the GPU driver, but you don't want the untrusted code to be able to do that itself. You also don't want the untrusted code to be able to access/subvert the memory owned by trusted parts of the same process, as that would subvert the sandbox. These goals require the opcode verification (and the accompanying x86 machine code restrictions) that NaCl enforces.
Google's Native Client forges ahead
Posted Aug 6, 2011 14:05 UTC (Sat) by slashdot (guest, #22014)
[Link]
Why not just use shared memory for anything performance critical, such as data uploads to the GPU?
As for context switches, most modern CPUs are multicore, so you might not need any actual context switches at all (just some cacheline bouncing).
Hardware 3D already usually communicates to a remote GPU via a DMA-based FIFO and uploads, so having an additional mechanism (faster due to using shared memory instead of DMA) shouldn't be the end of the world.
I'm not sure whether this additional IPC overhead would be actually higher than the performance degradation imposed by limiting the instruction set (for example, memory accesses seem to have extra overhead due to that).
Of course, you could also in principle trust the OS to be secure, and run arbitrary code in a security context with limited privileges, but with access to the GPU and other useful stuff; unfortunately, the history of local root holes on all OSes (not to mention the graphics drivers...) makes this probably an unwise choice.
Google's Native Client forges ahead
Posted Aug 6, 2011 21:38 UTC (Sat) by elanthis (guest, #6227)
[Link]
> Why not just use shared memory for anything performance critical, such as data uploads to the GPU?
> Hardware 3D already usually communicates to a remote GPU via a DMA-based FIFO and uploads, so having an additional mechanism (faster due to using shared memory instead of DMA) shouldn't be the end of the world.
The FIFO is for the command queue, not large chunks of data like VBO uploads. There is no 'additional shared memory' mechanism, because such a thing doesn't even make sense, nor is it even remotely safe even if it did exist. The kernel DRI/DRM interfaces exist for a reason.
> As for context switches, most modern CPUs are multicore, so you might not need any actual context switches at all (just some cacheline bouncing).
You don't appear to understand how multi-core CPUs or multi-tasking operating systems work. Of course there is going to be a context-switch involved. What you're suggesting implies that the other core will have a process sitting there busy-waiting on an atomic, eating up 100% of the processing time on that core, just in case the sandboxed process possibly maybe wants to do something. That would be a ridiculously bad idea.
Any privileged process -- on another core or not -- is going to be blocked in a syscall waiting for an IPC message of some form, and calling a remote method on that privileged process from the sandboxed one will require OS context switches. A minimum of four of them in total, in fact. It would actually be faster to _not_ have the privileged process on another core due to the additional overhead of sharing data between cores, and if such a scheme were used the processor affinity facilities should be used to coerce both processes to be on the same core.
> I'm not sure whether this additional IPC overhead would be actually higher than the performance degradation imposed by limiting the instruction set (for example, memory accesses seem to have extra overhead due to that).
Memory accesses do not have extra overhead in the NaCl implementation. The segmented memory model is a core part of the x86 instruction set and is always active, even if generally all segments are set to 'contain' all of system memory. Using it to isolate memory is effectively free. The only reason it's not used normally to isolate processes is because the CPU by itself doesn't stop a process from changing the segmentation configuration, so without a software arbiter to ban programs using those instructions before they even start it would not have been effective protection.
> Of course, you could also in principle trust the OS to be secure, and run arbitrary code in a security context with limited privileges, but with access to the GPU and other useful stuff
Most operating systems do not actually allow you to set up a sandbox like this, Linux included (unless you make something like SELinux mandatory for your browser to work, which won't fly well with anyone but Fedora/RHEL users). Sandboxing processes is a relatively recent addition to the security toolbox (despite how obviously powerful it is) and most OSes haven't caught up to the needs of these techniques, yet, making frameworks like NaCl mandatory for now.
Again, Google's engineers know what they're talking about, and you seem to have some holes in your knowledge of these topics. Please just go read their documentation. It's very easy to find and quite easy to understand.
> the history of local root holes on all OSes (not to mention the graphics drivers...) makes this probably an unwise choice.
That logic implies that all security is worthless and we should just stop trying to protect anything, because all OSes have local root holes and hence cannot be protected at all. A more useful way to look at things would be that holes are likely going to be found, and they will get fixed, and life will move on and people will still be more secure (no, not absolutely secure, but 'more' is still better than 'less') by having sandboxed processes than they were without.
Google's Native Client forges ahead
Posted Aug 7, 2011 7:30 UTC (Sun) by viro (subscriber, #7872)
[Link]
a) there is a very good reason why everyone sets segments to maximal size and it's exactly the fact that this crap is *not* free. It's turned off as an optimisation when processor sees that limit is set to maximum.
b) on amd64 segment limits are not verified in 64bit mode. End of story.
c) segments can be changed only when you are running in ring 0, at which point the game is really over. You can switch between the segments present in GDT + your LDT, but that's it. Said that, on anything that runs Linux kernel you will have segments spanning the entire user address space in GDT, making the segment-based protection only as good as your code sanitizer. And x86 instruction set is not well-suited for analysis, to put it mildly; it's not RISC. Prohibiting jumps into the middle of instruction is nice, but how do you prohibit return into the same? And with that added into the mix, you can construct far ret as part of the immediate constant, bugger the stack frame, hit normal ret (which is going to be in the allowed set), "return" to that far ret and there you are - %cs:%eip is set to your data. Arbitrary jump to other code segment... You are still within the same process, of course, but the sandbox boundary is broken through. At the very least you can read any data anywhere in your process' address space, segmentation be damned.
Google's Native Client forges ahead
Posted Aug 8, 2011 3:04 UTC (Mon) by elanthis (guest, #6227)
[Link]
> a) there is a very good reason why everyone sets segments to maximal size and it's exactly the fact that this crap is *not* free. It's turned off as an optimisation when processor sees that limit is set to maximum.
Have any references? All I can find when searching for performance of segmented memory in protected mode are a few papers on using it for efficient array bounds checking. :/ Not saying you're wrong, I'd just like to read more about it and I can't find anything useful.
> b) on amd64 segment limits are not verified in 64bit mode. End of story.
NaCl is 32-bit only, even on OSes/machines that support 64-bit mode, in no small part because the tricks employed on x86 depend on such details. The ARM port uses a different set of tricks, naturally.
> And x86 instruction set is not well-suited for analysis, to put it mildly; it's not RISC. Prohibiting jumps into the middle of instruction is nice, but how do you prohibit return into the same?
Since even kernel developers are apparently too lazy to even try to look this stuff up, let me answer your particular attack scenario: the RET instruction is also banned by the NaCl verifier (you are more than free to read the paper on how returns from functions are implemented, if you're wondering how it works). This is one of the reasons why a modified compiler is needed to produce binaries that work inside the NaCl sandbox.
Here is their original paper on their x86 sandboxing; there is more information available to anyone who can bother to spend 30 seconds looking for it:
NaCl isn't for regular desktop apps. It's for smaller, more contained apps. It's for the kinds of things you can already do on the Web or in Flash, except that it allows native speed (or very very close to native, depending on whether you consider hardware-executed but notably non-optimal instructions to be "native", I suppose) and allows for the use of C/C++ code and libraries (I can have a 3D math library that doesn't suck donkey nuts like every last single vector library in every single language other than C, C++, and D does due to the overwhelming limitations of the academia-designed high-level languages; yay!). NaCl isn't intended to be used outside of a browser or for complex applications that couldn't reasonably be implemented and deployed on top of something like Flash (save for the speed).
Google's Native Client forges ahead
Posted Aug 5, 2011 22:23 UTC (Fri) by dlang (✭ supporter ✭, #313)
[Link]
you can't easily analyze a binary to make sure there are not invalid opcodes in it. the problem is that you can jump into the middle of data, or jump to an address that is not the start of the instruction and the chip will start executing from there.
Google's Native Client forges ahead
Posted Aug 6, 2011 10:28 UTC (Sat) by elanthis (guest, #6227)
[Link]
That is why they only allow certain opcode patterns, and disallow any patterns that cannot be easily verified, and do require a special compiler to generate compatible machine code that will pass the verifier's requirements and implement the tricks needed to actually work in the sandboxed environment, along with applying significantly more knowledge of the various hardware architectures than you apparently think a team of Google's top engineers are capable of doing. Tricks they have written several in-depth papers on, have implemented fully in completely open source code, and have had working in real environments for quite a while now.
In particular on x86, they are using several different features of the architecture. One is the segmented memory model of x86, another is the ability to ban any code that calls the instructions to change segments, and yet another is a very tight control on where branches can be and where they can target. Non-writable code pages along with non-executable data pages ensure that the untrusted code cannot subvert the machine code verifier by modifying or creating machine code. Simple trampolines handle the code segment changes and stack pointer swaps necessary to call into and return from the trusted code.
If you want more information, just go read their documentation and papers. It's all very accessible and easy to grok.