LWN.net Logo

Security in the 20-teens

By Jonathan Corbet
February 1, 2010
Recently, Google announced that its operations in China (and beyond) had been subject to sophisticated attacks, some of which were successful; a number of other companies have been attacked as well. The source of these attacks may never be proved, but it is widely assumed that they were carried out by government agencies. There are also allegations that the East Anglia email leak was a government-sponsored operation. While at LCA, your editor talked with a developer who has recently found himself at Google; according to this developer, incidents like these demonstrate that the security game has changed in significant ways, with implications that the community can ignore only at its peril.

Whenever one talks about security, one must do so in the context of a specific threat model: what are we trying to defend ourselves against? Different threat models lead to very different conclusions. For years, one of the most pressing threats has been script kiddies and others using well-known vulnerabilities to break into systems; initially these breakins were mostly for fun, but, over time, these attackers have increasingly had commercial motivations. In response, Linux distributors have created reasonably secure-by-default installations and effective mechanisms for the distribution of updates. As a result, we are, by default, quite well defended against this class of attack when carried out remotely, and moderately well defended against canned local attacks.

Attackers with more determination and focus are harder to defend against; somebody who intends to break into a specific system in pursuit of a well-defined goal has a better chance of success. Chances are, only the most hardened of systems can stand up against focused attackers with local access. When these attackers are at the far end of a network connection, we still stand a reasonable chance of keeping them out.

Often, those concerned with security simply throw up their hands when confronted with the problem of defending a system against an attacker who is working with the resources available to national governments. Most of us assume that we'll not be confronted with such an attack, and that there's little that we could do about one if we were. When governmental attackers can obtain physical access, there probably is little to be done, but remote (foreign) governmental attackers may not be able to gain that sort of access.

What the attacks on Google (and others) tell us is that we've now entered an era where we need to be concerned about attacks from national governments. What the attacks on Google (and others) tell us is that we've now entered an era where we need to be concerned about attacks from national governments. Probably we have been in such an epoch for a while now, but the situation has become increasingly clear. Thinking about the implications would make some sense.

A look at updates from distributors shows that we still have have a steady stream of vulnerabilities in image processing libraries, PDF viewers, Flash players, and more. Some of these problems (yet another PNG buffer overflow, say) appear to have a relatively low priority, but they shouldn't. Media-based attacks can only become more common over time; it's easy to get a victim to look at a file or go to a specific web page. Properly targeted phishing (easily done by a national government) may be the method of choice for compromising specific systems for some time to come. Browsers, file viewers, and media players will play an unfortunate role in the compromise of many systems.

What may be even more worrisome, though, is the threat of back doors, trojan horses, or (perhaps most likely) subtle vulnerabilities inserted into our software development and distribution channels. This could happen at just about any stage in the chain.

On the development side, we like to think that code review would find deliberately coded security weaknesses. But consider this: kernel code tends to be reviewed more heavily than code in many other widely-used programs, and core kernel code gets more review than driver code. But none of that was able to prevent the vmsplice() vulnerability - caused by a beginner-level programming error - from getting into the mainline kernel. Many more subtle bugs are merged in every development cycle. We can't ever catch them all; what are our chances against a deliberately-inserted, carefully-hidden hole?

Source code management has gotten more robust in recent years; the widespread use of tools like git and mercurial effectively guarantees that an attempt to corrupt a repository somewhere will be detected. But that nice assumption only holds true for as long as one assumes that the hash algorithms used to identify commits are not subject to brute-force collisions. One should be careful about such assumptions when the computing resources of a national government can be brought to bear. We might still detect an attempt to exploit a hash collision - but our chances are not as good.

In any case, the software that ends up on our systems does not come directly from the source repositories; distributors apply changes of their own and build binary packages from that source. The building of packages is, one hopes, relatively robust; distributors have invested some significant resources into package signing and verification mechanisms. The Fedora and Red Hat intrusions show that this link in the chain is indeed subject to attack, but it is probably not one of the weakest links.

A weaker point may be the source trees found on developer laptops and the patches that those developers apply. A compromise of the right developer's system could render the entire signing mechanism moot; it will just sign code which has already been corrupted. Community distributions, which (presumably) have weaker controls, could be especially vulnerable to this attack vector. In that context, it's worth bearing in mind that distributions like Debian and Gentoo - at least - are extensively used in a number of sensitive environments. Enterprise distributions might be better defended against the injection of unwanted code, but the payback for the insertion of a hole into an enterprise distribution could be high. Users of community rebuilds of enterprise distributions (LWN being one of those) should bear in mind that they have added one more link to the chain of security that they depend on.

Then again, all of that may be unnecessary; perhaps ordinary bugs are enough to open our systems to sufficiently determined attackers. We certainly have no shortage of them. One assumes that no self-respecting, well-funded governmental operation would be without a list of undisclosed vulnerabilities close at hand. They have the resources to look for unknown bugs, to purchase the information from black-hat crackers, and to develop better static analysis tools than we have.

All told, it is a scary situation, one which requires that we rethink the security of our systems and processes from one end to the other. Otherwise we risk becoming increasingly vulnerable to well-funded attackers. We also risk misguided and destructive attempts to secure the net through heavy-handed regulation; see this ZDNet article for a somewhat confusing view of how that could come about.

The challenge is daunting, and it may be insurmountable. But, then, we as a community have overcome many other challenges that the world thought we would never get past, and the attacks seem destined to happen regardless of whether we try to improve our defenses. If we could achieve a higher level of security while preserving the openness of our community and the vitality of our development process, Linux would be even closer to World Domination than it is now. Even in the absence of other minor concerns - freedom, the preservation of fundamental civil rights, and the preservation of an open network, for example - this goal would be worth pursuing.


(Log in to post comments)

Security in the 20-teens

Posted Feb 1, 2010 18:28 UTC (Mon) by joey (subscriber, #328) [Link]

But that nice assumption only holds true for as long as one assumes that the hash algorithms used to identify commits are not subject to brute-force collisions. One should be careful about such assumptions when the computing resources of a national government can be brought to bear. We might still detect an attempt to exploit a hash collision - but our chances are not as good.
Of course there is at least one VCS that does not rely on hashing for security, and instead relies on gpg signatures. The question then becomes: Is cracking a typical length gpg key within the means of a government? Hmm.. Hashing is in some ways *better*, because at least with a hash collision, some random colliding data is nearly certain to be needed. While if a gpg key is cracked, completely plausible commits could be made. I outlined some ways that sha1 collisions could be used against git repositories here. The second attack mentioned there is not very useful to a government; it's useful for project members who want to attack a project and cover their tracks. The first attack could be more useful for a government. Perhaps a second git repo is not needed; instead their great firewall could replace a file with a colliding version in passing. Also, sha1 collisions don't need a government to exploit them. They're about at the level where a university can muster the equipment to generate a collision.

Bwahaha...

Posted Feb 2, 2010 15:50 UTC (Tue) by khim (subscriber, #9252) [Link]

Of course there is at least one VCS that does not rely on hashing for security, and instead relies on gpg signatures.

...which rely on the hashing for speed: typical GPG signature signs not the message itself but the hash of the message! This makes it potentially more vulnerable not less. The rest of the message is moot point.

Sure it may be good idea to use GPG signatures as defence from other attack vectors, but to say that GPG signatures can be used as defense against hash collisions... it's the height of folly.

Security in the 20-teens

Posted Feb 2, 2010 22:43 UTC (Tue) by djao (subscriber, #4263) [Link]

Of course there is at least one VCS that does not rely on hashing for security, and instead relies on gpg signatures.

Uh, what? GPG signatures themselves rely on hash functions. From the GPG manual:

A document's digital signature is the result of applying a hash function to the document.

Security in the 20-teens

Posted Feb 2, 2010 23:38 UTC (Tue) by njs (guest, #40338) [Link]

Even if GPG were signing the source code itself, rather than a hash of it, it would be unusable for the DVCS case, because the important feature of DVCS chained hashing is that the hash covers *the entire history*. No-one's going to hand GPG the entire history of their project (which easily reaches the terabyte range) on every commit.

Security in the 20-teens

Posted Feb 7, 2010 1:26 UTC (Sun) by vonbrand (subscriber, #4458) [Link]

You are mistaken. E.g., git doesn't hash the whole repo each time I commit something, what is hashed as a commit is just the contents of a file containing pointers (as SHA-1 hashes) of its parents and any file contents referenced. You can also GPG-sign a tag for added security.

Security in the 20-teens

Posted Feb 7, 2010 3:09 UTC (Sun) by njs (guest, #40338) [Link]

You misread :-). Certainly git doesn't hash the whole repo, it uses the chained hashing trick (the "pointers" you mentioned). This subthread is about what happens if you don't trust hashes -- you certainly can't use the chained hashing trick.

Security in the 20-teens

Posted Feb 1, 2010 19:04 UTC (Mon) by raven667 (subscriber, #5198) [Link]

While organizations are generally good about installing network firewalls, we have generally
forgotten the basic purpose of a firewall to segment an organizations data and provide access
control between bits of important data. Most organizations don't want to spend the time and
money it takes to actually understand their data flow to the point where they could segment it.
SCADA systems connected in any fashion to the internet is the perfect example of this.

Many organizations really should have separate infrastructure for their important business
critical operations and another set of systems for their internet accessible operations, including
having two separate machines on peoples desktops. At a minimum would be terminal servers
with access controls to restrict the flow of data into and out of the company environment.

This is never going to happen. There are too many efficiencies gained by having everything
easily accessible and it is so much simpler not to segment so that only rare and critical
environments are ever treated this way.

Personally what I think has some merit is sandboxing technologies like SELinux which aim to
firewall applications from one another. The goal, which may not be reached yet, is that the kind
of data driven exploit, which is generally common these days, would not have enough access
once it got onto a system to steal important data or leverage local exploits to grant itself further
access. Once you can access the local kernel enough to exploit one of its bugs the game is over
and any security hardening or structure in place is totally meaningless.

I don't know whether sandboxing applications will raise the bar enough, it is probably a safe bet
that we are in a world where the attackers are just going to press the "override security" button
like in some 80's hacker movie and while there may be islands of hardness out there it will never
be the norm.

Security in the 20-teens

Posted Feb 1, 2010 20:14 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

if people are not willing to understand their data flow between machines well enough to implement real security with firewalls, what makes you think that they will analyze the data flow between processes on the same machine well enough to implement real security with something like SELinux?

Security in the 20-teens

Posted Feb 1, 2010 20:46 UTC (Mon) by epa (subscriber, #39769) [Link]

I think the idea is that it doesn't depend on 'people' to sit down with the SELinux user guide and a large pot of tea to write the correct policy for their system; rather, the default settings will be somewhat more restrictive and more secure than we have now (where the Flash plugin, for example, can read any file in your home directory). This would close off some, though not all, exploits.

Security in the 20-teens

Posted Feb 1, 2010 21:34 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

every time you try to create such 'obvious' rules you will break something for someone. And unless the rules are easy enough for that person (or their sysadmin) to understand and modify, all that will happen is that they will learn that the way to make their system work is to disable SELinux, and they will.

This is what is routinely happening with SELinux today, even for professional sysadmins and security people.

getting this stuff right is hard, significantly harder than isolating systems from each other.

it gets even worse, because what you want isn't the binary 'communication via this port/access to this file is allowed or blocked', what you really want is 'you are allowed to do these types of things'. On the firewall side the retreat away from proxy firewalls to packet filters is a wonderful win for the manufacturers of packet filters, but a huge loss for everyone else. things are a little better on the SELinux side (they separate read/write/execute) but there's no control over what they read/write to a file, and you don't know what the impact is of a write unless you understand how every other program that reads the file interprets it.

Security in the 20-teens

Posted Feb 1, 2010 21:54 UTC (Mon) by jamesmrh (guest, #31622) [Link]

There is control over what's written in SELinux, and you can determine who can read a file -- these are fundamental design characteristics of the security model. In the default case with Fedora, we're focused mainly on containment of network facing apps, so most people don't see or use this, although information flow control is a big part of govt deployments (see e.g. the CLIP tool for managing this).

Security in the 20-teens

Posted Feb 1, 2010 23:03 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

SELinux allows you to control if an application can write to a file, but if you are allowed to write to a file, it doesn't give you any way of controlling _what_ you write.

Security in the 20-teens

Posted Feb 2, 2010 0:59 UTC (Tue) by jamesmrh (guest, #31622) [Link]

Well, it depends.

In terms of all of the objects on the system, they have security labels, and the policy will determine if information can flow from one place to another via a certain application running as a certain user.

e.g. you may not be able to open a 'secret' file for read and an 'unclassified' file for write.

You also then to label information as it enters the system.

In the case of, say, someone typing 'secret' information from memory into a text editor which has an 'unclassified' file open for write, it's impossible to prevent. You can try and detect that it's happened after the fact (e.g. file scanning), and perhaps add some deterrence via audit.

For the general case, what we'd likely to encounter in this area is inadvertent disclosure, e.g. phishing attacks. Window labeling (XACE) and trusted path may be useful here.

Security in the 20-teens

Posted Feb 2, 2010 1:30 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

this is not was I was saying is missing, what I am saying is missing are controls that would prevent you from inserting 8 bit data into a file of type 'ascii text'

in a network firewall, this would be an application proxy that would check that what you send over port 80 is really valid http (and a really good one would check that it is one of the requests that it has been configured to allow)

Security in the 20-teens

Posted Feb 4, 2010 10:12 UTC (Thu) by dgm (subscriber, #49227) [Link]

Exactly what purpose would this be useful for?

Security in the 20-teens

Posted Feb 4, 2010 20:39 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

way back up the thread the statement was made that using SELinux for many processes on one machine was as secure as having the processes on separate machines separated by firewalls.

This is an example of capability that you could have to filter communication between apps on different machines that you do not get with SELinux securing things on one machine.

as for what this would be useful for.

if you have apps that expect things to be text files and throw arbitrary binary data at them you may find a flaw in them and be able to do things as the owner of that process. If you make sure that such bad data can not get to the app you eliminate an entire class of exploits.

Security in the 20-teens

Posted Feb 2, 2010 2:30 UTC (Tue) by smoogen (subscriber, #97) [Link]

No it can't.. but then again.. that is pretty much impossible to do in any OS that isn't written from the ground up to be super secure AND is usually with caveats like: No network, have multi users watch people using input/output devices, make sure every written to file is checked multiple times through multiple people and programs, etc.

If that is the level of security you are wanting, then you are going to basically need a large budget for every computer. I remember a security policy back in 1995 that had that in its rules for every computer (Mac, PC, Unix,etc) system.. the site would have needed about 8x more people just to make sure the computers were just being used appropriately. And then it would probably only be 99% effective.

Security in the 20-teens

Posted Feb 2, 2010 2:47 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

but you can get that sort of security between machines. It's not cheap (it requires that you buy real application firewalls, not just cisco, checkpoint, linux, or *bsd stateful packet filters) and it requires that you take care about what your software is doing, but it is possible.

you do not get the same security by putting everything on one box and waving the SELinux magic wand.

Security in the 20-teens

Posted Feb 3, 2010 13:37 UTC (Wed) by foom (subscriber, #14868) [Link]

But, once the firewall is parsing application traffic, who's to say it doesn't have the security holes
just like the application does? (Wireshark certainly has its fair share of remote exploits, for
instance).

Security in the 20-teens

Posted Feb 3, 2010 16:41 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

yes, any checking the firewall does opens the firewall up to the possibility of errors (this includes the checking done in stateful packet filters)

However, for all relatively sane protocols, there is checking that can be done that doesn't require as much code (and therefor doesn't have the risk) of the application code that will be processing the request. Properly done the code for the firewall is relatively static and can be well tested. It doesn't need to change every time you add a new function to the application (or change it's behavior), it only needs to be able to be configured to do different checking.

Usually this can be things like (in order of increased complexity)

checking that the message is well formed by the definition of the protocol

checking that the message follows the protocol syntax

checking that specific fields in the message are in a whitelist

Yes Wireshark has a horrible track record in security, but this sort of checking is happening in many firewalls (under names like 'deep packet inspection') for some protocols. There are also seperate 'Application Firewall' products you can get for some protocols. The better IDS/IPS systems do this sort of thing (as opposed to mearly blacklisting known exploits)

Security in the 20-teens - Default security policies

Posted Feb 2, 2010 14:18 UTC (Tue) by eparis123 (guest, #59739) [Link]

Every time you try to create such 'obvious' rules you will break something for someone.

I completely agree.

From around a month, I was very late on a college project that involved loading binary files to a MySQL database. Using ubuntu, the queries always filled NULL in the binary files columns, without any visible error messages.

After around 40 minutes of Googling, I found that the reason was an AppArmor policy enabled by default in Ubuntu. I even found it on the very last comment of a MySQL bugzilla entry.

Needless to say, I was very frustrated I consumed all that time on this trivial matter, while having very limited time till the deadline. I guess this is a pet example for users frustration with security; Casey Schaufler (author of SMACK) had a great quote about this in one of the previous weekly editions kernel quotes page.

Quote candidate

Posted Feb 3, 2010 21:09 UTC (Wed) by man_ls (subscriber, #15091) [Link]

Maybe this one? I don't see how it relates to AppArmor though.

Quote candidate

Posted Feb 4, 2010 19:12 UTC (Thu) by eparis123 (guest, #59739) [Link]

Yes, this was the one I meant. The relation I find is that an application developer (me, innocently working on a MYSQL program) got bitten heavily in the worst of times.

Maybe I did not understand the quote context very well too.

Accurate quote

Posted Feb 4, 2010 21:11 UTC (Thu) by man_ls (subscriber, #15091) [Link]

He said: "Application developers have historically been intolerant of systems that change their security policy on the fly." It was me who was missing some context; in fact it was some silly grammar mistake on my part. I thought "their" referred to "systems", not to "application developers", and didn't see how AppArmor changes its own security policy on the fly. It doesn't; it changes application developer's security policy. And yes, it is annoying when that happens.

Hash collisions

Posted Feb 1, 2010 20:41 UTC (Mon) by epa (subscriber, #39769) [Link]

But that nice assumption only holds true for as long as one assumes that the hash algorithms used to identify commits are not subject to brute-force collisions.
It would be quite a task to generate a hash collision that also compiles as valid C code. And doubly impossible to generate one which is valid C code and inserts the backdoor you want. (This could be easier if you can generate both sides of the collision - so you'd somehow generate an innocuous-looking git tree and an evil one that has the same checksum - but then you'd have to somehow convince Linus to bless your innocuous-looking code absolutely untouched as an official release.)

Of course if a hashing function is shown to have weaknesses, you migrate to a better one. That's just common sense. But I don't think we need be too worried about this particular attack - not when there are far easier ways to insert backdoors.

Hash collisions

Posted Feb 1, 2010 23:53 UTC (Mon) by otaylor (subscriber, #4190) [Link]

Any particular patch is only a tiny change away from generating a hash collision with any other patch. The only question is knowing *what* change to make. You could do it by inserting 128 bits of data in any only place, but you could also do it by distributing changes around in smaller places, like added spaces, or alternate wording in a comment. Given the ability to generate hash collisions, it may be only a small step to generating innocuous looking hash collisions based on a desired end-goal. But yes, introducing evil code through collisions is the hard way around. (Linus had a mail analyzing this at one point.)

Hash collisions

Posted Feb 2, 2010 0:54 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

in theory, two patches could be one character away from being a collision.

the thing is that none of the methods that are known to generate collisions can do so to this extent. they all depend on creating a large, randomish blob in one or both files.

please educate me here.

I was under the impression that what people had succeeded in doing was to create two files with the same hash, not take a file that someone else generated and create a new file with the same hash as the original. I was further under the impression that to make this match, both files end up with large chunks of randomish data in them.

I think that everyone is in agreement that if a hash is broken to the point where someone can take an existing file and create a new file that looks reasonable with the same hash there is a serious problem.

where there is disagreement is if the current state of affairs, where someone looking at one or both of the files will see that something is weird here (they do not look like normal C source code), is there a serious problem.

I know there are a lot of people doing research on hash collisions. If someone can dig up two source code files that have the same hash, from any source it would go a long way towards making your case.

Even then there is the question of if one could plausibly be a replacement for the other, but just finding two source files would be a better start then all the theoretical arguing that you have been doing.

Hash collisions

Posted Feb 2, 2010 10:27 UTC (Tue) by copsewood (subscriber, #199) [Link]

MD5 was broken totally when it became possible to have 2 SSL certs that hashed to the same value.

Not sure how much more or less difficult doing something similar with 'C' source code would be. But I doubt that SHA1 is anywhere close to this level of brokenness.

Hash collisions

Posted Feb 2, 2010 14:09 UTC (Tue) by otaylor (subscriber, #4190) [Link]

It's hard for me to see how my few sentence comment could possibly considered as "all the theoretical arguing that you have been doing." My point was not that I know of any way of generating dangerous collisions, or that I am losing a single second of sleep over the security of my GIT repositories, but rather that I found the argument "It would be quite a task to generate a hash collision that also compiles as valid C code" weak. The current collision generating attacks I'm aware of (not specifically talking about SHA1) don't require generating a new file from scratch, but rather inserting random-looking data into a padding section of a file format. It doesn't seem a huge step from there to inserting "steganographered" random data. But even restricting to the simplest case of random-looking data at the end of the file, one out of every 65536 random-looking data blocks ends with '*/'... Anyways, I'm not an academic or even amateur cryptographer, and have no intention of becoming such, so while I try to avoid talking total nonsense, if you find posts based on general considerations offensive, please feel free to ignore what I write.

Hash collisions

Posted Feb 2, 2010 15:18 UTC (Tue) by jsatchell (guest, #6236) [Link]

Important point about hashes: finding a collision is not the same as solving the pre-image problem.

The acknowledged weakness in SHA1 means that it is possible to find a pair of texts, given control over both texts, that hash to the same value much faster than the expected 2^80 steps.

But most attacks on a VCS involve taking a known text, and finding another text that hashes to the same value; often this is just solving the pre-image problem, which is much harder. There is no suggestion in the open literature that the collision weakness of SHA1 is matched by a comparable pre-image one.

Hash collisions

Posted Feb 2, 2010 4:28 UTC (Tue) by nevyn (subscriber, #33129) [Link]

It would be quite a task to generate a hash collision that also compiles as valid C code. And doubly impossible to generate one which is valid C code and inserts the backdoor you want.
It might be "harder" but it's far from impossible. Consider the md5 CA attack from last year, they had to do:
Complying with the X.509 standard [HPFS], each of the two certificates consists of:

    * a header of 4 bytes,
    * a so called "to-be-signed" part of 927 bytes,
    * a so called "signature algorithm" field of 15 bytes,
    * a "signature" field of 131 bytes. 
...which included predicting bits of data from the above, that the victim generated, they succeeded.

Security in the 20-teens

Posted Feb 1, 2010 21:09 UTC (Mon) by cmccabe (subscriber, #60281) [Link]

I read an interesting paper recently by the author of qmail. It's here:
http://cr.yp.to/qmail/qmailsec-20071101.pdf

Basically, he claims that our existing approaches to security have failed. "Chasing attackers" by closing security bugs once they're published will never really result in a secure system; programmers just keep adding new bugs as the old ones are found. Firewalls, virus scanners, and other "band aids" don't really fix the underlying security problems, and just add a new layer of complexity for system administrators.

More controversially, he claims that restricting privilege is not as worthwhile as is generally thought. I'm not sure if I completely agree with this, but I will say one thing. In the absence of a LSM, the privileges of a process running as you are pretty high! It can read everything in your home directory, add code to your .bashrc, send messages over DBus, etc. From the perspective of an attacker, /home is where the good stuff is, and not having root may not really be a big deal.

Bernstein's proposed solution is to minimize the amount of "trusted code" by putting most of the program in some kind of sandbox. Using seccomp or running software in a virtual machine are two ways to sandbox code. He also wants to minimize the overall amount of code, to make it more auditable.

Generally, sandboxing code involves restructuring a program in terms of multiple processes that communicate over some IPC channel. This has some other advantages, since we're soon going to be living in a world of 64 or 128-core consumer CPUs. Programmers who really care about performance need to start thinking about parallelism anyway.

Sanboxing

Posted Feb 1, 2010 22:09 UTC (Mon) by jamesmrh (guest, #31622) [Link]

It seems we can accomplish a lot with sandboxing.

With the SELinux sandbox, the default rules for any app running inside are essentially to deny all accesses (e.g. no access to the filesystem, except to load shared libraries, no networking etc.), and we then pass an open file descriptor to the sandbox, over which all communication operates.

This means that the calling program assigns all authority to the sandbox via the open fd, and the sandbox has no "ambient" authority. It's quite a powerful abstraction and we can build more around it (e.g. sandbox X runs graphical apps via a nested X server, communicating over an fd).

See:

- http://video.linuxfoundation.org/video/1565
- http://namei.org/presentations/selinux-sandboxing-fossmy2...

These principles can be applied to other distros/security models.

There's an emerging area of research around the concept of removing ambient authority, see:

http://en.wikipedia.org/wiki/Object-capability_model

We're limited somewhat in Linux by the underlying design of the OS, but as above, we can apply some of the principles.

Sanboxing

Posted Feb 1, 2010 23:15 UTC (Mon) by cmccabe (subscriber, #60281) [Link]

The seLinux sandbox looks promising. For some reason, policycoreutils doesn't include the "sandbox" program for me in Fedora Core 11. It must have been added after the distro was released.

Maybe this is a dumb question, but are there any plans to sandbox apps "by default" in the future? Or is the goal to ship SELinux policies that are restrictive enough to contain misbehaving processes running as the local user? One of the points that was advanced in favor of seccomp was that there's no "off switch" like there is for seLinux.

Sanboxing

Posted Feb 2, 2010 0:35 UTC (Tue) by jamesmrh (guest, #31622) [Link]

It's a Fedora 12 feature.

I think it'd be useful to transparently sandbox some applications, and then perhaps break the sandbox if the user initiates an action which requires access outside.

e.g. all pdf viewing is sandboxed by default, but if the user wants to save the file, the sandbox is disabled for that access (need to ensure that the user clicked save w/ trusted path). Complex apps like firefox are more difficult, but not impossible.

One of the points that was advanced in favor of seccomp was that there's no "off switch" like there is for seLinux

Disabling SELinux can be prevented (modulo kernel bugs).

Sanboxing

Posted Feb 2, 2010 10:02 UTC (Tue) by nix (subscriber, #2304) [Link]

But about half the security holes on a Linux system *are* kernel bugs, and they're particularly nasty to fix because they require a reboot (which almost no other security fix does). So all an attacker waiting to own a system has to do is wait until a vulnerability window opens but you haven't rebooted, and then attack. Brad Spengler has demonstrated just how fast an exploit can be whipped up in that situation by someone with sufficient skill (and I'm quite certain major governments employ a good few such people).

Sanboxing

Posted Feb 2, 2010 16:10 UTC (Tue) by michaeljt (subscriber, #39183) [Link]

A dumb follow-up question, but one that has been on my mind for a while: are there any (more or
less) simple ways a *user* process can drop its privileges and enter a sandbox voluntarily without
using something as heavy duty as SELinux? Like setting the RLIMIT_NOFILE hard limit to one after it
has opened all files and sockets it needs? I am assuming of course that it is a true user process,
not setuid root or whatever.

Sanboxing

Posted Feb 2, 2010 18:52 UTC (Tue) by drag (subscriber, #31333) [Link]

Use LXC.

I can setup a LXC container as root that then can be safely used by users.
This is done through Linux file capabilities and does not require any
setuid programs or anything to be done.

It's as simple as running 'debootstrap' in a directory, installing firefox
into it, and then setting up a lxc configuration.

From then on users can execute firefox from that environment, using their
own UIDs and such, and have the output passed to Xephyr or to their own X
server.

I've done it. It works, it is fast, and unlike chroot it does not require
root rights and is designed for security. It has various levels of
isolation you can setup.

Unlike SELinux it's easy to understand and for mortals to understand.

Security in the 20-teens

Posted Feb 2, 2010 16:12 UTC (Tue) by dgm (subscriber, #49227) [Link]

I think that the most important (and also most overlooked) point made by Berstein is that complexity is the enemy of security. Only simple enough systems can be made secure. Sandboxing may help sometimes, but is just a ban-aid. What we need is simpler systems that we can write without bugs.

Security in the 20-teens

Posted Feb 11, 2010 9:22 UTC (Thu) by renox (subscriber, #23785) [Link]

>What we need is simpler systems that we can write without bugs.

Need? For security perhaps but history has shown that ,as time goes by, we use systems which have more and more features which is hard to reconciliate with the need for simpler systems..

Security in the 20-teens

Posted Feb 4, 2010 8:55 UTC (Thu) by eric.rannaud (guest, #44292) [Link]

> Bernstein's proposed solution is to minimize the amount of "trusted code"
> by putting most of the program in some kind of sandbox. Using seccomp or
> running software in a virtual machine are two ways to sandbox code. He
> also wants to minimize the overall amount of code, to make it more
> auditable.

I would like to remind everyone that it is exactly how Google Chrome
behaves (or Chromium the open source version that runs on Linux), using
seccomp.

All the HTML parsing, Javascript interpretation, image rendering, page
rendering happens in a very tight sandbox. A vulnerability in a PNG library
will not result in a breach of the system. Firefox does nothing of the
sort, quite sadly.

Chrome is the web browser the OpenBSD project would have designed. It
relies on privilege separation everywhere (and a sandbox on top of that, to
limit the impact of OS-level security flaws, like a buggy syscall). Its
design is similar to OpenSSH.

This is the right model. A PDF viewer should be designed that way, as well
as an email client. In this context, so-called webapps become counter-
intuitively *more* secure than local apps that run with $USER privileges.
And remember than with HTML5 localStorage, so-called webapps don't actually
have to store your data with a remote server. Webapps are not usually
designed that way, but they could. And there is of course NaCl, a Google
browser plugin that can run native applications in a sandbox.

It is certainly quite ironic that Google was apparently attacked through
either an IE flaw or an Acrobat Reader flaw. By design, Google Chrome is
more secure against the first class of attacks, and there has been talk of
adding a sandboxed native PDF renderer to Chrome, but that hasn't been done
yet...

See http://dev.chromium.org/chromium-os/chromiumos-design-doc...
overview and LWN's http://lwn.net/Articles/347547/

NB: Google Chrome is now available on Linux. For yum users, follow the
instructions at http://www.google.com/linuxrepositories/yum.html and:
yum install google-chrome-unstable

Security in the 20-teens

Posted Feb 2, 2010 5:55 UTC (Tue) by Baylink (guest, #755) [Link]

I sort of suspect that everyone's analyzing the wrong layer, here.

I'll call everyone's attention (back) to "Reflections On Trusting Trust", Ken Thompson's seminal ACM paper on an *actual* attack, albeit an internal, corporate one. (Yes, it really was; someone quotes here http://groups.google.com/group/sci.crypt/msg/9305502fd7d4... my message quoting Thompson saying so, 15 years ago.)

The underlying point is: it doesn't make any sense to have the degree of security of the various layers of your stack *out of sync*; the weakest one is the one people will attack. Well, at least "successfully". You always find the keys (pun entirely intentional) in the last place you look... because you stop looking, then.

Security in the 20-teens

Posted Feb 2, 2010 7:57 UTC (Tue) by eru (subscriber, #2753) [Link]

I'll call everyone's attention (back) to "Reflections On Trusting Trust", [...]

David A. Wheeler has described a way to thwart that attack: http://www.dwheeler.com/trusting-trust/

Security in the 20-teens

Posted Feb 2, 2010 11:13 UTC (Tue) by paulj (subscriber, #341) [Link]

Hmm, how does that thwart the attack exactly? It assumes you have access
to an unsubvertable oracle - (a set of 1 or more other compilers of which at
least one has not been subverted). Given that then, yes obviously, you can spot
a discrepency.

Isn't that missing the point of "Reflections On Trusting Trust", somewhat? Or
have I skimmed over some key point in this new paper?

Security in the 20-teens

Posted Feb 2, 2010 12:35 UTC (Tue) by eru (subscriber, #2753) [Link]

If I understood the idea correctly, you don't need any unsubverted compiler, just a completely different implementation (eg. to verify GCC you might use PCC, for example). Suppose GCC has been subverted to that it propagates the backdoor when compiling GCC, and PCC does the same when compiling PCC. But compiling GCC with PCC does not propagate a backdoor that would affect GCC when GCC compiles itself.

Security in the 20-teens

Posted Feb 2, 2010 13:19 UTC (Tue) by paulj (subscriber, #341) [Link]

Right, i.e. unsubverted oracle = {set of compilers where at least one compiler is
not subverted}, where 'subverted compiler' here means 'subverted
consistently with the others'.

Basically, he's inventing an oracle against the original problem by limiting the
scope of the attacker. That may be quite fair in practice, but I somehow feel it
still misses the point of the original "Reflections on Trusting Trust" to say that
he's solved the problem posed by it. To make it crystal clear, let me quote from
Ken Thompson's conclusion:

"The moral is obvious. You can't trust code that you did not totally create
yourself."

The DDC technique does not solve that problem in principle, it seems clear to
me.

Security in the 20-teens

Posted Feb 2, 2010 16:18 UTC (Tue) by nix (subscriber, #2304) [Link]

It reduces to the problem to "you can't trust compilers produced by a cooperating malevolent group, nor code compiled with those compilers". But if you have several compilers, some of which are trustworthy *or are produced by malevolent groups that are not in communication*, then those compilers will not introduce the Thompson hack into the *other* compilers when compiling them, and the attack falls apart.

This is a much harder bar for attackers to leap over: from subverting one compiler, they have to subvert every compiler you might possibly use targetting that architecture if they are to go undetected.

Security in the 20-teens

Posted Feb 2, 2010 16:26 UTC (Tue) by Baylink (guest, #755) [Link]

Or, you just have to not be thinking about the problem.

Be honest: how often do *you* evaluate your systems for the Reflections attack? That was my Usenet posting, and *I* don't do it most of the time...

Security in the 20-teens

Posted Feb 2, 2010 18:10 UTC (Tue) by droundy (subscriber, #4559) [Link]

It's not enough that the attackers aren't in communication, but rather that
they are mutually ignorant, which seems highly unlikely. I suspect even
benevolent compiler writers pay reasonably close attention to the work of
other compiler writers, and our malevolent compiler subverters seem likely
to pay even more attention to other compilers.

The attacker who inserted a back door into your gcc may have been smart
enough to make it also able to insert the same back door into pcc, or any
other compiler you can imagine. Which means that if you start out with
only one binary compiler, gcc, then you are out of luck, since you won't be
able to get an unsubverted compiler. Yes, this is harder, but we're
already talking about attackers who are creating very, very tricky code...

I suppose anything to raise the bar on the attack would seem worthwhile.
But it seems like it'd be a more effective approach to write a C compiler
in Forth or something else that is simple enough that you could write a
compiler for *it* in assembler (or machine code, if you don't trust the
assembler...).

Countering the trusting trust attack

Posted Feb 3, 2010 4:40 UTC (Wed) by dwheeler (guest, #1216) [Link]

> The attacker who inserted a back door into your gcc may have been smart enough to make it also able to insert the same back door into pcc, or any other compiler you can imagine...

Fair enough, but it's harder to subvert multiple organizations, and the defender gets to choose which compiler to use as the second compiler (call it the "trusted" or the "check" compiler). So, choose the one that's unlikely to be subverted the same way. If you don't believe in any, then:

> But it seems like it'd be a more effective approach to write a C compiler in Forth or something else that is simple enough that you could write a compiler for *it* in assembler (or machine code, if you don't trust the assembler...)...

Okay, go ahead and write another compiler yourself. That doesn't conflict with the DDC approach.

Now, you could just use that compiler instead of GCC, but everyone else still has the same problem... how can they trust YOUR compiler? And if you use it to compile another compiler (say GCC) in one step, again, how can anyone else trust the results of that GCC executable?

One answer is to use your C-in-Forth compiler to compile the original compiler source code (say GCC), then use THAT compiler executable to compile the original compiler source code again. Given certain assumptions described in the dissertation, the resulting executable should be exactly the same as your original executable. Once you've shown that they are equal, then that means either both were subverted in the same way, OR that the original executable isn't subverted.

Countering the trusting trust attack

Posted Feb 3, 2010 13:25 UTC (Wed) by hppnq (guest, #14462) [Link]

how can they trust YOUR compiler?

They can't, that's the principle of the Thompson attack.

One answer is to use your C-in-Forth compiler to compile the original compiler source code (say GCC), then use THAT compiler executable to compile the original compiler source code again.

The suggestion was -- and I think it is the only correct one -- that the compiler used to compile the compiler-compiler does not need to be compiled itself. If it does need to be compiled, the question remains: what compiler will you use to do that?

the resulting executable should be exactly the same as your original executable. Once you've shown that they are equal, then that means either both were subverted in the same way, OR that the original executable isn't subverted.

But can you tell which conclusion is the right one without having to assume that the original executable was not subverted in the first place? It seems to me that a meaningful conclusion can be drawn only when the two executables are not the same, so you can positively identify a subverted compiler.

Countering the trusting trust attack

Posted Feb 3, 2010 23:36 UTC (Wed) by dwheeler (guest, #1216) [Link]

> The suggestion was -- and I think it is the only correct one -- that the compiler used to compile the compiler-compiler does not need to be compiled itself. If it does need to be compiled, the question remains: what compiler will you use to do that?

As I discuss in the dissertation, malicious compilers must have triggers and payloads to produce subverted results. If you avoid their triggers and payloads, then it won't matter if they're malicious. For example, a malicious compiler cM may have triggers that affect compilations of its source code, but not for another compiler cQ. So you can use cM to compile the source code of cQ, even though cM is malicious, and have a clean result.

(It's a little more complicated than that; see the dissertation for the gory details.)

Countering the trusting trust attack

Posted Feb 4, 2010 7:51 UTC (Thu) by hppnq (guest, #14462) [Link]

For example, a malicious compiler cM may have triggers that affect compilations of its source code, but not for another compiler cQ. So you can use cM to compile the source code of cQ, even though cM is malicious, and have a clean result.

Eaxactly. But any of the N program-handling components of the build system may be subverted (and not necessarily the same one at each compilation, I suppose), so in order to make a reasonable assumption you have to make sure that none of the N components harbours a payload or trigger.

So you have to verify the linker, loader, assembler, kernel, firmware -- i.e., you have to be on completely independent platforms, for both the compilation and verification. I can't see how you can reasonably assure that this is indeed the case, unless you make the assumption that enough components can be trusted.

Which you can't, unless you literally assemble everything yourself. ;-)

Obviously, practically there is a lot you can do to minimize the chance that someone unleashes the Thompson attack on you. But you can't reduce this chance to zero, so the question is the same as always: is an attacker motivated enough to break through your defense? I am quite sure there are compilers that are not public, to make this particular barrier more difficult. But those are not used to build global financial or even governmental infrastructures.

Anyway, I'll shut up now and read the dissertation, it is an interesting topic. Thanks David, and belated congratulations! :-)

Countering the trusting trust attack

Posted Feb 3, 2010 4:22 UTC (Wed) by dwheeler (guest, #1216) [Link]

My web page on countering trusting trust through diverse double-compiling (DDC) has all the details on my DDC approach. DDC uses a second compiler to detect the trusting trust attack, and it's perfectly fine if the second compiler is also subverted; DDC merely presumes that the second compiler isn't subverted in exactly the same way.

Nix's posting is a nice summary its implications. As nix says, DDC 'reduces to the problem to "you can't trust compilers produced by a cooperating malevolent group, nor code compiled with those compilers". But if you have several compilers, some of which are trustworthy *or are produced by malevolent groups that are not in communication*, then those compilers will not introduce the Thompson hack into the *other* compilers when compiling them, and the attack falls apart. This is a much harder bar for attackers to leap over: from subverting one compiler, they have to subvert every compiler you might possibly use targetting that architecture if they are to go undetected.'

There are lots of details on that website, including the entire dissertation. The dissertation includes mathematical proofs and demonstrations with several open source software compilers (including GCC).

By the way, the DDC approach can only be applied if you have the source code. So DDC gives an advantage to compilers whose source code is publicly available, including OSS compilers.

Countering the trusting trust attack

Posted Feb 3, 2010 8:53 UTC (Wed) by paulj (subscriber, #341) [Link]

Hi,

I've replied to Nix. The work is nice, no doubt, but it still requires 1 absolutely
trusted compiler, which would have to be written (or verified/assumed), as I
think you note. No doubt the work could be extended such that Ct is a set of
compilers.

Do you think the "Fully" in the title of your thesis is perhaps unfortunate
though? Your work seems to re-enforce Thompson's result rather than fully
counter it, surely?

Countering the trusting trust attack

Posted Feb 3, 2010 23:24 UTC (Wed) by dwheeler (guest, #1216) [Link]

> The work is nice, no doubt, but it still requires 1 absolutely trusted compiler, which would have to be written (or verified/assumed)...

It does not have to be absolutely trusted, in the sense of being perfect on all possible inputs. It can be subverted, and/or have bugs, as long as it will compile the compiler-under-test without triggering a subversion or bug.

> Do you think the "Fully" in the title of your thesis is perhaps unfortunate though? Your work seems to re-enforce Thompson's result rather than fully counter it, surely?

No, it's not unfortunate. It's intentional.

Thompson's "trusting trust" attack is dead. Thompson correctly points out a problem with compilers and other lower-level components, but his attack presumes that you can't easily use some other system that acts as a *check* on the first. It's not just that you can recompile something with a different compiler; people noted that in the 1980s.

A key is that DDC lets you *accumulate* evidence. If you want, you can use DDC 10 times, with 10 different trusted compilers; an attacker would have to subvert ALL TEN trusted compilers *AND* the original compiler-under-test executable to avoid detection. Fat chance.

Countering the trusting trust attack

Posted Feb 3, 2010 23:40 UTC (Wed) by paulj (subscriber, #341) [Link]

Thanks for your reply. Again, I stress that I appreciate the practical benefits
of your approach.

I saw the caveat in the thesis about the trusted compiler-compiler only
needing to be trusted to compile the 1st full compiler. However, I am at a
loss to see how this trusted compiler (i.e. you inspected all possible relevant
source, or you wrote it) is different from Thompson's trusted compiler ("write
it yourself", see quote above).

Your approach still rests in complete trust in one compiler, according to your
own proofs.

See my other comment about how viruses have advanced from Thompson's
original attack, meaning that a subverted original compiler-compiler could
surely infect all other binaries ever touched by that code through, say, ELF
infections and hooking library calls.

Anyway, I'll leave it there.

Countering the trusting trust attack

Posted Feb 4, 2010 23:11 UTC (Thu) by bronson (subscriber, #4806) [Link]

> Your approach still rests in complete trust in one compiler

No, it doesn't. David described this in an ancestor post. It just rests on the assumption that a single group of attackers can't subvert every single one of your compilers.

Countering the trusting trust attack

Posted Feb 4, 2010 23:54 UTC (Thu) by nix (subscriber, #2304) [Link]

David also described in a recent post how you can ensure that your
compiler groups weren't maliciously cooperating: make sure your compilers
are very different ages. This will only get *better* as the years roll
past, especially once Moore's Law grinds to a halt: if one compiler is a
hundred years older than the other, unless there's an immortal on the
development team there's no *way* they share members. (These days of
course this gap is impractical because computers are changing too fast.)

Countering the trusting trust attack

Posted Feb 5, 2010 0:30 UTC (Fri) by Baylink (guest, #755) [Link]

In particular, this works very well if your check-compiler was shipped *when your target compiler/platform did not even exist yet*.

It would be hard to have hot-wired an early-90s IRIX compiler to break GCC4/Linux.

Countering the trusting trust attack

Posted Feb 5, 2010 19:33 UTC (Fri) by paulj (subscriber, #341) [Link]

How does it get better exactly? Old software doesn't come sandwiched,
ossified between rock strata that can further attest to its obvious age.

You're still going to have to determine whether or not the bag of bits you have
before you really is the same as that old compiler you want to put your faith in.
You'll have to trust your md5sum binary (oops) and you'll have to trust MD5.
Oops. And you're still trusting the original compiler author.

The "they old author can't have thought of future compilers" argument seems
weak. Viruses are much more sophisticated these days - there's no need the
attack has to be limited to specific implementations of software.

I know David's paper frames the problem so that the attack in fact does have
that limitation, but that seems an unjustified restriction of Thompson's attack.

Countering the trusting trust attack

Posted Feb 5, 2010 19:44 UTC (Fri) by Baylink (guest, #755) [Link]

> How does it get better exactly? Old software doesn't come sandwiched, ossified between rock strata that can further attest to its obvious age.

Sure it does. :-)

There are lots of things which make it difficult to run really old software on newer platforms, and the more obstacles you place in the way of a notional IRIX Trusting-attack implementor, the less likely you make an outcome positive to him.

> You're still going to have to determine whether or not the bag of bits you have before you really is the same as that old compiler you want to put your faith in. You'll have to trust your md5sum binary (oops) and you'll have to trust MD5. Oops. And you're still trusting the original compiler author.

Yes, but what you're trusting him to do *now* is to have written a compiler which could properly identify and mangle a compiler which did not even exist at that time. And compilers are sufficiently different from each other syntactically that I don't think that attack is possible even in theory, though clearly, "I don't think" isn't good enough for our purposes here. :-).

> The "the old author can't have thought of future compilers" argument seems weak. Viruses are much more sophisticated these days - there's no need the attack has to be limited to specific implementations of software.

Well, I think that depends on which attack we're actually talking about here, and "virus" doesn't really qualify. The Trusting attack was a compiler-propagated Trojan Horse, a much more limited category of attack than "viruses these days", and therefore even harder to implement.

I'm not sure why failing to expect clairvoyance from an earlier-decade's attack author is a weak approach, either. :-)

Countering the trusting trust attack

Posted Feb 5, 2010 21:42 UTC (Fri) by paulj (subscriber, #341) [Link]

Thompson implementing his attack as a compiler attack is a detail, primarily
because source code was the normal form of software interchange but the
basic compiler toolchain obviously still required passing around binaries. In
short it was the *only* place he could have implemented an attack by
subverting binaries. His paper is explicit that the compiler attack is merely a
demonstration of a more fundamental problem of having to place trust in
computer systems. Particularly, he mentions microcode as a possible level of
attack - clearly a completely different thing from compiler level and indication
that Thompson was making a very general point.

To think that Thompson's attack is only about compilers is surely to miss the
point of a classic paper.

Also, I don't expect clairvoyance. Indeed, you miss my point about which
direction the attacker is going.

I think perhaps I should properly write up my criticism...

Countering the trusting trust attack

Posted Feb 5, 2010 21:52 UTC (Fri) by Baylink (guest, #755) [Link]

Because I am a believer in the traditions of science, yes, I think it would be an excellent idea if you wrote up formally your problems with his paper...

which I *promise* I'm going to read, tonight while I wait for a server upgrade to finish. :-)

And certainly any level of the stack can be attacked, and I understand that was his point. But one either has to say "there's no practical way for me to validate the microcode of the CPU, and thus there's a practical limite to what I can verify", or one has to -- in fact -- do that validation.

If one can.

As we note on RISKS regularly, there are two issues at hand here: "pick your own low-hanging fruit", ie: make sure you apply extra security balm equally to all layers of your problem (as adjusted by your threat estimates at each layer), and "know your CBA": the amount of security at all levels you apply has to be in keeping with not only your threat estimate, but with what the bad guys can *get*.

This is, in particular, the part of the issue that terrorists throw monkey wrenches into: trying to inspire asymmetrical responses to what are, objectively, low-level threats. Your opponent wears himself out on the cape and never sees the sword. Bruce Schneier likes to address this issue.

Countering the trusting trust attack

Posted Sep 20, 2010 14:53 UTC (Mon) by paulj (subscriber, #341) [Link]

Took a while, but I wrote up those views on "Diverse Double-Compiling" and stuck them online here.

Countering the trusting trust attack

Posted Feb 5, 2010 23:05 UTC (Fri) by nix (subscriber, #2304) [Link]

Of course paulj's attack is possible in theory. We just need strong AI
first.

Countering the trusting trust attack

Posted Feb 5, 2010 23:03 UTC (Fri) by nix (subscriber, #2304) [Link]

You misunderstand. I'm not saying 'if you prove that this compiler is old
then you will invariably detect the Thompson hack' I'm saying 'if it is
likely that this compiler is old then your chances of detecting the
Thompson hack go way up'.

(And the Thompson hack *was* specifically relating to quined attacks on
compilers and other code generators. Viruses are a much larger field, with
Thompson hacks as a small subset. It is possible they are converging, but
I see little sign of it: attacking compilers isn't profitable because
they're relatively uncommon on the man in the street's machine.)

Countering the trusting trust attack

Posted Feb 10, 2010 9:45 UTC (Wed) by hppnq (guest, #14462) [Link]

So, how about "yum update"? ;-)

Countering the trusting trust attack

Posted Feb 3, 2010 17:50 UTC (Wed) by Baylink (guest, #755) [Link]

I will admit up front to not having yet checked out your site, I'm at work just now. But if your test is "both compilers produce the same object code", then even both compilers *not* being subverted will not guarantee that.

If I use compilers A and B to build G(cc), the A-G and B-G objects will not necessarily be byte-identical, and it doesn't *matter* what object they each in turn produce, because that would have to be am exhaustive search, which is impossible.

Or are you suggesting that A-G and B-G then be used to again compile Gcc, and *those* binaries be compared? That would tell you that either A and B were not subverted, or were subverted in exactly the same way...

but how are you authenticating your GCC sources?

(If the answer is "read the damn paper, idiot", BTW, just say that. :-)

Countering the trusting trust attack

Posted Feb 3, 2010 23:29 UTC (Wed) by dwheeler (guest, #1216) [Link]

Ummm... let me just say "read the paper, please" :-). I'm fully aware that compiling the same source with different compilers will (normally) produce different executables.

> Or are you suggesting that A-G and B-G then be used to again compile Gcc, and *those* binaries be compared? That would tell you that either A and B were not subverted, or were subverted in exactly the same way...

That's the basic idea, sort of. Given certain preconditions, you can even recreate the original executable with a different starting compiler.

Security in the 20-teens

Posted Feb 3, 2010 8:46 UTC (Wed) by paulj (subscriber, #341) [Link]

You've restated what I wrote, hence I don't disagree with you.

Note that we don't know how hard this bar would be. There are things like
'clumping' of expertise, such that that in any specialised area in a technical
field the people working in it tend to be drawn from a much smaller group than
the set of all people qualified in the field. I.e. the set people who *write*
compiler A are less independent from those who author compiler B. Hence
your assumption that the attacker would have to *hack* into the other
compiler is unsafe. Rather they could simply transition from working on A to B,
either as part of their normal career progression or at least seemingly so.

Next, as dwheeler also notes in his paper, it may be hard to obtain another
unsubverted compiler. Indeed, looking carefully at his work it seems his proofs
specifically require 1 compiler-compiler that can be absolutely trusted to
compile the general compiler correctly, as the starting point of the process. (I
thought at first that perhaps a set was sufficient, such that you didn't have
to know which compiler was trustable, as long as you could be confident at
least one compiler was). See the first sentence of 8.3 in his thesis, and the
multiple discussions of the role of a trusted compiler in the DDC process.

So this still seems to boil down to "you have to write (or verify all the source)
of your compiler in order to really be able to trust it".

I'm not poo-poo'ing the work per se, just saying this good work is slightly
marred by the overly grand claim made in its title.

Security in the 20-teens

Posted Feb 3, 2010 12:33 UTC (Wed) by paulj (subscriber, #341) [Link]

Another thing to consider:

There's nothing to stop the author of an compiler subverting its binaries such
that *generally* infects all binaries it touches, such that those binaries then
infect all other binaries they touch (e.g. by hooking open), and this infection
could also introduce system-binary specific attacks as/when it detected it
was running as part of those programmes.

Thinking in terms of a compiler specifically looking for login is ignoring the huge
advances made in virus design since Thompson wrote his.

I.e. in this discussion we're assuming DDC means you need to subvert 2
compilers. However that's not the case, nor is it even supported by the
thesis being discussed.

Anyway.

Security in the 20-teens

Posted Feb 4, 2010 22:39 UTC (Thu) by dwheeler (guest, #1216) [Link]

> There's nothing to stop the author of an compiler subverting its binaries such that *generally* infects all binaries it touches, such that those binaries then infect all other binaries they touch (e.g. by hooking open), and this infection could also introduce system-binary specific attacks as/when it detected it was running as part of those programmes.

An author can do that, but such an author risks instantaneous detection. The more general the triggers and payloads, the more programs that include corrupted code... and thus the more opportunities for detection.

For example, if compiling "hello world" causes a corrupted executable to be emitted, then you can actually detect it via inspection of the generated executable. Even if the system shrouds this, examining the bits at rest would expose this ruse.

Besides, as I talk about in the dissertation, the "compiler" you use does NOT need to simply include a compiler as it's usually considered. You can include the OS, run-time, and compiler as part of the compiler under test. You need the source code for them, but there are systems where this is available :-).

I have an old SGI IRIX machine that I hope to someday use as a test on a Linux distro with glibc and gcc. In this case, I have high confidence that the IRIX is as-delivered. I can feed it the source code, and produce a set of executables such as OS kernel, C run-time, and compiler as traditionally understood. If I show that they are bit-for-bit identical, then either (1) the SGI IRIX system executable suite when used as a compiler has attacks that work the same way against the Linux distro written many years later, or (2) the Linux distro is clean.

I talk about expanding the scope of the term "compiler" in the dissertation.

> I.e. in this discussion we're assuming DDC means you need to subvert 2
compilers. However that's not the case, nor is it even supported by the
thesis being discussed.

Sure it is, and the thesis proves it. However, be aware that I very carefully define the term "compiler". In the dissertation, a compiler is ANY process that produces an executable; it may or may not do other things. For example, a compiler may or may not include the OS kernel, runtime, etc. Anything NOT included in the compiler-under-test is, by definition, not tested. If you want to be sure that (for example) the OS kernel doesn't subvert the compilation process, then you include it as part of the compiler-under-test during the DDC process.

Security in the 20-teens

Posted Feb 5, 2010 19:19 UTC (Fri) by paulj (subscriber, #341) [Link]

Did SGI publish secure hashes of your IRIX software?

If yes, I bet it's using MD5 at best. Hashes seem to have quite limited lifetimes.

If no, how can you know the system today is as it was before? If you say "cause
it's been sitting in my garage", then how can I repeat your result? Perhaps you
will offer a compiler verification service, but then we're still back to Thompson's
point, surely?

Security in the 20-teens

Posted Feb 2, 2010 15:35 UTC (Tue) by ortalo (subscriber, #4654) [Link]

Yes, threat models are useful of course. So is risk analysis by the way, ie.: knowing what has critical value in the system (apart from the security mechanisms themselves).
However, threat models are also the area of arcane black magic, or if you prefer, of a lot of subjectivity.
IMHO, even though the attacker point of view is interesting, many of us fall in the trap of concentrating on this point of view. (The dark side is tempting, remember...?;-)

We need to focus more (possibly much more) on the defensive stance. We need to provide more (objective or subjective) guarantees about the security of our system and if possible about the properties we achieve.
One of our cousin, OpenBSD, as taken such a stance more than a decade ago and that brought them (what I see as) serious advantages in the security field, even against "well-funded" attackers.
The overall difference between OpenBSD and the rest of the family, is that they deliberatly raised their security priority (possibly as much as a differenciation feature as an actual objective but then, why should it matter?). It lead them to put actual guarantees on the table: they have something you can argue upon. We need the same approach: raise the priority and do the actual work (whether it be SELinux or something else).

Btw, a recent exchange of comments on LWN also brough me to a similar exchange with Ingo Molnar, and he rightly pointed out the lack of a useful security metric as a possible technical impediment for progress in this area. He also pointed out that users motivation for security features was not very high. Personnally, I am a "security-guy" (for a living), so of course I am heavily biaised. But my answer to the lack of end-users enthusiasm for security mechanisms is simple: "So what?"

(Endusers will never desire or fund security features; because they only want security gurantees.)

Security in the 20-teens

Posted Feb 2, 2010 15:44 UTC (Tue) by ortalo (subscriber, #4654) [Link]

Oops. Hit the publish button too fast... Reference to former comment around here http://lwn.net/Articles/342783/

Security in the 20-teens

Posted Feb 2, 2010 19:30 UTC (Tue) by michaeljt (subscriber, #39183) [Link]

Google might also reduce the likelyhood of attacks against them if they didn't save data about
certain "interesting" queries.

Security in the 20-teens

Posted Feb 4, 2010 6:06 UTC (Thu) by happyaron (subscriber, #62220) [Link]

The problem here is in our development procedure of most open source software, but I don't think it should be only point to the governmental attacks. Since not only the government can make an attack well funded, we cannot only emphasis such threats are only from government.

It is clear that we cannot assure that every piece of code is clean, without deliberately injected harmful code, such shortcoming is due to our current development procedure, which make the freedom of everybody contributing to there favorite projects. Talking about hash attacks on DVCS maybe really useless on such an issue, as a previous comment has issued, generating a piece of code that still can work isn't a really easy thing with the same hash, and please don't forget there is still code reviews, which makes generating a workable, with security holes injected codes, even harder than only generating something with no other meaning but only have the same hash. Perhaps nobody can tell that any currently widely used hash algorithm (e.g. MD5, SHA1) is so weak that can be successfully cracked in this way easily. As for GPG, it also depends on hash algorithm, so talking about GPG other than hash maybe meaningless.

I am not quite agree with the opinion that it is an alarm that national governmental attacks are just getting started from Google stating about a problem in China. Anyway Google hasn't claim that it is suffering attacks from the local government, but all the thing is the result our guess. But don't we agree that countries in the world with such power, or even some ones that are more powerful in this field, may already cracking their citizens data and monitoring their information? The problem is always a problem before it is fixed or proved not to be one, but making fusses about trifles is not needed at all.

Security in the 20-teens

Posted Feb 5, 2010 5:56 UTC (Fri) by Ford_Prefect (subscriber, #36934) [Link]

"Some of these problems (yet another PNG buffer overflow, say) appear to have a relatively low priority, but they shouldn't."

I wonder why such attacks are still relevant - just about every modern processor now allows you to mark only code pages as executable and read-only (NX bit and the like).

Security in the 20-teens

Posted Feb 5, 2010 13:30 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Read the Prologue to "A Fire Upon The Deep". Ultimately the difference between acting on some untrusted data and executing untrusted code is only a slight matter of degree.

Suppose the buffer that you overflow is next to a variable named 'fd'. You replace the file descriptor of a file being written with that of an open network connection, and suddenly data intended to stay local pours uncontrollably out onto the Internet...

The moment progam behaviour deviates from what was intended by the programmer / user you have a potential security hole. If you're lucky it amounts to nothing, and you can invent countermeasures to make that more likely, but it's not safe to bet on it, and the more resourceful and determined the attacker, the more certain they'll find a way to make it work.

And inside a web browser (the most obvious thing to attack) the idea of "non-executable" is laughable. So what if I can't change the machine code, I can scribble on the "mere data" like the trusted Javascript, Flash or Java byte code, which will get executed for me by a virtual machine and have the advantage of being portable.

Security in the 20-teens

Posted Feb 8, 2010 22:55 UTC (Mon) by mrdoghead (guest, #61360) [Link]

So what if I can't change the machine code indeed. With a mere browser-level deception and injection you can have the user change the system for you at the next restart, alerting you by whatever protocol you prefer silently at the next time thereafter that the machine comes in range of a radio connection or wire that your new machine awaits. And while developers and their tools are a prime target of people who want access to our machines and every system I know of has purposely commited "flaws", as they're described when exposed, the machine code is and has been where the real action is. Hardware is a cesspool of backdoors and security defeaters, some legally imposed and others not. There's much money and interest riding on machines being indefensible. Do people have the stomach to advocate against governments, corporations, and criminals too? When law enforcement is on the other side requiring indefensibility? And remember, a piece of working, innocuous code is just a context shift and reparsing away from being quite malicious, no recompiling required.

Security in the 20-teens

Posted Feb 11, 2010 9:36 UTC (Thu) by renox (subscriber, #23785) [Link]

>You replace the file descriptor of a file being written with that of an open network connection,

For a security perspective, the PNG decoder shouldn't have access to network sockets..

>And inside a web browser (the most obvious thing to attack) the idea of "non-executable" is laughable.

Agreed, that's why Chrome's design is really a nice change here, even if it doesn't go far enough: AFAIK Flash isn't properly 'shielded' from the rest of the system..

Security in the 20-teens

Posted Feb 11, 2010 14:32 UTC (Thu) by anselm (subscriber, #2796) [Link]

For a security perspective, the PNG decoder shouldn't have access to network sockets..

The PNG decoder shouldn't be allowed to open new network sockets. However, a file descriptor open for reading is a file descriptor open for reading. It doesn't matter much whether there is a disk or a web server at the other end.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds