|
|
Subscribe / Log in / New account

Wheeler: Fully Countering Trusting Trust through Diverse Double-Compiling

David Wheeler announces the defense of his PhD dissertation on countering the classic "Reflections on Trusting Trust" attack, which Ken Thompson spoke about in 1984. That attack subverts compilers to continuously re-infect binaries produced by that compiler (including the compiler binary itself) with some kind of malicious payload (a login back door was Thompson's example). The attack is impossible to detect, except by using Wheeler's technique, which was originally described in a 2005 Annual Computer Security Applications Conference (ACSAC) paper [PDF]. His dissertation expands on that work, and the defense of it is open to the public on November 23 at George Mason University in Fairfax, Virginia. "This 2009 dissertation significantly extends my previous 2005 ACSAC paper. For example, I now have a formal proof that DDC is effective (the ACSAC paper only had an informal justification). I also have additional demonstrations, including one with GCC (to show that it scales up) and one with a maliciously corrupted compiler (to show that it really does detect them in the real world). The dissertation is also more general; the ACSAC paper only considered the special case of a 'self-parenting' compiler, while the dissertation eliminates that assumption."

to post comments

Open Source Software connection

Posted Nov 3, 2009 4:55 UTC (Tue) by dwheeler (guest, #1216) [Link] (3 responses)

I should quickly make the connection to free/libre/open source software (FLOSS), for those who aren't familiar with this problem. After a successful "trusting trust" attack, the source code no longer corresponds with the executable, which renders moot the "many eyes" claim for FLOSS. Thankfully, there's a technique which can detect the attack, and thus, source code review can still work. Even more interestingly, the technique is primarily useful only for those who have access to the source code... which means that against the trusting trust attack, open source software has a decided advantage.

Open Source Software connection

Posted Nov 3, 2009 11:10 UTC (Tue) by sourcejedi (guest, #45153) [Link]

Good point.

I was skeptical about this at first, having been seduced by the the perfection of the original "Trusting trust" paper.

"Trusting trust" says you can't trust a single compiler, even if you have the source code. This work shows you can test a pair of compilers for trustworthiness - *provided* they are independent. (They may both be malicious, but you can tell if they are malicious in different ways)...

From where I'm standing, this doesn't automatically rule out that Ken Thompson has recursively back-doored every single compiler in modern use. It would require amazing foresight, but I don't like to rule it out. But what it does say is that if I can bootstrap a hack of a compiler by myself, however slow and sub-optimal it may be, I can then use it test for back-doors in the current whiz-bang generation of compilers.

Open Source Software connection

Posted Nov 7, 2009 2:15 UTC (Sat) by roelofs (guest, #2599) [Link] (1 responses)

Is the lack of access to your dissertation due to an intentional embargo period (e.g., until the public defense), or is it an ooper?

Forbidden

You don't have permission to access /trusting-trust/dissertation/wheeler-trusting-trust-ddc.pdf on this server.
Apache/2.2.3 (CentOS) Server at www.dwheeler.com Port 80

It seems slightly contradictory to the (very generous) license terms, so I'm guessing the latter...

Greg

P.S. Congrats! Kind of a nice feeling to be done with the whole thing, eh?

whoops...

Posted Nov 7, 2009 2:16 UTC (Sat) by roelofs (guest, #2599) [Link]

Doh! Just screwed up the page-width. Sorry 'bout that.

Don't embed timestamps!

Posted Nov 3, 2009 9:10 UTC (Tue) by edmundo (guest, #616) [Link] (11 responses)

Comparing your two binaries, expecting them to be identical, only works if your build process does not embed timestamps. This is something I've complained about before, for example in the context of Debian's process for building packages, but apparently other people don't think it matters (maybe they're just less pedantic than me). Also (in my opinion) build processes should not embed: the name of the user doing the building, the name of the machine, the names of various temporary files, etc. Also, if you tar up a directory tree with the ordinary "tar" command then you get the files in "random" order, which is another source of nondeterminism that can be eliminated with some care. The advantage of doing all this is that several people can build the same package and get identical binaries, which you can compare to prevent trojans, and also to make sure, in the case of Debian, that the released system is capable of building the packages in the released system, something that the Debian process didn't guarantee, the last time I checked (because the release might contain packages that were built with a prerelease version of the system that might, in unusual cases, turn out to behave differently from the same package built with the released system).

Don't embed timestamps!

Posted Nov 3, 2009 9:45 UTC (Tue) by mjthayer (guest, #39183) [Link]

Add in the word "optionally" in the right places and I quite agree. Some of the things you mention are useful to have, and not necessarily things you want to ban outright, but they don't belong in a "release version" of any software.

Don't embed timestamps!

Posted Nov 3, 2009 9:46 UTC (Tue) by nix (subscriber, #2304) [Link] (7 responses)

GCC has a bit of a problem right now when bootstrapping itself using C++, because anonymous namespaces have random names rather than static linkage (I have no idea why the standard felt fit to require non-static linkage for such things), which leads to each bootstrap stage differing.

Don't embed timestamps!

Posted Nov 3, 2009 10:52 UTC (Tue) by jwakely (subscriber, #60262) [Link] (6 responses)

I think the problem is fixed, or soon will be.

Some uses of C++ templates require names with external linkage, so anonymous namespaces can be used to get similar effects to static linkage, without actually having static linkage.

Don't embed timestamps!

Posted Nov 3, 2009 11:17 UTC (Tue) by nix (subscriber, #2304) [Link] (5 responses)

Ah, a random seed option or something?

(I've been wondering what uses could possibly require external linkage for some time. It's not as if you can put things in anonymous namespaces in two different translation units and have them refer to each other, and references via function pointers don't care if you're using static linkage or not. So why is this done?)

Don't embed timestamps!

Posted Nov 3, 2009 11:47 UTC (Tue) by quotemstr (subscriber, #45331) [Link] (2 responses)

For starters, pointers used as non-type template arguments must point to things with external linkage.

Don't embed timestamps!

Posted Nov 3, 2009 14:58 UTC (Tue) by nix (subscriber, #2304) [Link] (1 responses)

But if they're in an anonymous namespace the name can't leak outside the translation unit anyway (at least not in a way which could be used at compile time to instantiate a template; only as a pointer at runtime), so this seems like a pointless requirement.

C++ templates and static functions

Posted Nov 4, 2009 17:21 UTC (Wed) by jwakely (subscriber, #60262) [Link]

See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.ht... and http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19092

Apparently the restriction on calling static functions from templates was put in place to allow template instantiation to be done at link-time, using only the information available to the linker (which might not include static functions.)

At the Santa Cruz meeting of the C++ committee this issue was just closed, with a change allowing static functions to be found (noone does instantiation at link time.)

I'll stop derailing this article with off-topic comments about C++ trivia now.

Don't embed timestamps!

Posted Nov 3, 2009 16:58 UTC (Tue) by jwakely (subscriber, #60262) [Link] (1 responses)

> Ah, a random seed option or something?

That option already exists, see -frandom-seed, but shouldn't be needed to bootstrap gcc. (It is useful for users who want to compare binaries though.)

Instead, the anon namespace will still cause a random string to be part of the mangled name, but that name will no longer be used in the context that was causing bootstrap comparison failures.

Also, N.B. since GCC 4.2 "Members of the anonymous namespace are now local to a particular translation unit, along with any other declarations which use them, though they are still treated as having external linkage for language semantics." (from http://gcc.gnu.org/gcc-4.2/changes.html)
That means you get most of the benefits of static linkage, without actually having static linkage as far as the language is concerned.

Don't embed timestamps!

Posted Nov 3, 2009 20:00 UTC (Tue) by nix (subscriber, #2304) [Link]

Ah. That's what surprised me: I noticed that NEWS item going past and
assumed it meant random names were no longer used. Foolish me didn't
bother to actually check the source code though.

Don't embed timestamps!

Posted Nov 3, 2009 13:50 UTC (Tue) by arafel (subscriber, #18557) [Link]

I realise your comment probably wasn't directed at the paper, but for the benefit of others, the paper does actually discuss how to handle embedded timestamps. It's not an ignored problem. :-)

Don't embed timestamps!

Posted Nov 3, 2009 19:42 UTC (Tue) by marcH (subscriber, #57642) [Link]

> Also (in my opinion) build processes should not embed: the name of the user doing the building, the name of the machine,...

But these are useful information. I think the right trade-off is to embed them but only at the highest packaging level, where they can easily be separated from the actual binaries.


Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds