Pondering the X client vulnerabilities

Posted May 28, 2013 15:27 UTC (Tue) by tjc (guest, #137)
In reply to: Pondering the X client vulnerabilities by epa
Parent article: Pondering the X client vulnerabilities

> But why is the software written in such an unsafe language to start with?

Because the project started in 1984. What else would they have used -- Ada?

Pondering the X client vulnerabilities

Posted May 29, 2013 20:12 UTC (Wed) by epa (subscriber, #39769) [Link] (17 responses)

I went along with the article's statement that this code was from the early 90s (though the first versions of X were much older). But really my comment is not a criticism of the developers as of the glacial pace at which programming environments improve for general use. You'd think that safer C variants such as Cyclone would have displaced plain C by now, or better that the standards committees would have incorporated most of their safety features into newer C standards - with a way to turn them off on a case by case basis.

Pondering the X client vulnerabilities

Posted May 29, 2013 21:05 UTC (Wed) by jg (guest, #17537) [Link]

Well, there are still a few lines of code from W (in a few header files): those date from 1982 and 1083.

Bob Scheifler and I started working on X1 in June of 1984.

X11's (the protocol framework version used today) was shipped in 1987, IIRC.

There were precious few options available to us.

One of the early X Window managers and a terminal emulator were written in Clu, however. http://en.wikipedia.org/wiki/CLU_(programming_language) Bob Scheifler worked for Barbara Liskov.

But Clu wasn't available on other platforms (Project Athena, where I worked at the time as a Digital employee, was a joint project with IBM, and what became the IBM RT/PC was known to be in our future. Sun's were also widespread. So we needed a language that was cross platform. C++ wasn't yet an option either.

So we "knew better", but there just wasn't any actually viable alternative to C in that era.

Pondering the X client vulnerabilities

Posted May 29, 2013 21:08 UTC (Wed) by smoogen (subscriber, #97) [Link] (15 responses)

Those things are controlled by more on social pressure than code advances. It doesn't matter if you figured out how much Esperanto is better than every other European language out there.. if no one else around you uses it then you are pretty much talking to yourself. And while you can code yourself a nice program in Cyclone.. you aren't going to get a lot of help when you want to make it big.

That social inertia is what keeps certain languages in place even after you would think they should have been put away 20-30 years ago.

Pondering the X client vulnerabilities

Posted May 29, 2013 21:29 UTC (Wed) by epa (subscriber, #39769) [Link] (14 responses)

Yes, absolutely, I was referring to the social environment that makes C a common language despite it being far too easy to make dangerous mistakes. It's not as if safer ways of programming didn't exist even back in 1984 - but then, the choice of C was more understandable since alternatives were equally flawed in various ways. Why, though, doesn't the C language become a bit safer with each new version (C89, C99, C11) with code gradually being moved across? Unchecked 'thin pointers' ought to be only for performance-critical code by now.

Pondering the X client vulnerabilities

Posted May 30, 2013 9:27 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (6 responses)

If someone is coding in C (or C++, for that matter), there's a decent chance they already believe all their code is performance-critical.

Pondering the X client vulnerabilities

Posted May 30, 2013 11:22 UTC (Thu) by hummassa (subscriber, #307) [Link] (5 responses)

in the late 1990's and early 2000's it was somewhat true that we had lots of space for non-performance-critical code... But these days, what we have are lots of mobile platforms and high-power-efficiency server platforms, where performance and cost-to-benefit-ratio are measured not only in "operations per second", but also in "operations per milliwatt-hour"... so, unless the other languages made great strides in things like "not wasting a lot of RAM space and CPU cycles to achieve the same end result" (and yes, some did...) C/C++ (esp. the latter) seems like a good solution in many cases.

Pondering the X client vulnerabilities

Posted May 30, 2013 21:17 UTC (Thu) by dvdeug (guest, #10998) [Link] (3 responses)

Android runs Java. The first Power Mac G4 in 1999 had problems with export because it was classified as a super computer; every smart phone on the planet is faster then those systems running at 400 MHz.

I've never understood why an increase in speed is worth having to tell your customers that their credit card numbers are in the hands of hackers, or having your website hosting child porn. Taking the website down for a week while your IT department (getting massive overtime) is trying to recover what it can from your compromised systems is bloody expensive. It's a classic surcharge on day-to-day operations to prevent possible disaster.

Android doesn't really run Java

Posted May 31, 2013 11:39 UTC (Fri) by alex (subscriber, #1355) [Link] (2 responses)

Well obviously I'm being a little trollish ;-)

However an awful lot of the performance critical areas of the code are written in lower level code and exported to Java via JNI (not to mention key components like the browser which is pretty much all C++). There is a debate to be had as to how much can you ameliorate some of the worst areas of C with C++ but once you start up the road to interpreted/JITed higher level languages you will pay a performance penalty. It's fine when they are making fairly simple decisions that result in work for lower down the stack but when you are moving lots of stuff around you really don't to be in even the most efficient JIT code.

Android doesn't really run Java

Posted May 31, 2013 14:01 UTC (Fri) by hummassa (subscriber, #307) [Link]

You are right. And, of course, the performance penalty theoretically, at least, would increase still more when we are talking about performance as battery/energy economy. BTW -- THAT is what I'd like to see measured in this context.

In the same topic, people demonize C mainly because of three things: null pointers, buffer overflows and integer overflows. You can write beautiful and as-efficient-as C++ code without any of those. But PHP/Java-like vulns would persist (data scrubbing/encoding/quoting errors mainly) are a little bit more difficult.

Android doesn't really run Java

Posted May 31, 2013 18:12 UTC (Fri) by dvdeug (guest, #10998) [Link]

Java doesn't need to be interpreted; I presume it makes it easier to limit Android code and it certainly makes MIPS and ARM Androids both possible. In any case, it's not really relevant to the question of use-after-frees and buffer overruns.

The low-level stuff can't be ignored, but I'm a lot more comfortable having the base libraries be written in C, since they're written by someone who presumably knows what they're doing, and having the programs running on that code written in Java.

As for the browser, you're talking about a tool that deals with a sustained barrage of untrusted material, and it's not hard to feed a web browser compromised material. Looking at Chrome's recent vulnerabilities, in May we had 8 use-after-frees and one out-of-bounds read (and four other), and in March we had 6 use-after-frees and 1 out-of-bounds read (and 15 other, including some "memory corruptions"). (Counts of CVEs listed in Debian's changelog.)

As hummassa says, there's other standard security errors that Python and Java as well as C and C++ get up to. But 45% of Chrome's recent CVEs wouldn't have existed in Java or similar systems. Pick the low-hanging fruit first.

Pondering the X client vulnerabilities

Posted Jun 5, 2013 11:03 UTC (Wed) by epa (subscriber, #39769) [Link]

I'm not going to pretend that because computers are faster, there is no longer performance-critical code. (Even though it is worth bearing in mind that today's mobile platforms or ARM-based servers are still far more powerful than the high-end workstations for which X was written.) My point is that some code is performance-critical, but not *all of it*. The 80-20 rule applies - actually more like 99-1, where a small part of the code is the inner loop where micro-optimizations like skipping bounds checking make a difference.

By all means profile the code, find the hot sections, and after careful review turn off safety for those. That does not justify running without bounds checking or overflow checking for the rest of the code.

Pondering the X client vulnerabilities

Posted May 30, 2013 11:18 UTC (Thu) by etienne (guest, #25256) [Link] (2 responses)

> Unchecked 'thin pointers' ought to be only for performance-critical code by now.

Please define 'thick pointers', the opposite of 'thin pointers'.
An address plus a size? The size of what?
OK, for a structure, a thick pointer to that structure is { pointer + size of structure}.
For an array of structures, a thick pointer to that array is { pointer + size of complete array}. To get a thick pointer to an element of that array, you now need the size of that element - possible.
Now you want to scan through an array, you do not want a thick pointer to an element of the array (because you will not be able to increment this thick pointer to point to the next element), you want a thick pointer to an array of element, each times you increment the thick pointer you decrement the size of this thick pointer by the size of the structure of the array.
Now someone wants to search something in the array from the end of the array, each times you decrement the thick pointer you increment this thick pointer size by the size of the structure of the array. But then what stops you to go below the start of the array? You need to include a minimum base address into the thick pointer?
Now imagine all this is about chars and array of chars - your thick pointer is bigger than the string you are manipulating, and you code initially only prints an hexadecimal value into an array of 256 bytes, you know it will not overflow.
You think the compiler will be able to optimise and remove a field of a structure (thick pointer size) when it is obvious there will not be any problem? It is not going to happen if the function is not static inside the C file.
The "thick pointer size" can also be a variable inside the compiler (range analysis) instead of inside the compiled function, but that has its limits.

Pondering the X client vulnerabilities

Posted May 30, 2013 12:20 UTC (Thu) by mpr22 (subscriber, #60784) [Link]

I say the following as someone who loves C and codes in it for a hobby and for a living: As footguns go, the C-style (let's not use the word "thin", since it appears to have caused confusion) pointer is a Single Action Army with six in the cylinder.

Safer-than-C pointers have stricter semantics. You can't just casually toss them into, and fish them back out of, the semantics-free abyss that is void *. You can't use safe_pointer_to_foo as if it was safe_pointer_to_array_of_foo; if you want a mutable handle to an entry in an array_of_foo, you create an iterator_on_array_of_foo, which consists of a safe_pointer_to_array_of_foo and an index.

And the principle of defensive design indicates that since language and library safety features exist to protect the user from the effects of the programmer's idiocy, the syntax that is quick and easy to use should be the one that has all the safeties enabled, and if the compiler can't sufficiently intelligently optimize away redundant safeties then there should be an unsafe syntax carrying a significant dose of syntactic salt. Unfortunately, because people tend to resist even those changes from which they would get real benefits, getting such a change of outlook past the standards committee is very hard.

Pondering the X client vulnerabilities

Posted May 30, 2013 17:15 UTC (Thu) by tjc (guest, #137) [Link]

The "thick pointer" size problem could be solved by defining a thick pointer with just one additional member, an index (generated by the compiler or run-time code) into a global run-time data structure that includes the remaining information about the pointer. This way all thick pointers have the same size, and the remaining information doesn't have to be passed around on the stack.

Pondering the X client vulnerabilities

Posted May 30, 2013 20:48 UTC (Thu) by kleptog (subscriber, #1183) [Link] (3 responses)

The amount of code written in C is so vast that trying to fiddle with basic features of the language (like pointer arithmetic) will simply lead to vast swathes of C code failing to compile. To get people to advance to newer versions of the language requires not leaving a huge percentage of people behind. Witness Python 2 vs 3.

In any case, C doesn't need to become a safe language. There are plenty of safe languages to code in, you need at least one unsafe language to code all the safe languages in. It may as well be C.

What I think would really make a difference is if someone actually implemented a working C-like language (you couldn't call it C) that had a few extra restrictions, like: 'a pointer assigned an address in array X cannot point outside that array by pointer arithmetic'. And then demonstrated that it (a) performed ok and (b) ran >99% of existing C code.

I think the easiest way to achieve this would be code a C interpreter in PyPy where you could implement the restrictions, and you get a JIT to make it perform. If you could make this work, my guess is you could convince a lot of people run most C programs under such an interpreter and you would have solved the problems for a significant portion of the codebase.

But whether its actually achievable....

Pondering the X client vulnerabilities

Posted Jun 5, 2013 10:10 UTC (Wed) by epa (subscriber, #39769) [Link] (1 responses)

Have you looked at Cyclone?

Pondering the X client vulnerabilities

Posted Jun 5, 2013 20:31 UTC (Wed) by kleptog (subscriber, #1183) [Link]

Interesting, but it looks like it just punted and just created a new language that looks like C but isn't. Which means it's never going to catch on because it would require changing the code.

I mean, valgrind manages to catch all sorts of memory errors without any source with not many false positives. ISTM a compiler should be able to do a similar job but more efficiently (because valgrind slow as molasses, but points out the error to you right away, which is what makes it worth it).

Pondering the X client vulnerabilities

Posted Jun 6, 2013 8:55 UTC (Thu) by kevinm (guest, #69913) [Link]

Actually, for the restriction you gave ("a pointer assigned an address in array X cannot point outside that array by pointer arithmetic"), you *could* implement pretty much that restriction and still call it C. In C, pointer arithmetic is only defined if it results in a pointer within the original object or one object past the end of an array.

The major obstacle would be that you'd need an ABI change to pass around the necessary object extent data to enforce this everywhere.