> Does the C standard say anywhere that if you read an arbitrary
buffer before ever writing it ...
It says that if you do that, the compiler can do whatever it likes, including removing all of
your files exactly two hours later. Because of this use of uninitialized memory is a serious
bug, and it is good that they were trying to fix it. No compiler will intentionally do
something destructive, but agressive optimizations can make the bug spread in very unexpected
ways to other parts of your program.
In practical terms though, there was not a huge problem - the error had only minor effects on
the program. The fix was not correct and was far worse than the original bug.
> even the nastiest C compiler will be obligated to do the right
thing.
Don't be so sure about that. The compiler is not obligated to do anything reasonable once you
trigger undefined behavior. There is certainly no "right" behavior in those cases.
Posted May 14, 2008 8:56 UTC (Wed) by tialaramex (subscriber, #21167)
[Link]
I'm really not sure than an uninitialised (note, it is properly allocated so there's no chance
of a segfault or similar, it has just not been initialised by the C program) buffer has this
effect in the C standard.
How would the standard distinguish this case from the case where the buffer simply doesn't
belong to the C program ? If my C program looks at the MMIO registers of a PCI card, it
certainly isn't permissible for me to just arbitrarily zero them so that I'm sure they're
initialised...
There is a big difference between behaviour which is implementation defined (and your
implementation should document its behavior) such as the bit patterns of floating point
numbers, and behaviour which the C standard says is altogether undefined and thus you might
reformat your hard disk or start World War III. Clearly
{ char x[40]; x[105] = 'q'; }
is in the latter category, but I'd really want to see chapter and verse quoted before I
believed that the same applies to
{ char m, x[40]; m = x[20]; }
which is the situation we're looking at in OpenSSL.
Cryptographic weakness on Debian systems
Posted May 14, 2008 10:50 UTC (Wed) by nix (subscriber, #2304)
[Link]
`If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate' combines with a definition of undefined behaviour which includes behaviour
`upon use of... indeterminately valued objects'.
It's pretty clearly undefined.
Cryptographic weakness on Debian systems
Posted May 14, 2008 13:17 UTC (Wed) by endecotp (guest, #36428)
[Link]
Note the "has automatic storage duration" bit in nix's quote. This is why e.g. reading your
mmap()'d PCI device registers is not undefined.
Cryptographic weakness on Debian systems
Posted May 14, 2008 22:50 UTC (Wed) by dvdeug (subscriber, #10998)
[Link]
> I'd really want to see chapter and verse quoted before I believed that the same applies to
> { char m, x[40]; m = x[20]; }
On real hardware, without any compiler optimizations, both
{float m, x[40]; m = x[20];}
and
{char *m, (*x)[40]; m = x[20];}
can cause the program to crash, as the mere copy of an invalid float or invalid pointer can
generate fatal errors.
Cryptographic weakness on Debian systems
Posted May 15, 2008 6:06 UTC (Thu) by lysse (guest, #3190)
[Link]
A float I can buy, if the FPU trips over an invalid bit pattern - but an invalid pointer?
Educate me, please - cite an instance!
Cryptographic weakness on Debian systems
Posted May 15, 2008 21:38 UTC (Thu) by Ross (subscriber, #4065)
[Link]
Not all memory spaces are flat, so some pointer values as stored in memory might not be
loadable into the registers that implement them. (Due to segmentation, typed memory, etc.)
Obviously that doesn't apply to most of the systems in use today (with the exception of
function pointers).
Cryptographic weakness on Debian systems
Posted May 19, 2008 4:38 UTC (Mon) by donwaugaman (subscriber, #4214)
[Link]
On a segmented architecture, load an invalid descriptor into a segment register. Boom! Not
on all architectures, of course - I think the 386 family would be safe with this until you
tried to access the segment - but some machines would generate an exception on the load.
Cryptographic weakness on Debian systems
Posted May 15, 2008 2:28 UTC (Thu) by Ross (subscriber, #4065)
[Link]
I didn't mean that it wasn't properly allocated.
The program is reading data returned by malloc().
Not only can you not trust it to have any specific value (or to be "random"), but it invokes
undefined behavior just like reading an uninitialized local variable. The only stuff which is
exempt from this is global data (static), or data guaranteed to be initialized by your
environment before your program starts.
> is in the latter category, but I'd really want to see chapter and verse quoted before I
believed that the same applies to
> { char m, x[40]; m = x[20]; }
> which is the situation we're looking at in OpenSSL.
I thought we were talking about malloc()ed memory...
char *x = malloc(20);
printf("%c\n", x[10]); /* undefined behavior */
If it's a global struct or array, then it should contain all zero bytes -- not uninitialized
at all.
I don't have a copy of the standard, but check in the list of actions which invoke undefined
behavior, in the library section. A draft I have contains:
"- The value of the object allocated by the malloc function is used"
-Ross
Cryptographic weakness on Debian systems
Posted May 14, 2008 10:46 UTC (Wed) by nix (subscriber, #2304)
[Link]
It's even worse than that. Because the Standard does not define the behaviour of a translator
running over code that invokes undefined behaviour, the program could legally remove all your
files *as soon as it started up*. The compiler itself could do it when compiling said code.
In practice, a compiler that did that, or that responded to code invoking undefined behaviour
by buffer-overrunning or imploding in flames, would be considered to have rather big QoI
problems: but it's not unheard of by any means. Remember the GCC `bailing out' message from
days of yore? That was emitted when the compiler *segfaulted*, but was prettied up somewhat in
the vague hope that people wouldn't complain about it. (These days the same problems yield an
'internal compiler error', which *is* considered a bug: but if it happens on invalid code, not
a very important bug.)
Cryptographic weakness on Debian systems
Posted May 14, 2008 14:28 UTC (Wed) by welinder (guest, #4699)
[Link]
> It's even worse than that. Because the Standard does not
> define the behaviour of a translator running over code that
> invokes undefined behaviour, the program could legally remove
> all your files *as soon as it started up*. The compiler itself
> could do it when compiling said code.
In this case, no. That would require that the translator could
determine that undefined behaviour would happen. That is not
possible in any, but the most trivial, of programs. For example,
link with a malloc that always returns NULL -- boring, but ok
for the C standard -- and the problematic code will not be reached.
Cryptographic weakness on Debian systems
Posted May 14, 2008 19:38 UTC (Wed) by nix (subscriber, #2304)
[Link]
There's no requirement for the code incurring undefined behaviour to be
executed. The Standard imposes requirements on *translators*, and the
translator sees that code. However, compilation must succeed `unless [the
translator] can determine that every possible execution of that program
would result in undefined behavior' (or there's a syntax error or
constraint violation, neither of which is true here), which could be read
to imply that if the translator can determine that there are several
possible ways to interpret some part of a program, some undefined and some
not, the translator must assume that the not-undefined interpretation
holds.
This sometimes has incredibly counterintuitive consequences, so as a QoI
issue it is not always followed. (e.g. in this case I suspect it would be
permitted for a compiler to spot that the problematic code is not executed
if malloc() always returns NULL, and expand the appropriate malloc() calls
inline to a constant NULL on the basis that returning anything else would
incur undefined behaviour! In practice, this wouldn't get done.)