OpenSSH and the dangers of unused code

Posted Jan 21, 2016 10:20 UTC (Thu) by mjthayer (guest, #39183)
In reply to: OpenSSH and the dangers of unused code by andresfreund
Parent article: OpenSSH and the dangers of unused code

> I think you're overstating the problem here. Dead stores being eliminated you can easily prevent by adding a volatile specifier to the passed type. Saying that the store has a side effect besides saving a value for the next read.

I thought that the problem was that the code authors forgot to mark the store as having side effects, not that they could not have easily done it if they had thought of it.

OpenSSH and the dangers of unused code

Posted Jan 28, 2016 13:09 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

Why is this the code authors' problem. THEY TOLD THE CODE TO CLEAR THE BUFFER!

It's down to the compiler to warn them that it's ignoring their instructions!

Okay, maybe one doesn't want too many optimisation warnings, but for the compiler to effective DELETE LINES OF CODE without warning the programmers that the resulting program doesn't actually do what they asked it to, is a compiler bug - optimisation or no optimisation.

Put it another way - THE CODE THE PROGRAMMERS WROTE is either wrong, or meaningful, so for the compiler to just lose it without warning is a compiler error. Saying that the code authors need to add extra instructions to force the compiler to "do as I say" is, as soon as you put it like that, obviously wrong. And something that even an EXPERIENCED programmer is likely to get wrong.

Cheers,
Wol

OpenSSH and the dangers of unused code

Posted Jan 28, 2016 16:16 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

> Put it another way - THE CODE THE PROGRAMMERS WROTE is either wrong, or meaningful, so for the compiler to just lose it without warning is a compiler error.

The compiler doesn't have any way to see whether the code was actually what the programmer wrote, or just something embedded in a macro and placed there by the preprocessor. In the latter case the line may make perfect sense in some situations, but have no effect in others. For inlined functions the compiler has a bit more information available (at least during the initial stages, well before the dead store elimination pass) but even there the programmer may just have been writing the code generically—which is simply good coding practice, even if the store is not meaningful in the context of the current compilation unit.

The current behavior is better for the vast majority of real-world programs, where these warnings about standard and expected optimizations would just add noise and perhaps mask more serious issues. For those rare cases where you absolutely need to make sure data isn't retained in memory, there are well-known techniques available to force the store to take place. If you don't know about those techniques and the need to use them, you probably shouldn't be writing such code in the first place.

OpenSSH and the dangers of unused code

Posted Jan 29, 2016 17:20 UTC (Fri) by warrax (subscriber, #103205) [Link] (1 responses)

> Why is this the code authors' problem. THEY TOLD THE CODE TO CLEAR THE BUFFER!

See, programming languages are partly defined by this thing called "semantics" (operational or otherwise). The semantics of C say that a 'store' which cannot be observed is irrelevant[1]. Therefore the compiler is free to elide it -- as long as it can prove that nobody can observe it.

(Hence the existence of the "volatile" keyword.)

[1] Well, I'm not actually 100% sure that that's what the standard says, but if it didn't then compiler vendors *wouldn't* be doing this optimization in the first place.

OpenSSH and the dangers of unused code

Posted Jan 29, 2016 21:05 UTC (Fri) by wahern (subscriber, #37304) [Link]

Here's an excerpt from C11 (draft n1570). Elsewhere the standard says that an object's lifetime ends after a call to free (or realloc). See 6.2.4p2. You then have to rely on the definition (3.4.3p1) and application of undefined behavior to say that any side-effect is unobservable after the object's lifetime has ceased.

It's noteworthy that the last line, "[t]his is the observable behavior of the program", was absent from C99. You're still left to connect the dots about what observable behavior means in relationship to side-effects and, thus, allowable optimizations. Nowhere else is "observable behavior" explicitly mentioned except in the section on atomics, though that fact alone is suggestive of both the intended meaning as well as reasons why the language is so terse and circumspect in this respect.

5.1.2.3 Program execution

1 The semantic descriptions in this International Standard describe the behavior of an
abstract machine in which issues of optimization are irrelevant.

2 Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects, which are changes in the state of
the execution environment. Evaluation of an expression in general includes both value
computations and initiation of side effects. Value computation for an lvalue expression
includes determining the identity of the designated object.

...

4 In the abstract machine, all expressions are evaluated as specified by the semantics. An
actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).

...

6 The least requirements on a conforming implementation are:

-- Accesses to volatile objects are evaluated strictly according to the rules of the abstract
machine.
-- At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
-- The input and output dynamics of interactive devices shall take place as specified in
7.21.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.

This is the observable behavior of the program.