User: Password:
|
|
Subscribe / Log in / New account

A longstanding GnuTLS certificate validation botch

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 12:00 UTC (Sun) by ibukanov (subscriber, #3942)
In reply to: A longstanding GnuTLS certificate validation botch by paulj
Parent article: A longstanding GnuTLS certificate validation botch

The technique described in your blog effectively changes the code pattern:

int example() {
    ...
    int errors = f1();
    if (errors) {
         error_cleanup1();
         return errors;
    }
    int errors = f2();
    if (errors) {
         error_cleanup2();
         return errors;
    }
    int errors = f3();
    if (errors) {
         error_cleanup3();
         return errors;
    }
    ...
    normal_cleanup();
    return 0;
}
into
void example(int *errorp) {
    if (*errorp)
        return;
    ...
    f1(errorp);
    f2(errorp);
    f3(errorp);
    ...
    cleanup();
    return 0;
}
That is, this moves the error check into the callee freeing the caller from doing the repeated error checks. This is possible as long as the error state idempotent and indeed minimizes the number of branches in the code.

This pattern does not require using finite state machines for the whole code. I have read about it in a book published over 30 years ago about embedded systems programming and always wondered why this have never became mainstream.


(Log in to post comments)

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 15:31 UTC (Sun) by vonbrand (guest, #4458) [Link]

This complicates the callee for no benefit, it turns out to have to check for errors before doing anything. The error testing is hidden (how do I know all of the callees are doing their job?), and moreover handling cases where an error in the first part allows to do (part of) the rest are a mess.

Just not calling the other tasks if they make no sense trades a call and a branch inside for a simple branch. But who's worrying about performance in error paths... it does cut down on code size if the tasks are used repeatedly (only the branch inside, not each time it is called upon). Perhaps that was the real reason for this coding pattern?

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 16:39 UTC (Sun) by ibukanov (subscriber, #3942) [Link]

> This complicates the callee for no benefit, it turns out to have to check for errors before doing anything.

In that model the callee has to check for errors only as a debugging tool so the origin of the error and its propagation through the code can be logged. In general the idea is to treat errors similar to NaN that does not affects the data flow but only the result. That minimizes the amount of rarely taken branches potentially allowing to test all code paths.

> how do I know all of the callees are doing their job?

There is no need for that as long as the code that detected the error properly taints the evaluation with error condition.

> handling cases where an error in the first part allows to do (part of) the rest are a mess.

I do not see the source of supposed extra mess. If one can recover from a particular error, the error state can be cleared.

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 19:48 UTC (Sun) by ibukanov (subscriber, #3942) [Link]

> handling cases where an error in the first part allows to do (part of) the rest are a mess.

Consider what a good parser does when it detects a syntax error. First it reports it. Second it tries to recover from it guessing if necessary to allow to report *other* errors during single pass. The end result is that it produces a valid parsed tree reflecting its guesses but that tree is tainted with errors so the code generation would never be performed.

Effectively this replaces all the error checks in all callers in the parser implementation by a single check in the code generator for presence of errors.

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 22:12 UTC (Sun) by vonbrand (guest, #4458) [Link]

If you'd like to find all errors that is right. But if an error means curtains, better bail out early. In security-sensitive code, you'd better make sure there is no way the tainted state gets somehow untainted (by mistake or by nefarious intent).

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 22:34 UTC (Sun) by ibukanov (subscriber, #3942) [Link]

> But if an error means curtains, better bail out early.

in a security-sensitive code it is critical to be able to test all the branches if a formal verification is not feasible. An early bailout not only hinders that via exponential growth of branch space but also subjects the software for timing attacks.

As for accidental untainting, consider that GnuTLS and Apple bugs are exactly the examples of this occurring in the code that follows established practice of error handling in C. Yet I have not heard of bugs caused by untainting errors reported by a parser - typically an infrastructure to support reporting of multiple errors naturally minimizes the number of places in the code where the error state can be cleared. It is just harder to wipe out an error array than to clear or ignore a return value flag.

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 0:37 UTC (Mon) by vonbrand (guest, #4458) [Link]

Bailing out early is what cuts down exponential growth. Finding out that after the first sanity test fails the thirtieth does (or doesn't) adds no useful data (state is tainted, i.e., known broken anyway).

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 0:46 UTC (Mon) by vonbrand (guest, #4458) [Link]

In a compiler "bad untainting" leads to an error cascade, a sadly well-known phenomenon. But people just fix the obvious errors and compile again, they will rarely report that as a compiler bug.

A longstanding GnuTLS certificate validation botch

Posted Mar 9, 2014 16:04 UTC (Sun) by paulj (subscriber, #341) [Link]

I don't know why it isn't used more. I've only really seen it in network protocol code, where the code is implementing an RFC that explicitly specifies a state machine. Which makes it seem the authors didn't think too much about the other applications.

I wonder if maybe many programmers just aren't aware of this approach.

Especially with security sensitive parsers of external input (e.g. network message) I absolutely cringe when I see hand-crafted, irregular parsers, twiddling pointers directly into buffers in C. It's 2014....

I've seen a PhD that extended the Java language with support for specifying the allowed state transitions on objects. This allowed the compiler to do extra checks - sanity checks on the FSM and on users of the code. I think the author has moved on to other things though, I think.

I'd really like to see languages provide more support for more restricted abstractions, like FSMs, that can be layered over the code and be more easily checked for problems than that code.

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 18:59 UTC (Mon) by zblaxell (subscriber, #26385) [Link]

The second form is better than the first, but the first was bad to start with. C code with gotos is better than the first one.

Even then, the transformation doesn't reduce complexity. It just moves complexity around the program in a way that may make practical risks worse.

In the first case, there are lots of branches in the top-level function, but the preconditions of each function are stronger. This means f10() doesn't have to recheck f1()'s work, and neither do f9(), f8(), etc. A bug in f3() that fails to detect a violation of f4()'s preconditions could ruin our day, but f4() is simpler in this case, and if we get f3() wrong we'll probably get f4() wrong at the same time anyway.

In the second case, there are fewer branches at the top level, but the invariants, preconditions, and postconditions of the functions are weaker. f10() has to cope with everything f1() let through, and so does f9(), f8(), etc. This can breed code duplication and maintenance mistakes, and increases the attack surface considerably. Imagine fixing six slightly different string-quoting bugs because every function after f3() has to check for conditions that f3() already identified as part of an error state. We can avoid this failure mode by having every function check *errors and return early, but then we've just created a variation on the first case that the compiler can't optimize as easily, and that requires wrapper functions around libraries that don't use this convention.

Either one has a linearly expanding set of states inside example(), assuming you don't make a dumb semantic mistake. There's pretty much no difference from a formal coverage testing point of view.

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 19:45 UTC (Mon) by ibukanov (subscriber, #3942) [Link]

> In the second case, there are fewer branches at the top level, but the invariants, preconditions, and postconditions of the functions are weaker.

Preconditions and postconditions stay the same. What is different is a reaction to a violated precondition. In the first case that follows a typical C pattern the reaction is to bailout. In the second case the reaction is to recover via guessing what would be the right data while tainting the results.

> Imagine fixing six slightly different string-quoting bugs because every function after f3() has to check for conditions that f3() already identified as part of an error state.

There is no error state besides a pointer to error indicator and logging facilities. If f3() is responsible for enforcing the quotation, then in case of errors it should just change the data while tainting the calculation so the data as seen by the rest of code contain the proper quote. This is similar to what some parsers do to report multiple errors when they rewrite lookahead buffer as a part of recovery.

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 20:44 UTC (Mon) by vonbrand (guest, #4458) [Link]

It might well be dangerous to have e.g. f7() working on data known bad from before, some checks will have to be repeated in such a case (and f7() becomes more complex and fragile). And using this to be able to say that tests covered x% of f7() is nonsense, what is interesting isn't coverage in cases that don't do anything (because it failed before).

A longstanding GnuTLS certificate validation botch

Posted Mar 10, 2014 21:14 UTC (Mon) by zblaxell (subscriber, #26385) [Link]

I don't see how early bailout is a "C pattern." Touching tainted data from the Internet is risky in every language. We want to stop doing it as soon as we can determine a negative result to reduce our attack surface (unless we are defending against a timing attack). In a language that isn't C we might throw an exception or use some other idiom instead of gotos or cleanup functions, but we'd still stop processing early to avoid exposing further code to attack.

Yes, we can do all sorts of wonderful analysis of invalid certificates if we keep going through all the parsing stages; however, at the end of the day a malformed certificate is still invalid, and needs only to be rejected.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds