GMP and assert()

By Jake Edge
February 27, 2019

A report of a potential security problem in the GNU Multiple Precision Arithmetic (GMP) library was met with a mixed reaction, from skepticism to responses verging on hostility, but the report ultimately raised a question worth pondering. What role should assertions (i.e. calls to the POSIX assert() macro) play in error handling? An assertion that fails leads to a process exit, which may not be what a developer calling into a library expects. Unexpected behavior is, of course, one step on a path that can lead to security holes.

Jeffrey Walton posted a message with "Asserts considered harmful (or GMP spills its sensitive information)" as its subject line to the oss-security mailing list right at the end of 2018. He noted that GMP uses assert() for error checking, which could lead to a number of undesirable effects. Software calling GMP might well be handling encryption keys or other sensitive information, which could be exposed via core dumps or other means (e.g. error-reporting services) due to the abort() call made in response to the failing assertion.

As the subject of the mail might indicate, the tone of the message was a tad combative and, perhaps, a bit condescending. That may help explain some of the reactions to it. Walton's message begins as follows:

The GMP library uses asserts to crash a program at runtime when presented with data it did not expect. The library also ignores user requests to remove asserts using Posix's -DNDEBUG. Posix asserts are a [debugging] aide intended for [development], and using them in production software ranges from questionable to insecure.

Many programs can safely use assert to crash a program at runtime. However, the [prerequisite] is, the program cannot handle sensitive information like user passwords, user keys or sensitive documents.

High integrity software, like GMP and Nettle, cannot safely use an assert to crash a program.

He goes on to describe what happens when an assertion fails and how that can lead to sensitive information leaking. He also includes an example that uses the Nettle cryptographic library (which calls into GMP) to trigger an assertion, crash, and core dump. It turns out that the assertion is actually caused by a bug in Nettle that has since been fixed. Walton built Nettle and GMP with -DNDEBUG, which is meant to disable assert(). As noted above, that did not disable assert() in GMP, but did work for Nettle. However, that tickled a bug in one of the Nettle test cases, which caused the GMP assertion to fail. Discovering that led to some dismissive responses, but the larger point still stands: should applications expect libraries to abort()?

Most in the thread argued that the danger of trying to continue past a failed assertion could be worse than the information leak. Security-sensitive programs should already be ensuring that no core dumps are produced and that any abort()-reporting services are disabled. In addition, as Vincent Lefèvre pointed out, there are plenty of other ways for a program to crash (e.g. segmentation fault) that could lead to the same information leak. So programs handling sensitive information need to be prepared for that in any case.

The crux of Walton's argument is that the assertions would be better handled by returning an error rather than crashing (or attempting to crash, since programs could be set up to handle the SIGABRT). That may be true but it would change the API and, perhaps, slow down the library, Lefèvre said.

Moreover, some asserts may come from the detection of an inconsistent state. In this case, it is better to abort. Otherwise letting the program continue may have worse consequences.

The developer of Nettle, Niels Möller, agreed: "Crashing in a controlled fashion may also be *more* secure that continuing execution with undefined results. Depending on circumstances, of course." He went on to say that Walton's perspective differed from his own:

I read the general statement "asserts considered harmful" as your personal [opinion], likely based on experience with very different development projects than I'm involved with.

In the final analysis, developers of programs handling sensitive information need to be more careful than other developers. One could argue that those developers should be familiar with the behavior of any libraries that they use, so that assertion failures are not unexpected. Beyond that, as GMP developer Torbjörn Granlund pointed out, the API of a library is determined by its developers; other projects are free to use the library—or not—at their discretion.

Index entries for this article
Security	Vulnerabilities/Library

GMP and assert()

Posted Feb 27, 2019 22:04 UTC (Wed) by BenHutchings (subscriber, #37955) [Link] (5 responses)

Applications that shouldn't ever core-dump should call prctl(PR_SET_DUMPABLE, 0, 0, 0, 0). It doesn't make any sense to say that library writers should avoid triggering a core-dump; that's like saying libraries shouldn't have bugs.

GMP and assert()

Posted Feb 27, 2019 22:14 UTC (Wed) by roc (subscriber, #30627) [Link] (4 responses)

Applications can also use MADV_DONTDUMP to mark specific regions of memory as non-dumpable. Probably secrets in memory should be protected with both MADV_DONTDUMP and mlock().

GMP and assert()

Posted Feb 27, 2019 23:44 UTC (Wed) by gdt (subscriber, #6284) [Link] (3 responses)

The problem is a lack of more general support for secrets-keeping in CPUs and operating systems.

Even if the secret is protected with both MADV_DONTDUMP and mlock() the library may copy that memory. That copying might not be unreasonable -- to access data in modern CPUs is to copy it, at the very least into CPU caches and registers.

We saw the same thing recently with password databases, where graphics libraries retained the contents of previously-displayed text and so passwords were still in RAM even if the password database had erased its own storage. In turn those graphics libraries may feel that free()ing data means it has passed from their responsibility.

GMP and assert()

Posted Feb 28, 2019 6:14 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

I think it's worse than that. You don't know that your OS is running on the bare metal. It might be in a container or hypervisor, either of which might reasonably fail to know (or care) about prctl() and mlock() calls inside the guest. As an extreme case, if you're running a VM in a managed cloud-like environment, and the host decides to live-migrate you to a new machine, then it's going to take a complete memory dump, and that complete memory dump is going to go over the (local) network, like it or not. At least it'll be encrypted (probably).

GMP and assert()

Posted Feb 28, 2019 7:33 UTC (Thu) by mjthayer (guest, #39183) [Link] (1 responses)

> We saw the same thing recently with password databases, where graphics libraries retained the contents of previously-displayed text and so passwords were still in RAM even if the password database had erased its own storage. In turn those graphics libraries may feel that free()ing data means it has passed from their responsibility.

That is an interesting new twist in the display-password-or-display-dots question. Accessing graphics memory is probably more of a threat than shoulder surfing these days. (I would be interested to know if it really is a realistic threat, as in worth the effort for an attacker.) A flag marking a graphics surface as sensitive or something on those lines might help.

GMP and assert()

Posted Feb 28, 2019 22:23 UTC (Thu) by flussence (guest, #85566) [Link]

It seems like a very real threat for shared-memory GPUs - most laptops and some desktops, maybe less so for discrete graphics cards.

It's absolutely trivial to demonstrate pulling unsanitized chunks of VRAM out of a GPU too: just connect one tiny screen and one huge one, switch VT (linux fbcon shrinks to fit the smaller screen, leaving the bottom right of the larger one uninitialised) and dump the framebuffer using whatever means available. The empty space is black on kernel 4.20 so I assume it's fixed now, but for many years this wasn't the case.

GMP and assert()

Posted Feb 28, 2019 3:59 UTC (Thu) by mm7323 (subscriber, #87386) [Link]

abort() is pretty handy for asserts in libraries. Unlike return codes it's hard to ignore, and the resulting core file makes tracking down any error really easy. This must increase the quality of software using such libraries.

As already said, applications can use prctl() if too sensitive to core-dump, and that covers all cases not just abort () triggers

GMP and assert()

Posted Feb 28, 2019 9:51 UTC (Thu) by smurf (subscriber, #17840) [Link]

There probably should be an option whether to abort(), or return a NaN (or an object that behaves like one but can encapsulate the error code+message) instead.

GMP and assert()

Posted Feb 28, 2019 14:00 UTC (Thu) by devkev (subscriber, #74096) [Link] (34 responses)

I'm pretty amazed at all the people who think this is about leaking info, and that it's acceptable behaviour for a library. As others have pointed out, there are ways to guard against the info leaks, and more ways to leak info than just crashing. But the real issue here is whether it's appropriate for a library to end the process.

The whole point of a library is to be embedded inside some larger process. When you create a process, it's not by running a library, it's by running a program. So it should be the program that decides when that process ends. The lifecycle of that process is none of the library's business, and the library should not do anything to interfere with it. It is entirely inappropriate for a library to unilaterally pull the rug out from under the entire process by early terminating it.

If the library has entered a state where it can no longer perform useful work, then it can just tell the application that. It can return a special "fatal" error response, and if it's completely hosed, then it could set a flag so that all other library calls immediately return the same fatal error. This flag would only be cleared when the problematic state has been fixed (eg. by a special function call which re-inits the entire library, clearing out all its internal state).

For example, the program might be using the library for some optional functionality. In this case, the program could be happy to continue on without the library, using some fallback/alternate instead. Only the program can make that decision, and the library should not even try to second-guess it.

It is completely reasonable for developers to expect that libraries won't go and call exit(), or kill(getpid(), SIGKILL), or similar. Just like I expect random root daemons won't go and reboot my whole system just because they've encountered some problem that they consider severe.

Whatever happened to the "catch low, handle high" error handling philosophy? Libraries have no knowledge of the surrounding context in which they are running - ie, the rest of the program. But this is precisely why they should not be handling such errors, only reporting them. The caller can bubble the error up to a level that is best placed to handle it appropriately.

GMP and assert()

Posted Feb 28, 2019 14:59 UTC (Thu) by pizza (subscriber, #46) [Link] (7 responses)

> The caller can bubble the error up to a level that is best placed to handle it appropriately.

We both know that "can" actually means "will ignore completely" 99.7% of the time.

GMP and assert()

Posted Feb 28, 2019 15:28 UTC (Thu) by NAR (subscriber, #1313) [Link] (6 responses)

Then it's clearly the application's fault. I also feel like calling assert() in a library is just cop-out from a proper error handling (which, I admit, might be difficult due to the deficiencies of the C language or backward compatibility).

GMP and assert()

Posted Feb 28, 2019 15:50 UTC (Thu) by intgr (subscriber, #39733) [Link] (5 responses)

> calling assert() in a library is just cop-out from a proper error handling

There are some frameworks that handle assertion failures by logging and returning: GNOME things like glib, GTK, etc.

Have you ever tried running a GNOME application in a terminal? The majority of them spew errors and warnings left and right; developers seem to just ignore the errors because they are so easy to ignore. No-one bothers reporting them as bugs, although many of them probably are.

As far as I can tell, the whole industry has been slowly moving towards a more "fail fast, fail loud" approach. I'm surprised that it has suddenly became controversial.

GMP and assert()

Posted Feb 28, 2019 16:41 UTC (Thu) by NAR (subscriber, #1313) [Link] (3 responses)

If the glib, gtk would instead crash, nobody bothered to use them in the first place.

GMP and assert()

Posted Feb 28, 2019 20:15 UTC (Thu) by mageta (subscriber, #89696) [Link] (1 responses)

But they do, errors in glib always crash the entire process.

GMP and assert()

Posted Mar 3, 2019 13:41 UTC (Sun) by hadess (subscriber, #24252) [Link]

> But they do, errors in glib always crash the entire process.

There are 2 type of "errors" that crash the entire process in a GLib program: a call to g_error(), and memory allocation failure. Both of those are documented.

GMP and assert()

Posted Mar 1, 2019 18:31 UTC (Fri) by notriddle (subscriber, #130608) [Link]

Bull. If the application crashed instead of launching, it would never have been released. It probably would never have been uploaded to version control.

GMP and assert()

Posted Mar 3, 2019 13:45 UTC (Sun) by hadess (subscriber, #24252) [Link]

> Have you ever tried running a GNOME application in a terminal? The majority of them spew errors and warnings left and right; developers seem to just ignore the errors because they are so easy to ignore. No-one bothers reporting them as bugs, although many of them probably are.

They're easy to ignore because most "GNOME applications" aren't launched from a terminal and people rarely read all the errors in the journal when apps otherwise function.

And application developers are thankful for the bug reports, when they happen.

Note that if you feel courageous, you could set "G_DEBUG=fatal_warnings" and all the warnings would make the app crash. Send backtraces!

GMP and assert()

Posted Feb 28, 2019 16:00 UTC (Thu) by sorokin (guest, #88478) [Link] (19 responses)

> The caller can bubble the error up to a level that is best placed to handle it appropriately.

Although technically correct and this is what should be done for most errors by default, this behavior is not appropriate for some errors. Let me present an example.

Example 1. Let's say we write a function sqrt. sqrt is not applicable for negative numbers. OK let's make this function return error code in case the argument is negative (EDOMAIN). Now, let's write function hypot using this sqrt. OK, mathematically hypot always calls sqrt with non-negative argument. What should hypot do when sqrt returns EDOMAIN? Should it propogate the error to its caller? If the error is propagated to the caller of hypot, what will be the name of the error? "my internal call to sqrt said that its argument is negative"?

I claim that hypot should not propagate this error to the caller. hypot should assert that sqrt never returns EDOMAIN. That's the example why asserts are useful and not all errors should be propagated to callers.

Example 2. We can go further. If we look at typical usages of sqrt we'll see that most of the time (in good code base perhaps always) people use it with only non-negative arguments. As the result most users just assert that sqrt returns EDOMAIN. Can we move this assert into sqrt. Yes, we can and as the result the signature of sqrt simplifies:

error_code sqrt(number in, number& out); // old: wide contract (sqrt applicable for all arguments)
number sqrt(number in); // new: narrow contract (sqrt applicable only for non-negative arguments)

With new signature function is much easier to use and can be more efficient (as it has fewer checks).

That's the second example why asserts can be useful (to make code simpler).

GMP and assert()

Posted Feb 28, 2019 16:10 UTC (Thu) by sorokin (guest, #88478) [Link] (17 responses)

I would like to add that asserts and "propagate error to caller policy" are not interchangeable.

There are two types of errors:
1. Rare, but perfectly possible conditions e.g. "file not found".
2. Violation of internal invariants e.g. "my example with hypot above".

It is a pity that people use same word "error" for both of these conditions.

1 can and must be handled using "propagate error to caller policy".
"propagate error to caller policy" should not be used for 2, because it will cause all sorts of "this must not be possible here" issues. 2 must be checked with asserts.

GMP and assert()

Posted Feb 28, 2019 20:30 UTC (Thu) by HenrikH (subscriber, #31152) [Link] (1 responses)

And there is #3 when any of the errors are triggered due to memory being randomly being overwritten by a buffer overrun, random hardware error and so on, and in this case recovering is completely out of the picture and process termination is IMHO the only way to go.

Or when malloc() returns NULL on a kernel with overcommit (the default), even logging the error would require more memory to be allocation and thus also fail.

GMP and assert()

Posted Feb 28, 2019 22:42 UTC (Thu) by devkev (subscriber, #74096) [Link]

> when malloc() returns NULL ... even logging the error would require more memory to be allocation and thus also fail

This is not true. First of all, maybe the condition which caused malloc() to fail was transient, and subsequent mallocs will succeed. But even if not, the proper behaviour in this case is to pre-allocate (or use static allocation) a reasonable sized buffer to use for this logging. This is why things like sigaltstack() exist, for example.

GMP and assert()

Posted Feb 28, 2019 22:34 UTC (Thu) by devkev (subscriber, #74096) [Link] (7 responses)

> it will cause all sorts of "this must not be possible here" issues

This is basically saying "Don't want to have to check this, because it cannot possibly have failed" - famous last words.

> 2 must be checked with asserts.

Of course libraries should check for internal consistency, and that invariants are actually true. But such a condition can only be (at most) fatal for the library, it absolutely does not follow that it necessarily has to be fatal for the whole program. A library which presumes otherwise is making unnecessarily selfishly assumptions about the program in which it's running. This means that for libraries, at least, assert(3) is simply the wrong tool for this job.

hypot should not propagate sqrt's EDOMAIN, because the fact that hypot calls sqrt is an implementation detail. hypot returning EDOMAIN is for situations where hypot has been called with invalid or improper arguments. Instead, when hypot wants to check its internal consistency, namely that the value it got back from sqrt is sensible and can be safely used, if it finds that this is not the case it should just immediately return EUNKNOWN or EINTERNAL.

Here's an analogy. Someone comes up to you on the street and asks if you have the time. You say "Yes" and look at your wrist to see your watch, and are surprised that it's not there. You don't know why, maybe you left it at home, maybe the strap broke and it fell off, or maybe someone stole it. Do you punch the person in the face and run away? Or do you say "Sorry, I expected to be able to tell you the time, but it turns out that I can't and I don't know why"?

GMP and assert()

Posted Feb 28, 2019 23:51 UTC (Thu) by mpr22 (subscriber, #60784) [Link]

> But such a condition can only be (at most) fatal for the library

In memory-unsafe land (which is where a lot of libraries exist), that depends on what the invariant is and what the implications of the invariant being wrong are.

> You say "Yes" and look at your wrist to see your watch, and are surprised that it's not there.

If I'm surprised at that point that my watch isn't there, I want to see a neurologist pronto, because it means the sense of touch in my left arm isn't working properly :P

(100g of close-fitting stainless steel bracelet tends to make its presence felt.)

GMP and assert()

Posted Mar 6, 2019 10:27 UTC (Wed) by renox (guest, #23785) [Link] (5 responses)

This analogy is biased: while this is true that library assert suck for the applications which can recover from an error, removing the assert adds work for the applications which must now check the error return code.
Ideally the language should know about error return values and add an assert if the caller doesn't handle the error return value himself..
But I don't know of a language which allow this..

GMP and assert()

Posted Mar 6, 2019 12:09 UTC (Wed) by excors (subscriber, #95769) [Link]

> Ideally the language should know about error return values and add an assert if the caller doesn't handle the error return value himself..
> But I don't know of a language which allow this..

That sounds a lot like exceptions. Errors get propagated up the call stack until someone handles the error, and if nobody handles it then the application crashes. (Of course exceptions introduce a whole new set of problems so they're just as bad as every other approach to error handling.)

GMP and assert()

Posted Mar 7, 2019 16:53 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (3 responses)

> But I don't know of a language which allow this.

Rust and other languages with algebraic data types has this. You get a `Result<T, E>` where in order to access the `T` (the wanted result), you either need to use `.unwrap()` which turns into a panic if an error did occur or you have to pattern match and then there's an obvious error codepath that exists in the code.

GMP and assert()

Posted Mar 8, 2019 8:51 UTC (Fri) by gracinet (guest, #89400) [Link]

My thoughts exactly.

To be complete, that leaves the case where one does not care about the result (type T in your example), and simply ignores it.

Of course the Rust compiler has a warning for that.

GMP and assert()

Posted Mar 14, 2019 9:26 UTC (Thu) by jezuch (subscriber, #52988) [Link] (1 responses)

Or you do a ? on it to propagate it up the call stack, which makes it pretty much like exceptions but with the code that potentially throws explicitly marked as such. (The ? operator is actually syntactic sugar for the try! macro which hides pattern matching underneath, but still.)

GMP and assert()

Posted Mar 14, 2019 13:32 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I'm aware :) . I prefer using error_chain's .chain_err() or failure's .context() before using ? to tag a reason for "what was attempted that caused this error" tracing. Just passing the buck to the parent is an easy way to get errors that are absolutely confounding to end users like git-lfs:

git push
fatal error: no such file: path/to/some/lfs/file/on/another/branch

Note that this particular error is now better worded in that it mentions the object id that it is missing instead.

GMP and assert()

Posted Feb 28, 2019 22:50 UTC (Thu) by mgb (guest, #3226) [Link] (6 responses)

> Violation of internal invariants ... must be checked with asserts.

What if the application prefers to finish landing the plane before figuring out how the library managed to violate its own invariant?

GMP and assert()

Posted Mar 1, 2019 2:59 UTC (Fri) by pizza (subscriber, #46) [Link] (5 responses)

> What if the application prefers to finish landing the plane before figuring out how the library managed to violate its own invariant?

If application in question is flight-critical avionics, then every line of source code (including all libraries and compile-time settings) will have been fully audited a-priori, rendering your question moot.

That isn't to say that unanticipated issues can't happen.. indeed, safety-critical systems are typically designed under the assumption that unexpected issues _will_ occur. Consequently there are requirements that the overall system must be able to fail (and recover) gracefully.

GMP and assert()

Posted Mar 1, 2019 7:23 UTC (Fri) by mgb (guest, #3226) [Link] (4 responses)

> If application in question is flight-critical avionics, then every line of source code (including all libraries and compile-time settings) will have been fully audited a-priori, rendering your question moot.

Unmanned space missions have been lost due to typos. A navy cruiser (Yorktown) lost its computer network and propulsion because of an unexpected zero. And at least one F-22 crashed due to a software error.

GMP and assert()

Posted Mar 1, 2019 14:27 UTC (Fri) by pizza (subscriber, #46) [Link] (3 responses)

The Yorktown was being used as a testbed for a brand-new "smart ship" system architecture and its highly publicized propulsion failure was ultimately due to unsanitized user input. Meanwhile, the Mars Lander was lost was due to metric/imperial unit confusion -- essentially a failure of specification. And the only F-22 cash due to an outright software error was on an early flight test of the second prototype -- and even that was a failure to compensate for pilot-induced oscillation.

So if your point is auditing doesn't catch everything (especially in light of incomplete or incorrect specifications) then sure, we're in complete agreement.

If it's not, then I have no idea what point you are trying to make.

GMP and assert()

Posted Mar 1, 2019 16:01 UTC (Fri) by mgb (guest, #3226) [Link] (2 responses)

> So if your point is auditing doesn't catch everything (especially in light of incomplete or incorrect specifications) then sure, we're in complete agreement.

Good. Then let's allow the application do it's best to finish landing the plane before we stop to figure out how the library managed to violate its own invariant.

GMP and assert()

Posted Mar 1, 2019 18:56 UTC (Fri) by pizza (subscriber, #46) [Link] (1 responses)

> Then let's allow the application do it's best to finish landing the plane before we stop to figure out how the library managed to violate its own invariant.

Do I have to point out that when it comes to avionics, there is no distinction made between "the library" and "the application"?

Meanwhile, there have been quite a few situations where "the application tried to finish landing the plane" straight into the ground.

GMP and assert()

Posted Mar 2, 2019 10:02 UTC (Sat) by devkev (subscriber, #74096) [Link]

> Do I have to point out that when it comes to avionics, there is no distinction made between "the library" and "the application"?

No. It wasn't meant literally. It was meant to illustrate the idea that the application might have things that it would prefer to do on error conditions, and those things can't be done if the library's narrow view causes the whole application to be killed.

> Meanwhile, there have been quite a few situations where "the application tried to finish landing the plane" straight into the ground.

And innumerable more where the remaining systems - affected by the problem as well as unaffected - were able to be used to land the plane safely. A faulty altimeter doesn't shut down the entire plane, because the rest of the plane is _probably_ fine. But in this analogy, that is exactly what happens when a library calls abort/assert.

GMP and assert()

Posted Mar 13, 2019 22:05 UTC (Wed) by Wol (subscriber, #4433) [Link]

> Example 1. Let's say we write a function sqrt. sqrt is not applicable for negative numbers.

Dangerous statement :-) What about i?

Cheers,
Wol

GMP and assert()

Posted Mar 1, 2019 22:17 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link] (5 responses)

This is really the only sensible choice: The library has zero information about anything but itself, hence, it cannot safely terminate the process using it, *no matter what*. Especially not via assert as the process could have installed a SIGABRT handler which would end up dumping core from a completely different context than the library authors thought it would. For certain errors, especially errors involving a corrupted stack, the outcome of an abort is nothing but unusable gibberish. Further, assert prints a message to stderr which might just vanish, leading to programs which suddenly terminate for no reason anyone can ever determine.

At the very least, any assert in a library out to be part of the documented interface. Otherwise, there's no way anyone can safely use the library for anything where "sudden failure for an unknown reason" is not a sensible option.

GMP and assert()

Posted Mar 2, 2019 3:35 UTC (Sat) by rra (subscriber, #99804) [Link] (4 responses)

Regardless of whether one agrees with this (I think it's a bit more arguable than you do, I suspect, but I can certainly see your perspective), there's an additional wrinkle: what if the library's API doesn't currently return errors?

You can argue that was bad design, but if the library is in widespread use already, the mistake has already been made, and one still has to decide how to move forward from here. If the library is currently asserting on internal consistency errors or API violations, and you change it to to return an error from functions that previously never returned an error, that's quite dangerous, because you've now broken the API contract with all existing callers, which may proceed on entirely erroneous assumptions and do something even more dangerous than crashing. You can introduce new APIs and deprecate the existing APIs, but that's potentially a monumental amount of work, with no guarantee of practical success (and has to be prioritized against everything else the library developers could be doing).

One of the issues with the original report is that it completely failed to engage with this problem. Even if you agree that libraries should never assert, the reality is that GMP already does assert and does not define error returns in some of the places that it asserts. One can argue that it never should have gotten into that position, but now that it is, it's very inobvious that changing GMP to return errors is correct.

GMP and assert()

Posted Mar 2, 2019 11:10 UTC (Sat) by devkev (subscriber, #74096) [Link] (3 responses)

mfuzzey explains at https://lwn.net/Articles/781129/ a simple way of avoiding all of these problems (not sure why it wasn't a reply to your comment).

Basically, the library adds a new function called libfoo_register_fatal_handler, which takes a function pointer that's called whenever an assertion fails. The default handler just calls abort.

This approach preserves existing legacy API and behaviour, while also allowing programs to opt-in to arbitrary error handling (including abort, noop, or setting a flag to know that the returned value will be garbage and inhibit future calls into the library). Changing shouldn't be overly hard, since if you call your new macro assert() then you can avoid needing to touch code all over the library.

GMP and assert()

Posted Mar 2, 2019 17:29 UTC (Sat) by rra (subscriber, #99804) [Link]

Agreed, that seems like an elegant solution.

GMP and assert()

Posted Mar 3, 2019 6:20 UTC (Sun) by songmaster (subscriber, #1748) [Link] (1 responses)

A global fatal handler function might be fine for relatively simple cases in single-threaded code, but how do you handle multiple users of the library living in the same process (such as other libraries) that all want their own fatal handlers calling? Is the fatal handler supposed to return, or to do something like a longjmp() to restore control back to a safe place in the original caller? The latter opens up questions about cleaning up resources, whereas the former requires the caller to add checks of the library fatality flag after every call. Allowing multiple threads complicates the issue even more.

GMP and assert()

Posted Mar 4, 2019 20:16 UTC (Mon) by xtifr (guest, #143) [Link]

At that point, you may simply have to face up to the fact that the original API was never adequate in the first place. (Remember, we're already discussing a possible to a workaround to the API's limitations.) Look at functions like strtok_r(3) in libc. Or all the other "_r" functions that were added, because the classic C api simply wasn't able to cope with the requirements of re-entrant and/or multithreaded code.

GMP and assert()

Posted Mar 2, 2019 10:01 UTC (Sat) by mfuzzey (subscriber, #57966) [Link]

The library could have a "fatal error handler" callback hook set by the application.

That way if, the application *wants* to assert rather than checking return values for internal error it can.

If the application doesn't set the handler or the handler does nothing the library should return "internal error"..

error codes and asserts have different purposes.

Posted Mar 8, 2019 2:19 UTC (Fri) by john.carter (guest, #123615) [Link] (1 responses)

Sigh. When will people learn.

You cannot make a library fool proof, fools are too ingenious.

You can only tell the fool that he has been foolish.

Returning error codes are for telling the caller that something they couldn't possibly know about, has occurred. eg. Some external process has deleted the file, so open cannot succeed, so return ENOENT.

Precondition asserts are for telling the caller that there is a bug in their program that can only be fixed by a new version of their program.

Of course any part of a process can kill a process, this is a fact of programming life.

Divide by zero, dereference null.... do you want the CPU to ignore those errors and just womble on doing stupider and stupider things.... sometimes working, sometimes doing rubbish, sometimes dying?

Attempting to detect and unwind every possible flavour of stupid on the part of the caller merely creates an untested and untestable spaghetti of error handling code.

I get infuriated by people who write such spaghetti... and then never even check the error code, and just cast it to void. Experience tells me whenever I see that pattern... I will find bugs.

Far far more opportunities for security flaws and leaks to hide in that stuff.

People get horribly confused between "something unexpected came in over the wire from an external system" vs "your program has a bug and will do wrong _every time_ until you get off your butt and fix it".

An assert should _only_ be used for second thing, and should _always_ die as there is no other sane and reliable way of recovering.

error codes and asserts have different purposes.

Posted Mar 22, 2019 0:40 UTC (Fri) by mcortese (guest, #52099) [Link]

> Divide by zero, dereference null.... do you want the CPU to ignore those errors and just womble on doing stupider and stupider things.... sometimes working, sometimes doing rubbish, sometimes dying?

Ideally I'd like the application to degrade in a localized and controlled way. For example, if the battery applet on my panel, which calculates the % of charge dividing the current charge by the max capacity, faces a divide-by-zero for whatever reason (say, a hardware failure of the battery control), it could simply show 0% or 100% or even a meaningless value, and yet this would be ways better than killing the whole panel[1].

So yes, sometimes wombling on doing stupider and stupider things is the desirable approach.

[1] In fact, this is not a made-up example but a real case with LXDE panel.