Better handling of integer wraparound in the kernel [LWN.net]

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 4:00 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (1 responses)

I suspect this is the fault of the PDP-11 and such. At the time, the notion that the compiler would just emit extra instructions for a simple add would have been seen as extravagant. If you wrote +, you got whatever behavior the architecture provided (in practice, this meant wrapping for unsigned and 🤷 for signed). If the architectures of the time would've provided separate wrapping, saturating, and faulting instructions for addition, then they probably would have added all three as primitives. Of course, that would've been a very different design compared to what was actually being done at the time, and I'm sure the OEMs had valid reasons for not (consistently) providing such instructions.

Nowadays we have the "abstract machine" and such affordances, so it's no longer seen as surprising that + might compile into something other than an add instruction. So you could also say this is K&R's fault for designing C too close to the metal. But hey, hindsight is 20/20.

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 5:49 UTC (Sat) by willy (subscriber, #9762) [Link]

I don't think you can blame the PDP-11 here. It had a standard set of NZVC bits and used twos complement arithmetic. Instead, blame ANSI. In trying to specify a language that would run on ones-complement and sign-magnitude machines, they made overflow UB and now we all suffer.

So I'd blame the CDC-6600 although you can blame the PDP-1 or LINC if you're determined to make DEC the bad guy.

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 5:45 UTC (Sat) by adobriyan (subscriber, #30858) [Link] (12 responses)

>Wrapping<i64>

Wrapping signed integers is a funny thing by itself.

I'd argue that Algebraic Types Party doesn't do _that_ better because overflowing/saturating is property of an operation not a type.

In a perfect world Rust would invent new syntax for "unsafe" additions/multiplications. Cryptographers would use it and everyone else would use regular + which traps.

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 8:30 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (11 responses)

> In a perfect world Rust would invent new syntax for "unsafe" additions/multiplications. Cryptographers would use it and everyone else would use regular + which traps.

Rust already did that. They're simple method calls on the primitive integer types (e.g. wrapping_add, saturating_sub, checked_div, etc.). Wrapping<T> etc. types are for cases where you want to override what + does (because writing wrapping_add all the time is obnoxious). If you want to manually spell out the operation every time, you can do that.

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 9:10 UTC (Sat) by adobriyan (subscriber, #30858) [Link] (10 responses)

Yes, they added wrapping_add() and friends, it's lengthy and wordy.

I'm thinking more about "a [+] b" for unsafe addition , etc

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 15:36 UTC (Sat) by Paf (subscriber, #91811) [Link]

I dunno, it seems like these should be rare, which makes it ok if they’re lengthy and good that the existence of that code construct is well advertised.

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 22:50 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (8 responses)

The problem with overflow modes is that there are so many to choose from! Rust provides at least the following:

* carrying_add(): A helper method for implementing bignum addition. There's also borrowing_sub() for subtraction. Probably not useful for general-purpose addition.
* checked_add(): Returns an Option<T> which is None if overflow would happen. Nicely compatible with the question mark operator, match, if let, etc. constructs.
* overflowing_add(): Returns the result of wrapping_add() as well as a bool indicating whether we overflowed.
* saturating_add(): Returns (the Rust equivalent of) INT_MAX when we overflow.
* unchecked_add(): Overflow is undefined behavior. Can only be called in an unsafe block (or unsafe function).
* wrapping_add(): Wraps around using two's complement (signed) or modular arithmetic (unsigned).

You could, hypothetically, have a menagerie of different symbols for all of these different use cases, but it's going to look like Perl or APL very quickly. You could, I suppose, pick one of them as the "best" overflow mode and just give that a symbol, but I guarantee you it won't be unchecked_add() as your comment seems to suggest (they are never adding a binary operator that can only be used in unsafe blocks). The more flexible option is picking which mode you "usually" want to use, and using the Wrapping<T> etc. types for that.

One thing that does bother me is the fact that overflowing_add() returns a tuple instead of some kind of ADT wrapper. That would force you to explicitly destructure it and handle both the wrapping and non-wrapping cases. With a tuple, you can forget to check the bool in some code paths. It's not really a disaster because you probably won't have overly elaborate code surrounding the callsite, but it's still an opportunity for things to go wrong.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 8:13 UTC (Sun) by epa (subscriber, #39769) [Link] (6 responses)

In my ideal language there would also be cannot_overflow_add() which checks at compile time overflow is impossible. It would need bounded integer types like those in Ada. You can achieve something like this with template metaprogramming in C++ but I think it important enough to be part of the language.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 10:35 UTC (Sun) by khim (subscriber, #9252) [Link] (5 responses)

> It would need bounded integer types like those in Ada.

Useless, pointless and not at all helpful in real code.

> In my ideal language there would also be cannot_overflow_add() which checks at compile time overflow is impossible.

That's what Wuffs does.

Turning that into general-purpose language with pervasive dependent typing is surprisingly hard.

Maybe someone would manage to do that 10 or 20 years down the road.

> You can achieve something like this with template metaprogramming in C++ but I think it important enough to be part of the language.

No, you can not. The only thing you may achieve with templates are some statically-defined checks and these are not much useful for real programs with dynamically allocated externally-specified objects. And kernel is full of these.

You may also reject C++ entirely and built entirely separate language where every type would be your own template type and everything is done using these, but at this point you are firmly in the “pseudo-language written in C++ templates” and it's better to just create a separate language than pretending you are still dealing with C++.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 11:26 UTC (Sun) by epa (subscriber, #39769) [Link] (4 responses)

I’ve written a toy C++ template library that provides a bounded<n,m> type (guaranteeing n <= x < m) with the basic arithmetic operations yielding the bounds you’d expect. Then explicit coercion with a run time check for creating a bounded type from an ordinary int, or for narrowing the bounds.

Useful in practice? Perhaps not much. My concern was more about eliminating logic errors (and making explicit the places overflow can occur), rather than memory safety. The common style nowadays is to avoid arbitrary limits. Users would not be happy if grep had a maximum line length of 2000 characters. But in the real world you can usually put a bound on things: no aeroplane carries more than ten thousand passengers and no person is aged over two hundred. Ada does not have bounded integers purely out of design by committee. In the end your integer types will have a bound, like it or not, and forcing the programmer to consider it explicitly can be helpful.

So yes, I do agree with your comment. You can’t realistically retrofit bounded integers into C++ given the amount of existing code; and with everything dynamically allocated (rather than static fixed size buffers) they are not that useful for memory safety. Carrying a full proof of allowed bounds alongside each variable requires more cleverness than compilers have (although even a system that required numerous “trust me” annotations might have its uses). I was blue-skying about my ideal language rather than proposing a practical change to an existing language or codebase.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 12:01 UTC (Sun) by khim (subscriber, #9252) [Link] (3 responses)

> But in the real world you can usually put a bound on things

No, you couldn't. Even if limit sounds sane today it would be, most definitely, be too small tomorrow.

Maybe embedded may work with precomputed limits, but every time I saw TeX there was always hugeTeX for people who need more or something.

Beyond certain program complexity using arbitrary limits is not useful and below that limit using them is not too helpful.

That's why I said such types are useless and pointless: where they are useful (small programs, limited scale) they are pointless, where they could be beneficial (large programs, complicated logic, data comes from outside) they are not useful.

> But in the real world you can usually put a bound on things: no aeroplane carries more than ten thousand passengers and no person is aged over two hundred.

And then someone tries to use you program for a cruise ship or tries to enter data about three hundreds years Juridical person and everything explodes.

Thanks, but no, thanks.

> Ada does not have bounded integers purely out of design by committee.

It does have them because ALGOL and Pascal have them. But they are not much useful in any of these languages.

Perhaps they were useful when they were invented and when programs were written once, run once and then discarded. But today we are not writing programs in such fashion and arbitrary static limits cause more harm than good.

> I was blue-skying about my ideal language rather than proposing a practical change to an existing language or codebase.

And I was thinking about practical applicability instead. I'm pretty sure with enough automation proof-carrying code may be pretty usable, but for now are stuck with Rust and that was right choice: lifetime tracking causes significantly more grief than out-of-bounds access.

Perhaps after Rust would become the “new C” we may start thinking about dependent types. They need to have lifetime tracking as their basis, anyway, or else they wouldn't be much useful in practice, thus Rust made the right choice.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 12:15 UTC (Sun) by epa (subscriber, #39769) [Link]

It's the "everything explodes" scenario I am trying to avoid. By having the bounds explicit and tracked at every use, the program will reject the value of twenty thousand passengers when that value is first entered. TeX is an example of an old program written in a fixed-allocation style when dynamic would nowadays be considered better. Nobody is doing safety-critical embedded systems running typesetting with hard real-time constraints, or hardening their TeX macros against an attacker. But then for embedded applications, as you mention, you may need to set an upper bound on memory usage or require that no dynamic allocation and freeing happens.

Airline passenger management is a kind of middle ground between these two. It's not in itself safety-critical and doesn't need to run on tiny systems. But then, it does need to interact with parts of the real world that have fixed limits. Perhaps the passenger number is at most three digits long, or the dot-matrix printer only has 80 columns, or there is a database system with fixed storage size. In those cases I would prefer to be explicit about the bounds of a number rather than write code that purports to work with any number but in fact does have an upper bound somewhere, just nobody knows quite what it is.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 18:00 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

> > But in the real world you can usually put a bound on things

> No, you couldn't. Even if limit sounds sane today it would be, most definitely, be too small tomorrow.

Stop being an arrogant idiot!

Okay, I can't think of an example, and I guess you haven't even bothered to look, but I'm sure other people will be able to find examples where a certain positive big number indicates an error. Certainly I'm sure there are examples where the mere EXISTENCE of a negative value is an error (in other words 0 is an absolute lower bound).

(Actually, as a chemist, I've just thought of a whole bunch of examples. s is either 1 or 2 (can be empty aka 0). Likewise, p is 1 to 6. d is 1 to 10. f is 1 to 14. ANY OTHER VALUE IS AN ERROR.) To have the compiler make sure I can't screw up would be a wise choice.

And please, WHY ON EARTH would you want to store an entry about a company into a genealogical database? While it's possible it'll change (extremely unlikely), any value for age outside 0 - 126 is pretty much impossible. In fact, surely you DO want to limit that value, precisely in order to trigger an error if someone is stupid enough to try to enter a judicial person into the database!

Cheers,
Wol

Enough

Posted Jan 28, 2024 22:23 UTC (Sun) by corbet (editor, #1) [Link]

Ok let's stop this here please. Seriously.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 10:53 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

> saturating_add(): Returns (the Rust equivalent of) INT_MAX when we overflow.

(I'm sure you know this but) note that saturation occurs at _both_ ends of the range of an integer type, if you (-100i8).saturating_add(-100) that's the 8 bit signed integer -128 aka i8::MIN not i8::MAX

Better handling of integer wraparound in the kernel

Posted Jan 27, 2024 21:14 UTC (Sat) by donald.buczek (subscriber, #112892) [Link] (10 responses)

> Rust's integers don't magically have different behaviour depending on whether they're signed

But they magically have different behavior depending on the whether you compile in debug or release mode, because the "debug" and "release" profiles define `overflow-checks` differently. This is surprising, too, and I'm not sure, if that was a good design choice.

Better handling of integer wraparound in the kernel

Posted Jan 28, 2024 10:35 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (5 responses)

Your program is wrong if you overflow the basic integer types, so only incorrect programs exhibit different behaviour depending on whether overflow-checks is set (if it's set, the program exits when it detects your mistake, if not, the integer wraps).

Better handling of integer wraparound in the kernel

Posted Feb 7, 2024 8:24 UTC (Wed) by milesrout (subscriber, #126894) [Link] (4 responses)

>only incorrect programs exhibit different behaviour depending on whether overflow-checks is set

"You can only shoot yourself in the foot if you hold it wrong"

"Only incorrect programs have this problem" is the exact issue Rust is meant to prevent. What's the point of Rust if it doesn't prevent one of the most common sources of vulnerabilities?

Better handling of integer wraparound in the kernel

Posted Feb 7, 2024 10:48 UTC (Wed) by atnot (guest, #124910) [Link]

I've argued for enabling overflow checks by default elsewhere but:

> What's the point of Rust if it doesn't prevent one of the most common sources of vulnerabilities?

The main reason overflows lead to vulnerabilities is by breaking bounds checks. Since bounds checks are still applied by the compiler in a safe after whatever math you did, there's not really any way to turn it into a memory vulnerability. Unless you're manually writing bound checks in unsafe Rust, which is honestly pretty rare.

Better handling of integer wraparound in the kernel

Posted Feb 7, 2024 10:56 UTC (Wed) by farnz (subscriber, #17727) [Link] (2 responses)

The different fully-defined behaviours are, in practice, not a common source of vulnerabilities; by default, in production, you get wrapping (2s complement wrapping if signed), in development you get panics.

This differs from the issue in C, where overflow is a common source of vulnerabilities since it's not defined for signed integers, and thus the compiler is allowed to surprise a developer.

Better handling of integer wraparound in the kernel

Posted Feb 7, 2024 11:36 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

In other words, as I read it, you shouldn't get wrapping in production because it will have panic'd and been fixed in testing.

Okay, okay, that's being optimistic, but that is the likely course of events ...

Cheers,
Wol

Better handling of integer wraparound in the kernel

Posted Feb 7, 2024 12:07 UTC (Wed) by farnz (subscriber, #17727) [Link]

That's one component of it; the other is that if it does wrap, you're not going to be surprised by the resulting behaviour, whereas in C, you can get weirdness where overflowing signed arithmetic does unexpected things. Taking the following code as an example:


int a = 5;
int b = INT_MIN;
int c = b - a;

if (c < b) {
    printf("c less than b\n");
}
if (b < c) {
    printf("b less than c\n");
}
if (c == 0) {
    printf("c is zero\n");
}
if (b == c) {
    printf("b equals c\n");
}

In standard C semantics, because arithmetic overflow is undefined, it is legal for all four comparisons to evaluate to true, and thus for all four printfs to print. In Rust semantics, because of the wraparound rule. c will be equal to (the equivalent of) INT_MAX - 4, and thus only the b < c condition is true.

This, in turn, means that you're less likely to be surprised by the result in Rust, since it's at least internally consistent, even if it's not what you intended. And thus, if you use the result of the arithmetic operation to do something like indexing an array, instead of the compiler being able to go "well, clearly c should be zero here, so I can elide the bounds check", the compiler does the bounds check and panics for a different reason (index out of bounds). You've now got an "impossible" error, and can debug it.

Note that this relies on Rust's memory safety guarantees around indexing; if you used pointer arithmetic with c instead, you could get an out-of-bounds read, and hence have trouble. The only thing that helps here is that pointer dereferencing is unsafe, and hence if you get unexpected SIGSEGVs or similar, that's one of the first places you'll look.

Better handling of integer wraparound in the kernel

Posted Jan 29, 2024 2:18 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (2 responses)

With my SRE hat on: I'm very happy with this behavior. I do not like it when serving code crashes at runtime. Often enough, the crash is caused by some malformed query (input), and if there are enough of those queries, they will take down your servers faster than the orchestration layer can bring them up again. Then your load balancer sees that you don't have enough servers in this datacenter (region, cluster, zone, whatever you call it), so it starts spilling to other datacenters, taking their servers down too, and before the pager has even gone off, you're having a full-blown cascade failure.

Are there situations where data integrity might dictate failing rather than producing an incorrect value? Of course, but those are properly handled with checked_add() and friends, not with crashing the entire service. If there is any input that can cause your service to crash, it is an outage waiting to happen.

Better handling of integer wraparound in the kernel

Posted Jan 29, 2024 7:23 UTC (Mon) by donald.buczek (subscriber, #112892) [Link] (1 responses)

I understand your perspective, but others might prioritize safety and accuracy, opting for a fail-safe state during unforeseen circumstances. While checked_add() is indeed appropriate for such cases, the debate was about the language's default behavior in the face of coding errors.

Considering your example, isn't there a concern that your internal or external user might have other requirements and would prefer no data over wrong data? Overlooking unexpected overflows might result in serious incidents like Cloudbleed [1]. As a customer, I'd rather see a temporary service disruption than having to learn that my private data has been silently exposed and spilled into search engine caches.

[1]: https://blog.cloudflare.com/quantifying-the-impact-of-clo...

Better handling of integer wraparound in the kernel

Posted Jan 29, 2024 9:25 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> Considering your example, isn't there a concern that your internal or external user might have other requirements and would prefer no data over wrong data?

This is fine if the program is running on the user's computer. If it is not, then a crash will affect everyone whose requests are being processed by that machine, and in the case of cascade failure, fallout will be wider still. There may be situations where failing rather than producing wrong data is preferred, but it is the responsibility of the programmer to understand that requirement and use checked_add etc. instead of regular arithmetic operators. This is no different to any other functional requirement of a software system - if you write the code wrong, it will not work.

> Overlooking unexpected overflows might result in serious incidents like Cloudbleed [1].

Security is hard. I'm not going to solve the general problem of malicious inputs in an LWN comment, but in general, defensive programming is necessary (not sufficient) to deal with problems of this nature. Most languages do not provide integers that always crash on overflow, so handling overflow correctly is something that serious programmers will find themselves having to deal with on a regular basis regardless. Rust provides the necessary methods for doing this front and center, which is more than you can say for most languages (C is only standardizing the equivalent functions in C23!). If you want stronger guarantees than that, I would suggest rewriting the parser (and perhaps a small part of the parser-adjacent application logic) in wuffs.

Besides, if you allow malicious inputs to cause a crash, you make yourself far more vulnerable to denial of service attacks, which are also a security concern (albeit a less serious one).

Better handling of integer wraparound in the kernel

Posted Jan 29, 2024 11:33 UTC (Mon) by danielthompson (subscriber, #97243) [Link]

>> Rust's integers don't magically have different behaviour depending on whether
>> they're signed
>
> But they magically have different behavior depending on the whether you compile
> in debug or release mode, because the "debug" and "release" profiles define
> `overflow-checks` differently. This is surprising, too, and I'm not sure, if that was a
> good design choice.

I think here you are describing a choice of *defaults* for overflow-checks rather than a design choice.

In other words a developer who doesn't care can leave the defaults as they are. If a developer did, for example, wanted their panic handler run for any overflow then they can change the defaults for the release build. Cargo allows them to change it both for their own code and, with a wildcard override, for any crates they depend on!

Better handling of integer wraparound in the kernel

Posted Feb 5, 2024 16:34 UTC (Mon) by plugwash (subscriber, #29694) [Link] (1 responses)

Rust's integers do however "magically" have different behaviour in debug and release mode.

Better handling of integer wraparound in the kernel

Posted Feb 5, 2024 17:00 UTC (Mon) by farnz (subscriber, #17727) [Link]

Well, configurable different behaviour with different defaults in debug and release mode - you can panic on overflow in release, and you can wrap on overflow in debug. Plus there's the types like Wrapping which fully define it as wrapping in both modes, plus functions like checked_add if you want to change behaviour on overflow.

That said, this is a potential footgun if you're unaware that overflow is potentially problematic; the reason the default is panic in debug builds is to increase the chances of you noticing that you depend on overflow behaviour, and ensuring that it doesn't happen.

Better handling of integer wraparound in the kernel

Posted Feb 13, 2024 19:54 UTC (Tue) by DanilaBerezin (guest, #168271) [Link] (2 responses)

It's because the C standard doesn't define the implementation of signed integers. They could be 2's complement, 1's complement, signed bit, or whatever. So if you can have any of these representations, it's impossible to guarantee that signed overflow also wraps around. Why it doesn't define the implementation of unsigned integers as 2's complement is probably for historical purposes and backwards compatibility at this point.

Better handling of integer wraparound in the kernel

Posted Feb 13, 2024 20:11 UTC (Tue) by mb (subscriber, #50428) [Link] (1 responses)

Well, they could have decided to make it implementation defined instead of UB.
That would make a huge difference.

Better handling of integer wraparound in the kernel

Posted Feb 13, 2024 22:18 UTC (Tue) by andresfreund (subscriber, #69562) [Link]

> Well, they could have decided to make it implementation defined instead of UB.
> That would make a huge difference.

Or at the very least they could have provided a sane way to check if overflow occurs. Introducing that decades after making signed overflow UB is insane. A correct implementation of checking whether the widest integer type overflows is quite painful, particularly for multiplication.

Here's postgres' fallback implementation for checking if signed 64bit multiplication overflows:

/*
* Overflow can only happen if at least one value is outside the range
* sqrt(min)..sqrt(max) so check that first as the division can be quite a
* bit more expensive than the multiplication.
*
* Multiplying by 0 or 1 can't overflow of course and checking for 0
* separately avoids any risk of dividing by 0. Be careful about dividing
* INT_MIN by -1 also, note reversing the a and b to ensure we're always
* dividing it by a positive value.
*
*/
if ((a > PG_INT32_MAX || a < PG_INT32_MIN ||
b > PG_INT32_MAX || b < PG_INT32_MIN) &&
a != 0 && a != 1 && b != 0 && b != 1 &&
((a > 0 && b > 0 && a > PG_INT64_MAX / b) ||
(a > 0 && b < 0 && b < PG_INT64_MIN / a) ||
(a < 0 && b > 0 && a < PG_INT64_MIN / b) ||
(a < 0 && b < 0 && a < PG_INT64_MAX / b)))
{
*result = 0x5EED; /* to avoid spurious warnings */
return true;
}
*result = a * b;
return false;