|
|
Subscribe / Log in / New account

Ownership and lifetimes

Ownership and lifetimes

Posted Jul 12, 2021 18:52 UTC (Mon) by JanC_ (guest, #34940)
In reply to: Ownership and lifetimes by khim
Parent article: Announcing Arti, a pure-Rust Tor implementation (Tor blog)

The reason why signed overflow results are “undefined behavior” in C is that it was implemented differently in various hardware, and the C language didn’t want to force compiler makers to have to introduce slow workarounds on any of that hardware.


to post comments

Ownership and lifetimes

Posted Jul 13, 2021 10:34 UTC (Tue) by khim (subscriber, #9252) [Link] (8 responses)

This was logical back 1989. Today only two's complement representation is supported and all relevant CPUs handle overflow just fine.

And if compiler developers worry about benchmarks then they can always use an appropriate switch to restore obsolete behavior.

The problem with C/C++ is not certain peculiarities with misunderstandings about some complex corner-cases (it's inevitable when complex systems are involved) but absolute conviction of compiler developers in the fact that “mere users” don't even deserve a discussion.

“It's my way or the highway” attitude permeates the discussion. Well… Rust looks like a nice place on the highway, so maybe it's time to pick 2nd option.

Ownership and lifetimes

Posted Jul 13, 2021 20:51 UTC (Tue) by mrugiero (guest, #153040) [Link] (7 responses)

No, it was not 100% logical in 1989 and it is not 100% logical in 2021. C doesn't just have defined and undefined as categories for behavior. There's also implementation defined. That was really the best choice in this case, I think. This is because implementation defined still means the platform+compiler must make a promise about it. What the promise is is up to them, but they have to be consistent about it. They can also have some extra guarantees, just like `int` size and representation is implementation defined but is guaranteed to be able to represent the range [-2^31,2^31). So, if I know I'm building for ARM64 on Linux with GCC I could have *some* guarantees about integer overflow, even if they are not the same as MSVC on x86 for Windows. This would suffice to avoid these slow workarounds, while not making programmers bend backwards to avoid it altogether.

Ownership and lifetimes

Posted Jul 14, 2021 9:24 UTC (Wed) by khim (subscriber, #9252) [Link] (6 responses)

If I understand correctly they wanted to support systems where signed overflow causes a trap without introducing a ways to intercept that trap. But declaring it “undefined behavior” they achieved that goal: system which don't cause overflow have no need to intercept that trap while systems which cause it are not valid C.

Not sure how much sense that did even back in 1989, but in 2020 it doesn't make any sense at all: there are few CPUs which may still generate signals on overflow, but I don't know any where that's not optional.

And now, when two's complement is mandatory, workaround is, actually, “very simple”. Instead of if (x + 100 < x) you just have to write something like the following: if (std::std::is_signed_v<decltype(x)> ? static_cast<decltype(x)>(static_cast<std::make_unsigned_t<decltype(x)>>)(x) + 100U) < x : x + 100U < x)

And yes, of course you stuff that into a simple templated function. Except since there are no standard modules, no central repo and so on… everyone who wants/needs it would probably need to implement it separately. Which would, most likely, lead to bugs like an uncaught attempts to use that function to check overflow of float or double… or it would be miscomplied because someone used -std=c++20 instead of proper -std=gnu++20… as one of members of pilot Rust program for Android said: I don't feel as satisfied when reviewing Rust code: I always find something to comment on in C++ changes, usually some kind of C++ quirk or a memory management problem. When reviewing Rust changes it makes me uncomfortable if I don't find anything to comment on ("Am I being thorough?").

And these guys are talking about artificial puzzles the language imposes on top of each programming task? Really?

Ownership and lifetimes

Posted Jul 15, 2021 2:16 UTC (Thu) by mrugiero (guest, #153040) [Link] (5 responses)

I find your post very educational, however, it doesn't address my point: all the problems in making signed integer overflow defined behavior could have been fixed by making it implementation defined rather than undefined. Implementation defined means it's still hard to write *portable* code, but that when you do know where it'll run, you know there exists one behavior that is appropriate. That is, if it's implementation defined the compiler you use needs to define a given behavior for the platform it'll run and never violate it. So, gcc would be mandated to tell you what value will result from overflowing, even if such a value is one for SPARC and a different one for x86. I fail to see why it UB made more sense to them than that at the time.

But yeah, talking about artificial puzzles when you have all those kinds of behavior and you don't even mandate warnings when you hit them is a bit hypocritical.

Ownership and lifetimes

Posted Jul 15, 2021 14:12 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (3 responses)

> That is, if it's implementation defined the compiler you use needs to define a given behavior for the platform it'll run and never violate it. So, gcc would be mandated to tell you what value will result from overflowing, even if such a value is one for SPARC and a different one for x86. I fail to see why it UB made more sense to them than that at the time.

I may be misremembering, but didn't some platforms not provide a way to tell? That would require the compiler to emit manual checks for every possible overflowing operation in order to guarantee *a* behavior. This would also mean that optimizing to, e.g., FMA instructions is basically impossible because if the intermediate ops overflow, that needs to be handled as well.

But if there are no platforms that lack an "overflow happened" flag…meh.

Ownership and lifetimes

Posted Jul 15, 2021 15:56 UTC (Thu) by matthias (subscriber, #94967) [Link]

Even if there is a platform that lacks an overflow happens flag. Then on this platform the implementation defined behavior is overflow with the semantics of what overflow means on this platform. Implementation defined means you can choose different semantics for very combination of compiler/platform.

If you do not know which platform your code will run on, you can still be not sure what the result of x+100 is in case of overflow, but you can at least be sure what (x+100)&0 is. And you can be sure that if you do an assertion x+100>x to test for an overflow that the assertion is not optimized away. Ok, it can be optimized away of the compiler can prove that there is not overflow, but this is no problem.

Ownership and lifetimes

Posted Jul 15, 2021 16:26 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

If such platforms exist and need to be cared about (like you, I'm not sure if they did, or did not), an easy solution would be to make the behaviour of signed integer overflow unspecified, rather than undefined.

In the C language spec, there are four groups of behaviour (from strongest definition to weakest):

  1. Defined. To be a confirming implementation, you must behave in the way the Standard says you should, and there are no choices here. For example, the behaviour of the free(void * ptr) standard library function in C is defined if ptr is NULL or a pointer returned by malloc.
  2. Implementation defined. The Standard sets out restrictions on what this can be defined as, but the implementation gets to choose one concrete behaviour and stick to it. For example, the number of bits in int is implementation defined in C - any value greater than or equal to 16 is permissible, and the implementation must document which value it has cchosen.
  3. Unspecified. The Standard sets out restrictions on what behaviour is acceptable (e.g. "abc" == "abc" must be either 1 for true or 0 for false depending on whether the compiler merges identical string literals, but it cannot be 42 under any circumstances). The compiler can choose any behaviour it likes from the set the standard allows, and can choose different behaviour every time it sees the unspecified construct (e.g. printf("%d %d\n", "abc" == "abc", "abc" == "abc"); can print "0 0", "0 1", "1 0", or "1 1", and can print a different one each time the program is run); it does not have to stick to one choice, as it does for implementation-defined behaviour.
  4. Undefined. If a program run would follow a path with undefined behaviour, the meaning of the program as a whole is not set out by the Standard; it can do anything the compiler implementer wishes, even if the user of the compiler thinks that's a crazy stupid outcome.

If signed integer overflow became unspecified, such that the result of a signed integer overflow could be any integer value, then we're in a much better place. The compiler can just use the machine instruction (assuming it doesn't trap), and we don't have the pain where the compiler goes "well, if this is signed overflow, then behaviour is undefined, ergo I can assume it's not, ergo I can optimise out a check"; instead, signed overflow has to produce an integer, but which integer is not known by the code author.

Ownership and lifetimes

Posted Jul 15, 2021 16:42 UTC (Thu) by khim (subscriber, #9252) [Link]

This is all well and good, but this requires some form of consensus.

And in C/C++ world discussions are just not happening. Why have the coveted “pointer provenance” proposals were not incorporated into standard in 15 years (and counting)?

Because not even C/C++ compiler writers can agree with each other. -std=friendly-c proposal had the exact some fate.

And when someone tries to resolve the issue by “starting from scratch”… the end result is D, Java, C# or Rust… never a “friendly C”…

Rust is just the first such “new C” language which is low-level enough to actually be usable for all kinds of usages. Starting from lowest-level just above assembler.

Ownership and lifetimes

Posted Jul 20, 2021 10:25 UTC (Tue) by anton (subscriber, #25547) [Link]

Concerning integer overflow, the result with a given bit-width is the same for twos-complement arithmetic across all hardware (they all can perform modulo arithmetic, aka wrapping on overflow). And all architectures introduced since 1970 use twos-complement arithmetic exclusively (while, e.g., . So for integer overflow, the behaviour could be just standarized, no need for cop-outs like implementation-defined or implementation-specified. As an example modulo arithmetic has been standardized in Java from the start.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds