GCC 12.1 Released

Posted May 9, 2022 19:24 UTC (Mon) by wtarreau (subscriber, #51152)
In reply to: GCC 12.1 Released by excors
Parent article: GCC 12.1 Released

> You could remove the length check and do "if (snprintf(...) >= sizeof(fullpath)) return -1;", because -Wformat-truncation=1 only warns if it heuristically estimates that truncation is likely *and* the return value is unused. That would make the code simpler and more robust, since it no longer relies on you manually replicating snprintf's length calculation, and would eliminate the warning.

Sorry, but no. There are sufficiently bogus snprintf() implementations in the wild, I'm not going to remove a security check in my code just to silence a bogus warning in gcc. Instead I added the condition to snprintf() in addition to the existing one, making the code even uglier, and I even managed to fail it once by forgetting to add "> sizeof()" at the end. Fortunately it broke in the right direction and stopped working. A similar bug in the other direction can cause an introduction of a vulnerability, as quite often when playing dirty length tricks to shut up a compiler.

> I suspect the compiler is converting the check into "strlen(dir) + strlen(file) > 4096/2-2", and both values are unsigned so it can deduce strlen(dir) <= 2046 and strlen(file) <= 2046, but it forgets the relationship between them because it doesn't support multi-variable constraints on string lengths - it just has an integer upper/lower bound for each string independently

That was exactly my feeling as well, which proves that the warning is totally bogus and should be reverted. But they never revert warnings, they just add tons more until the code becomes unreadable in ifdefs and convoluted tests that become totally insecure.

> > This enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros.
> which is also behaving as advertised, because C string functions are always questionable, and it's easy to avoid the warning by checking snprintf's return value.

I get your point but here we're reaching the point that many of us have been seriously questioning for a while: "how long before we have to definitely remove -Wall projects built with gcc". That's sad because it used to catch many programmers' bugs in the past and has become useless and unusable over time. Reminds me of the 90s when compilers could almost compile /etc/passwd without sweating...

GCC 12.1 Released

Posted May 9, 2022 19:35 UTC (Mon) by mpr22 (subscriber, #60784) [Link] (6 responses)

This whole discussion does a very good job of convincing me that the real problem in this particular scenario is C's string model being profoundly Worng.

GCC 12.1 Released

Posted May 9, 2022 20:17 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (5 responses)

It's not wrong, merely inadequate. Quoth James Mickens:

> You might ask, “Why would someone write code in a grotesque
> language that exposes raw memory addresses? Why not use
> a modern language with garbage collection and functional
> programming and free massages after lunch?” Here’s the
> answer: Pointers are real. They’re what the hardware understands.
> Somebody has to deal with them. You can’t just place
> a LISP book on top of an x86 chip and hope that the hardware
> learns about lambda calculus by osmosis.

https://www.usenix.org/system/files/1311_05-08_mickens.pdf

(I'm sure that Mr. Mickens is/was aware that LISP machines existed, once upon a time, but they're hardly relevant to the modern era.)

GCC 12.1 Released

Posted May 10, 2022 3:43 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (4 responses)

That's exactly the way I see it. Nowadays people using C need it for low-level stuff because someone has to do it, and the places where C is needed want to have a good trust on the code translation to machine code because where it's used, it matters. Usually it's a mix of relying on hardware (e.g. hope the compiler will produce a ROL when using both a left and right shifts), the OS (e.g. cause a segfault when writing at address zero), and the libc (e.g; memcpy() does what the standard says it does).

That's why for me it's important that a C compiler tries less to guess about improbable mistakes that are relevant to absolutely zero use cases for this language, but instead focuses on real mistakes that are easy to solve (e.g; operators precedence, asking for braces, undefined use of self-increment in arguments, etc).

I'm fine with having such unlikely analysis but only on developer's request (e.g. -Wsuspicious). It could then go further and report some uncommon constructs that are inefficient and are suspicious because of that without annoying users when -Wall is used to detect likely incompatibilities on their platforms (because that's why most of us use -Wall -Werror, it's to catch problems at build time on other platforms).

GCC 12.1 Released

Posted May 10, 2022 15:51 UTC (Tue) by dvdeug (guest, #10998) [Link] (3 responses)

If you want all warnings, use Wall. If you want a specific set of warnings, turn just those warnings on. Turning arbitrary warnings on and turning warnings into errors turns the language into an ever-evolving, compiler-specific language, which is your choice, but something terribly silly to complain about.

GCC 12.1 Released

Posted May 10, 2022 17:22 UTC (Tue) by mpr22 (subscriber, #60784) [Link] (2 responses)

GCC's -Wall command-line option does not turn on all warnings, hasn't done for years, and quite possibly has never done so in the quarter-century I've been using GCC.

GCC 12.1 Released

Posted May 10, 2022 18:43 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (1 responses)

Exact. Plus "enabling specific warnings" would only work if there was a portable way to enable (or silence) warnings across all compilers without having to perform a discovery pass first. Enabling a fixed set everywhere is trivial when you use *your* compiler for *your* project. When you distribute your code and it builds on a wide range of compilers that's a totally different story.

GCC 12.1 Released

Posted May 11, 2022 21:42 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

The C standard doesn't really give compilers a whole lot to go on here. For certain issues, it directs the compiler to "emit a diagnostic," and for a superset of those issues, it says "the program is ill-formed," but that's it.

* Warnings vs. errors? Unspecified. The compiler is entirely within its rights to produce a binary even if the program is ill-formed and a diagnostic is required, so long as the compiler at least prints some sort of message diagnosing the issue.
* -Wall vs. -Wextra vs. -Wsome-random-thing? Nope. Most compilers emit all standardized warnings with no flags, so the additional warnings you can enable are all nonstandard, and it's purely an issue of implementation quality how they work, which flags toggle which warnings, and so on.
* Formatting of messages? No. The standard simply directs the compiler to "emit a diagnostic," and compiler writers are responsible for figuring out what that means and how to implement it. This is arguably a good thing, because it makes it possible to display warnings graphically or in an IDE (rather than e.g. requiring the use of stderr and then having the IDE parse the output from a separate process, which might be a good design but should not be mandatory), but it also means that different compilers can print totally different messages for the same problem.

About the best you can do is pick a set of warnings that you think is appropriate for your codebase (e.g. start with -Wall and add/subtract warnings as necessary), fix all of those warnings, and then aggressively WONTFIX any bugs that people file about warnings that are not on the list (unless it looks like the warning may have identified a real bug, in which case you might want to consider adding it to your list). If people don't like that, they can fork it.

GCC 12.1 Released

Posted May 10, 2022 20:39 UTC (Tue) by dvdeug (guest, #10998) [Link] (2 responses)

> There are sufficiently bogus snprintf() implementations in the wild,

So working implementations can't be improved because you have to be backwardly compatible with systems that can't properly implement functions released with BSD4.4 and standardized in C99. Where are these snprintf implementations? Especially "in the wild"? It's not sounding like something that most GCC users or developers would care about.

GCC 12.1 Released

Posted May 11, 2022 8:11 UTC (Wed) by geert (subscriber, #98403) [Link] (1 responses)

Like the snprintf() you had to roll yourself, because VxWorks didn't provide one?

GCC 12.1 Released

Posted May 11, 2022 20:44 UTC (Wed) by wtarreau (subscriber, #51152) [Link]

Or an old Solaris one that returned 0 or -1 when the operation failed (I don't remember, sorry), or the one in dietlibc that used to do something similar, etc. Even here the snprintf() doc doesn't match what we do on most modern systems:

https://pubs.opengroup.org/onlinepubs/7908799/xsh/snprint...

RETURN VALUE
Upon successful completion, these functions return the number of bytes
transmitted excluding the terminating null in the case of sprintf() or snprintf()
or a negative value if an output error was encountered.

On Linux+glibc:
The functions snprintf() and vsnprintf() do not write more than size
bytes (including the terminating null byte ('\0')). If the output was
truncated due to this limit, then the return value is the number of
characters (excluding the terminating null byte) which would have been
written to the final string if enough space had been available.

That's what most modern systems do, allowing you to realloc() the area and try
again. Some do not support being passed size zero, others do.

snprintf() is one of the most important and least portable functions when it comes
to good security practices. There's also %z (size_t) that's not much portable, and
"%.*s" that often does fun things like shifting all args by one since %.* is not
understood as consuming an extra argument, so usually you segfault by trying to
print the string from a pointer that's in fact its max length.

GCC 12.1 Released

Posted May 16, 2022 19:50 UTC (Mon) by jpfrancois (subscriber, #65948) [Link]

But the check the return value is much more robuste in à lot of case :
If you change the format strings it still works.
You do not need to implement your security check across all call site.
What if you have à slightly more complex format string ? You have to implement correctly the size calculation everywhere ?