|
|
Log in / Subscribe / Register

GCC 12.1 Released

GCC 12.1 Released

Posted May 9, 2022 7:45 UTC (Mon) by wtarreau (subscriber, #51152)
In reply to: GCC 12.1 Released by wtarreau
Parent article: GCC 12.1 Released

... and that started already with a new awesome warning, it didn't take long! Note, this one is implified, it instead complains at plenty of places where controls were already in place.

$ cat thankyougcc12.c
#include <sys/param.h>
#include <stdio.h>
#include <string.h>

char dir[MAXPATHLEN];
char file[MAXPATHLEN];
char fullpath[MAXPATHLEN];

/* returns -1 in case of error */
int makefullpath()
{
	if ((strlen(dir) + 1 + strlen(file) + 1) > sizeof(fullpath))
		return -1;

	snprintf(fullpath, sizeof(fullpath), "%s/%s", dir, file);
	return 0;
}

$ x86_64-linux-gcc -O2 -Wall-c thankyougcc12.c 
thankyougcc12.c: In function 'makefullpath':
thankyougcc12.c:15:50: warning: '%s' directive output may be truncated writing up to 4094 bytes into a region of size between 1 and 4095 [-Wformat-truncation=]
   15 |         snprintf(fullpath, sizeof(fullpath), "%s/%s", dir, file);
      |                                                  ^~        ~~~~
thankyougcc12.c:15:9: note: 'snprintf' output between 2 and 8190 bytes into a destination of size 4096
   15 |         snprintf(fullpath, sizeof(fullpath), "%s/%s", dir, file);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sure... I just performed the length check before calling snprintf() and it believes I'm trying to stuff the sum of these in this string. So I have two options, either I conclude that I can remove all my now useless length checks (since gcc12 doesn't trust them, so possibly it optimised them away, not checked) or I'll simply disable that warning that became stupid.

And it's really the control fro the previous check that is wrong, because if I lower the limit on the sump of strlen() in the first check to sizeof/2, it accepts to pass! So it looks like they've implemented a string length test for snprintf() that didn't consider that two strings could be concatenated by a single call (yes we can do that!). It would be nice if they only enabled warnings after they tested that they actually work on real code.

It's sad that each and every new version forces you to disable useful warnings that once used to be valid and became useless over time, it does render the code less secure by letting stupid bugs slip through. Because of this, in the long term I'll probably end up writing my own function and stop calling it snprintf() directly so that it stops being smart. Too bad if I introduce new bugs in this action.

What would be needed would be a diagnostic mode where you ask for suggestions or "are you sure" only as a developer, but not stuff like this that prove the compiler didn't understand the code but will cause build breakage at users', and it completely discourages programmers from putting error checks in their code since regardless of what was done, the compiler complains anyway.

Ah, GNU Complainers Collection, I really love you :-(


to post comments

GCC 12.1 Released

Posted May 9, 2022 10:08 UTC (Mon) by excors (subscriber, #95769) [Link] (18 responses)

> So I have two options, either I conclude that I can remove all my now useless length checks (since gcc12 doesn't trust them, so possibly it optimised them away, not checked) or I'll simply disable that warning that became stupid.

You could remove the length check and do "if (snprintf(...) >= sizeof(fullpath)) return -1;", because -Wformat-truncation=1 only warns if it heuristically estimates that truncation is likely *and* the return value is unused. That would make the code simpler and more robust, since it no longer relies on you manually replicating snprintf's length calculation, and would eliminate the warning.

> if I lower the limit on the sump of strlen() in the first check to sizeof/2, it accepts to pass!

I suspect the compiler is converting the check into "strlen(dir) + strlen(file) > 4096/2-2", and both values are unsigned so it can deduce strlen(dir) <= 2046 and strlen(file) <= 2046, but it forgets the relationship between them because it doesn't support multi-variable constraints on string lengths - it just has an integer upper/lower bound for each string independently (I think?). Then it knows the snprintf won't need more than 4094 bytes and can't overflow. In the original code, all it can deduce is strlen(dir) <= 4096 etc, which isn't sufficient to prove it won't overflow.

It appears this only fixes the warning at -O2, not -O1, seemingly because -O1 doesn't deduce string length constraints from strlen comparisons and it just uses the declared length instead.

The GCC documentation says:

> When the exact number of bytes written by a format directive cannot be determined at compile-time it is estimated based on heuristics that depend on the level argument and on optimization. While enabling optimization will in most cases improve the accuracy of the warning, it may also result in false positives.

so it's behaving as advertised (i.e. not stable or precise). And -Wall says:

> This enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros.

which is also behaving as advertised, because C string functions are always questionable, and it's easy to avoid the warning by checking snprintf's return value.

GCC 12.1 Released

Posted May 9, 2022 10:44 UTC (Mon) by atnot (guest, #124910) [Link] (5 responses)

> It appears this only fixes the warning at -O2, not -O1

Wait wait what, the warnings change depending on optimization level? Am I the only one for whom this is surprising news?

GCC 12.1 Released

Posted May 9, 2022 11:40 UTC (Mon) by anselm (subscriber, #2796) [Link]

Wait wait what, the warnings change depending on optimization level?

That's not new. I seem to remember from back when I was programming in C more that some GCC warnings about unreachable code or uninitialised variables were only output under optimisation, because otherwise the analysis on which these warnings were based would not have been performed.

GCC 12.1 Released

Posted May 9, 2022 11:43 UTC (Mon) by pizza (subscriber, #46) [Link]

> Wait wait what, the warnings change depending on optimization level? Am I the only one for whom this is surprising news?

This would appear to be an obvious conclusion from different optimization levels producing different sets of warnings.

I don't know when I first became aware of this, but it's been at least a decade.

GCC 12.1 Released

Posted May 9, 2022 11:49 UTC (Mon) by excors (subscriber, #95769) [Link]

That's true for lots of compiler warnings. The optimisation passes provide a lot of information about control flow and data flow, especially when they remove function call boundaries by inlining, which helps determine whether code is probably buggy (and should be warned about) or probably safe (no warning). Without that information, the compiler can't be confident either way and will usually err on the side of not warning (because programmers get really annoyed by false positives, especially if there's no easy way to make the compiler shut up). So it will usually find and report more bugs when you turn on optimisation.

In this case, if the variables are declared as char* then the compiler has no idea of their probable length and doesn't warn. It's only because they're declared as char[MAXPATHLEN] that it becomes reasonably confident in its guess that the string might actually be MAXPATHLEN-1 in length, which is enough confidence to emit the (incorrect) warning. More sophisticated optimisation passes let it make a better guess of the string's length, reducing the false positives.

GCC 12.1 Released

Posted May 9, 2022 12:22 UTC (Mon) by tzafrir (subscriber, #11501) [Link] (1 responses)

On GCC 10 you get the same warning even without any -O flag. So this did somewhat improve in later GCC versions.

GCC 12.1 Released

Posted May 9, 2022 19:25 UTC (Mon) by wtarreau (subscriber, #51152) [Link]

> On GCC 10 you get the same warning even without any -O flag. So this did somewhat improve in later GCC versions.

In a sense, that's a way to see it... But 4.7 never got it wrong at all and used to provide meaningful warnings if you go in that direction :-) Plus it was 3 times faster.

GCC 12.1 Released

Posted May 9, 2022 19:24 UTC (Mon) by wtarreau (subscriber, #51152) [Link] (11 responses)

> You could remove the length check and do "if (snprintf(...) >= sizeof(fullpath)) return -1;", because -Wformat-truncation=1 only warns if it heuristically estimates that truncation is likely *and* the return value is unused. That would make the code simpler and more robust, since it no longer relies on you manually replicating snprintf's length calculation, and would eliminate the warning.

Sorry, but no. There are sufficiently bogus snprintf() implementations in the wild, I'm not going to remove a security check in my code just to silence a bogus warning in gcc. Instead I added the condition to snprintf() in addition to the existing one, making the code even uglier, and I even managed to fail it once by forgetting to add "> sizeof()" at the end. Fortunately it broke in the right direction and stopped working. A similar bug in the other direction can cause an introduction of a vulnerability, as quite often when playing dirty length tricks to shut up a compiler.

> I suspect the compiler is converting the check into "strlen(dir) + strlen(file) > 4096/2-2", and both values are unsigned so it can deduce strlen(dir) <= 2046 and strlen(file) <= 2046, but it forgets the relationship between them because it doesn't support multi-variable constraints on string lengths - it just has an integer upper/lower bound for each string independently

That was exactly my feeling as well, which proves that the warning is totally bogus and should be reverted. But they never revert warnings, they just add tons more until the code becomes unreadable in ifdefs and convoluted tests that become totally insecure.

> > This enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros.
> which is also behaving as advertised, because C string functions are always questionable, and it's easy to avoid the warning by checking snprintf's return value.

I get your point but here we're reaching the point that many of us have been seriously questioning for a while: "how long before we have to definitely remove -Wall projects built with gcc". That's sad because it used to catch many programmers' bugs in the past and has become useless and unusable over time. Reminds me of the 90s when compilers could almost compile /etc/passwd without sweating...

GCC 12.1 Released

Posted May 9, 2022 19:35 UTC (Mon) by mpr22 (subscriber, #60784) [Link] (6 responses)

This whole discussion does a very good job of convincing me that the real problem in this particular scenario is C's string model being profoundly Worng.

GCC 12.1 Released

Posted May 9, 2022 20:17 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (5 responses)

It's not wrong, merely inadequate. Quoth James Mickens:

> You might ask, “Why would someone write code in a grotesque
> language that exposes raw memory addresses? Why not use
> a modern language with garbage collection and functional
> programming and free massages after lunch?” Here’s the
> answer: Pointers are real. They’re what the hardware understands.
> Somebody has to deal with them. You can’t just place
> a LISP book on top of an x86 chip and hope that the hardware
> learns about lambda calculus by osmosis.

https://www.usenix.org/system/files/1311_05-08_mickens.pdf

(I'm sure that Mr. Mickens is/was aware that LISP machines existed, once upon a time, but they're hardly relevant to the modern era.)

GCC 12.1 Released

Posted May 10, 2022 3:43 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (4 responses)

That's exactly the way I see it. Nowadays people using C need it for low-level stuff because someone has to do it, and the places where C is needed want to have a good trust on the code translation to machine code because where it's used, it matters. Usually it's a mix of relying on hardware (e.g. hope the compiler will produce a ROL when using both a left and right shifts), the OS (e.g. cause a segfault when writing at address zero), and the libc (e.g; memcpy() does what the standard says it does).

That's why for me it's important that a C compiler tries less to guess about improbable mistakes that are relevant to absolutely zero use cases for this language, but instead focuses on real mistakes that are easy to solve (e.g; operators precedence, asking for braces, undefined use of self-increment in arguments, etc).

I'm fine with having such unlikely analysis but only on developer's request (e.g. -Wsuspicious). It could then go further and report some uncommon constructs that are inefficient and are suspicious because of that without annoying users when -Wall is used to detect likely incompatibilities on their platforms (because that's why most of us use -Wall -Werror, it's to catch problems at build time on other platforms).

GCC 12.1 Released

Posted May 10, 2022 15:51 UTC (Tue) by dvdeug (subscriber, #10998) [Link] (3 responses)

If you want all warnings, use Wall. If you want a specific set of warnings, turn just those warnings on. Turning arbitrary warnings on and turning warnings into errors turns the language into an ever-evolving, compiler-specific language, which is your choice, but something terribly silly to complain about.

GCC 12.1 Released

Posted May 10, 2022 17:22 UTC (Tue) by mpr22 (subscriber, #60784) [Link] (2 responses)

GCC's -Wall command-line option does not turn on all warnings, hasn't done for years, and quite possibly has never done so in the quarter-century I've been using GCC.

GCC 12.1 Released

Posted May 10, 2022 18:43 UTC (Tue) by wtarreau (subscriber, #51152) [Link] (1 responses)

Exact. Plus "enabling specific warnings" would only work if there was a portable way to enable (or silence) warnings across all compilers without having to perform a discovery pass first. Enabling a fixed set everywhere is trivial when you use *your* compiler for *your* project. When you distribute your code and it builds on a wide range of compilers that's a totally different story.

GCC 12.1 Released

Posted May 11, 2022 21:42 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

The C standard doesn't really give compilers a whole lot to go on here. For certain issues, it directs the compiler to "emit a diagnostic," and for a superset of those issues, it says "the program is ill-formed," but that's it.

* Warnings vs. errors? Unspecified. The compiler is entirely within its rights to produce a binary even if the program is ill-formed and a diagnostic is required, so long as the compiler at least prints some sort of message diagnosing the issue.
* -Wall vs. -Wextra vs. -Wsome-random-thing? Nope. Most compilers emit all standardized warnings with no flags, so the additional warnings you can enable are all nonstandard, and it's purely an issue of implementation quality how they work, which flags toggle which warnings, and so on.
* Formatting of messages? No. The standard simply directs the compiler to "emit a diagnostic," and compiler writers are responsible for figuring out what that means and how to implement it. This is arguably a good thing, because it makes it possible to display warnings graphically or in an IDE (rather than e.g. requiring the use of stderr and then having the IDE parse the output from a separate process, which might be a good design but should not be mandatory), but it also means that different compilers can print totally different messages for the same problem.

About the best you can do is pick a set of warnings that you think is appropriate for your codebase (e.g. start with -Wall and add/subtract warnings as necessary), fix all of those warnings, and then aggressively WONTFIX any bugs that people file about warnings that are not on the list (unless it looks like the warning may have identified a real bug, in which case you might want to consider adding it to your list). If people don't like that, they can fork it.

GCC 12.1 Released

Posted May 10, 2022 20:39 UTC (Tue) by dvdeug (subscriber, #10998) [Link] (2 responses)

> There are sufficiently bogus snprintf() implementations in the wild,

So working implementations can't be improved because you have to be backwardly compatible with systems that can't properly implement functions released with BSD4.4 and standardized in C99. Where are these snprintf implementations? Especially "in the wild"? It's not sounding like something that most GCC users or developers would care about.

GCC 12.1 Released

Posted May 11, 2022 8:11 UTC (Wed) by geert (subscriber, #98403) [Link] (1 responses)

Like the snprintf() you had to roll yourself, because VxWorks didn't provide one?

GCC 12.1 Released

Posted May 11, 2022 20:44 UTC (Wed) by wtarreau (subscriber, #51152) [Link]

Or an old Solaris one that returned 0 or -1 when the operation failed (I don't remember, sorry), or the one in dietlibc that used to do something similar, etc. Even here the snprintf() doc doesn't match what we do on most modern systems:

https://pubs.opengroup.org/onlinepubs/7908799/xsh/snprint...

RETURN VALUE
Upon successful completion, these functions return the number of bytes
transmitted excluding the terminating null in the case of sprintf() or snprintf()
or a negative value if an output error was encountered.

On Linux+glibc:
The functions snprintf() and vsnprintf() do not write more than size
bytes (including the terminating null byte ('\0')). If the output was
truncated due to this limit, then the return value is the number of
characters (excluding the terminating null byte) which would have been
written to the final string if enough space had been available.

That's what most modern systems do, allowing you to realloc() the area and try
again. Some do not support being passed size zero, others do.

snprintf() is one of the most important and least portable functions when it comes
to good security practices. There's also %z (size_t) that's not much portable, and
"%.*s" that often does fun things like shifting all args by one since %.* is not
understood as consuming an extra argument, so usually you segfault by trying to
print the string from a pointer that's in fact its max length.

GCC 12.1 Released

Posted May 16, 2022 19:50 UTC (Mon) by jpfrancois (subscriber, #65948) [Link]

But the check the return value is much more robuste in à lot of case :
If you change the format strings it still works.
You do not need to implement your security check across all call site.
What if you have à slightly more complex format string ? You have to implement correctly the size calculation everywhere ?


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds