Modern C for Fedora (and the world)
There are a number of constructs that were normal in 1980s C, but which are seen as a breeding ground for bugs now. These include:
- Implicit function declarations: if code calls function() without having first included a declaration for that function, the compiler implicitly declares it as taking no parameters and returning an int value. That may not be how the function is actually defined, opening up possibilities for all kinds of confusion.
- Implicit integer declarations: a variable declared with just a storage class (static, for example) is implicitly deemed to be an int. C++ has already adopted type inference, where the compiler figures out what the appropriate type for the variable should be from how it is used, in this case. There are schemes afoot to add a similar feature to C, but type inference is incompatible with implicit int.
- Conversions between pointers and integers: original C played fast and loose with pointer values, allowing them to be converted to and from int values at will. Whether such constructions actually work on current architectures (where a pointer is likely to be 64 bits and an int 32 bits) is a matter of chance.
- Inconsistent return statements: old-style C paid little attention to whether a function returned a value or not; a function declared int could do a bare return (or just fall off the end with no return statement at all), and void functions could attempt to return a value without complaint. Good things will not happen if a function fails to return a value that the caller is expecting.
- Missing parameter types in function definitions: C would accept such definitions, assigning no type to the parameter at all. That means that typos in a function prototype (such as "f(ant)" instead of "f(int)") can give surprising results.
- Assignments between incompatible pointer types: continuing in the "fast and loose with pointers" theme, early C had no objections to assigning a pointer value to an incompatible type without even a cast. Sometimes a developer writing such an assignment knew what they were doing; other times not.
Current GCC versions will issue warnings for the above constructs, but will proceed to compile the code anyway. Florian Weimer, though, would like to change that situation; at the end of November, he posted an update on work toward turning the warnings for obsolete C constructs into hard errors instead. This would seem like a sensible thing to do; those constructs have been deprecated for ages; they can hide bugs or prevent the adoption of new language features and should not be appearing in modern code.
There is only one little problem: a lot of code in the free-software world is not modern. Simply turning all of those warnings into errors has the potential to break the compilation of numerous packages — an outcome that is not likely to be universally welcomed. To address this problem, the Fedora project has been working on a "porting to modern C" project since at least late 2022. The idea is to find the packages in Fedora that fail to build with the new errors and fix them, sending those fixes upstream whenever possible. Once Fedora builds correctly, chances are that the amount of old code that remains will be relatively small.
Weimer has also posted an update on the Fedora work. There are, it seems, still a number of packages (out of about 15,000 tested) that generate errors indicating the presence of old code:
Implicit function definition 53 Implicit integer declaration 2 Integer conversion 99 Return mismatch 13 Missing parameter type 0 Pointer assignment 374
While quite a bit of progress has been made toward the goal of building Fedora with the new errors, Weimer points out that the job is not yet done:
As you can see, the incompatible-pointer-types issues are a bit of a problem. We fixed over 800 packages during the first round, and now it looks like we are only two thirds done.It is unlikely that I will be able to work on all these issues myself or with help from the people around me. I just suggested to GCC upstream that we may have to reconsider including this change in the GCC 14 release.
Weimer included a separate column for programs that may be miscompiled
because autoconf may be confused by the new errors. For example,
many of its checks don't bother to declare exit(); they will fail
to compile if the error for implicit function definitions is enabled,
causing autoconf to conclude that the feature it is checking for
is absent. There are also seemingly problems with the Vala language, which compiles to
obsolete C. Vala has not been under active
development has not addressed this problem for some time
and seems unlikely to be fixed.
The current plan is to continue this work, focusing mostly on the Fedora Rawhide development distribution. Efforts will be made to deal with the autoconf problem and to put some sort of hack into Vala, but that still leaves hundreds of packages needing further attention. If they cannot be fixed in time, it may not be possible to enable all of those errors in the GCC 14 release.
Part of the problem, perhaps, is that it appears to have fallen on Fedora
and GCC developers to make these fixes. In many cases, this may be the
result of the lack of a viable upstream for many packages; we are probably
all using more unmaintained code than we like to think. At its best, this
work might shine a light on some of those packages and, in a truly
optimistic world, bring out developers who can pick up that maintenance and
modernize the code. In many cases, it should be a relatively
straightforward task and a reasonable entry point into maintainership.
With enough help, perhaps we can finally leave archaic C code behind.
Posted Dec 8, 2023 17:08 UTC (Fri)
by willy (subscriber, #9762)
[Link] (61 responses)
Posted Dec 8, 2023 17:21 UTC (Fri)
by cjwatson (subscriber, #7322)
[Link] (2 responses)
Posted Dec 9, 2023 10:59 UTC (Sat)
by fw (subscriber, #26023)
[Link] (1 responses)
But it turns out that on x86-64, without PIE, global data, constants, and the heap all are in the first 32 bits of the address space. Even today, only the stack is outside that range. So you get surprisingly far with 32-bit pointers only. It really shouldn't work, but it does in many cases. But of course PIE changes that.
Posted Dec 16, 2023 8:16 UTC (Sat)
by mpr22 (subscriber, #60784)
[Link]
surely that depends on the size of your heap
Posted Dec 8, 2023 17:32 UTC (Fri)
by Paf (subscriber, #91811)
[Link] (28 responses)
So the assumption that made most of that code work didn’t become wrong with the advent of 64 bit.
Fwiw, I work in a project that has long done Wall and Werror and so all of these constructs terrify me :)
Posted Dec 8, 2023 19:34 UTC (Fri)
by roc (subscriber, #30627)
[Link] (27 responses)
Though, maybe compilers have stopped adding warnings to -Wall and now only add to -Wextra instead? I wish I knew.
Posted Dec 8, 2023 22:23 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
Eh, it depends what you want to get out of Wall/Werror. If you're a distro, of course you don't want to use it, it will break all the packages all the time. If you're an upstream, and you also require zero lint errors (for whatever linter your project is using), then this is much less problematic. By the time something makes it into -Wall, the linters have probably been complaining about it for years, and so in practice, the amount of breakage when you upgrade to a new compiler is rather limited. And you always have the option of (temporarily) doing -Wall -Wno-foo if a particular warning causes issues.
Posted Dec 9, 2023 7:56 UTC (Sat)
by wtarreau (subscriber, #51152)
[Link]
Posted Dec 9, 2023 11:09 UTC (Sat)
by Sesse (subscriber, #53779)
[Link]
Posted Dec 8, 2023 22:40 UTC (Fri)
by fwiesweg (guest, #116364)
[Link] (19 responses)
On the other hand, if you are able to keep up with the load, it's about the best thing you can do. With each modernization push, enforced by making warnings fail hard, the amount of runtime errors can be brought down considerably, and by now nearly all issues we have are caused by missing or disabled static check.
Of course, updating was gruesome, tedious work, but it makes the life after much more relaxed and enjoyable. I even ran a Friday deployment today without being overly worried, something I'd never have done just five years ago. In then long-run, -Wall was really worth it.
Posted Dec 8, 2023 23:45 UTC (Fri)
by roc (subscriber, #30627)
[Link] (15 responses)
Posted Dec 9, 2023 0:30 UTC (Sat)
by pbonzini (subscriber, #60935)
[Link] (13 responses)
Posted Dec 9, 2023 1:02 UTC (Sat)
by roc (subscriber, #30627)
[Link] (6 responses)
Currently we build with -Werror -Wall for CMAKE_BUILD_TYPE=DEBUG, and not for CMAKE_BUILD_TYPE=RELEASE. That's assuming developers build regularly with DEBUG and people who just want a working upstream build don't. It works out OK in practice. It doesn't seem ideal but maybe it's about as good as it can be.
Posted Dec 9, 2023 8:15 UTC (Sat)
by pm215 (subscriber, #98099)
[Link] (4 responses)
Posted Dec 9, 2023 11:11 UTC (Sat)
by smcv (subscriber, #53363)
[Link] (3 responses)
Posted Dec 9, 2023 16:41 UTC (Sat)
by pm215 (subscriber, #98099)
[Link] (2 responses)
Unless the compiler authors commit to "-Og will never lose debug info that is present in -O0" I personally use and advise others to use -O0.
Posted May 12, 2024 4:41 UTC (Sun)
by koh (subscriber, #101482)
[Link] (1 responses)
We have -Werror on also for optimized debug builds (those with -O2 -UNDEBUG) in the CI. The only way they differ from release builds is NDEBUG. Locally, by default, I run -Wno-error, because I frequently switch between compilers/versions.
Posted May 12, 2024 14:00 UTC (Sun)
by pizza (subscriber, #46)
[Link]
Over the years I've had to work with codebases that simply wouldn't *fit* in the available space without a combination of -Os and LTO. -Og has proven to be quite useful in a context where -O0 simply isn't feasible.
Posted Dec 9, 2023 12:11 UTC (Sat)
by kreijack (guest, #43513)
[Link]
People building from upstream, should be able to deal with these kind of issues; usually this means reading a README file.
> Currently we build with -Werror -Wall for CMAKE_BUILD_TYPE=DEBUG, and not for CMAKE_BUILD_TYPE=RELEASE. That's assuming developers build regularly with DEBUG and people who just want a working upstream build don't. It works out OK in practice. It doesn't seem ideal but maybe it's about as good as it can be.
This is a sane principle.
Posted Dec 9, 2023 8:35 UTC (Sat)
by marcH (subscriber, #57642)
[Link] (5 responses)
What I found to work well is to have -Werror added only in pre-merge CI. Not having it by default makes prototyping more convenient.
This is consistent with running linters in pre-merge CI while not forcing developers to run them all the time.
None of this approach is specific to C.
Of course you need to have some pre-merge CI in the first place. If you don't even have that minimal level of CI then the project is basically unmaintained.
Posted Dec 9, 2023 14:46 UTC (Sat)
by mathstuf (subscriber, #69389)
[Link] (4 responses)
Posted Dec 9, 2023 17:54 UTC (Sat)
by marcH (subscriber, #57642)
[Link] (3 responses)
This being said, the simplest and best solution is to compile twice: once without -Werror and once with -Werror. This can be in two separate (and clearly labeled) runs or even consecutively. The first run shows all warnings and the second blocks the merge.
This is a bit similar to the `make || make -j1` technique that avoids (real) errors being drowned by many threads and confusing developers.
Posted Dec 9, 2023 21:43 UTC (Sat)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
I'll do an initial run on all of the CI configurations to get a survey of what is broken and then focus on what is broken after that (I don't build all of the configurations locally to know anyways).
Posted Dec 10, 2023 0:53 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (1 responses)
If you have a good test framework that does all that for you then you should absolutely ignore my previous post. Not everyone is that lucky. I mean many projects don't even have any pre-merge CI at all (yet?). Remember that the main article is about Fedora and others stepping up to rescue orphaned projects coded in ancient C. In such a context my simple advice above definitely holds because it's just one extra line in your CI configuration. Super cheap and very high value and something people not familiar with CI may think about.
> Developers aren't looking through build logs.
They don't by default (assuming of course you have developers in the first place...)
They definitely do when there's a CI red light somewhere that threatens the merge of their code any maybe their deadline. In such a case I know from first hand experience that they really enjoy the simple "tricks" I recommended above.
> and uploads it to CDash for viewing.
I don't know anything about CDash but I know neither GitHub nor Jenkins nor Gitlab has any "yellow light"/warning concept, it's either green/pass or red/fail. Running twice with and without -Werror also solves that display limitation problem extremely cheaply. Again: if you have a smarter and better viewer then by all means ignore my tricks.
> We also don't need the second `-Werror` run (which pollutes the build cache)
Curious what you mean here.
Posted Dec 10, 2023 4:10 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link]
GitLab-CI does have a "warning" mode with the `allow_failure` key[1]. We use exit code 47 to indicate "warnings happened" so that the testing can proceed even though the build made warning noise. There are issues with PowerShell exit code extraction and that always hard-fails, but that seems to be a gitlab-runner issue (it worked before we upgraded for other reasons). It's actually nifty because it still reports as a `failed` *state* and the `allow_failure` key on the job just changes the render and "can dependent jobs proceed" logic, so our merge robot just sees that state and says "no" to merging.
> > We also don't need the second `-Werror` run (which pollutes the build cache)
> Curious what you mean here.
We have a shared cache for CI (`sccache`-based; `buildcache` on Windows). Adding another set of same-object output for a different set of flags just removes space otherwise ideally suited for storing other build results (*maybe* the object is deduplicated, but it doesn't seem necessary to me; probably backend-dependent anyways).
Posted Dec 9, 2023 15:25 UTC (Sat)
by Paf (subscriber, #91811)
[Link]
So we have circumstances that are a bit different, I think.
Posted Dec 9, 2023 1:24 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
That's what I did with a code base. Just worked through the codebase adding -W3 to each module in turn, and cleared all the errors. It took time, but the quality of the code base shot up, and loads of unexplained errors just disappeared :-)
Cheers,
Posted Dec 11, 2023 14:48 UTC (Mon)
by rgmoore (✭ supporter ✭, #75)
[Link] (1 responses)
A reasonable way to think about this is to treat all those compiler warnings as technical debt. Paying off that technical debt will be painful, especially if you have a lot of it, but it's probably worth it in the long run. The big cost will be when you take a project that has allowed the warnings to pile up and suddenly force everyone to spend time fixing those warnings rather than develop anything new. Dealing with new warnings as compilers change their mind about what deserves a warning will be more manageable. The main problem in that case is letting the compiler writers dictate when you pay off your technical debt rather than making the decision yourself.
Posted Dec 11, 2023 16:46 UTC (Mon)
by Wol (subscriber, #4433)
[Link]
We had a project where we couldn't suppress a particular warning (MSC v6, -W4, bought in library, unused arguments. Catch 22, we could fix warning A, but the fix triggered warning B, cue endless loop).
Anyways, our standards said "All warnings must be explained and understood". So that one we just ignored. There's no reason a project can't say "it's an old warning, we haven't got round to fixing it". But any new warning in modified code is an instant QA failure.
Cheers,
Posted Dec 9, 2023 16:24 UTC (Sat)
by jwarnica (subscriber, #27492)
[Link]
Introducing a new complier version is a significant step. Perhaps you will need a development branch to work through that, but you should either never change compiler versions, or actually do all that is needed when you do....
Which could we be disabling particular checks in the build process. But if you said Wall, then you have implicitly deferred to the compiler people's taste.
Posted Dec 9, 2023 22:42 UTC (Sat)
by quotemstr (subscriber, #45331)
[Link] (2 responses)
Posted Dec 10, 2023 11:25 UTC (Sun)
by joib (subscriber, #8541)
[Link]
Posted Dec 14, 2023 12:04 UTC (Thu)
by spacefrogg (subscriber, #119608)
[Link]
Posted Dec 8, 2023 17:36 UTC (Fri)
by Hello71 (subscriber, #103412)
[Link] (2 responses)
Posted Dec 11, 2023 8:10 UTC (Mon)
by jengelh (guest, #33263)
[Link] (1 responses)
Posted Dec 13, 2023 1:52 UTC (Wed)
by marcH (subscriber, #57642)
[Link]
Big endian is more "human-friendly" because you can read hexdumps "as is" (because humans use big endian too)
Little endian is more "computer-friendly" because of what you just explained.
In other words, Gulliver is wrong here.
Posted Dec 10, 2023 8:57 UTC (Sun)
by swilmet (subscriber, #98424)
[Link] (23 responses)
In my opinion, type inference for variable declarations should be used only sparingly, when the type of the variable is already visible (and quite long to write) on the right-hand side of the assignment. Writing the types of variables explicitly enhance code comprehension.
See this article that I wrote this night after reading this LWN article:
About type inference
(the article is 2 pages long, a bit too long to copy here as a comment, I suppose).
Posted Dec 10, 2023 9:00 UTC (Sun)
by swilmet (subscriber, #98424)
[Link]
Posted Dec 10, 2023 11:52 UTC (Sun)
by excors (subscriber, #95769)
[Link] (20 responses)
Like, using `auto` instead of `const char*` or `ArrayList<String>` isn't a huge benefit, because those are pretty simple types. But when you're regularly writing code like:
for (std::map<std::string, std::string>::iterator it = m.begin(); it != m.end(); ++it) { ... }
then it gets quite annoying, since the type name makes up half the line, and it obscures the high-level intent of the code (which is simply to iterate over `m`). (And that's not the real type anyway; `std::string` is the templated `std::basic_string<char>`, and the `iterator` is a typedef which is documented to be a LegacyBidirectionalIterator which is a LegacyForwardIterator which is a LegacyIterator which specifies the `++it` operation etc, so in practice you're not going to figure out how the type behaves from the documentation - you're really going to need a type-aware text editor or IDE, at least until you've memorised enough of the typical library usage patterns. That's just an obligatory part of modern programming.)
Or in Rust you might rely on type inference like:
let v = line.split_ascii_whitespace().map(|s| s.parse().unwrap());
where you can see the important information (that it ends up with a vector of ints), and you can assume `v` is some sort of iterable thing but you don't care exactly what. Writing it explicitly would be something terrible like:
let v: std::iter::Map<std::str::SplitAsciiWhitespace<'_>, impl Fn(&str) -> i32> = line.split_ascii_whitespace().map(|s| s.parse().unwrap());
except that won't actually work because the `'_` is referring to a lifetime which I don't think there is any way to express in code; and the closure is actually an anonymous type (constructed by the compiler to contain any captured variables) which implements the `Fn` trait, and you can only use the `impl Trait` syntax in argument types (where it's a form of generics) and return types (where it's a kind of information hiding), not in variable bindings, so there's no way to name the closure type. Rust's statically-checked lifetimes and non-heap-allocated closures are useful features that simply can't work without type inference.
Posted Dec 10, 2023 21:52 UTC (Sun)
by tialaramex (subscriber, #21167)
[Link] (5 responses)
Sure, I have no idea what "type" chars actually is, but it's clearly some sort of Iterator, and somebody named it chars, I feel entitled to assume it impl Iterator<Item = char> unless it's obvious in context that it doesn't.
If anything I think I more often resent needing to spell out types for e.g. constants where I'm obliged to specify that const MAX_EXPIRY: MyDayType = 398; rather than let the compiler figure out that's the only correct type. I don't hate that enough to think it should be changed, it makes sense, but I definitely run into it more often than I regret not knowing the type of chars in a construction like let chars = foo.bar().into_iter()
However, of course C has lots of footguns which I can imagine would be worsened with inference, so just because it was all rainbows and puppies in Rust doesn't mean the same will be true in C.
Posted Dec 10, 2023 22:07 UTC (Sun)
by mb (subscriber, #50428)
[Link]
Yes, that is true.
Type inference works well in Rust due to its strict type system.
Posted Dec 10, 2023 23:00 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
I would agree with this. The main concern I can think of is how C handles numeric conversions. They are messy, complicated, and I always have to look them up.[1] They can mostly be summarized as "promote everything to the narrowest type that can represent all values of both argument types, and if an integer, is at least as wide as int," but that summary is wrong (float usually *can't* represent all values of int, but C will just promote int to float anyway). Throwing type inference on top of that mess is probably just going to make things worse.
By contrast, Rust has no such logic. If you add i32 + i16, or any other situation where the types do not match, you just get a flat compiler error.
I do wish Rust would let me write this:
let x: i32 = 1;
(Presumably this is because you can also add i32 + &i32, and the compiler isn't quite smart enough to rule out that override.)
The compiler suggests writing this abomination, which does work:
let z: i32 = x + <i16 as Into<i32>>::into(y);
But at least you can write this:
let x: i32 = 1;
Posted Dec 11, 2023 3:08 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
let z: i32 = x + i32::from(y);
Obviously I need to spend more time studying Rust, or maybe actually sit down and write a toy program in it.
Finally, I should note that you can write "y as i32", but that's less safe because it will silently do a narrowing conversion. from() and into() can only do conversions that never lose data, and there's also try_from()/try_into() if you want to handle overflow explicitly.
Posted Dec 11, 2023 13:08 UTC (Mon)
by gspr (guest, #91542)
[Link] (1 responses)
And there's try_from().expect("Conversion failure") for those cases where you wanna say "man, I don't really wanna think about this, and I'm sure the one type converts to the other without loss in all cases my program experiences – but if I did overlook something, then at least abort with an error message instead of introducing silent errors".
Posted May 8, 2024 15:41 UTC (Wed)
by adobriyan (subscriber, #30858)
[Link]
The "messy numeric conversions" are largely due to rubber types and the fact that there are lots of them
If all you have is what Rust has, C is not _that_ bad.
Kernel has certain number of min(x, 1UL) expression just because x is "unsigned long", but it is clear that programmer wants typeof(x).
Posted Dec 11, 2023 4:43 UTC (Mon)
by swilmet (subscriber, #98424)
[Link] (13 responses)
Both C++ and Rust have a large core language, while C has a small core language.
I see Rust more as a successor to C++. C programmers in general - I think - like the fact that C has a small core language. So in C the types remain small to write, and there are more function calls instead of using sophisticated core language features. C is thus more verbose, and verbosity can be seen as an advantage.
Maybe the solution is to create a SubC language: a subset of C that is safe (or at least safer). That's already partly the case with the compiler options, hardening efforts etc.
Posted Dec 11, 2023 8:39 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (12 responses)
I disagree with this, assuming that "safe" means "cannot cause UB outside of an unsafe block." A safe version of C needs at least the following:
* Lifetimes and borrow checking, which implies a type annotation similar to generics.
I just don't see how you provide all of that flexibility without doing monomorphization, at which point you're already 80% of the way to reinventing Rust.
Posted Dec 11, 2023 11:10 UTC (Mon)
by Sesse (subscriber, #53779)
[Link] (3 responses)
Posted Dec 11, 2023 13:52 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (2 responses)
If you're not careful, you end up with something like Wuffs. A perfectly useful language in some domains, but deliberately limited in scope to stop you writing many classes of bug.
Posted Dec 14, 2023 10:55 UTC (Thu)
by swilmet (subscriber, #98424)
[Link] (1 responses)
Posted Dec 14, 2023 10:57 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
You're not going to get very far when you can't access arguments, or do I/O. Wuffs is deliberately limited to not doing that, because it's dangerous to mix I/O with file format parsing.
Posted Dec 11, 2023 11:35 UTC (Mon)
by swilmet (subscriber, #98424)
[Link] (7 responses)
But why not trying a C-to-Rust transpiler? (random idea).
By keeping a small core language with the C syntax, and having a new standard library that looks like Rust but uses more function calls instead.
The transpiler would "take" the new stdlib as part of the language, for performance reasons, and translates the function calls to Rust idioms.
A source-to-source compiler is of course not ideal, but that's how C++ was created ("C with classes" was initially translated to C code).
Posted Dec 11, 2023 12:09 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (6 responses)
You might want to look at the C2Rust project; the issue is that a clean transpiler to Rust has to use unsafe liberally, since C constructs translate to something that can't be represented in purely Safe Rust.
The challenge then becomes adding something like lifetimes (so that you can translate pointers to Rust references instead of Rust raw pointers) without "bloating" C. I suspect that it's impossible to have a tiny core language without pushing many problems into the domain of "the programmer simply must not make any relevant mistakes"; note, though, that this is not bi-directional, since a language with a big core can still push many problems into that domain.
Posted Dec 12, 2023 10:32 UTC (Tue)
by swilmet (subscriber, #98424)
[Link] (5 responses)
But I had the idea to convert (a subset of) C to _safe_ Rust, of course. Instead of some Rust keywords, operators etc (the core language), have C functions instead.
Actually the GLib/GObject project is looking to have Rust-like way of handling things, see:
Anyway, that's an interesting topic for researchers. Then making it useful and consumable for real-world C projects is yet another task.
Posted Dec 12, 2023 10:43 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
The hard part is not the keywords and operators - it's the lifetime annotation system. Lifetimes are a check on what the programmer intended, so have to be possible to write as an annotation to pointer types in the C derived language, but then to be usable force you to have a generics system (since you want many things to be generic over a lifetime) with (at least) covariance and invariance possible to express.
And once you have a generics system that can express covariance and invariance for each item in a set of generic parameters, why wouldn't you allow that to be used for types as well as lifetimes? At which point, you have Rust traits and structs, and most of the complexity of Rust.
Posted Dec 12, 2023 11:34 UTC (Tue)
by mb (subscriber, #50428)
[Link] (3 responses)
That is not possible, except for very trivial cases.
The C code does neither include enough information (e.g. lifetimes) for that to work, nor is it usually structured in a way for this to work.
Programming in Rust requires a different way of thinking and a different way of structuring your code. An automatic translation of the usual ideomatic C programs will fail so hard that it would be easier to rewrite it from scratch instead of translating it and then fixing the compile failures.
Posted Dec 13, 2023 23:59 UTC (Wed)
by swilmet (subscriber, #98424)
[Link] (2 responses)
I started to learn Rust but dislike the fact that it has many core features ("high-level ergonomics"). It's probably possible to use Rust in a simplistic way though, except maybe if a library forces to use the fancy features.
Posted Dec 14, 2023 9:37 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (1 responses)
You could avoid using those libraries, and limit yourself to libraries that have a "simple" enough interface for you (no_std libraries are a good thing to look for here, since they're designed with just core and maybe alloc in mind, not the whole of std) - bearing in mind that you don't need to care how those libraries are implemented if it's just about personal preference.
In general, though, I wouldn't be scared of a complex core language - all of that complexity has to be handled somewhere, and a complex core language can mean that complexity is being compiler-checked instead of human-checked.
Posted Dec 14, 2023 11:07 UTC (Thu)
by swilmet (subscriber, #98424)
[Link]
"Soft"ware, they said :-)
Posted Dec 10, 2023 12:03 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
Have a variable type of "infer"? That way, an undeclared variable is still an error, but you can explicitly tell the compiler to decide for itself :-)
Cheers,
Posted Dec 10, 2023 19:59 UTC (Sun)
by geert (subscriber, #98403)
[Link]
Posted Dec 22, 2023 6:06 UTC (Fri)
by glandium (guest, #46059)
[Link]
Posted Dec 8, 2023 17:32 UTC (Fri)
by Hello71 (subscriber, #103412)
[Link] (2 responses)
Posted Dec 9, 2023 10:50 UTC (Sat)
by fw (subscriber, #26023)
[Link] (1 responses)
There must have been similar efforts going on for Homebrew and Macports and the various BSDs, to increase Clang compatibility, but compared to the Gentoo effort, I have seen fewer upstream contributions. Personally, I found that rather disappointing. I'm not aware of the Clang upstream project making similar assessments like we did for GCC regarding overall impact, and taking active steps to manage that. Or Apple when they switched Xcode to more errors well before upstream Clang, as I understand it.
The Clang change, along with Fedora's express desire to offer Clang as a fully supported compiler to package maintainers, certainly provided some justification for tackling these issues, and opened up even some limited additional resources (and every bit helps for this). But were it not for Gentoo's contributions, I think the practical impact of the earlier Clang change would have been pretty limited unfortunately.
Posted Dec 12, 2023 7:18 UTC (Tue)
by areilly (subscriber, #87829)
[Link]
Posted Dec 8, 2023 21:38 UTC (Fri)
by saladin (subscriber, #161355)
[Link] (105 responses)
I know the standard response is to either fix the software or stick with old compilers, but why? Current compilers work exactly as intended, already warn about the dangers of these constructs, and if Fedora wants to eliminate use of these constructs, then they can use -Werror or patch their build of GCC. Enabling new C language features should not have to break old C code; the compilers already treat different versions differently wrt. keywords and especially the 'auto' keyword in C23.
Also, these tricks are very fun to exploit when it comes to golfing.
Posted Dec 8, 2023 21:47 UTC (Fri)
by tshow (subscriber, #6411)
[Link] (1 responses)
Posted Dec 9, 2023 10:55 UTC (Sat)
by fw (subscriber, #26023)
[Link]
There are a couple of problematic projects out there which explicitly rely on C99 and later features which are not available as GNU extensions with
Posted Dec 8, 2023 21:54 UTC (Fri)
by tshow (subscriber, #6411)
[Link] (26 responses)
As for `auto`, sure, it used to have a different meaning ("not register"), but it was a useless keyword from what I could tell; everything was `auto` by default, and I don't believe you could do `auto register int i;` to cancel the `register` markup. Any existing code using it will be doing something like `auto i = 3;` which in old money implied `auto int i = 3;` and in new money will type infer to `int i = 3;`. I strongly suspect it would be hard to intentionally craft an example where the change in meaning of `auto` actually breaks, and if it's possible at all I'd guess it would involve some fairly hairy macro magic.
Posted Dec 9, 2023 3:26 UTC (Sat)
by zev (subscriber, #88455)
[Link] (25 responses)
Would something with a float literal be a simple example? Certainly not something one would normally expect to see, but with the mountains of legacy code I wouldn't be surprised to find such instances floating around...
I don't happen to have a compiler with support for the new (C23) meaning of 'auto' lying around, but if it behaves like it does in C++:
$ cat foo.c
Posted Dec 9, 2023 4:23 UTC (Sat)
by tshow (subscriber, #6411)
[Link] (24 responses)
Posted Dec 9, 2023 14:16 UTC (Sat)
by willy (subscriber, #9762)
[Link] (23 responses)
Posted Dec 9, 2023 15:24 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (2 responses)
Intel before the (was it) 486? Or more likely the 386. Which iirc we're talking early 90s. I'm sure I was using a load of 286 computers when I started that new job in 1989 ...
Cheers,
Posted Dec 11, 2023 10:23 UTC (Mon)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Dec 11, 2023 11:02 UTC (Mon)
by mjg59 (subscriber, #23239)
[Link]
Posted Dec 9, 2023 15:31 UTC (Sat)
by Paf (subscriber, #91811)
[Link] (7 responses)
Posted Dec 9, 2023 15:35 UTC (Sat)
by willy (subscriber, #9762)
[Link] (6 responses)
Posted Dec 9, 2023 15:50 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
(That was the program(s) I set -W3 / -W4 on.)
Cheers,
Posted Dec 10, 2023 1:30 UTC (Sun)
by Paf (subscriber, #91811)
[Link] (1 responses)
The point is just that with and without hardware FP both existed, I guess.
Posted Dec 10, 2023 11:11 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
Cheers,
Posted Dec 10, 2023 3:11 UTC (Sun)
by mjg59 (subscriber, #23239)
[Link] (2 responses)
Posted Dec 10, 2023 3:19 UTC (Sun)
by jake (editor, #205)
[Link] (1 responses)
hmm, i wrote the code for my 3D graphics grad school class in C on an Amiga 1000 in 1986 or 7 ... i suppose it is possible that it was all software floating-point, but i certainly did not encounter any problems in that regard ...
jake
Posted Dec 10, 2023 3:26 UTC (Sun)
by mjg59 (subscriber, #23239)
[Link]
Posted Dec 9, 2023 15:51 UTC (Sat)
by pizza (subscriber, #46)
[Link] (11 responses)
"Has been around" and "available on many models" is a _long_ way from "can assume it's generally/routinely available" especially in the 1970s and 1980s.
Indeed, it wasn't until the early 1990s that personal computers of any sort could be expected to have a built-in FPU (eg i486 in 1989, 68040 in 1990). ARM didn't have an _architectural_ FP spec until the v7 family (ie Cortex-A) which didn't launch until the early 2000s.
Even in the UNIX realm, SPARCv7 didn't have an architecturally defined FPU, and many different ones were bolted onto the side. SPARCv8 (~1990) formally added an architectural FPU [1], but it was still technically optional and many implementations (SPARCv8a) lacked all or part of the FPU instructions) DEC Alpha launched in 1992 with a built-in FPU, but its predecessor (ie VAX, PDP) didn't necessarily come with FP hardware either, as you yourself mentioned. MIPS defined an FP spec, but it was an external/optional component until the R4000/MIPS-III in 1991. Unsually, PA-RISC appears to have an FPU for all of its implementations, which started arriving in 1988.
So, no, you couldn't generally rely on having an FP unit until the early 1990s. even then you had to be using fairly new equipment. Prior to that, FPUs were an (expensive!) option that you only opted for if you needed one. Everyone else had to make do with (very slow) software-based FP emulation, or rewrite their algorithms to use fixed-point mathematics, The latter approach is _still_ used wherever possible when performance is critical.
Heck, even today, the overwhelming majority of the CPU cores shipped still lack any sort of FPU, and I'd bet good money most of those are running predominately C or C++ codebases. (Yes, I'm referring to microcontrollers...)
[1] Incidentally, SPARCv8 was the basis for the IEEE754 floating point specification.
Posted Dec 9, 2023 16:27 UTC (Sat)
by pizza (subscriber, #46)
[Link] (1 responses)
Correction -- Like so many other ARM things, they have a wide variety of floating point units that operated using different instructions; it wasn't until armv7 that you could expect/rely on a consistent FP baseline that worked the same way.
(The first ARM processor with the option of FP support was the ARM6 (armv3) in 1991)
Posted Dec 9, 2023 16:44 UTC (Sat)
by willy (subscriber, #9762)
[Link]
Posted Dec 9, 2023 16:37 UTC (Sat)
by willy (subscriber, #9762)
[Link] (8 responses)
"This standard was significantly based on a proposal from Intel, which was designing the i8087 numerical coprocessor; Motorola, which was designing the 68000 around the same time, gave significant input as well."
And yes, I'm aware that personal computers didn't have much hardware FP available, but my contention is that there wasn't much C being written on PCs of that era.
Also, I don't think an "architectural spec" is particularly meaningful. I was active in the ARM scene and I remember the Weitek coprocessor, the FPA10, FPA11 and the infamous mixed endian FP format. People used floating point with or without hardware, and with or without an architectural spec.
Posted Dec 9, 2023 17:15 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (1 responses)
Well, I can think of at least one major program from that era ... the linux kernel ... (which was originally written for one of the early 386's, no?)
Cheers,
Posted Dec 9, 2023 21:34 UTC (Sat)
by mathstuf (subscriber, #69389)
[Link]
Posted Dec 9, 2023 19:08 UTC (Sat)
by pizza (subscriber, #46)
[Link] (5 responses)
During the 70s and early 80s, sure, not a lot of C on "personal" (ie non-UNIX) computers. but by the late 80s, that had changed.
Lattice C was released for DOS in 1982. Microsoft repackaged it for Microsoft C 1.0 in 1983. Borland released Turbo C in 1987, Watcom C was released in 1988 (and was the overwhelming favorite for game developers) GCC's first releases also landed in 1987.
While the 8087 FPU has been part of the x86 family since its introduction the late 70s, it was an expensive option, and as a result very little software was written that could directly take advantage of it. That had nothing do with the choice of programming language.
Posted Dec 9, 2023 20:38 UTC (Sat)
by willy (subscriber, #9762)
[Link] (4 responses)
https://beebwiki.mdfs.net/Floating_Point_number_format
If you're from a games background then the program is never fast enough ;-)
As an aside, I think the fact that Unix doesn't use floating point is quite crippling. If the sleep() syscall took a floating point argument, it would have meant we didn't need to add msleep(), usleep() (and I guess soon nsleep()). The various timespec formats would still need to exist (because you can't lose precision just because a file was created more than 2^24 seconds after the epoch), but _relative_ time can usually be expressed as a float. Indeed, Linux will round the sleep() argument -- see https://lwn.net/Articles/369549/
Posted Dec 9, 2023 21:52 UTC (Sat)
by dskoll (subscriber, #1630)
[Link] (2 responses)
nanosleep has existed for quite some time, so no need for an nsleep.
I don't really see a need for supporting floating point in UNIX system calls like sleep. Seems like overkill to me.
difftime returns the difference between two time_t objects as a double. But seeing as time_t in UNIX has only one-second resolution, that seems a bit silly to me, unless it's to prevent overflow if you subtract a very large negative time from a very large positive time.
Posted Dec 9, 2023 21:58 UTC (Sat)
by willy (subscriber, #9762)
[Link] (1 responses)
https://www.infradead.org/~willy/linux/scan.c
and think how much more painful it would be to use some fixed point format (like, I don't know, a timespec)
I'm sure I could use a single precision float for this purpose, but that would definitely stray into the realm of premature optimization.
Posted Dec 9, 2023 23:09 UTC (Sat)
by dskoll (subscriber, #1630)
[Link]
Sure, yes, timespec has nanosecond precision. difftime takes arguments with only one-second precision.
Posted Dec 10, 2023 19:40 UTC (Sun)
by smoogen (subscriber, #97)
[Link]
Posted Dec 8, 2023 21:59 UTC (Fri)
by ErikF (subscriber, #118131)
[Link] (5 responses)
And I agree that K&R C is a wonderful golfing language.
Posted Dec 9, 2023 14:33 UTC (Sat)
by Karellen (subscriber, #67644)
[Link]
Huh. For some reason I always thought it did. But looking back at at a random selection of manuals, even going back to GCC 2.95, there's no such value. Maybe I just got confused about C89/C90 still permitting K&R-style syntax.
Posted Dec 10, 2023 18:42 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
Posted Dec 10, 2023 19:49 UTC (Sun)
by smoogen (subscriber, #97)
[Link] (2 responses)
Posted Dec 10, 2023 20:36 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
What I'm really getting at is, what is K&R C, from the compiler author's perspective? How do you know if a given compiler is a "valid" implementation of K&R? How do you know what optimizations are permitted? If the answer is "no optimizations are allowed, because there's no C abstract machine yet, so everything must be translated 1:1," then how do you decide what constitutes "1:1" output? Even the modern C standard does not define such a notion, and users would probably like to have some optimizations anyway.
I don't think there's any sensible answer to those questions that doesn't ultimately look like C89 with a bunch of -f options to enable K&R constructs, which is why there's no -std=kr option. The second edition explicitly acknowledges this limitation in the preface, and directs compiler authors to the C standard.
Posted Dec 11, 2023 16:03 UTC (Mon)
by smoogen (subscriber, #97)
[Link]
Posted Dec 8, 2023 22:12 UTC (Fri)
by mb (subscriber, #50428)
[Link] (69 responses)
Well.
If you want this legacy code keep compiling, please use a 30 years old compiler.
I am all for backwards compatibility.
But there are limits.
Also: https://xkcd.com/1172/
Posted Dec 9, 2023 7:09 UTC (Sat)
by rrolls (subscriber, #151126)
[Link] (35 responses)
I'll just go ahead and disagree with this.
Compilers should feel free to add as many new features as they like, including warnings/errors for usage of antiquated misfeatures. However, if compiler P turns source code Q into executable R when told to use some particular standard or version S, **it should continue to do so for all eternity**. Assuming Q is actually valid according to standard S, of course.
C compilers have been really good at this; a lot of other languages' compilers have not. This is one of the many reasons why, despite objectively not being a very good language these days compared to many others, C continues to be popular.
It's not just programming languages, either: you should **always** be able to use some program to turn some unchanged input into the same output, even if you have to update the program for one reason or another. TeX froze its design in 1990 for this exact reason, and it's still popular.
Imagine if when Python 3 came out, you had to write something like # python-standard: 3.0 at the top of every file to opt into all the new behavior. That would have avoided the entire Python 3 debacle. (And to be clear, I'm not at all a Python 2 stalwart; I'm a fan of modern Python. It's just the perfect example to use.)
I should not have to install a bunch of different versions of a compiler - all probably with their own conflicting system dependencies - just to compile a bunch of different programs from different eras, that aren't broken and don't need updating.
Heck, a few months ago I had a reason to use Python 2 to get some old Python 2 code running - and despite Python 2 not having been updated in years, it was incredibly simple to download the source code of Python 2.7.18, compile it and use it - because it was written in plain old C that compiles the same today, in gcc 12, as when it was written.
The "everything must be updated all the time because reasons, and it's fine for stuff to stop working once it's not been touched for even 2 years" concept is a modern concept, and not a very good one IMO.
Posted Dec 9, 2023 8:21 UTC (Sat)
by pm215 (subscriber, #98099)
[Link] (3 responses)
Posted Dec 12, 2023 13:33 UTC (Tue)
by rrolls (subscriber, #151126)
[Link] (2 responses)
However:
Did any actual C standard parse it as the latter?
If yes, then my point stands, meaning that if someone invokes a C compiler and explicitly tells it to use whatever old C standard that was, then that's the way it should be parsed. Does this cause a security issue? No! Because you have explicitly specify that.
If no, then the point is moot, because if it's not specified behavior then I'd say the compiler is free to change its behavior as it pleases.
Posted Dec 12, 2023 14:08 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
This now comes down to the definition of "actual C standard". No ISO C standard parsed it as the latter, but some K&R versions did. However, K&R wasn't a formal standard - it was more or less defined in terms of what the K&R compiler did.
Posted Dec 12, 2023 14:13 UTC (Tue)
by excors (subscriber, #95769)
[Link]
According to https://www.bell-labs.com/usr/dmr/www/chist.html :
> B introduced generalized assignment operators, using x=+y to add y to x. The notation came from Algol 68 [Wijngaarden 75] via McIlroy, who had incorporated it into his version of TMG. (In B and early C, the operator was spelled =+ instead of += ; this mistake, repaired in 1976, was induced by a seductively easy way of handling the first form in B's lexical analyzer.)
Posted Dec 9, 2023 8:24 UTC (Sat)
by matthias (subscriber, #94967)
[Link] (2 responses)
Unfortunately this stability does not hold for all the LaTeX packages that are used today. Output significantly changes from version to version. And who is still using plain TeX today?
In fact, the differences are usually small. But TeX wonderfully demonstrates the butterfly effect. Even the tiniest change in spacing can have huge changes several pages further down the document.
Posted Dec 11, 2023 8:22 UTC (Mon)
by jengelh (guest, #33263)
[Link]
So basically how MSWord operated all those decades... :-p
Posted Dec 12, 2023 13:35 UTC (Tue)
by rrolls (subscriber, #151126)
[Link]
Posted Dec 9, 2023 8:46 UTC (Sat)
by marcH (subscriber, #57642)
[Link] (1 responses)
This is also why C is the least safe language in the world and the surest way to get hacked. Not the only way of course but the most likely by far.
I totally agree that software shouldn't be automatically bad just because it's old. But old, unmaintained C code is just bad and dangerous. This was basically the main point of the article but you seemed to have missed it.
Also, you seem to dismiss backwards compatibility in other languages a bit quickly.
Posted Dec 12, 2023 13:48 UTC (Tue)
by rrolls (subscriber, #151126)
[Link]
The work mentioned in the article is good work! Opting in to new checks that disallow bad patterns, and then updating code to fix all the errors, is almost always a good thing to do.
My comment wasn't responding to the article. My comment was responding to the statement "If you want this legacy code keep compiling, please use a 30 years old compiler."
> Also, you seem to dismiss backwards compatibility in other languages a bit quickly.
"Backwards compatibility" these days usually tends to mean "we'll make your code spam log files with warnings for a year and then stop working altogether until you fix it".
Real backwards compatibility means that the intended and documented behavior of an old version of something will be kept (at least, once any necessary opt-ins have been performed), even if it's deemed to have some flaws.
I'm not calling for _exact_ behavior, such as bugs, to be retained - just behavior that is documented and intended (or at least was intended at the time it was documented).
Posted Dec 9, 2023 8:51 UTC (Sat)
by mb (subscriber, #50428)
[Link] (18 responses)
I disagree. It should depend on how hard it is to fix the old code.
In the cases we are talking about it's actually trivial to make your old code compile again with a modern changed compiler.
Yes, I do understand that there are many packages and programs doing this, so over all this is a big amount of work. But it is trivial work. And it can easily be parallelized.
Posted Dec 9, 2023 14:20 UTC (Sat)
by willy (subscriber, #9762)
[Link] (17 responses)
Posted Dec 9, 2023 14:26 UTC (Sat)
by mb (subscriber, #50428)
[Link] (15 responses)
That is nontrivial for soooo many more reasons.
Posted Dec 11, 2023 14:51 UTC (Mon)
by wahern (subscriber, #37304)
[Link] (14 responses)
A year ago or so I was pleasantly surprised when I attempted to compile the latest release of lrzsz, 0.12.20, from 1998 (https://www.ohse.de/uwe/software/lrzsz.html), and it almost compiled out-of-the-box[1] in a modern, stock macOS environment. It only took a few minutes to identify and fix the bitrot. Most of the handful of issues were missing header includes or header includes gated behind a bad feature test. Once it compiled to completion the remaining issues caught by the compiler were related to the 64-bit transition: some wrong printf specifiers and conflation of socklen_t with size_t. There may be other bugs, but it seemed to work once it compiled cleanly.
Also interesting (but less surprising) was how well the ./configure script held up, which was generated by autoconf 2.12 from 1996.
[1] Several were fixable with build flags: CFLAGS="-Wno-format-security" CPPFLAGS="-DSTDC_HEADERS" ./configure
Posted Dec 11, 2023 16:15 UTC (Mon)
by pizza (subscriber, #46)
[Link]
I like the way you expressed this, and find myself in agreement.
Posted Dec 11, 2023 16:59 UTC (Mon)
by mb (subscriber, #50428)
[Link] (1 responses)
You are actually just saying that it didn't compile.
Removing the features this article is about from the compiler will not really worsen the situation by any meaningful amount.
Posted Dec 11, 2023 19:20 UTC (Mon)
by wahern (subscriber, #37304)
[Link]
The places where the code falters largely relate to 1) its support for pre-ANSI C library interfaces, 2) a newer hardware architecture exposing non-standard code, and 3) non-POSIX extension APIs (e.g. gettext). I think this says something positive regarding the value of standards and backward compatibility.
Notably, some of the code does use K&R parameter lists. (At least the getopt_long compat implementation does, but on OpenBSD it was properly excluded from the build.) I'm not advocating for continued support for K&R, just pushing back against the notion that old code, and support for old code, has little value. 20 years isn't even that long in the grand scheme of things, especially in the context of a systems language like C.
Posted Dec 11, 2023 19:26 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (10 responses)
So making it harder to compile is a feature, not a bug.
Posted Dec 11, 2023 19:52 UTC (Mon)
by pizza (subscriber, #46)
[Link] (9 responses)
That's a good way to ensure your code never gets used by anyone other than yourself. And yourself too, I might add.
(In which case, why bother publicly publishing anything at all?)
Posted Dec 11, 2023 19:58 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (8 responses)
That's not such a bad outcome. Old and insecure code should not be used for new projects.
An analogy: I love our railroad park, I'm helping to restore an old steam engine. We are even planning to run it through on a public railroad some time next year. It's fun! But I for sure don't want these engines running nearby every day, they don't have any kind of emissions control, they have terrible efficiency, and they're just plain dangerous.
> (In which case, why bother publicly publishing anything at all?)
Mostly for historical/archival purposes.
Posted Dec 11, 2023 20:04 UTC (Mon)
by pizza (subscriber, #46)
[Link] (7 responses)
Newer == automatically better, got it.
Posted Dec 11, 2023 20:36 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Like pretty much everything else in engineering. The old code might be more compact and less resource-intensive, but it will almost 100% be less safe and robust.
Posted Dec 11, 2023 20:52 UTC (Mon)
by pizza (subscriber, #46)
[Link] (5 responses)
I'm sorry, that was snarky and not what I meant to convey.
There's plenty of "newer" software that's grossly insecure or otherwise lacking in some way versus something that's been around a while. It usually takes a while to stabilize something into a generally usable form.
Meanwhile. When lrzsz was first published, it wasn't "old obsolete software" that should not be used for new projects. It was brand-new software, intended to be used by contemporary users. Saying that it shouldn't have been published or made more difficult to compile as to discourage folks from using it rather defeats the entire point of releasing it to begin with. And where would any of the F/OSS world be if that attitude was the norm?
What should this magic memory-hole/de-publishing cutoff point be? Months? Years? Decades?
One can't know in advance how long something will be actively developed or maintained. One can't know in advance how diligent users/integrators will be in ensuring stuff is kept up-to-date [1]. Meanwhile, its degree of maintenance tells you very little about its overall quality or suitability for a given purpose.
[1] Well, I suppose history shows us that the answer is "barely to never". How many routers are deployed running old, vulnerable versions of dnsmasq? How many are still shipping with these long-since-fixed vulerabilities?
Posted Dec 11, 2023 21:21 UTC (Mon)
by mb (subscriber, #50428)
[Link]
Posted Dec 11, 2023 23:21 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
lzsz had been maintained, up until some point in the past. And for software that is being maintained, migrating to new toolchains every now and then is not a huge burden. So in a hypothetical world where we are still using modems, lzsz would be rewritten in hardened C (like ssh). But that's not what happened, at some point lzsz lost its users and was abandoned. So this example is a success story of full software lifecycle.
And that's why I'm fine with making lzsz more complicated to compile. It hasn't been maintained for two decades, and using it as-is in production in current conditions is bordering on irresponsible, so it has to be thoroughly audited and fixed anyway.
> Meanwhile, its degree of maintenance tells you very little about its overall quality or suitability for a given purpose.
Honestly? It usually does. Unless you're looking into an area that naturally disappeared.
Posted Dec 12, 2023 9:42 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
Your comment about old, vulnerable versions of dnsmasq skirts the edge of the underlying problem; we treat software as an undecaying asset that can be built again and again from the source code and forever treated as "perfect", when it's closer in behaviour to the plans for a building. Nobody sensible takes plans from 5 years ago and expects them to be built unchanged with today's tools, techniques and materials; there's a constant shifting of requirements that means that we change things on the fly.
For example, I had recent construction work done; the plans were under 12 months old at the point I handed them over to the builder who did the work, and yet several changes were needed as we went along because the plans made assumptions that no longer held - the specific fire resistant material that had been specified was no longer available, and had to be substituted, the assumptions about what they'd find when digging for foundations turned out to be wrong (so we could use a smaller foundation), the locations of utilities as documented turned out to be wrong (including some that would have affected our neighbours if our builders hadn't changed what they did to match reality instead of the plans), and the price of insulating materials had changed so that we were able to change the plans to get much better insulation for slightly more money.
And this is closer to the reality of software; the source code is the plans, and (unlike construction), the build is done by automation, which means that whenever the plans turn out to be outdated, they need updating to keep up with modern requirements; at least with construction, when the plans turn out to be outdated, the workers turning plans into a building are capable of changing the plan on-the-fly to reflect reality on the ground, whereas a compiler simply isn't capable of doing that.
Posted Dec 12, 2023 21:22 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Or, if you've been relying on UB, the compiler turns out to be extra-capable of "changing the plan on-the-fly". But this is also because compilers take the source code as gospel and its "reality" is just some abstract fantasy machine.
And compilers do communicate through warnings and diagnostics all the time, but we're all too willing to ignore them at times.
Posted Dec 12, 2023 21:27 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
Oh, the compiler can change the plan all right - it just can't do so to reflect the reality on the ground (since that requires intelligence spotting that the programmer "meant" this, but wrote that), but instead to reflect its understanding of what you wrote, even if that's not what you intended to write.
Posted Dec 14, 2023 23:11 UTC (Thu)
by fw (subscriber, #26023)
[Link]
Posted Dec 9, 2023 17:22 UTC (Sat)
by smoogen (subscriber, #97)
[Link]
Having a single compiler which can compile all the backwards available code might be seen as a 'fairly' modern concept also. I had to maintain at least 4 different compiler sets for each of the Unix systems I maintained 25 years ago. For the previous 10 years, it had been quite common for particular C code to only compile with one specific compiler from Sun or HP, etc. Most of these times it was due to either a 'compiler defined side-effect' which the code needed (aka you could compile it with a different compiler from Sun, but the code might produce different results in certain runs). And for many languages you might end up having compilers which only compiled one specific version aka a Fortran66, Fortran77, C77 or C90 (and the various Snoball, etc ones). [While this was 25 years ago, I know many science, automotive and aeronautical systems tend to have to keep multiple versions of the same compiler because it is expected that some code relied on 'side-effects']
It was usually the gcc compiler set which normally could compile backwards compatible different code bases (within limits). You still might end up with code which acted differently between gcc-2.x and gcc-2.(x+1) but it was mainly due to one of the various 'vendor defined' or similar areas where what the compiler and the hardware does could change what happened. These sorts of code issues usually ended up with long emails between whatever scientist had coded the original to match a specific behaviour and the compiler developers pointing out that the standard could be interpreted in that region in any way they felt. [Or in some cases, any way the underlying hardware firmware decided... ]
Posted Dec 11, 2023 11:57 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (5 responses)
The problem we have, however, is that lots of legacy code Q is not valid according to any version of standard S, but an old version of compiler P happened to turn it into executable R. We wouldn't be having this discussion if we didn't have lots of legacy code that fell into the pit of "the program is ill-formed; no diagnostic is required" from various standards, where the program is not valid according to standard S, but a compiler does not have to tell you that you wrote an invalid program, but can produce an executable that may (or may not) do what you expect program Q to do.
Further, I disagree mildly with your premise - if Q is valid according to standard S, then there are many Rs that have the same behaviour on the real machine as program Q does on standard S's abstract machine. If, to choose an example, compiler P put a NOP in every delay slot as the easiest way to compile program Q to executable R, then I'd still like compiler P to be permitted to rearrange such that all delay slots do useful work if I compile program Q again. Unless, of course, by "compiler P", you mean a fixed version and set of flags to a compiler, and not later versions of compiler P, or changes to the compiler flags (e.g. switching from -O0 to -Os or -O3), in which case most compilers meet this requirement simply because they're deterministic.
Posted Dec 12, 2023 14:18 UTC (Tue)
by rrolls (subscriber, #151126)
[Link] (4 responses)
I'm not concerned about legacy code that wasn't valid in the first place. If someone's relying on a behavior that's undocumented, or indeed where the documentation says it shouldn't be relied on, there's no need to keep that behavior at all.
> if Q is valid according to standard S, then there are many Rs that have the same behaviour on the real machine as program Q does on standard S's abstract machine. [...] I'd still like compiler P to be permitted to rearrange such that all delay slots do useful work if I compile program Q again.
OK, I wasn't quite specific enough on this note. It had occurred to me that someone might make the optimisation argument but I didn't want to get lost in the details of that. :) What I meant by "executable R" was "an executable that does what it's supposed to according to the requested standard and any relevant compiler flags that affect behavior". So if some compiler provides a way to explicitly ask for a deterministic build then yes, it should always generate the exact same output bit-for-bit, if the same input is given, no matter how many updates to the compiler there have been. However, that's if you explicitly request a deterministic build. If you just specify certain compile flags that permit a range of possible behaviors, then of course the output should be allowed to vary within the permitted behaviors, such as allowing new optimisations.
My key point was that language maintainers should make provisions for people to be able to compile old code with new compiler versions without undue effort, so that you can use a single compiler version (preferably installed systemwide) to compile a wide range of code, rather than having to have lots of different compiler versions installed. The corollary is that any given source code would work on a wide range of compiler versions. (And I would call on library maintainers to do similar, so that projects that depend on them will work on a wide range of library versions.)
Posted Dec 12, 2023 14:56 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (3 responses)
But the issue here is entirely with code that wasn't valid in the first place - legacy code is full of places where people rely on behaviours that are undocumented, or documented as not to be relied upon, but where the compiler that was used at the time did what the programmer expected that construct to do, while a modern compiler does not.
If we were only dealing with code that was perfectly valid according to a documented standard, there'd be a lot less noise on this topic. But we're not - we're dealing with legacy code that's never been valid according to the documented form of the language, but where many previous implementations of the language did what the programmer expected anyway. It's just that current compilers now do something different, thus "breaking" the code (since they're still matching what's documented, just not the same interpretation as the previous compiler and the original programmer).
Posted Dec 13, 2023 8:22 UTC (Wed)
by rrolls (subscriber, #151126)
[Link] (2 responses)
Posted Dec 13, 2023 11:02 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (1 responses)
The underlying argument is about what defines "totally valid code"; one group (which I think you're part of) says that "totally valid code" is defined solely by reference to some form of standard, and another group that extends that definition to include custom-and-practice.
The second group argue that C compilers don't see it as a problem for "totally valid C" to break in a couple of years time, because they've got a construct that's been interpreted in a specific way by every compiler they've used in the last 50 years, up until the latest GCC or Clang version. This is mostly about a difference in definition; if you define "totally valid C" as "my compilers from 1995 to 2020 accepted this construct without diagnostics and interpreted it consistently", then you're going to view a 2023 compiler interpreting that code differently (with or without a diagnostic) as "new compiler can't handle totally valid code". Whereas the first group would say "this code was never valid, because C90 says it's not valid, but no diagnostic is required, and thus it's on you".
Posted Dec 13, 2023 12:36 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
The problem then comes when the standard declaring the code invalid (C90) postdates the code itself ... :-) (Like a lot of the C code I worked on)
Cheers,
Posted Dec 9, 2023 11:52 UTC (Sat)
by ballombe (subscriber, #9523)
[Link] (32 responses)
Posted Dec 9, 2023 12:04 UTC (Sat)
by mb (subscriber, #50428)
[Link] (31 responses)
That's not true. There are good reasons.
Posted Dec 9, 2023 13:40 UTC (Sat)
by pizza (subscriber, #46)
[Link] (30 responses)
that should be "not _enough_ good reasons"
This scenario is describing a very real problem that folks with existing codebases have to deal with.
Posted Dec 9, 2023 14:23 UTC (Sat)
by mb (subscriber, #50428)
[Link] (29 responses)
Can you name a project that
I think that projects only fall into three categories:
No "very real" problem for 1) and 2).
Posted Dec 9, 2023 15:10 UTC (Sat)
by makendo (guest, #168314)
[Link] (2 responses)
NetHack is known to use legacy function definitions as late as 2021: The next revision of C is deprecating legacy function definitions. The development branch has since switched to modern function definitions, but the switch wasn't backported to the 3.6.x releases and Gentoo maintainers have forced
Posted Dec 9, 2023 15:30 UTC (Sat)
by fw (subscriber, #26023)
[Link]
Posted Dec 10, 2023 18:50 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link]
Posted Dec 9, 2023 16:10 UTC (Sat)
by pizza (subscriber, #46)
[Link] (23 responses)
I can't name something that uses these specific legacy C features, but I help maintain one project [1] that is extremely sensitive to the toolchain used [2], making bisecting quite challenging when you have to cross a toolchain boundary and the old toolchain can't even be compiled on more modern systems.
> 1) They are unmaintained since decades. No need to bisect.
You make the same mistake as so many others by equating "unmaintained" with "unused" -- Disabusing folks of this notion is the entire point of this article.
[1] Rockbox, replacement firmware for a wide variety of MP3 players. Currently supporting a couple dozen platforms representing four major CPU architectures. It runs bare-metal, under SDL, and as a native Linux application that has to run on both ancient and modern userspaces.
Posted Dec 9, 2023 16:26 UTC (Sat)
by mb (subscriber, #50428)
[Link] (22 responses)
No, I didn't say that.
Posted Dec 9, 2023 16:31 UTC (Sat)
by pizza (subscriber, #46)
[Link] (21 responses)
So you fix the failures in your tree so you can continue building it with modern toolchians, but when you need to go back and bisect *your own code*, this vendored code no longer builds, forcing you to have to backport those changes at each bisection step.
This sort of thing can be _very_ common.
Posted Dec 9, 2023 16:42 UTC (Sat)
by mb (subscriber, #50428)
[Link] (20 responses)
It's a ticking time bomb for so many more reasons to vendor or even only depend on unmaintained code.
Such old code will often blow up in your face when compiled with modern optimizing compilers. Regardless of the proposed changes from the article.
In fact, I would actually *prefer* the build breakage over a subtle "miscompilation" due to decades old code not playing with the rules of the C machine model or having implicit types and declarations.
Posted Dec 9, 2023 18:45 UTC (Sat)
by pizza (subscriber, #46)
[Link] (17 responses)
Why? It had been working just fine.
I'm using in production a bit of software that literally hasn't been updated in nearly three decades. Replacing it with anything else woulld require a nontrivial amount of effort, for no measurable gain.
(I had to do a little bit of work to make it compile on 64-bit targets but that's been the extent of its maintenance in the past 20 years)
Posted Dec 9, 2023 19:02 UTC (Sat)
by mb (subscriber, #50428)
[Link] (16 responses)
To avoid building up more and more technical debt and to prevent it from exploding.
>It had been working just fine.
Until it exploded the bomb was just fine.
Posted Dec 9, 2023 19:13 UTC (Sat)
by pizza (subscriber, #46)
[Link] (15 responses)
Are you volunteering to pay me to do this work?
Or is this just yet another example of someone demanding that I perform unpaid work on their behalf?
Posted Dec 9, 2023 19:43 UTC (Sat)
by mb (subscriber, #50428)
[Link] (14 responses)
Nope. It's your project.
> Or is this just yet another example of someone demanding that I perform unpaid work on their behalf?
No. Not at all. I am not demanding anything.
But please don't complain, if it explodes.
Posted Dec 9, 2023 22:56 UTC (Sat)
by pizza (subscriber, #46)
[Link] (13 responses)
What you call "piling up technical debt" everyone else calls "priorities"
> But please don't complain, if it explodes.
Um, I'm not. Once again, you presume something not in evidence.
None of this stuff "explodes" on its own. Indeed, it works just fine in the environments it's been used in for (as you put it) "decades". However, the article, and this discussion, is about how a _new_ environment has come along that causes (usually trivially-fixed) compilation failures on code that's not needed significant maintenance for "decades". How is that property not a _good_ thing? What is this modern fascination with constantly reinventing the wheel just to stay in place?
Posted Dec 10, 2023 0:59 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (12 responses)
While that fascination is real, it's absolutely not what this article and discussion is about. You're angry and not listening.
Posted Dec 10, 2023 2:39 UTC (Sun)
by pizza (subscriber, #46)
[Link] (11 responses)
Seriously?
I'm being scolded for using ancient software that does exactly what I need it to do, solely because it's just a matter of time before it "explodes" causing me all manners of problems. Instead, I should switch to something actively developed.
That presumes that there is (1) an alternative with the necessary functionality, and (2) the transition cost is low to nonexistent. It also over exaggerates the scope and effect of the actual problem (ie a compile-time problem that is, most of the time, pretty trivial to resolve).
I've also pointed out, multiple times, that this article shows that "not actively developed" does not mean "not actively used", and the right-now cost of incrementally fixing this old software is far less than replacing it entirely.
(Anectdotally, those calling for wholesale replacements/rewrites/etc or otherwise telling F/OSS authors/maintainers/distributors/users/etc what they "should" be doing never seem to be the ones doing the actual work or helping cover its cost. I won't apologize for calling out that abusive, entitled behaviour)
Posted Dec 10, 2023 5:17 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (1 responses)
Posted Dec 10, 2023 15:43 UTC (Sun)
by pizza (subscriber, #46)
[Link]
....You injected yourself into the tail end of a sub-thread that was about just that.
(Meanwhile, I agree that the article _wasn't_ about that, a point I've repeatedly made)
Posted Dec 10, 2023 15:58 UTC (Sun)
by mb (subscriber, #50428)
[Link] (8 responses)
That's not true. It's your choice and I respect that choice.
>for using ancient software that does exactly what I need it to do, solely because it's just a matter of time >before it "explodes" causing me all manners of problems.
Yes. It's called bit-rot.
>Instead, I should switch to something actively developed.
That is *one* of the possible options that have been pointed out here.
>"not actively developed" does not mean "not actively used"
Yes. But nobody claimed that it would mean that.
>telling F/OSS authors/maintainers/distributors/users/etc what they "should" be doing
Nobody is telling you what you should do. That's a misinterpretation on your side.
But I'm not fine with it, if you want to prevent certain developments of the C language itself, just to keep your ancient and trivially fixable code working.
Posted Dec 10, 2023 18:03 UTC (Sun)
by pizza (subscriber, #46)
[Link] (6 responses)
I'm willing to bet that, on a daily basis, you trust your physical safety (if not your life) to "unmaintained software".
For example, the _newer_ of my two vehicles was manufactured 22 years ago. Any support/warranty/part supply/recall obligations its manufacturer had completely ceased seven years ago. If the software running in its ECU, ABS, and safety/airbag modules doesn't qualify as "unmaintained" then nothing else possibly could.
Meanwhile, *every* computer I have Linux installed upon is running completely unmaintained firmware -- The newest one fell out of support about a year ago. Does this mean I should just scrap the lot?
My point? "unmaintained" doesn't mean that it's automatically bad, untrustable, or incapable of fulfilling its purpose. Secondly, "maintained" in of itself tells you very little. Indeed, the Fedora folks' efforts with these old packages is itself a form of maintenance!
Going back a few posts, the "unmaintained production" software I mentioned earlier that you chided me for relying upon? It's a glorified data logger in a closed environment. It's been in production for approximately two decades, and it's "unmaintained" because *it hasn't needed any maintenance* in the past three years. It does what's needed, so what is to be gained by messing with it? What exactly is supposed to "explode" in this context? This particular bit of ancient software is actually the most reliable portion of the entire system!
Posted Dec 10, 2023 18:45 UTC (Sun)
by mb (subscriber, #50428)
[Link] (1 responses)
Well, yes. I know. In my day job I write this software.
Probably the majority of the software in such a thing is "frozen". It will not be developed any further to add new features.
>Meanwhile, *every* computer I have Linux installed upon is running completely unmaintained firmware
Nope. You completely missed my point again.
I am not at all talking about binary firmware sitting in devices. It's completely fine to keep using the same binary firmware for an infinite amount of time. It will not become worse with time.
I am talking about the legacy source code that is used in new compilations with modern compilers today.
> What exactly is supposed to "explode" in this context?
If you try to recompile it with a modern compiler many things will happen.
Posted Dec 10, 2023 19:35 UTC (Sun)
by marcH (subscriber, #57642)
[Link]
Only if it's really "airtight" (cause new attack techniques appear constantly) and use cases never ever change.
Even in such a case the company will likely want to re-use and evolve that source in some newer product. Then as you wrote, the binary is fine but the source is not.
"Zero maintenance" software can exist for sure but in many cases people who wish they don't have to pay for maintenance are just burying their head in the sand not to see technical debt.
Software maintenance has absolutely nothing to do with the fascination for shiny new things. It's actually the exact opposite. Confusing the two is not far from insulting the people performing ungrateful maintenance work. Unlike pseudo-"inventors"[*], they're never in the spotlight. Kudos to LWN for this article.
[*] search the Internet for "myth of the lone inventor"
Posted Dec 11, 2023 10:51 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (3 responses)
A twenty-year old binary built with Diab 5.0 either works or it doesn't; that's not going to change just because GCC 13.2 has a more thorough understanding of the C standard than GCC 3.1 (roughly contemporary to Diab 5.0). If you rebuild from the same sources today with Diab 5.0, you'll still get a working binary - nothing has changed, so nothing new fails.
Further, you can do your changes (if any are needed) with Diab 5.0 as the compiler, and it will interpret the code the same way it did 20 years ago. What you face trouble with is code that assumed that some underspecified behaviour would always be implemented the way the compiler of the day implemented it, and even then, only if you change the compiler. If you don't change the binary, it doesn't matter; if you change the source, but reuse the same compiler, it's (usually) fine.
The problem comes in when you change two things at once; both the compiler in use (maybe even as small a change as switching from PowerPC 440 to Arm Cortex-M7 backend in the same compiler binary) and the source code. At that point, you have the risk of a problem that could be anywhere, since most languages don't (arguably can't) tell you if the behaviour of the compiler has changed in a way the last programmer to touch the code would be surprised by. This applies to Rust, too; for example, integer overflow is underspecified in Rust by this standard (two possible outcomes, panic or 2s complement wrapping), and if the last programmer to touch the code didn't think about this, then you have room for a problem where only panic is acceptable behaviour, but instead you get silent wrapping.
Posted Dec 11, 2023 17:02 UTC (Mon)
by pizza (subscriber, #46)
[Link] (2 responses)
...I'd argue a switch from a (usually) BE CPU to a (nearly always) LE CPU is a pretty significant change, to say nothing of subtleties like the memory ordering model and how unaligned accesses are handled.
But yes, change out a major portion of the compile or runtime environment (and/or other fundamental requirements) and the code may need updating. Change multiple things at once... you're likely in for a world of pain.
Posted Dec 11, 2023 17:33 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (1 responses)
But then we come back round to the beginning - why are you rebuilding code with a new compiler if no requirements have changed, and expecting it to behave exactly as it did when built with the old compiler? This goes double if your code depends on specifics of how the old compiler interpreted the code, rather than being code whose meaning is unambiguous.
And that, of course, leads to the big problem with legacy code - much of it (in all languages) is written "knowing" that if it passes tests when built with a single compiler, then it's good enough. But change anything (compiler, inputs, other bits and pieces), and it stops working.
Posted Dec 11, 2023 19:48 UTC (Mon)
by pizza (subscriber, #46)
[Link]
Well, if nothing changes, then.. you don't need to do anything. (That was kinda my point with respect to my using "unmaintained" software in a production environment)
But more typically, requirements do change... eventually. You rarely know what those will be in advance, or what effort will be needed to handle it.
Posted Dec 10, 2023 19:20 UTC (Sun)
by marcH (subscriber, #57642)
[Link]
Small digression sorry.
Posted Dec 9, 2023 18:52 UTC (Sat)
by ballombe (subscriber, #9523)
[Link] (1 responses)
Which miscompilation are you talking about?
Posted Dec 9, 2023 19:01 UTC (Sat)
by mb (subscriber, #50428)
[Link]
Well, I said what I was talking about:
>subtle "miscompilation" due to decades old code not playing with the rules of the C machine model or having implicit types and declarations.
Just try to compile a 30-40 year old C program. Chances are good that it just won't work.
Posted Dec 9, 2023 16:57 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link]
Postgres used ~two instances of "Missing parameter types in function definitions" until somewhat recently. Mainly because it made the code look worse to replace them.
Posted Dec 9, 2023 17:37 UTC (Sat)
by ballombe (subscriber, #9523)
[Link]
Why ?
2) They are maintained and already build with -Wall so that these legacy problems don't exist.
False: -Wall does not prevent commits that generate a warning to be pushed to a GIT repository.
Beside, maintainers come and go and making the live of new maintainers miserable by pretending they are responsible for the state of the repository before they took over maintenance do not serve anyone purpose.
Most current C project have their own memory management system (if only to deal with out of memory) which will likely need to do conversion between pointers of different type. It is quite easy to miss a cast (especially when the rules for C++ and C are different).
Posted Dec 9, 2023 3:41 UTC (Sat)
by ebiederm (subscriber, #35028)
[Link] (1 responses)
That does not seem inactive to me.
Posted Dec 9, 2023 10:31 UTC (Sat)
by fw (subscriber, #26023)
[Link]
Posted Dec 9, 2023 4:01 UTC (Sat)
by makendo (guest, #168314)
[Link] (1 responses)
I definitely welcome these to be turned into errors for C99 and above. Any code still using it should turn on My main complaint is that assignments between incompatible pointers shouldn't always error when the target is declared in the same statement (i.e. initialization), as otherwise you would have to specify the pointer type twice in a single statement, which I see as unnecessary verbosity.
Posted Dec 9, 2023 19:18 UTC (Sat)
by ianmcc (subscriber, #88379)
[Link]
Posted Dec 9, 2023 16:50 UTC (Sat)
by NightMonkey (subscriber, #23051)
[Link] (2 responses)
Posted Dec 10, 2023 0:27 UTC (Sun)
by makendo (guest, #168314)
[Link] (1 responses)
Posted Dec 10, 2023 9:05 UTC (Sun)
by swilmet (subscriber, #98424)
[Link]
Posted Dec 10, 2023 3:51 UTC (Sun)
by david.a.wheeler (subscriber, #72896)
[Link] (1 responses)
https://github.com/ossf/wg-best-practices-os-developers/i...
Posted Dec 10, 2023 3:57 UTC (Sun)
by david.a.wheeler (subscriber, #72896)
[Link]
https://best.openssf.org/Compiler-Hardening-Guides/Compil...
Posted Dec 11, 2023 17:43 UTC (Mon)
by eru (subscriber, #2753)
[Link]
A nit, but I think the description of the implicit function declaration is a bit off. It is not a function that takes no parameters, but one with an unknown parameter list. It also assumes external linkage. Equivalent to
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
If not, they should use a distro package.
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Wol
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Wol
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
let vals: Vec<i32> = v.collect();
let vals: Vec<i32> = v.collect();
About type inference coming to the C language as well
About type inference coming to the C language as well
But a subset of Rust's type inference will probably work well in C.
About type inference coming to the C language as well
let y: i16 = 2;
let z: i32 = x + y.into(); // Compiler error!
let y: i16 = 2;
let y32: i32 = y.into();
let z: i32 = x + y32;
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
(5 main, __uint128, size_t, uintptr_t, intmax_t, ptrdiff_t). POSIX doesn't help with off_t.
About type inference coming to the C language as well
About type inference coming to the C language as well
* Type inference, or else you have to write lifetime annotations everywhere.
* Box<T> or something equivalent to Box<T>, or else you can't put big objects on the heap and move their ownership around.
* Arc<RwLock<T>> or some equivalent, or else you have no reasonable escape hatch from the borrow checker (other than unsafe blocks).
* Rc<RefCell<T>> or some equivalent, or else you have to use the multithreaded escape hatch even in single-threaded code.
* And then there are many other optimizations such as using Mutex<T> instead of RwLock<T>, or OnceCell<T> instead of RefCell<T>. All of these have valid equivalents in C, and should be possible to represent in our hypothesized "safe C" (without needing more than a minimal amount of unsafe, preferably buried somewhere in the stdlib so that "regular" code can be safe).
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
https://www.bassi.io/articles/2023/08/23/the-mirror/
(but a bit long to read, and one needs to know the GObject world to understand the blog post I think).
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
About type inference coming to the C language as well
Wol
Modern C for Fedora (and the world)
However, gcc still seems to add "move.l %a0,%d0" at the end of any function returning a pointer type.
Modern C for Fedora (and the world)
Why Fedora and GCC only?
Why Fedora and GCC only?
Why Fedora and GCC only?
Obsolete C for you and me
Obsolete C for you and me
More generally, you need to use Obsolete C for you and me
-fpermissive
, a new option (for the C front end) in GCC 14.
-std=c90
or -std=gnu90
, particularly the new inlining semantics. If you just throw in -fpermissive
, it also remains active if the build system automatically selects -std=c11
(for example), despite the use of language features that were removed in C99. (The relative order of -fpermissive
and the C dialect options does not matter, unlike for the C dialect options themselves.)
Obsolete C for you and me
Obsolete C for you and me
#include <stdio.h>
int main(void)
{
auto i = 3.9;
printf("%d\n", i);
}
$ gcc -w -o foo foo.c && ./foo
3
$ g++ -w -o foo foo.c && ./foo
-2146603272
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Wol
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Wol
Obsolete C for you and me
Obsolete C for you and me
Wol
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Wol
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Floating-point in syscalls (was Obsolete C for you and me)
Floating-point in syscalls (was Obsolete C for you and me)
Floating-point in syscalls (was Obsolete C for you and me)
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
if you could specify a `-std=kr` (or something like that) I would be content.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
No. As the article says, no it will not be kept compiling.
Where it makes sense.
This fixes real world problems that are known for decades.
There is no excuse for using these legacy C features.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
You have implicit function declarations? Just add explicit ones! It's trivial.
Implicit int? Just change it. It's trivial.
and so on...
Obsolete C for you and me
Obsolete C for you and me
Dependencies.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Which is exactly my point. It's not possible for a non-programmer to take a >20 year old program and just compile it with a modern compiler in a modern environment. There are so many reasons for this to fail. You listed a few of them.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
What makes a huge difference is whether it is maintained or not.
That protects against bit rot.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
If you think it's trivial, please try to fix the type safety issues in the Obsolete C for you and me
cohomolo
package for GAP. Even using -flto -Wlto-type-mismatch
(which enables type checking across translation units), this one still looks really difficult to me. Of course, most packages are not like that.
Obsolete C for you and me
> compile a bunch of different programs from different eras, that aren't broken and don't need updating.
...
> The "everything must be updated all the time because reasons, and it's fine for stuff to stop working once it's not been touched for even
> 2 years" concept is a modern concept, and not a very good one IMO.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Wol
Obsolete C for you and me
Suddenly some commits that just missed a cast fail to build, breaking the bisection.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
- was under active development in the last 5 years so that to-be-bisected bugs have been introduced and
- uses these legacy C features?
1) They are unmaintained since decades. No need to bisect.
2) They are maintained and already build with -Wall so that these legacy problems don't exist.
3) A very small group of projects that have very poor code quality and are actively maintained.
3) should not be used anyway. The fix is to rewrite them.
Obsolete C for you and me
boolean
is_edible(obj)
register struct obj *obj;
{
/* ... */
}
-std=gnu89
as a result.
Those definitions were declared obsolescent in the second edition of the standard, in 1999. According to published drafts, the next revision of the standard will remove them from the language altogether. I don't know what compilers will do about it. C23 also introduces unnamed parameters. Curiously, the syntax is not ambiguous even for compilers which still support implicit Obsolete C for you and me
int
, but it is a rather close call.
Obsolete C for you and me
Obsolete C for you and me
[2] To the point we supply our own that needs to be built from sources
Obsolete C for you and me
I said that if there are no changes then there is no need to bisect.
Obsolete C for you and me
Obsolete C for you and me
It should have been priority number one to get rid of it decades ago before it exploded.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Feel free to keep piling up as much technical debt as you like.
It's your decision.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
But you have to live with the consequences of your own decisions.
It was your decision to use C features that have been deprecated and throwing warnings for decades.
Environments change and perfectly good software becomes an ancient mess.
Feel free to keep depending on unmaintained software. That is your choice and I am fine with that.
Obsolete C for you and me
Obsolete C for you and me
> then nothing else possibly could.
But it's not at all "unmaintained", because if problems do come up, they will get fixed.
This is enforced by law.
>Does this mean I should just scrap the lot?
That is a completely different thing.
That is what this discussion is about.
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Obsolete C for you and me
Assignment between different pointer types is about impossible for the compiler to get wrong the compiler, whether or not cast are used.
Obsolete C for you and me
Obsolete C for you and me
> - was under active development in the last 5 years so that to-be-bisected bugs have been introduced and
> - uses these legacy C features?
Obsolete C for you and me
Especially if the CI system only test the tip of branch, not all the intermediary commits.
In any large code large code base there will always be some small percentage of commit that generate warnings.
What does it mean that Vala is not seeing development?
What does it mean that Vala is not seeing development?
Modern C for Fedora (and the world)
-std=c89
or -ansi
in their Makefiles.Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
Modern C for Fedora (and the world)
"Compiler Options Hardening Guide for C and C++":
Modern C for Fedora (and the world)
int f();