About type inference coming to the C language as well

Posted Dec 11, 2023 4:43 UTC (Mon) by swilmet (subscriber, #98424)
In reply to: About type inference coming to the C language as well by excors
Parent article: Modern C for Fedora (and the world)

It's true that in C++ and Rust, types can be quite long to write.

Both C++ and Rust have a large core language, while C has a small core language.

I see Rust more as a successor to C++. C programmers in general - I think - like the fact that C has a small core language. So in C the types remain small to write, and there are more function calls instead of using sophisticated core language features. C is thus more verbose, and verbosity can be seen as an advantage.

Maybe the solution is to create a SubC language: a subset of C that is safe (or at least safer). That's already partly the case with the compiler options, hardening efforts etc.

About type inference coming to the C language as well

Posted Dec 11, 2023 8:39 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (12 responses)

> Maybe the solution is to create a SubC language: a subset of C that is safe (or at least safer). That's already partly the case with the compiler options, hardening efforts etc.

I disagree with this, assuming that "safe" means "cannot cause UB outside of an unsafe block." A safe version of C needs at least the following:

* Lifetimes and borrow checking, which implies a type annotation similar to generics.
* Type inference, or else you have to write lifetime annotations everywhere.
* Box<T> or something equivalent to Box<T>, or else you can't put big objects on the heap and move their ownership around.
* Arc<RwLock<T>> or some equivalent, or else you have no reasonable escape hatch from the borrow checker (other than unsafe blocks).
* Rc<RefCell<T>> or some equivalent, or else you have to use the multithreaded escape hatch even in single-threaded code.
* And then there are many other optimizations such as using Mutex<T> instead of RwLock<T>, or OnceCell<T> instead of RefCell<T>. All of these have valid equivalents in C, and should be possible to represent in our hypothesized "safe C" (without needing more than a minimal amount of unsafe, preferably buried somewhere in the stdlib so that "regular" code can be safe).

I just don't see how you provide all of that flexibility without doing monomorphization, at which point you're already 80% of the way to reinventing Rust.

About type inference coming to the C language as well

Posted Dec 11, 2023 11:10 UTC (Mon) by Sesse (subscriber, #53779) [Link] (3 responses)

I guess that if you banned threads and pointers (presumably requiring lots of globals) and made all array access bounds-checked and all data zero-initialized, you could get a safe C subset without going there. How useful it would be would be a different thing...

About type inference coming to the C language as well

Posted Dec 11, 2023 13:52 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

If you're not careful, you end up with something like Wuffs. A perfectly useful language in some domains, but deliberately limited in scope to stop you writing many classes of bug.

About type inference coming to the C language as well

Posted Dec 14, 2023 10:55 UTC (Thu) by swilmet (subscriber, #98424) [Link] (1 responses)

Seems useful to write command-line programs, for example.

About type inference coming to the C language as well

Posted Dec 14, 2023 10:57 UTC (Thu) by farnz (subscriber, #17727) [Link]

You're not going to get very far when you can't access arguments, or do I/O. Wuffs is deliberately limited to not doing that, because it's dangerous to mix I/O with file format parsing.

About type inference coming to the C language as well

Posted Dec 11, 2023 11:35 UTC (Mon) by swilmet (subscriber, #98424) [Link] (7 responses)

I'm not an expert in programming languages design and security-related matters.

But why not trying a C-to-Rust transpiler? (random idea).

By keeping a small core language with the C syntax, and having a new standard library that looks like Rust but uses more function calls instead.

The transpiler would "take" the new stdlib as part of the language, for performance reasons, and translates the function calls to Rust idioms.

A source-to-source compiler is of course not ideal, but that's how C++ was created ("C with classes" was initially translated to C code).

About type inference coming to the C language as well

Posted Dec 11, 2023 12:09 UTC (Mon) by farnz (subscriber, #17727) [Link] (6 responses)

You might want to look at the C2Rust project; the issue is that a clean transpiler to Rust has to use unsafe liberally, since C constructs translate to something that can't be represented in purely Safe Rust.

The challenge then becomes adding something like lifetimes (so that you can translate pointers to Rust references instead of Rust raw pointers) without "bloating" C. I suspect that it's impossible to have a tiny core language without pushing many problems into the domain of "the programmer simply must not make any relevant mistakes"; note, though, that this is not bi-directional, since a language with a big core can still push many problems into that domain.

About type inference coming to the C language as well

Posted Dec 12, 2023 10:32 UTC (Tue) by swilmet (subscriber, #98424) [Link] (5 responses)

I didn't know C2Rust, it shows that my random idea is not stupid after all :)

But I had the idea to convert (a subset of) C to _safe_ Rust, of course. Instead of some Rust keywords, operators etc (the core language), have C functions instead.

Actually the GLib/GObject project is looking to have Rust-like way of handling things, see:
https://www.bassi.io/articles/2023/08/23/the-mirror/
(but a bit long to read, and one needs to know the GObject world to understand the blog post I think).

Anyway, that's an interesting topic for researchers. Then making it useful and consumable for real-world C projects is yet another task.

About type inference coming to the C language as well

Posted Dec 12, 2023 10:43 UTC (Tue) by farnz (subscriber, #17727) [Link]

The hard part is not the keywords and operators - it's the lifetime annotation system. Lifetimes are a check on what the programmer intended, so have to be possible to write as an annotation to pointer types in the C derived language, but then to be usable force you to have a generics system (since you want many things to be generic over a lifetime) with (at least) covariance and invariance possible to express.

And once you have a generics system that can express covariance and invariance for each item in a set of generic parameters, why wouldn't you allow that to be used for types as well as lifetimes? At which point, you have Rust traits and structs, and most of the complexity of Rust.

About type inference coming to the C language as well

Posted Dec 12, 2023 11:34 UTC (Tue) by mb (subscriber, #50428) [Link] (3 responses)

>But I had the idea to convert (a subset of) C to _safe_ Rust, of course.

That is not possible, except for very trivial cases.

The C code does neither include enough information (e.g. lifetimes) for that to work, nor is it usually structured in a way for this to work.

Programming in Rust requires a different way of thinking and a different way of structuring your code. An automatic translation of the usual ideomatic C programs will fail so hard that it would be easier to rewrite it from scratch instead of translating it and then fixing the compile failures.

About type inference coming to the C language as well

Posted Dec 13, 2023 23:59 UTC (Wed) by swilmet (subscriber, #98424) [Link] (2 responses)

The C syntax alone is not enough, but comments with annotations can be added, and become part of the language.

I started to learn Rust but dislike the fact that it has many core features ("high-level ergonomics"). It's probably possible to use Rust in a simplistic way though, except maybe if a library forces to use the fancy features.

About type inference coming to the C language as well

Posted Dec 14, 2023 9:37 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

You could avoid using those libraries, and limit yourself to libraries that have a "simple" enough interface for you (no_std libraries are a good thing to look for here, since they're designed with just core and maybe alloc in mind, not the whole of std) - bearing in mind that you don't need to care how those libraries are implemented if it's just about personal preference.

In general, though, I wouldn't be scared of a complex core language - all of that complexity has to be handled somewhere, and a complex core language can mean that complexity is being compiler-checked instead of human-checked.

About type inference coming to the C language as well

Posted Dec 14, 2023 11:07 UTC (Thu) by swilmet (subscriber, #98424) [Link]

The codebases that I maintain already use between two and three/four main programming languages (welcome to GNOME, I should say). At some point I wanted to write new code in Rust, but it means adding more complexity and being less productive for some time while learning the language.

"Soft"ware, they said :-)