Zig 2024 roadmap

Posted Feb 3, 2024 12:06 UTC (Sat) by ojeda (subscriber, #143370)
In reply to: Zig 2024 roadmap by rrolls
Parent article: Zig 2024 roadmap

When you're coming from C, Zig gets you 99% of the memory safety that Rust does.

Could you please explain what do you mean by "99%"? To my understanding, 99% means catching the vast majority of bugs, but from their own documentation, it seems like memory safety is essentially like C's:

It is the Zig programmer's responsibility to ensure that a pointer is not accessed when the memory pointed to is no longer available. Note that a slice is a form of pointer, in that it references other memory.

In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers. In general, when a function returns a pointer, the documentation for the function should explain who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever, to free the pointer.

And from a quick try, I see that indeed trivial use-after-frees are not caught in 0.11/master, with -OReleaseSafe/-ODebug): https://godbolt.org/z/EG1dKz5MK, https://godbolt.org/z/zaW9e13aG.

In Rust, the equivalent program does not even compile.

Zig 2024 roadmap

Posted Feb 3, 2024 14:36 UTC (Sat) by rrolls (subscriber, #151126) [Link] (19 responses)

I suppose it depends on your viewpoint.

In my background I'm used to manually thinking about and managing lifetimes anyway. For example, using array indices rather than pointers, minimising the total number of allocations, documenting exactly how data is stored and in what order things should be "set up", "used" and "torn down" (in whatever means is appropriate for the language), passing responsibility for allocation towards the caller as much as is practical, and so on. I just don't write code in the first place that does "lots of little allocations and deallocations here there and everywhere"*; I think carefully about every single allocation; so, effectively by necessity, "lifetimes" isn't a very complex topic in my code, with the consequence that Rust's advantages don't really apply to it. (A function that returns a pointer to part of its own stack frame, like your example, is something I learned never to write a long time ago.)

*unless I'm using a high-level language like Python, where allocations happen constantly, but the garbage collector prevents any memory-unsafety troubles at the cost of performance. However, even in Python, I'm still thinking about the "semantics" of setup and teardown in the same way, because although memory management isn't involved in it, you can still run into the same traps with higher-level resources such as files, sockets, threads, processes, async tasks, or the state of a database.

On the other hand, I constantly find myself running into "completely obvious and stupid once you find it" buffer overflows, off-by-one errors, and so on. This is a class of problems that Zig solves immediately with slices. In one particular example, I had a known bug (as in, the effect was known, but not the explanation or the solution) in some C code that lasted for years because, each time the edge case that caused it arose, it would set a byte one beyond the end of a buffer to zero when that byte is usually zero anyway, but then just occasionally that byte isn't supposed to be zero, so something completely unrelated fails long after the code that wrongly sets that byte is done, making it impossible to figure out why. If that was Zig code, using a slice for that buffer would have given me a panic with a relevant stack trace immediately the very first time that write was detected, and I would have been able to fix it straight away. (Even valgrind didn't detect this bug: although it was out of bounds of the array, it was still part of a single memory allocation, hence why instead of causing an immediate crash it corrupted some unrelated behavior later.)

Of course it's not just slices. Zig also brings a bunch of other useful features that C doesn't have: tagged unions, comptime, RLS and its unique approach to error handling, to name a few. The first and last of these are also places where memory safety bugs completely unrelated to lifetimes show up in C over and over again. Rust has some of these features too, but that's irrelevant to the fact that Zig is still far better than C.

Basically my argument is that I feel Zig would solve the vast majority (if not all) of the "extremely hard" bugs I actually run into in practice, and while yes, Rust would not let code with those bugs compile, it would also make it a lot harder to write anything in the first place, but wouldn't bring me any further benefits beyond what I'd get from Zig. This is what I meant by "Zig is pragmatic." (The caveat of course being that I've only experimented a little so far with Zig, and almost not at all with Rust, so I can't exactly speak from experience with either language.)

So, yes, I was wrong to make such a sweeping generalisation with my original comment. But I hope I have at least demonstrated that the dismissive, rhetorical question of "Why are we, as an industry, pouring resources into non-memory-safe languages [like Zig]?" is equally misplaced :)

Zig 2024 roadmap

Posted Feb 4, 2024 1:47 UTC (Sun) by roc (subscriber, #30627) [Link] (11 responses)

Use-after-free (including dangling pointers into the stack) is a huge problem in large C and C++ applications and the source of a lot of CVEs. We've been trying "think carefully about every single allocation" for decades and it does not work at scale. ASAN, fuzzing, etc help but not enough; rare race and error conditions are too hard to hit.

Zig 2024 roadmap

Posted Feb 4, 2024 10:42 UTC (Sun) by b7j0c (guest, #27559) [Link] (7 responses)

It's not like Zig is absent on this

The testing allocator tells you about mishandled/unmanaged memory

Zig 2024 roadmap

Posted Feb 4, 2024 11:15 UTC (Sun) by mb (subscriber, #50428) [Link]

Does this also help, if the code path is not actually hit?

Maybe you can catch use-after-free with tests, if the use-after-free often happens in normal program flow, too.

But many security problems come from code paths being entered with incorrect input data that would never be executed alike in normal program flow.
In normal program flow with normal input data there often is no actual invalid memory access.

Can Zig find these problems?
Can Zig find multi thread data races?

Zig 2024 roadmap

Posted Feb 4, 2024 11:37 UTC (Sun) by atnot (subscriber, #124910) [Link] (5 responses)

As said above:
> ASAN, fuzzing, etc help but not enough; rare race and error conditions are too hard to hit.

The testing allocator is just basically built-in ASAN.

I should say Rust has this issue a bit too, thanks to the regrettable decision to disable overflow checks in release mode by default. I sort of get why they did it, it avoids the "why is my math microbenchmark 2% slower than the C version" from people first trying out the language, which would inevitably create a public impression that Rust is slower than C. But it's usually barely measurable in real world programs[1]. Granted, it's not as bad as in C because bounds checks are still applied post-overflow, but it's still annoying to find. And since the overflow checks won't show up in the profilers by default, people never optimize for that configuration. But I digress.

[1] Safe custom allocators, ime, tend to be very measurable, although there's a lot of work by e.g. Google on making it better. Still, it means Zig is arguably slower than Rust at iso safety, if you wanted to look at it that way.

Zig 2024 roadmap

Posted Feb 5, 2024 13:49 UTC (Mon) by khim (subscriber, #9252) [Link] (4 responses)

> But it's usually barely measurable in real world programs.

10x slowdown is “barely measurable”? What world do you live in?

> I should say Rust has this issue a bit too, thanks to the regrettable decision to disable overflow checks in release mode by default.

If you are talking about arithmetic (which is wrapping in Rust when not in debug mode) then they really had no choice: while trying to vectorize code with these checks is not impossible infrastructure is just not there. And they really had no resources to add custom passes to LLVM which would make the whole thing usable from Rust with non-wrapping arithmetic.

10x or more slowdown is not that uncommon if you enable these checks and then process large arrays of integers, which is definitely easily measurable in real world programs.

And having different rules for integers in arrays and standalone integers would be just too weird (although it may be interesting optional mode of compilation, now that I think about it).

Zig 2024 roadmap

Posted Feb 5, 2024 15:09 UTC (Mon) by atnot (subscriber, #124910) [Link] (2 responses)

> 10x slowdown is “barely measurable”? What world do you live in?

This is just based on just testing my own programs. Which are generally bottlenecked on memory not integer math, as most programs are. Even then, 10x is a ridiculous number, I can't find anyone reporting anything even close to 2x in real world programs.

But here, let's look at some actual data, someone running specint:

> On the other hand, signed integer overflow checking slows down SPEC CINT 2006 by 11.8% overall, with slowdown ranging from negligible (GCC, Perl, OMNeT++) to about 20% (Sjeng, H264Ref) to about 40% (HMMER). [...]
> Looking at HMMER, for example, we see that it spends >95% of its execution time in a function called P7Viterbi(). This function can be partially vectorized, but the version with integer overflow checks doesn’t get vectorized at all. [...]
> Sanjoy Das has a couple of patches that, together, solve [some missed optimizations]. Their overall effect on SPEC is to reduce the overhead of signed integer overflow checking to 8.7%.
https://blog.regehr.org/archives/1384

Specint is a bit biased towards HPC but we see, even there most normal business logic style code doesn't lose out at all. The losses are dominated by a few, very hot functions that are presumably heavily optimized already.

As you note, overflow checks interfere with vectorization, but so do millions of other things, it's notoriously finicky. Rust regularly misses autovectorization because of bounds checks too. It's very hard to write non-trivial code that vectorizes perfectly and reliably across platforms by accident.

Which gets me to the actual point I was getting at you ignored: In a hypothetical world where overflow checking was enabled by default, here's how this would have gone in a profiling session:

"why is hmmer::P7Viterbi() so slow now?"
"oh, it's not vectorizing because of overflow"
"let me replace it with a wrapping add, or trap outside of the loop body, or use iterators since I'm using Rust"
"that's better"

And millions of mysterious production bugs and a hundred CVEs would have been avoided, at barely any cost to most programmers and one extra profiling iteration of a thousand for a few people writing heavily integer math code.

Zig 2024 roadmap

Posted Feb 5, 2024 15:36 UTC (Mon) by khim (subscriber, #9252) [Link] (1 responses)

> In a hypothetical world where overflow checking was enabled by default

…Rust would have played the role of “new Haskell”: something which people talk about but don't use, except for a few eggheads and then rarely.

> And millions of mysterious production bugs and a hundred CVEs would have been avoided, at barely any cost to most programmers and one extra profiling iteration of a thousand for a few people writing heavily integer math code.

Nothing would have been avoided because Rust would have been just ignored. Rust, quite consciously, used up it's weirdness budget for other, more important, things.

Perhaps Rust with slow-integers-by-default would have saved someone from themselves, but chances are high that it would have hindered adoption of Rust too much: people are notoriously finicky about simple things and seeing these dozens of checks in the program which should be, by their understanding, two or three machine instructions long would have gave Rust a bad reputation for sure.

> This is just based on just testing my own programs.

If you are happy with that mode then why couldn't you just enable it in your code? -Z force-overflow-checks exists precisely because some people like these overflow checks.

I'm not big fun of them because in my experience for hundreds of bugs where some kind of buffer is too small and range checks are catching the issue there exist maybe one or two cases where simple integer overflow check is capable of catching the issue which is not also caught by these range checks. Certainly not enough to warrant these tales about millions of mysterious production bugs (why not trillions if you go for imaginary unjustified numbers, BTW?)

Zig 2024 roadmap

Posted Feb 5, 2024 16:26 UTC (Mon) by atnot (subscriber, #124910) [Link]

> Nothing would have been avoided because Rust would have been just ignored.

You're just making my own arguments back at me now, but snarkier. I said this two messages ago.

Look, I like Rust too, it's my personal language of choice. There's no need for this level of aggressive defensiveness over someone on the internet thinking it would be useful to make some pretty marginal tradeoff differently. With the data we have, I'm convinced it makes sense today and I've explained why. You're welcome to disagree.

Zig 2024 roadmap

Posted Feb 5, 2024 16:14 UTC (Mon) by mb (subscriber, #50428) [Link]

>10x or more slowdown is not that uncommon if you enable these checks and then process
>large arrays of integers, which is definitely easily measurable in real world programs.

You do not have to decide globally, if you want overflow checks or not.

You can enable overflow checks in release builds and for the performance critical code you can use https://doc.rust-lang.org/std/num/struct.Wrapping.html
With that you get fast code and safe code where it matters.

Zig 2024 roadmap

Posted Feb 5, 2024 7:02 UTC (Mon) by rghetta (subscriber, #39444) [Link] (2 responses)

I don't know for C, but I work on a 5MLOC C++ very active project, and in my experience memory errors are not so frequent, especially after C++/11. Each year we have *some* memory bugs (and none in production) but hundreds of logic errors. RAII, smart pointers, references, even templates can make a huge difference in preventing resource bugs, imho.

Zig 2024 roadmap

Posted Feb 5, 2024 19:00 UTC (Mon) by roc (subscriber, #30627) [Link] (1 responses)

Is it multithreaded and being fuzzed by experts?

Zig 2024 roadmap

Posted Feb 7, 2024 12:59 UTC (Wed) by rghetta (subscriber, #39444) [Link]

Multithreaded yes. Fuzzed only on import interfaces, not on the complete codebase.

Zig 2024 roadmap

Posted Feb 4, 2024 19:39 UTC (Sun) by ojeda (subscriber, #143370) [Link] (6 responses)

So, you are saying we can avoid temporal memory safety mistakes by "thinking carefully". But somehow, the same argument would not apply to spatial memory safety mistakes, and in fact, we would "constantly" make them.

Well, according to the Chromium project (and other projects), some of those issues you say we would not "run into in practice" are, in fact, quite prevalent sources of vulnerabilities: "Around 70% of our high severity security bugs are memory unsafety problems (that is, mistakes with C/C++ pointers). Half of those are use-after-free bugs."

> If that was Zig code, using a slice for that buffer would have given me a panic

...unless the safety checks are disabled, e.g. via -OReleaseFast.

> Rust would not let code with those bugs compile, it would also make it a lot harder to write anything in the first place

I mean, you state this as a fact, but you also recognize you have almost no experience with Rust. If you already think in terms of lifetimes anyway as you said, then writing safe Rust should be a very nice experience.

> I hope I have at least demonstrated that the dismissive, rhetorical question of "Why are we, as an industry, pouring resources into non-memory-safe languages [like Zig]?" is equally misplaced :)

It is not misplaced. The point is that nowadays we know how to do better. The industry (and other entities) is interested in getting away from memory unsafety as much as possible. Thus introducing a new language that essentially works like C (especially if you consider existing tooling for C) is not a good proposition.

Zig 2024 roadmap

Posted Feb 4, 2024 20:28 UTC (Sun) by roc (subscriber, #30627) [Link] (5 responses)

I think the question is misplaced because we're not actually pouring resources into Zig. Even advocates agree that it's years away from stabilization, and it's not going to get used much until after that happens. In the meantime, there are some interesting ideas like comptime that will get some testing.

We might even discover at some point in the future that "Rust, but comptime instead of generics and macros" would be an improvement on Rust.

Zig 2024 roadmap

Posted Feb 5, 2024 10:22 UTC (Mon) by farnz (subscriber, #17727) [Link] (4 responses)

comptime instead of generics is a non-starter, I suspect, since generics in Rust lay out data differently, not just change the code. But comptime instead of macros and some uses of traits would be extremely interesting to see; I suspect that there's a lot of cases where people currently have to write procmacros in Rust where comptime would be a good fit.

Zig 2024 roadmap

Posted Feb 5, 2024 13:44 UTC (Mon) by atnot (subscriber, #124910) [Link] (3 responses)

> comptime instead of generics is a non-starter, I suspect, since generics in Rust lay out data differently, not just change the code

You totally can do that! Zig and other languages with dependent-ish types generally have a unified type system for all types, including types themselves. In practical Zig terms, this means that you can't only return an integer from a comptime function, but the integer type itself, or a different type _depending_ on the arguments (that's where the term comes from), or an arbitrary struct type you just made, or a function, or anything really.

So in mathematical terms it's actually far more powerful than anything Rust has. Rust can do a few of these things as bespoke features but it's not a generalized system in the same way it is in dependently typed languages. This has its advantages and disadvantages, with dedicated syntax generally being more compact, readable and debuggable but also leaving weird incongruences between various parts of the language that are hard to solve, as Rust is experiencing.

It's been a pretty hip thing to experiment with somewhat recently (at least before effect systems really hit the scene) so I'm looking forward to seeing how Zig fares with it in a non-academic setting.

Zig 2024 roadmap

Posted Feb 5, 2024 19:02 UTC (Mon) by roc (subscriber, #30627) [Link] (2 responses)

What Zig does is not "dependent types" as in academia.

Zig 2024 roadmap

Posted Feb 5, 2024 20:16 UTC (Mon) by atnot (subscriber, #124910) [Link] (1 responses)

Yes, for anyone curious, one reason is that comptime isn't statically typed, it's dynamically typed but at compile time. For example, there is no way to write things like "comptime function that returns a function that returns either int or float", you can only write "comptime function that returns a function that returns some mystery surprise type". Like C++ templates, there's no type checking going on until after things have already been evaluated.

That said, you can do a lot of similar constructions and I think it faces a lot of similar issues regarding ergonomics when used in a non-fp language. Plus I think it does also demonstrate some of the benefits of having a single unified type system nicely without having to teach someone to read haskell-like syntax.

Zig 2024 roadmap

Posted Feb 5, 2024 20:35 UTC (Mon) by atnot (subscriber, #124910) [Link]

C++ templates are probably actually a good example in multiple ways here. Anyone really into their types would sneer at someone calling templates generics, they aren't really generics for similar reasons as comptime isn't dependent typing. But they're still extraordinarily popular and helpful, and they make it very easy to sell people on proper generics by pointing at them and saying "they're like that, but better".