Proper Community Engagement

Posted Aug 31, 2024 1:04 UTC (Sat) by roc (subscriber, #30627)
In reply to: Proper Community Engagement by rc00
Parent article: Rust-for-Linux developer Wedson Almeida Filho drops out

> how a computer works, syscalls, filesystems

Filesystems have nothing to do with C. If you really want to learn about syscalls and how a computer works, you need to write or at least read assembly.

> And on top of that, learning C means it is generally easier to learn other languages after the fact. C's lack of rules and restrictions are a double-edged sword but it is typically much more direct to code.

C is full of rules and restrictions, they're just not enforced by the compiler. Try explaining undefined behavior to a new programmer! Of course, no-one teaches that, so students aren't really learning C, they're learning just enough to get the right output from one run of some toy code. And that's dangerous because they think they know how to write reliable code in C when they don't.

> Again, I communicate with professors and I would advise the same for you before making these kinds of comments

No need to be condescending. I have a PhD in computer science and a lot of my friends are professors at various universities in the USA and New Zealand. While we can have different opinions on what makes a good first language, what professors choose to teach is a matter of facts. Here's some actual data from 2019: https://research.leune.org/publications/ICCSE21-CS1CS2.pdf
See Tables III and IV. C has actually been minuscule for a long time; Python was rising at the expense of Java and C++. As far as I can tell from anecdotal data, Python has only accelerated since then due to the soaring interest in AI.

> 1. Python being a more approachable language for non-programmers meant that they were the perfect target audience. The mistakes are due to inexperience. But Rust is far from being the solution.

I agree! But Rust isn't the only option that's stricter than Python. That's why I highlighted Julia for scientific users, not Rust.

> You can have memory leaks in Rust, you can have poor logic in Rust, and you can have egregious mistakes in Rust.

I wouldn't push Rust for scientific users but this kind of argument is a common fallacy that's language-independent. You can make mistakes in any language, but there are large classes of mistakes that are prevented in some languages but not others, and that matters a lot in practice.

When you're publishing scientific papers, programming errors can have huge impact --- not just on the results, but on whoever depends on those results --- and using a language that catches more errors has huge value that in many cases would justify the extra effort required.

> Rust's first-mover "advantage" over "new-langs" like C3, Odin, and Zig will be erased

None of those languages can replace Rust because none of them provide the combination of performance and memory safety that Rust does (let alone the other kinds of safety Rust provides, like data-race freedom).

I actually think a simpler language than Rust that hits the same performance+safety target could be built and might be great --- try taking Rust and replacing its generics and macros with Zig-style comptime, but keeping the aliasing and borrowing model. But AFAIK no-one has even started trying to do that, and in the 10-20 years it would take to reach maturity if started now, Rust will likely be too entrenched.

Proper Community Engagement

Posted Aug 31, 2024 1:49 UTC (Sat) by rc00 (guest, #164740) [Link] (18 responses)

> None of those languages can replace Rust because none of them provide the combination of performance and memory safety that Rust does (let alone the other kinds of safety Rust provides, like data-race freedom).

Would you consider Unsafe Rust and Safe Rust to be two languages or two parts of the same language? I'm skipping over the fact that the argument with Rust proponents went from "no one writes unsafe Rust" to "unsafe Rust is required for the Linux kernel" (paraphrasing
https://lwn.net/Articles/982868/). I'm also skipping over the recent report of the percentage of crates found to be using Unsafe Rust.

Specifically regarding the content of the article on this page, Unsafe Rust is inevitable in the Linux kernel and yet Rust was designed to make Unsafe Rust undesirable and difficult. Why write Unsafe Rust (which also means opting out of the borrow checker) instead of something like Zig that has around 80% of the memory safety that you get from Rust's borrow checker feature set but with 0% of the downside of the loss of productivity in fighting the borrow checker? (https://zackoverflow.dev/writing/unsafe-rust-vs-zig/)

Arguing for Safe Rust while dismissing that Unsafe Rust is almost inevitable is a different kind of logical fallacy. It sidesteps the reality that a language that isn't inherently memory-safe but instead makes memory-unsafe operations as safe as possible is the better solution for the assignment at hand. I believe both Zig and Odin also do not have undefined behavior where Unsafe Rust and C do. Why not pick the best tool for the job?

> I actually think a simpler language than Rust that hits the same performance+safety target could be built and might be great

This strikes me as Zig 1.0 if the current trajectory holds true.

Proper Community Engagement

Posted Aug 31, 2024 2:25 UTC (Sat) by roc (subscriber, #30627) [Link] (9 responses)

> Why write Unsafe Rust (which also means opting out of the borrow checker)

That's not the right way to think about it. "unsafe" lets you deference raw pointers, which means you can bypass the borrow checker by using raw pointers instead of references, but you can use references in unsafe Rust and they still get borrow-checked. This matters because it means that entering an unsafe block because e.g. you want to call a C function does NOT mean that you are suddenly also susceptible to lifetime bugs.

> Unsafe Rust is inevitable in the Linux kernel

It is inevitable that unsafe Rust is present in the kernel, but it matters a great deal *how much* is present and where it is present. If you have unsafe Rust in the Rust wrappers around kernel APIs but you can write a lot of drivers with zero unsafe Rust (which is actually the goal), then that is likely to be a big win. You write the wrappers, you invest effort in reviewing that code and testing it, and then all your driver code (which is much more code than the wrappers, in the long run if not the short run) benefits from the safe Rust guarantees.

OTOH if every driver has to use copious amounts of "unsafe" then Rust in the kernel will have failed.

> instead of something like Zig that has around 80% of the memory safety that you get from Rust's borrow checker feature set but with 0% of the downside of the loss of productivity in fighting the borrow checker?

In release builds (i.e. builds with acceptable performance) you get no use-after-free checking with Zig, and those are most of the exploited memory safety issues these days. Plus of course Rust is giving you more safety than just memory safety.

> I believe both Zig and Odin also do not have undefined behavior where Unsafe Rust and C do

UAF is for sure undefined behaviour in Zig. There is no way to define it!

> This strikes me as Zig 1.0 if the current trajectory holds true.

Nope, no UAF protection in production code with Zig.

Proper Community Engagement

Posted Aug 31, 2024 3:10 UTC (Sat) by rc00 (guest, #164740) [Link] (8 responses)

> OTOH if every driver has to use copious amounts of "unsafe" then Rust in the kernel will have failed.

I've already placed my wager. 😉

> In release builds (i.e. builds with acceptable performance) you get no use-after-free checking with Zig, and those are most of the exploited memory safety issues these days. Plus of course Rust is giving you more safety than just memory safety.
> UAF is for sure undefined behaviour in Zig. There is no way to define it!
> Nope, no UAF protection in production code with Zig.

Let's debug more of your programming language knowledge:

* Zig's standard library offers a general-purpose allocator that can be used to prevent double-free, use-after-free, and can also detect memory leaks.

* Zig has a build mode named ReleaseSafe that is not only fit for production code but considered the main mode for releases. With ReleaseSafe, memory and safety checks are still enabled.

Let's try to avoid spreading misinformation.

Additional links and resources:
* https://zig.news/kristoff/how-to-release-your-zig-applica...
* https://medium.com/@shyamsundarb/memory-safety-in-c-vs-ru...
* https://ziglang.org/documentation/master/#Build-Mode

Proper Community Engagement

Posted Aug 31, 2024 3:17 UTC (Sat) by intelfx (subscriber, #130118) [Link] (2 responses)

And that allocator is going to be used if (for some inexplicable reason) Linux kernel gets to contain Zig code?

> Let’s try <…>

Let’s try to get our facts and logic straight before directing condescension at others, shall we?

Proper Community Engagement

Posted Aug 31, 2024 3:37 UTC (Sat) by rc00 (guest, #164740) [Link] (1 responses)

> And that allocator is going to be used if (for some inexplicable reason) Linux kernel gets to contain Zig code?

It's a general-purpose heap allocator. For all intents and purposes, it can be thought of as the default allocator. You can certainly use others for specialized cases but during development and debugging, why not use the general-purpose one out of good habit? How is this any different from kmalloc? (These are rhetorical questions. See the last sentence of this reply.)

> Let’s try to get our facts and logic straight before directing condescension at others, shall we?

Can you approach this thread with correct information? It's quite the take that you want to apply a negative connotation to "condescension" but yet you state that toxic comments that suggest brigading are "funny."

No, I'm not going to interact with you beyond this dribble.

Proper Community Engagement

Posted Aug 31, 2024 3:41 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

> It's a general-purpose heap allocator. For all intents and purposes, it can be thought of as the default allocator.

JFYI, Linux has a variety of ways pointers are allocated, kmalloc is only one of them. One of the problems with Rust adoption was that Rust's standard library was not ready for custom allocators and fallible allocations.

Proper Community Engagement

Posted Aug 31, 2024 4:05 UTC (Sat) by roc (subscriber, #30627) [Link] (3 responses)

> * Zig's standard library offers a general-purpose allocator that can be used to prevent double-free, use-after-free, and can also detect memory leaks.

Which leaks virtual memory (to catch heap UAF) and is not suitable for production use.

> Zig has a build mode named ReleaseSafe that is not only fit for production code but considered the main mode for releases. With ReleaseSafe, memory and safety checks are still enabled.

ReleaseSafe does not affect the choice of allocator and therefore does not detect heap UAF unless you opt into the general-purpose allocator, which is not suitable for production use.

And FWIW Zig has *nothing at all* to detect stack UAF.

Proper Community Engagement

Posted Aug 31, 2024 5:05 UTC (Sat) by rc00 (guest, #164740) [Link] (2 responses)

Your knowledge is either dated or you insist on spreading misinformation. I have provided numerous links and references. Please make use of them.

> Which leaks virtual memory (to catch heap UAF) and is not suitable for production use.

The general-purpose allocator is configured with never_unmap set to false by default. Only when set to true is there a possibility that every allocation can be leaked. Again, this allocator can be used for production release modes.

https://www.openmymind.net/learning_zig/heap_memory/

> ReleaseSafe does not affect the choice of allocator and therefore does not detect heap UAF unless you opt into the general-purpose allocator, which is not suitable for production use.

Please read the previous links. This was directly addressed.

> And FWIW Zig has *nothing at all* to detect stack UAF.

The language is still not 1.0 yet. Here is the issue where use-after-free stack allocations will be addressed:
https://github.com/ziglang/zig/issues/3180

I am sharing these links in good faith. Please take the appropriate time to process them.

Proper Community Engagement

Posted Aug 31, 2024 9:04 UTC (Sat) by roc (subscriber, #30627) [Link] (1 responses)

> I have provided numerous links and references. Please make use of them.

Yes, I checked all your links (and more) before I responded, and the information in those links is consistent with what I said. Your new link is also consistent with what I said.

> The general-zurpose allocator is configured with never_unmap set to false by default. Only when set to true is there a possibility that every allocation can be leaked. Again, this allocator can be used for production release modes.
> https://www.openmymind.net/learning_zig/heap_memory/

This doesn't support your claims. It doesn't mention never_unmap or describe exactly what GeneralPurposeAllocator guarantees or say what the performance impact is. In fact those things are not documented anywhere outside the source code, as far as I can tell; certainly nowhere you have linked to. So I find your attitude irksome.

But in the spirit of https://xkcd.com/386, I went ahead and looked at the source code: https://github.com/ziglang/zig/blob/master/lib/std/heap/g.... The situation is pretty complicated, but as far as I can tell, the following things are true in the default configuration with ReleaseSafe build mode:
-- GeneralPurposeAllocator reuses addresses of allocated objects in many cases and does allow UAF bugs in those cases
-- GeneralPurposeAllocator avoids reusing addresses of allocated objects in other cases, and therefore must suffer from serious fragmentation problems leading to performance degradation
-- GeneralPurposeAllocator lacks a lot of the optimizations essential to modern high-performance allocators and cannot be expected to be competitive with those allocators

In more detail: the source code comments say:
> //! ### `OptimizationMode.debug` and `OptimizationMode.release_safe`:
...
> //! * Do not re-use memory slots, so that memory safety is upheld. For small
> //! allocations, this is handled here; for larger ones it is handled in the
> //! backing allocator (by default `std.heap.page_allocator`).

It's not entirely clear from the comment what "memory slots" means here, but looking at the code it's more clear: https://github.com/ziglang/zig/blob/37df6ba86e3f4e0f5d6a2...
GPA passes through objects >= the page size through to the underlying page-level allocator (which defaults to td.heap.page_allocator). For smaller objects, it has a list of page-sized "buckets" per size class, with each bucket carved up into "slots" of that size class. To allocate an object, it finds a bucket for the right size class with empty space, and uses the *next slot in the bucket that has never been used*. I.e., even if some objects in the bucket have been freed, those slots won't get used. This is what inevitably leads to fragmentation and poor cache usage.

However, when freeing a sub-page-size object, if that means the bucket is *completely empty*, the entire bucket is freed: https://github.com/ziglang/zig/blob/37df6ba86e3f4e0f5d6a2...
With the default PageAllocator and GPA configuration, this memory is returned to the operating system. The next time GPA needs to allocate a page, PageAllocator will likely return memory at the same address. GPA will hand out references to that memory and so classic UAF bugs can then be triggered --- references through a pointer to the freed object will work but access memory in the freshly allocated object, perhaps of a different type.

Note that the comment "for larger ones it is handled in the backing allocator" is false for the default PageAllocator. There is no logic to avoid reusing virtual memory in that code. So virtual memory isn't leaked, but in exchange you get UAF bugs.

For allocations >= the page size, you enable UAF bugs immediately because PageAllocator can reuse addresses.

This Reddit thread agrees with me that "UAF will not be reliably detected": https://www.reddit.com/r/Zig/comments/1eysv2k/general_pur...

andrewrk did write: https://github.com/ziglang/zig/issues/3180#issuecomment-6...
> Use after free is now solved as far as safety is concerned with heap allocations, if you use the std lib page_allocator or GeneralPurposeAllocator.
And maybe you believed him, but unfortunately that's not true, or at least it's not true now.

> I am sharing these links in good faith. Please take the appropriate time to process them.

Oh, I've spent way too much time processing this.

> Here is the issue where use-after-free stack allocations will be addressed: https://github.com/ziglang/zig/issues/3180

Will it really, though? It is not easy at all to detect and prevent stack UAF bugs. For example, if you take the address of a stack value and store it in a heap object, is that allowed? It's often useful to be able to do that, so I guess Zig would want to allow it, but if you allow it, it can be very difficult to prove that the pointer is not used after the function has returned. If they do solve this, there will be some overhead and/or some existing useful Zig code that is no longer legal.

For now, safe Rust prevents UAF and all Zig has is partial, inefficient solutions and promises.

Proper Community Engagement

Posted Aug 31, 2024 13:34 UTC (Sat) by corbet (editor, #1) [Link]

This language-advocacy discussion has gone fairly far from the original topic and seems unlikely to resolve anything. This seems like a good time to let it go.

Proper Community Engagement

Posted Aug 31, 2024 5:32 UTC (Sat) by pbonzini (subscriber, #60935) [Link]

> I've already placed my wager

https://lwn.net/Articles/863459/ has two unsafe blocks, both at initialization time, and that was three years ago. It's a simple driver, sure, and some sources of unsafely are only apparent with e.g. devices that do DMA, but rest assured that the developers have done their homework.

Proper Community Engagement

Posted Aug 31, 2024 14:32 UTC (Sat) by kleptog (subscriber, #1183) [Link] (7 responses)

> Specifically regarding the content of the article on this page, Unsafe Rust is inevitable in the Linux kernel and yet Rust was designed to make Unsafe Rust undesirable and difficult.

I've never understood this argument. Saying "Rust is a bad choice for the kernel because you need unsafe sometimes" is the same as "C is a bad choice for the kernel because you need ASM blocks sometimes". Except nobody complains about the number of ASM blocks in the kernel.

It would be even worse if the option wasn't there. Then it would definitely have been a non-starter.

Proper Community Engagement

Posted Aug 31, 2024 15:08 UTC (Sat) by rc00 (guest, #164740) [Link] (4 responses)

You don't understand how using a language for its intended purpose makes sense?

Do you remember the earlier days of this years-long hype cycle for Rust? "No one should write Unsafe Rust, that defeats the entire purpose."

No responsible person writing C ever said that Assembly should be avoided at all costs. There was a time that C was considered a higher-level programming language, especially when viewed through the lens of something like Assembly. In the same way that Python is today's mainstream higher-level language, there are still plenty of internal components for it written in C. Pairing higher-level languages with lower-level languages is far from novel or frowned upon.

Rust is explicitly not the right tool for the job and this is based on what some of the largest Rust proponents have said in their own words. However, the argument against Unsafe Rust by Rust proponents was discarded to prioritize Rust being thrust into the Linux kernel project. "Rust by any means," so to speak. This strategic and constant shifting of the goalposts is the very reason there is a groundswell of intelligent people ready to move on from Rust, its related hype, and many of the related individuals. We went through this with Haskell, and Scala, and so on.

Proper Community Engagement

Posted Sep 1, 2024 1:57 UTC (Sun) by roc (subscriber, #30627) [Link]

> Do you remember the earlier days of this years-long hype cycle for Rust? "No one should write Unsafe Rust, that defeats the entire purpose."

An important part of Rust has always been building safe abstractions around unsafe code. See e.g. https://smallcultfollowing.com/babysteps/blog/2016/05/23/...
That's from 2016. The Rust-for-Linux project has been following that exact playbook. Nothing has changed.

If you heard "no-one should write unsafe Rust" then someone was very confused, and I suspect it's you.

Proper Community Engagement

Posted Sep 1, 2024 19:05 UTC (Sun) by kleptog (subscriber, #1183) [Link] (2 responses)

> "No one should write Unsafe Rust"

The irony is, if you type this phrase in Google with quotes, it returns exactly one match: your comment. So it's not surprising I've never heard of it before.

> this is based on what some of the largest Rust proponents have said in their own words.

to which I can only say [citation needed]. Because I spent on time looking but could find any explicit instances, just more claims that these people did exist.

Proper Community Engagement

Posted Sep 1, 2024 20:18 UTC (Sun) by rc00 (guest, #164740) [Link] (1 responses)

The irony is that if you were alive during the late 2010s, you would already know. You spent time looking but couldn't find blogs or Reddit posts that were either scrubbed at the time or scrubbed since? Have you ever heard of Actix(-web)? Read what little is left online of the toxic community's discourse during that time.

You're surprised that what has become an inferior search engine can't find the hordes of toxic examples from an even more toxic community known to suppress and try to rewrite reality and then history, and then gaslighting whenever adverse topics are brought up, insinuating that they're fabricated. No, this is not a thread I'm going to continue with you or anyone else.

Good thinking

Posted Sep 1, 2024 21:17 UTC (Sun) by corbet (editor, #1) [Link]

Indeed — please do not continue this thread any further. We do not need this kind of personal attack here.

Proper Community Engagement

Posted Sep 2, 2024 21:55 UTC (Mon) by anthm (guest, #173221) [Link] (1 responses)

Rust needs unsafe blocks therefore it's no different than a language that is just unsafe everywhere comes from the same place as not all bugs are memory bugs so memory safety is pointless.

Proper Community Engagement

Posted Sep 3, 2024 10:08 UTC (Tue) by mbunkus (subscriber, #87248) [Link]

Yeah, this is an argument I hear quite often. Here's a comment someone made to me a couple of days ago:

> Rust is no more secure than C/C++. Memory ownership is an important part of security BUT not the only one. That is simple problem with most folks who talk about rust being the magic bullet; it is not. Memory safety might feel like a hot issue right now but in 5 years there will be another one; even with moving to rust.

I find this to be completely ludicrous, given examples such as Microsoft's observation that 70% of their observed security vulnerabilities are, in fact, memory safety issues. I wonder if these types of comments come from malice or ignorance. I'm inclined to believe it's the latter.

Proper Community Engagement

Posted Aug 31, 2024 9:36 UTC (Sat) by Wol (subscriber, #4433) [Link]

> When you're publishing scientific papers, programming errors can have huge impact --- not just on the results, but on whoever depends on those results --- and using a language that catches more errors has huge value that in many cases would justify the extra effort required.

Two cases in point - the entire diet industry is founded on a paper that concluded "fat is bad for you". That has now been thoroughly debunked. The alternatives pushed by the diet industry are far worse (they replaced fat with sugar, which has pretty much *caused* the current obesity, cancer and diabetes epidemics. Then there's the margarine and trans-fats which is probably behind all the heart disease).

And read Feynman, where he spent ages trying to work out why his sub-atomic-particle charges didn't add up. Then he thought "if the charge on this particle is wrong ... AHA - I remember seeing the proof and thinking it was screwy!!!". So he looked at it, and if he deleted just one outlier result, the proof flipped the charge! So we have 20 years of atomic research possibly derailed by just one paper where the author didn't understand statistics!

That's also why I hate seeing newspaper headlines "A new study has proved ..." - anybody who knows science knows that ALL new studies PROVE NOTHING. It's the REPEAT studies that prove the first one was correct! Unfortunately, it's very difficult to get funding to repeat a study ... and getting back to topic - it's getting better but how many studies cannot be repeated because the software required cannot be found or reliably recreated?

Cheers,
Wol