Thoughts and clarifications

Posted Sep 4, 2024 16:46 UTC (Wed) by asahilina (subscriber, #166071)
In reply to: Thoughts and clarifications by wtarreau
Parent article: Whither the Apple AGX graphics driver?

> There *might* possibly be such people but quite frankly I doubt it.

Well, there was Ted Ts'o's rant at Wedson and the others that started with an accusation of wanting to "convert" people to the Rust "religion", followed by a pile of strawman arguments, followed by another person making jokes comparing Rust to Java and more strawmen.

It may not be something that is expressed in the open on the mailing lists daily, but at this point it has become quite clear to me that those people do in fact exist.

This should in fact not be surprising, because those anti-Rust people who cling to C absolutely do exist on the internet (you will meet many of them on Twitter where masks are off much of the time, or you can go to the Phoronix comment section for an extra dose of toxicity if you want). It is logical that at least some of these people would have ended up in Linux maintainer positions.

Thoughts and clarifications

Posted Sep 4, 2024 17:00 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (112 responses)

But can you at least put yourself in the shoes of someone suddenly pressured to change everything that they're trying hard to keep in good working condition ? Has it ever happened to you ? When you start to apply blind changes that you don't sense well, and that for the next 10 years, once a month you have to work several days on a bug that is a direct consequence of a rushed choice that you later regret. Some parts do require deep thinking and the thinking must be made with multiple people.

Saying people "ah you don't want my work, so you're against me and my preferences" is horrible. It simply denies the right to *think* from the people. When you see people that are overly confident in their choices and that are unable to step back and think forward about possible evolutions and consequences in 1 year, 3 years, 5 years etc, and you're still dealing regularly with the consequences of some 10-year old choices, what can you think except "that person pressuring me is lacking experience and refuses to listen to my concerns".

To be honest, when I read your long message, I was a bit scared by the extreme confidence you have in the quality of your own work and your ability to spot many other "older" areas that are wrong. I've worked with people doing that in the past. They'd stay 3 years for the time it takes to transform a forest into a desert, then they suddenly quit without warning because "it's impossible to work with a team that constantly rejects my art" yet the ones in place have to deal with the consequences.

It's important to be able to argue (and sometimes strongly) with others around technical arguments, concerns, difficulties, but beliefs and personal preferences may never be used to try to disprove the other party. That just creates a bias and puts a discussion to a stop, resulting in nothing good. The only possibility that remains then is to just make a lot of noise, saying "look what they do to me", but sometimes it's even quite visible and doesn't do good service to any of the parties.

And I agree with you on one point: mailing lists are not suitable for great explanations. They're fine to exchange points of views, ideas, patches, code reviews or suggestions, but when it becomes a matter of culture or approach, discussions in person are absolutely needed. And if possible with few people so that the discussion can heat a bit if that helps, without anyone having to witness that.

Changing something that works is extremely difficult. Changing it without breaking it is even more difficult. Changing it in a way that guarantees that it will still be possible to change it later is the most difficult. This requires experience and honest cooperation, not bold accusations nor criticisms of everything.

Thoughts and clarifications

Posted Sep 4, 2024 17:13 UTC (Wed) by asahilina (subscriber, #166071) [Link] (32 responses)

> But can you at least put yourself in the shoes of someone suddenly pressured to change everything that they're trying hard to keep in good working condition ? Has it ever happened to you ? When you start to apply blind changes that you don't sense well, and that for the next 10 years, once a month you have to work several days on a bug that is a direct consequence of a rushed choice that you later regret. Some parts do require deep thinking and the thinking must be made with multiple people.

Now we're back to the same strawman arguments... the Rust team aren't asking the C maintainers to "change everything" or practically anything. If anything I've been finding bugs in drm_sched and its users in C code as part of my work. Me not doing this work would have led to the C folks having to debug those things. And Wedson wasn't asking FS developers to change anything in C, he was asking FS developers to help understand the existing C API so it could be mapped to Rust properly.

Of all the Rust abstractions I've written drm_sched was the exception, the one case where the existing design was so bad I really needed to improve it. Those changes were completely non-intrusive, the patch wasn't even 50 lines of code and did not change the API for existing drivers in any way.

You're making it sound like us Rust folks are pushing through rushed, questionable changes to C code just because we "need" it and that just isn't true, at all.

> Changing something that works is extremely difficult. Changing it without breaking it is even more difficult. Changing it in a way that guarantees that it will still be possible to change it later is the most difficult. This requires experience and honest cooperation, not bold accusations nor criticisms of everything.

None of the changes I asked for negatively impacted existing correct code, that was trivial to see since they only affected conditions which supposedly were previously not allowed to happen. They definitely didn't break anything. They definitely didn't introduce some horrible maintenance burden or touch things deep in the architecture. It was just some cleanup code and cloning some debug strings so they weren't dangling pointers. That's it.

Every time people excuse the C maintainers' reaction to Rust it's always strawmen. Things that aren't true. Things they wish were true so they could discount and attack Rust, but which aren't, so they just pretend they are.

And I and the other Rust people are getting really, really, really tired of this nonsense.

> To be honest, when I read your long message, I was a bit scared by the extreme confidence you have in the quality of your own work and your ability to spot many other "older" areas that are wrong.

Well, I'm pretty confident that drm_sched is buggy because I got oops reports from users and had to debug it, multiple times. And so far I haven't had any oops reports in my own driver code from users. So there's that...

Thoughts and clarifications

Posted Sep 4, 2024 17:37 UTC (Wed) by Wol (subscriber, #4433) [Link] (8 responses)

> And I and the other Rust people are getting really, really, really tired of this nonsense.

Welcome to the club.

I'm known for being very anti-Relational. Well, why wouldn't I be, when I seem to spend most of my time waiting for Snoracle and the like to find data.

Relational is O(log(n)) which means searching 100K rows is O(5). Yes it's a guaranteed O(5), but I can provide a 5 nines guarantee my database will be faster (it's O(1) nearly all the time - IRREGARDLESS of the size of the data set).

It's a tradeoff. Do you want an "almost always 80% faster response", or a "guaranteed to get a result but all of them will be slow".

People can't understand why you want "it's impossible to specify a faulty memory operation", people can't understand why I want a database that returns data almost faster than I can hit "return" ...

Cheers,
Wol

Slow down a little

Posted Sep 4, 2024 17:54 UTC (Wed) by corbet (editor, #1) [Link] (1 responses)

Wol, slow down please? Perhaps we don't need to drag this stuff into every unrelated discussion? Please?

Slow down a little

Posted Sep 4, 2024 19:59 UTC (Wed) by Wol (subscriber, #4433) [Link]

Sorry Jon. I think I go in phases, the problem is I'm so frustrated at the moment fighting Excel I'm probably over-compensating elsewhere.

I do feel for Asahi, though, because I do my best to explain things and feel that people just don't want to understand.

I'll do my best to back off. Part of the trouble is that (especially as I'm in the wrong time zone for most people) that seriously gets in the way of a good discussion. Maybe that's a good thing. Stops me getting really out of hand :-)

Cheers,
Wol

Thoughts and clarifications

Posted Sep 4, 2024 17:57 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (1 responses)

> It's a tradeoff. Do you want an "almost always 80% faster response", or a "guaranteed to get a result but all of them will be slow".

I love this example because it's often a difficult choice that changes over time. For example those who pay for the CPU time will most of the time prefer "almost always 80% faster responses". And those who pay strong penalties on SLA will often prefer "guaranteed time but slower". And it happens that these choices are taken in a product at a moment, the product's area of popularity evolves over time, it's good, it moves to other areas and then the initial better choice is suddenly considered as an absurd one. Just because where it acts, the default preference is different.

Thoughts and clarifications

Posted Sep 22, 2024 17:18 UTC (Sun) by Rudd-O (guest, #61155) [Link]

At a certain nameless company that was very, very big and was running, at least my team, a very, very big, private GPU compute cluster, They turned off, by mandate, all of the spectre and meltdown protections. Why? Well, they were paying for the compute time in the electricity, and it was a lot of money. But turning these off meant the workloads that would often take weeks or months would finish days before. And sometimes days is what you have before a product launch (or a tapeout, same thing from our vantage point).

I am not sure entirely, however, how this connects to the conversation of Rust in the curve. Rust is focused on providing what they call cost-free abstractions. That is, all the typesafety and niceties that you see that some people like, some people hate, I myself hated them before... They don't cost anything. You don't pay for them in compute time. You don't pay for them in electricity. At least not doing execution. They're all handled at compile time.

The cost you pay for this is not measured in compute or electricity. The cost you pay for this is that you develop somewhat (a lot!) slower at the beginning. The compiler is constantly fighting you. Because you're doing things that you find natural and normal with your experience level at whenever language you were using before, you started using Rust, where you naturally have less experience at the beginning. Things that Rust defines as impossible using a compiler in a borrowed checker in the compiler.

It's an important cost; you wouldn't want to prototype something rapidly in Rust, especially if you don't have a lot of experience within it, because it's just gonna take you much longer to finish the prototype. But it is a cost that certainly worth paying when what you're doing is going to be used by potentially millions or hundreds of millions of people and some of those people are going to be very interested in seeing how they can exploit that so that they can harm others.

There are definitely trade-offs to be made, but I don't think the Linux kernel as a whole should be making the trade-off of "just keep doing what we were doing, mang, it's gonna be fine", because it really is not.

To be perfectly honest, one can write subpar, suboptimal, slow, and even crashy Rust code. Cough .clone().unwrap() cough. But that is an argument that applies equally to every language. In fact, more to other languages where it's actually easier to write all of those things. Because the language will help you do it.

Thoughts and clarifications

Posted Sep 22, 2024 17:08 UTC (Sun) by Rudd-O (guest, #61155) [Link] (3 responses)

> People can't understand why you want "it's impossible to specify a faulty memory operation",

Not everyone can't understand that. But some can...

Rust (its "safe" part, at least) as a language is not a magical, can't crash your car, safety system. But it does provide certain niceties like, for example, seat belts. Or, you can't shift into park or reverse while you're rolling forward in drive. Or, it won't start until you've depressed the brake pedal. Or, backup cameras. Or, LATCH baby seat anchors. Or, airbags.

C's the ratrod.

Analogies aside, I am actually quite surprised, after only two years of programming in Rust, that it is almost impossible to deliberately cause a invalid memory access bug in Rust, unless you are using "unsafe" or you are using bindings to a C library which itself has a bug. Before that, I didn't even know that such a thing was possible. I thought the only way to get a memory-safe system is to have a garbage-collected system that just tracks pointers for you and you never have to think about that. And of course, never call into a C library.

I strongly suspect that the entire world of systems development, especially embedded in low-level programming, is going to experience a shift that in 20 years, or maybe less, will have most people wondering, why would anyone use C? With roughly the same amount of puzzlement as you would find yourself in, if you heard someone asking for a car with a steering wheel. Obviously every car has a steering wheel. That's just not something you ask.

The borrow checker as a significant advance

Posted Sep 23, 2024 9:03 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

Arguably, the biggest advance Rust brings is the borrow checker; it's hugely intrusive, but it allows Rust to formally verify that you're paying attention to whether or not a given pointer (in Rust's case, a class of pointers called "references") points to a valid place at time of use, without runtime assistance (e.g. a garbage collector).

Combine that with "either shared, or mutable, but not both", and you get a powerful tool for making it easier to reason about your code's behaviour.

The borrow checker as a significant advance

Posted Sep 23, 2024 11:24 UTC (Mon) by intelfx (subscriber, #130118) [Link]

Another hugely impactful thing is Rust's linear/affine type system. This one has actually changed the way I think about variables, objects and resources in _any_ language.

The borrow checker as a significant advance

Posted Sep 25, 2024 9:54 UTC (Wed) by Rudd-O (guest, #61155) [Link]

Oh, 100% right! Ownership checking and the prevention of pointer aliasing are awesome.

Thoughts and clarifications

Posted Sep 4, 2024 17:53 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (21 responses)

> Now we're back to the same strawman arguments...

I'm starting to sense how it can be difficult to have a technical-only discussion with you...

> the Rust team aren't asking the C maintainers to "change everything" or practically anything. If anything I've been finding bugs in drm_sched and its users in C code as part of my work. Me not doing this work would have led to the C folks having to debug those things.

I simply don't know because it's not my code. What I'm trying to explain is that it's very common when coming with a 2-line patch to reveal a deeper problem that needs more work, and *that* work can have serious long-term consequences, and as a result requires more thinking. Then the problem is that the patch author can feel like "they're rejecting my trivial patch, it's just because they hate me" while that trivial patch is in fact a brown paper bag over a deeper problem. This has happened to be many times and I have caused myself frustration many times to contributors who revealed such problems in my code. And believe me, a 2-line patch can end up with 6 months of work to redo lots of things differently. And in my project I don't have to deal with ABI compatibility issues nor stuff like this. That's why I'm saying that I understand to some extents when this can happen. I'm not saying this is the case but the way you seem to instantly map a patch rejection to "C vs Rust", I really don't like it because such form of victimization has already hurt the Rust community a lot in my opinion. I mean, this is a perfect example that will make me more careful later about discussions around this langauge, by fear of entering a maelstorm of false accusations.

> And Wedson wasn't asking FS developers to change anything in C, he was asking FS developers to help understand the existing C API so it could be mapped to Rust properly.

Maybe, I honestly don't know. But similar things happened to me in the past where some people asked me levels of details about my code that I simply didn't have, because the code is the doc when it comes to APIs, so I couldn't do more than reading it again and it takes a lot of time, plus I feel like I'm just wearing my eyes so that the requester can spend his time playing candy crush, which is not really cool.

> Of all the Rust abstractions I've written drm_sched was the exception, the one case where the existing design was so bad I really needed to improve it. Those changes were completely non-intrusive, the patch wasn't even 50 lines of code and did not change the API for existing drivers in any way.

Possibly, I don't know. But it's not the size of the change that matters when you're getting close to the core, it's the impacts and what the change reveals. You said yourself that you've found deep problems with the current API and that it's totally unsafe even for C. Did you ever think that the maintainer himself probably doesn't trust this code at all anymore and is not willing to permit more code to get more intimate with it ?

> You're making it sound like us Rust folks are pushing through rushed, questionable changes to C code just because we "need" it and that just isn't true, at all.

No, I'm not saying that it's what happens, I'm saying that you need to understand that it may be perceived like this sometimes by the person whom you're asking to accept the change, regardless of any language. It would be the same from C to C as well. The language has nothing to do there, yet you're bringing it on the table all the time as the supposed reason for your work not being accepted.

> None of the changes I asked for negatively impacted existing correct code, that was trivial to see since they only affected conditions which supposedly were previously not allowed to happen. They definitely didn't break anything. They definitely didn't introduce some horrible maintenance burden or touch things deep in the architecture. It was just some cleanup code and cloning some debug strings so they weren't dangling pointers. That's it.

Possibly. At least that's your analysis and I totally trust you that it's the intent. Sometimes for a maintainer, opening the possibility that some unsafe code is easier to use means more problems in the future, especially when trying to replace it. I don't know the details, but sometimes that can be an explanation as well.

> Every time people excuse the C maintainers' reaction to Rust it's always strawmen. Things that aren't true. Things they wish were true so they could discount and attack Rust, but which aren't, so they just pretend they are.

Languages again again again... "it's not us who started first it's them!" Please! Maybe if you tried to reach out to people to fix generic API bugs without presenting yourself as representing a language team and putting them by default in the supposed other language one they would be more confident in your motivations ? What do you have against C that you want to put everything non-rust in it and present it as a perpetual ennemy. It looks like pure politics where there's no place for facts nor technical excellence. We could go on with all the other languages in the kernel at this game.

> Well, I'm pretty confident that drm_sched is buggy because I got oops reports from users and had to debug it, multiple times.

Oh I've never questioned that, I totally trust you on that one. What I'm saying is that some maintainers might prefer to keep (for some time) bugs that are understood by them and impossible to trigger under some conditions rather than risk more complicated ones. Many of us had to run through such choices over time. They're unpleasant, they make you like the dirty coder in that corner overthere, and you know that at some point you need to address them. Sometimes you just can't have enough time on the table to deal with them, and you figure it's even harder to bring someone up to speed on them, so you're waiting for finding the best person for the task and it can take years. But such rare persons definitely don't start by placing people in boxes with a language name written on the side :-/

Thoughts and clarifications

Posted Sep 4, 2024 18:00 UTC (Wed) by daroc (editor, #160859) [Link]

> I'm starting to sense how it can be difficult to have a technical-only discussion with you...

[And several other parts on similar lines.]

Please do avoid personal attacks in the comments. I think you may be talking past each other a bit; it's easy for all the attention to end up focused on the times when things go wrong, instead of the times when things go right, and that can make it hard to be charitable. Consider leaving this thread of comments here.

Thoughts and clarifications

Posted Sep 4, 2024 19:43 UTC (Wed) by rywang014 (guest, #167182) [Link] (12 responses)

If a maintainer finds some seemingly trivial patch may lead to big consequences and therefore requires a second look, I guess the best response should not be "Well completely NAK" but some constructive discussions about the issue and the code the patch changes.

Thoughts and clarifications

Posted Sep 4, 2024 20:11 UTC (Wed) by wtarreau (subscriber, #51152) [Link]

I totally agree. There can be bad reason why this was not done here (80th time asked the same thing, bad mood, tired etc) but that would have at least deserved a complement later to detail the reasons.

Thoughts and clarifications

Posted Sep 4, 2024 20:26 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (10 responses)

Indeed, NAK is not constructive and the maintainer acknowledged that. There was also a proposal (part tongue in cheek, part not) that replying with "NAK" would lead to complete blocking of your patches[1] until you document the reason.

[1] the patches of the guy who NAKs

Thoughts and clarifications

Posted Sep 5, 2024 0:59 UTC (Thu) by Ashton (guest, #158330) [Link] (6 responses)

Putting my management hat on, persistent NAKs is a red flag that something’s going on. Either the person in question is overloaded and cannot handle more, or they’re being uncooperative and need a stern talking to. Either way, it’s a clear sign that someone with more authority needs to step in and do something.

Thoughts and clarifications

Posted Sep 5, 2024 2:29 UTC (Thu) by viro (subscriber, #7872) [Link] (5 responses)

Charming. Could you explain who you are and where do you work? Just to make sure I never end up anywhere near you in your managerial role...

Thoughts and clarifications

Posted Sep 5, 2024 2:44 UTC (Thu) by viro (subscriber, #7872) [Link]

PS: ... and I'd love to hear that it's _not_ anywhere near airspace - not considering "no" even a theoretically valid answer, no matter what can be annoying in software, but there's annoyance and then there's what this kind of attitude had produced on that concall in January 1986...

Thoughts and clarifications

Posted Sep 5, 2024 7:23 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

I mean, he did provide a very charitable explanation (which I agree with). Even "You're doing the same thing that has already been NAKed. Please reread my comments from before and tell me what wasn't clear" is better than nacking without a comment.

I've gotten my share of stern NAKs from you, but you've always been extremely constructive and explained what I was doing wrong. But if there's no explanation whatsoever, it is not a win for anyone.

Thoughts and clarifications

Posted Sep 5, 2024 12:08 UTC (Thu) by Ashton (guest, #158330) [Link]

If you find the idea that a manager should step in and figure out why someone is NAKing without feedback offensive, then the feeling is extremely mutual.

Code rejections should come with clear actionable feedback, and it shouldn’t be the requesters job to extract that feedback from the reviewer. Even “I see this and need more time to review” is far more useful than an unexplained “no”.

This kinda behavior is totally fine if it’s an occasional thing, nobody is perfect. But if it’s common then it’s a sign something is up and someone with authority needs to step in and figure out why it’s happening.

Thoughts and clarifications

Posted Sep 5, 2024 12:41 UTC (Thu) by Wol (subscriber, #4433) [Link]

So you're quite happy with more and more straws being loaded on YOUR back?

I'm bitching slightly, I know, but this is the perfect example of people not stepping back and thinking. The GP said "something is going wrong". If your boss comes to you and asks - politely - "what's happening", and you respond "look at the size of my intray!", wouldn't you appreciate your boss going through it and saying "this is important, that's not, what do you think of the other?"

If that's what the GP meant - and I'm sure it is - he's exactly the sort of boss I would like! Yes, you're going to have to piss some people off, but at least you know your boss has got your back.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 5, 2024 16:50 UTC (Thu) by rsidd (subscriber, #2582) [Link]

Ibased on that comment (and past history) I would focus on making sure viro is never in my orbit.

Sorry if this comment violates lwn standards. But I remember Linus stepping back for a bit to examine his behaviour and learning to be better. I don't think he or was by any means the worst offender and if by publicly reassessing his interactions he was hoping to set an example that others would follow, well, nice try.

Thoughts and clarifications

Posted Sep 5, 2024 3:11 UTC (Thu) by viro (subscriber, #7872) [Link]

Depends. Usually some explanation of a NAK is called for, but e.g. if it's a large series that keeps reposted with objections quietly ignored, at some point plain NAK is the only possibly reply - some kinds of persistence really should not be rewarded. And then there's Markus and similar special cases. Or trivial and obviously _in_correct patch (that overlaps with the previous group, though).

Thoughts and clarifications

Posted Sep 12, 2024 3:39 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

He did give reason! Just because Jon only quoted the NAK doesnt mean it is all he said.

Thoughts and clarifications

Posted Sep 13, 2024 14:32 UTC (Fri) by MrWim (subscriber, #47432) [Link]

I think this is what the GP is referring to:

https://lore.kernel.org/lkml/CAPM=9txcC9+ZePA5onJxtQr+nBe...

Quote Dave Airlie:

> The next NAK I see on the list will mean I block all patches from the
> sender until they write a documentation patch, because seriously this
> stuff is too hard for someone to just keep it in their head and expect
> everyone else to understand from reading the code.

So it's not about documenting the reasons for a given NAK, it's about adding documentation to the drm_sched code actually describing how it works and how it can be used safely.

Thoughts and clarifications

Posted Sep 5, 2024 0:22 UTC (Thu) by Ashton (guest, #158330) [Link] (3 responses)

> I'm starting to sense how it can be difficult to have a technical-only discussion with you...

No definition of politeness has ever required that someone not point out falsehoods. If having someone point out that what you’re saying is contrary to the written record is “difficult” then that’s your issue.

Thoughts and clarifications

Posted Sep 5, 2024 13:44 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

I have to point out though, that "truth" often depends on your viewpoint.

Case in point - journalists are ever eager to demand "scientific proof positive". THERE IS NO SUCH THING. And when two people are talking past each other, it's pretty obvious they have two different (and they could both be right!) definitions of truth.

When two logical, rational people disagree, it's a pretty safe bet they don't share the same set of facts. That appears especially true in this case.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 5, 2024 13:57 UTC (Thu) by pizza (subscriber, #46) [Link]

> When two logical, rational people disagree, it's a pretty safe bet they don't share the same set of facts. That appears especially true in this case.

I disagree (heh).

Two logical, rational people can easily share the same set of underlying facts, but disagree about how how much weight each individual fact should carry in any given decision.

(This is epitomized by the expression "Fast, cheap, good -- pick two")

Thoughts and clarifications

Posted Sep 22, 2024 17:40 UTC (Sun) by Rudd-O (guest, #61155) [Link]

Oftentimes yes, and you are completely correct about that.

However, the two controversies that I am seeing clearly here do not (at least to me) seem to be matter of opinion.

The technical controversy arising from that patch that fixed the circularity issue in the ownership of objects under DRM seems quite clear cut to me. Circularity is almost always bad design; and having some object hold a reference to another which holds a reference to the first (for printk()'s sake), which then prevents the first object from freeing the other or the other from freeing the first safely? Bad. Asahi's fix should have gone in without question, rather than invoke curt NAK.

The more social or human controversy about the animosity that Rust developers are getting from some kernel developers that do not want or do not appreciate some strictures that (they perceive) Rust is imposing on them? Well, you could argue it could go either way, but we have video. We have video of a rust developer explaining how the type system prevents certain classes of errors to an audience of C file system kernel developers. And one of the kernel developers interrupts him, doesn't let him finish and accuses him of trying to spread the Rust religion or impose it on the kernel space. This is comically easy to judge.

No one in the Rust for Linux project is trying to rewrite the kernel into Rust or trying to make everybody ditch C and learn Rust. (I would if I had a magic wand, but I don't). What I have seen indicates to me that Rust for Linux developers are discovering deficiencies in existing subsystems, and structures, and algorithms, and drivers, and APIs, and have been trying to fix those deficiencies with the best of intentions, and also have been trying to help the C developers see that there is a way that they can prevent having those deficiencies. And not everybody, not the majority, not many, But sadly a few kernel developers have reacted in a negative and destructive way to this effort.

A computer language that is more rigid than another has its drawbacks. And sometimes these drawbacks are very serious. But one thing that is not a drawback is with that language and its strictures help you find issues that were previously unseen before. Folks who are exposed to these previously unseen issues should be thankful that these issues have been brought to light because now that they can be seen and addressed, they can be fixed. Can you make bad APIs with Rust? Absolutely. Is it easier to make bad APIs with Rust than with C? struct void* struct {}.

Thoughts and clarifications

Posted Sep 6, 2024 5:36 UTC (Fri) by marcH (subscriber, #57642) [Link] (1 responses)

> > Of all the Rust abstractions I've written drm_sched was the exception, the one case where the existing design was so bad I really needed to improve it. Those changes were completely non-intrusive, the patch wasn't even 50 lines of code and did not change the API for existing drivers in any way.

> Possibly, I don't know. But it's not the size of the change that matters when you're getting close to the core, it's the impacts and what the change reveals. You said yourself that you've found deep problems with the current API and that it's totally unsafe even for C. Did you ever think that the maintainer himself probably doesn't trust this code at all anymore and is not willing to permit more code to get more intimate with it ?

I see this as the core issue and key contradiction here. On one hand the design is "very bad and incredibly complicated and brittle", but on the other hand you submitted a "small, non-intrusive patch that barely changes anything". Mmmmm... anyone with a little bit of experience now how this song _usually_ ends; I won't repeat wtarreau.

You look like one of those exceptional developers who are indeed capable of playing that sort of Jenga successfully. But it's not enough to be correct: building up the corresponding _trust_ unfortunately requires a massive amount of explanations, demonstrations, demo / test code and generally: time that you said you don't have. While the Rust versus C "racisms" do not help, I also suspect it would not be so different if you removed Rust from the picture. We've seen that story many times before - in a single language.

If Linux gets a new, generic scheduler entirely in safer Rust code then we all win? A complete rewrite is also a typical next chapter after "it's very bad and too complicated" - again even with a single language. Cause it's... faster.

Thoughts and clarifications

Posted Sep 6, 2024 12:37 UTC (Fri) by daenzer (subscriber, #7050) [Link]

Very well put!

Thoughts and clarifications

Posted Sep 22, 2024 17:24 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> because the code is the doc when it comes to APIs,

Quite ironically, this is exactly what the rust for Linux people are trying to do. And it is not exaggeration to say that exactly that, wanting the code to document the behaviors of the API is what got Wedson interrupted by Ted.

Put the behavior of the API in the types. That is all what the Rust people are asking for. Nothing more. Actually, they're asking for something that's even less than that. They're asking for cooperation from the C part of the kernel so that they can do that work of putting the meaning of the API and the behaviors of the API into the type system so that the system can be safe and sound.

To me this controversy does sound like a few kernel developers, not everybody, not the majority, but a few, are seriously looking at a gift horse in the mouth. Maybe they just don't want the horse? And as a result, nobody else gets the pony we want.

Now I'm not a kernel developer, I'm simply a roast programmer, a fairly new one, but I do know, having used Linux since 1996, that I want superior kernel code quality and stability and fewer security issues. We are never getting there, unless we move on from C, at least for the critical parts of the system. And I frankly do not care if it is in Rust or in another language, But if we are going to jump out of a puddle and not fall into another, it has to be a type safe and sound language that protects against the sort of problems that see causes. There is this precious opportunity right now to take that chance by using Rust. I hope we don't squander the opportunity and have to wait 10 more years until either everybody is using a different operating system, or someone invents a superhuman AI that can type, check, see, and its runtime properties properly, or Linux has finally decided that it's going to incorporate sound and type safe programming tooling beyond C + glorified linters.

Thoughts and clarifications

Posted Sep 4, 2024 18:50 UTC (Wed) by mb (subscriber, #50428) [Link]

Lina, I just want to say thanks to you. Thanks for the work you did and hopefully still plan to do.

Be assured that there are many people on your side of the discussion.
Even if they are more silent than the verbose people who don't admit that they are wrong.

Thoughts and clarifications

Posted Sep 4, 2024 18:13 UTC (Wed) by Deleted user 129183 (guest, #129183) [Link] (20 responses)

> They'd stay 3 years for the time it takes to transform a forest into a desert, then they suddenly quit without warning because "it's impossible to work with a team that constantly rejects my art"

This observation is very on point, since this is what has recently *actually* happened in relation to the ’Rust for Linux’ project. If anyone does not remember:

https://lwn.net/Articles/987635/

Even the time frame is largely accurate, lol.

Thoughts and clarifications

Posted Sep 4, 2024 18:25 UTC (Wed) by daroc (editor, #160859) [Link] (19 responses)

I think that's somewhat unkind, given that one of more than thirty Rust-for-Linux developers (looking only at people with changes in the rust folder in the past year) chose to leave the project. Working on open source software can be hard for everyone.

Thoughts and clarifications

Posted Sep 4, 2024 18:45 UTC (Wed) by Deleted user 129183 (guest, #129183) [Link] (18 responses)

> I think that's somewhat unkind, given that one of more than thirty Rust-for-Linux developers (looking only at people with changes in the rust folder in the past year) chose to leave the project.

So far. But reading such articles like the one above, I think that more people will likewise resign in the near future. Especially since it seems that the Rust culture (that likes to ‘move fast and break things’) is a poor match for the Linux culture (where even the very important changes can take more than two years to be done).

Thoughts and clarifications

Posted Sep 4, 2024 18:48 UTC (Wed) by corbet (editor, #1) [Link] (12 responses)

The Rust folks have neither moved fast nor broken things. This kind of comment is not helpful.

Thoughts and clarifications

Posted Sep 4, 2024 19:45 UTC (Wed) by khim (subscriber, #9252) [Link] (10 responses)

Technically Rust culture is releasing new, incompatible, versions of crates very often.

That's what C++ folks call “move fast and break things”, but compared to Linux development that's actually “move slow”, because breaking changes in Rust crates are still coordinated and pre-announced, while Linux-internal APIs sometimes are broken without warnings and it's not even possible to use code that is designed against old API while in Rust linking in two incompatible versions of crates into one binary is allowed and supported case.

Rust developers are often liking to use latest version of Rust compiler, while Linux kernel is extremely conservative in regard to use of new features of gcc or clang, but that's much less pressing concern: kernel is known to to include optional features that even require use of pre-release versions of gcc or clang if they are worth it.

Thoughts and clarifications

Posted Sep 4, 2024 19:58 UTC (Wed) by mb (subscriber, #50428) [Link] (8 responses)

>Technically Rust culture is releasing new, incompatible, versions of crates very often.

Well, you could not be more wrong than that.
The opposite of what you say is true.

Crate maintainers almost all care deeply about compatibility and use Semver to express that.
Breaking changes are not frequent in most crates and if breaking Semvers are released, it's often trivial to upgrade. Breaking changes are not frequently done for most crates.

Yes, there are some crates that update and break often. But saying that this is "the Rust culture" is just plain wrong and shows more about your experience with the Rust community than the Rust community itself.

This is all the complete opposite of the C/C++ universe, where a commonly agreed versioning scheme does not exist, everybody does versioning differently.
The kernel is a prime example of not having a stable internal API and breaking things all the time.

Thoughts and clarifications

Posted Sep 4, 2024 20:29 UTC (Wed) by khim (subscriber, #9252) [Link] (7 responses)

> The opposite of what you say is true.

Seriously? Even rand, a very narrow crate that you first encounter in a Rust Book have 9 versions. Clap (that you find in many other tutorials) have 13 versions, ndarray have 16 and so on.

That's quite a lot, compared to many other languages where you may find 3-4 versions released over decade instead of 10-15, and where every release is “a big deal”™.

> Breaking changes are not frequent in most crates and if breaking Semvers are released, it's often trivial to upgrade.

Sure, but that doesn't change the fact that changes are breaking and support for old version is, often, immediately dropped when new version is released.

As I have said: it's still better than how in-kernel APIs are treated, but that's unusual from POV of Java or even C++ developers.

> But saying that this is "the Rust culture" is just plain wrong and shows more about your experience with the Rust community than the Rust community itself.

Can you show me any Rust apps that doesn't rely on these crates that have dozen releases or more?

> This is all the complete opposite of the C/C++ universe, where a commonly agreed versioning scheme does not exist, everybody does versioning differently.

True, but how many C++ libraries that have more than dozen incompatible releases can you name? They exist, sure, but how common they are?

Qt had fewer incompatible releases in ⅓ century than rand in 10 years! And if you compare size of API that Qt offers to what rand offers… difference is even more stricking.

Thoughts and clarifications

Posted Sep 4, 2024 21:18 UTC (Wed) by mb (subscriber, #50428) [Link] (6 responses)

>Seriously? Even rand, a very narrow crate that you first encounter in a Rust Book have 9 versions

The latest version 0.8.x is supported and compatible since more than three years.

>Clap (that you find in many other tutorials) have 13 versions,

version 4 is compatible since two years.

>That's quite a lot

No, it's not. The criteria for bumping the major are completely different compared to almost all major C libraries.
Even extremely small theoretical breakages cause a major bump.

>where every release is “a big deal”™.

It's not.

>Sure, but that doesn't change the fact that changes are breaking and support for old version is, often, immediately dropped when new version is released.

So? That's exactly the same for basically every Open Source software out there.
There are only very few projects providing long term support of old versions.

And nobody stops you from supporting your favorite old "rand".

You are asking for long term support that you get nowhere else.

>Can you show me any Rust apps that doesn't rely on these crates that have dozen releases or more?

The times any build broke in the whole time I used Rust is in the low single digits. I think it's two or three times.
Updates are extremely smooth.

>Qt had fewer incompatible releases in ⅓ century than rand in 10 years!

Oh. So you also don't have any experience with Qt major upgrades.

Great. Let me explain it to you: Most of the major Qt version bumps require massive changes in the project.
Whereas most of the crate major version bumps just work with little to no change.

The number of major versions does not mean anything, if you accumulate the changes until an extremely loud big bang release.
It can even be argued that a big release every 5 years is worse than small incremental changes every year.

Thoughts and clarifications

Posted Sep 4, 2024 22:26 UTC (Wed) by khim (subscriber, #9252) [Link]

> It can even be argued that a big release every 5 years is worse than small incremental changes every year.

That's different question, though. It's question of whether “move fast and break things” approach is better than alternatives.

> The latest version 0.8.x is supported and compatible since more than three years.

While similar C++ facility had no breaking changes ever. But was extended few times.

> You are asking for long term support that you get nowhere else.

I had it going for ten years with Python2, Java8 and many other tools, sorry.

> Great. Let me explain it to you: Most of the major Qt version bumps require massive changes in the project.
Whereas most of the crate major version bumps just work with little to no change.

Sure, but that, again, discusses virtues of “move fast and break things” approach versus “keep it stable as long as you can, then do a massive break when you can not anymore” approach.

I think nowadays “move fast and break things” approach becomes more and more popular (and as I have pointed out and you repeated that's how kernel manages internal APIs, too).

But that doesn't change the fact that it's different approach from what many other languages, especially “enterprise” ones, practise (or practised).

Lots of projects, these days, go overboard with “move fast and break things”, though. At least temporarily. Although they eventually change their approach AFAICS: even flagship of that approach, Abseil, these days offers more Rust-like approach with compatible releases every half-year or so. They proudly proclaim them LTS, which, of course, sounds ridiculous since they are only supported for one year, but still… it's closer to what Rust does then to either “everyone should just live on HEAD” or “breaking changes should happen once per half-century” extremes.

Thoughts and clarifications

Posted Sep 5, 2024 3:08 UTC (Thu) by legoktm (subscriber, #111994) [Link] (1 responses)

I agree with mb, "move fast and break things" is not at all how I would describe the attitude of the Rust community. I think people are very intentional about not breaking things and as a result, take a very literal stance with what is a breaking change (see e.g. cargo-semver-checks).

I'd also say that people care a lot about good API design, and as a result iterate (with breaking changes) until they reach 1.0 and then intend to keep it stable forever like serde. If I had to complain about something it's probably that people are, in my opinion, too perfectionist, and don't declare the 1.0 despite their crate being stable. (Of course, I'm also guilty of this in my own crates.)

Thoughts and clarifications

Posted Sep 5, 2024 7:11 UTC (Thu) by khim (subscriber, #9252) [Link]

> If I had to complain about something it's probably that people are, in my opinion, too perfectionist, and don't declare the 1.0 despite their crate being stable.

Indeed, lots of very basic crates are stuck forever at version zero, even such basic crates as libc

> I'd also say that people care a lot about good API design, and as a result iterate (with breaking changes) until they reach 1.0 and then intend to keep it stable forever like serde.

This may be, very well, their intent (and in some rare cases, like with Rust compiler itself, even actual accoplishment), but that's not what developer have to deal with. In the absence of that mythical version 1.0 crate people are forced to use what they have available. And what they have available is, very often, not that bad! For all practical purposes, in a Rust world, version 1.0 is just a number: if crate is version zero crate then minor number work like major number for crates after version 1.0. And it's not as if breaks stops after version 1.0: syn is at version 2.0, clap is at version version 4.5, etc.

And cargo-semver-checks is certainly not unique, that's Rust version of abidiff, essentially.

And it maybe even true that radical and drastic breakages every dozen of years may be harder to deal with than regular and frequent yet minor breakages, but that doesn't change the fundamental approach to how Rust community operates: while many developers dream of replicating the Rust compiler feat of breaking things and moving fast in the beginning while reaching eventual stability, after which development still advances but at glacial speed, but often they only manage to achieve the first part. That's still more honest and better than many C libraries that proclaim to release compatible versions which in reality break programs, but one couldn't claim your are not breaking things if you routinely release new, incompatible, versions while simultaneously stop supporting old versions.

Thoughts and clarifications

Posted Sep 5, 2024 10:01 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (2 responses)

> Most of the major Qt version bumps require massive changes in the project.

I maintain a few Qt projects. Since when are massive changes needed to bump? That is not my experience at all.

Thoughts and clarifications

Posted Sep 5, 2024 10:09 UTC (Thu) by mb (subscriber, #50428) [Link] (1 responses)

>Since when are massive changes needed to bump?

2 to 3, 3 to 4 and 5 to 6 were pretty massive changes in my projects.
That only leaves 4 to 5 as a small upgrade with small changes for me.

Qt major version upgrades

Posted Sep 8, 2024 8:53 UTC (Sun) by chris_se (subscriber, #99706) [Link]

What in 5 to 6 was a massive change that actually caused pain? 5 to 6 was extremely painless in my experience, even less so than 4 to 5 (which was already fine). 3 to 4 was a huge pain though.

Thoughts and clarifications

Posted Sep 22, 2024 17:46 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> Technically Rust culture is releasing new, incompatible, versions of crates very often.

If you are a Rust programmer, you are not forced to upgrade to the latest and greatest crate. You could just keep using the old crate, it's still published, you can still download it, and unless you have security issue, there's no issue for you. You can even use multiple versions of the same crate in the same project, and it just works. This is the opposite of move fast and break things. It is rather move fast, keep old things working the way they were.

Moreover, kernel developers don't gratuitously use crates like your comment would seem to imply. The vast majority of crates are simply unusable in the kernel because they depend on the standard library. And the standard library cannot be linked into the kernel because the standard library has some requirements regarding memory allocation that cannot be fulfilled by the kernel.

I find it absolutely amazing that you can add a few YAML lines to your GitHub project and there's an entire computer network that will automatically upgrade all of your crates in your Rust project that's on GitHub. And then subsequently your CI fires and everything is tested so that you know all the upgrade didn't break anything. I used that all the time. But this is absolutely unrepresentative of how kernel development with Rust code is done. Maybe someday that will be the case. Maybe in 25 years. We're not even close to that. We need to get even a few crates going in the kernel before that's even a concern in anyone's radar.

If anything, Rust in the kernel is actually move slow. And if we are to conclude anything from the rust for Linux developers' contributions to the Linux kernel, it has been move slow and fix other people's things.

Thoughts and clarifications

Posted Sep 4, 2024 19:48 UTC (Wed) by corbet (editor, #1) [Link]

I should clarify that I was talking about the behavior of the Rust developers in the kernel project. I'm taking no position on all proponents of any language.

Thoughts and clarifications

Posted Sep 5, 2024 0:51 UTC (Thu) by Ashton (guest, #158330) [Link]

Rust culture likes to “move fast and break things”? I am genuinely baffled how you came to this conclusion, it is the exact opposite of what I see.

The most recent drama was about some C developers asserting that they will break things and not even inform the rust developers.

Thoughts and clarifications

Posted Sep 5, 2024 10:56 UTC (Thu) by agateau (subscriber, #57569) [Link] (1 responses)

> Rust culture (that likes to ‘move fast and break things’)

There is a difference between A) breaking things unannounced and B) breaking things by bumping the major version of your project.

In my experience it's much more common in the Rust ecosystem to go with B than with A. And B is usually not a problem in that dependent projects are unlikely to hit unexpected build breakages. My experience in other ecosystems is very different...

Thoughts and clarifications

Posted Sep 5, 2024 12:13 UTC (Thu) by Ashton (guest, #158330) [Link]

Also, the discussion should be about how the rust for Linux people are behaving, not rust developers in general. Different sub-groups of a language committee can and do develop different attitudes and norms around things, especially stuff like versioning, dependencies, and backwards compatibility.

In the abstract if someone asserted that a major, risk sensitive project in a language took a much more conservative approach to dependencies and change than the average user of the same language I would be utterly unsurprised.

Thoughts and clarifications

Posted Sep 5, 2024 19:38 UTC (Thu) by MarcB (guest, #101804) [Link]

> So far. But reading such articles like the one above, I think that more people will likewise resign in the near future. Especially since it seems that the Rust culture (that likes to ‘move fast and break things’) ...

Where is this coming from?! "Moving fast and breaking things" is basically the least fitting description of "Rust culture" (whatever that may be).

Thoughts and clarifications

Posted Sep 22, 2024 17:41 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> Especially since it seems that the Rust culture (that likes to ‘move fast and break things’)

🤣 where would anyone get that opinion from? Honest question!

Thoughts and clarifications

Posted Sep 14, 2024 15:20 UTC (Sat) by sunshowers (guest, #170655) [Link] (57 responses)

This is a very C mindset, one that centers fear.

Thoughts and clarifications

Posted Sep 14, 2024 15:39 UTC (Sat) by Wol (subscriber, #4433) [Link] (56 responses)

> This is a very C mindset, one that centers fear.

It's called "technical debt".

Most people don't like maintenance, for PRECISELY that reason - it's a slog.

And idiots who aren't afraid of breaking things are people who cause thousands of flights to be cancelled, credit card payments to stop working, etc etc. It's not a C mindset, it's the natural mindset of older people who've seen (and been hurt by) the consequences of young inexperienced people cocking up.

I'm lucky - I messed up very early in my career, and ever since while I'm quite happy to plough ahead and break things, I've always been conscious of the fact that breakage needs to be avoided if possible.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 14, 2024 23:16 UTC (Sat) by sunshowers (guest, #170655) [Link] (55 responses)

Right. The C mindset is that "I'm afraid to make changes" is the end of the conversation. I get it, having maintained C before. Every line of C you write or modify, your hairs are probably standing on end.

The Rust mindset is that "I'm afraid to make changes" is the start of the conversation. It's reasonable to be concerned of making changes, but how do you make it as easy as possible? Encoding lifetimes into the type system, having a separation between shared and mutable access, going all-in on encapsulation, etc.

Thoughts and clarifications

Posted Sep 14, 2024 23:46 UTC (Sat) by viro (subscriber, #7872) [Link] (54 responses)

>Right. The C mindset is that "I'm afraid to make changes" is the end of the conversation. I get it, having maintained C before. Every line of C you write or modify, your hairs are probably standing on end.

Not to discourage your noble efforts, but could you possibly aspire to somewhat higher quality of trolling? There are some standards to language holy wars, and your contribution is... falling short, to put it very mildly. There is a lot of examples of that genre available for study - search the comp.lang.* archives and you'll find really impressive ones. If you must use chatgpt, at least train it on good examples...

Overall: D-.

Thoughts and clarifications

Posted Sep 15, 2024 0:57 UTC (Sun) by intelfx (subscriber, #130118) [Link] (2 responses)

> could you possibly aspire to somewhat higher quality of trolling?

Well, could you?

Accusing people who happen to hold an opinion you disagree with of trolling to silence or discredit them is so <insert a timestamp well in the past>.

Thoughts and clarifications

Posted Sep 15, 2024 4:13 UTC (Sun) by viro (subscriber, #7872) [Link] (1 responses)

> Accusing people who happen to hold an opinion you disagree with of trolling to silence or discredit them [...]

Huh? Why would I want to silence them? And what does opinion being claimed (nevermind "held") have to do with anything?

Language holy war is an art form. When done right, it can be subtle and highly amusing to watch, but that kind of move is just plain wrong at this stage. Overwrought rhetoric in the first part would be about right for a retort deep in a subthread that has already devolved into a pure exchange of insults; here it's in the wrong place. And appending to that a paragraph of stock praises to $OBJECT_OF_WORSHIP is always a faux pas, especially when execution is so uninspiring - stylistic mismatch is awful.

Objections had been about the style, not the "contents"; I thought I made that very clear, but apparently that didn't come through
well enough. As for the alleged contents... do we really need to discuss that, starting with the equivalent of "I ate a fruit and I know how awful do they taste"? Not to mention the expression "$LANGUAGE mindset", which is a shining example of the same fallacy...

Language is not an identity. It's a tool. "$X is written in C" covers a huge range of styles/degrees of cleanliness/etc. So does "$X is written in Rust"; those ranges overlap a whole lot and as for the factors in cost of modifications... C vs Rust is really, really minor compared to the variability among C programs and variability among Rust ones. I will not insult the poster by assuming they are too ignorant to realize that, and that's precisely what taking the first part at the face value would imply.

I'm all for taking the piss out of self-righteous cretins who blather about immense superiority/inferiority of languages; as I said, language holy wars can be highly amusing, especially if aforementioned cretins get maneuvered into exposing their ignorance in their $OBJECT_OF_WORSHIP. As long as editors' requests to stop a subthread that goes in direction unacceptable for lwn.net get promptly honoured, I see no problem with that. But for pity sake, do that in style...

Thoughts and clarifications

Posted Sep 15, 2024 12:32 UTC (Sun) by sunshowers (guest, #170655) [Link]

> C vs Rust is really, really minor compared to the variability among C programs and variability among Rust ones.

I understand where you're coming from — folks have been burned by the promise of so many languages in the past — but this is not true in the case of Rust specifically. Using a language which directly tackles mutability xor sharing makes code just fundamentally better and more correct. Rust programs are consistently higher quality than C ones.

Thoughts and clarifications

Posted Sep 15, 2024 2:17 UTC (Sun) by sunshowers (guest, #170655) [Link] (50 responses)

I very sincerely believe in everything I said, based on many years of C and Rust experience.

Thoughts and clarifications

Posted Sep 15, 2024 4:56 UTC (Sun) by viro (subscriber, #7872) [Link] (49 responses)

In my experience the costs of modification in C codebases vary so much that any universal statements regarding those costs are flat-out unbelievable.

Thoughts and clarifications

Posted Sep 15, 2024 5:52 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (48 responses)

While the costs vary, I have yet to see even a moderately complex C codebase where refactorings are _easy_.

Thoughts and clarifications

Posted Sep 15, 2024 8:41 UTC (Sun) by Wol (subscriber, #4433) [Link] (47 responses)

Have you ever seen a moderately complex *RUST* codebase that is easy to refactor? I would have thought the phrase "moderately complex" was enough to make it clear *any* codebase would be hard to refactor.

The thing is, how easy is it to "write what you mean"? A language that makes it easy to express complex requirements in simple language just pushes a "complex codebase" further down the road before you hit it. Rust comes over as that sort of language, but I've never used so I wouldn't know.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 15, 2024 11:37 UTC (Sun) by pizza (subscriber, #46) [Link] (44 responses)

> Have you ever seen a moderately complex *RUST* codebase that is easy to refactor? I would have thought the phrase "moderately complex" was enough to make it clear *any* codebase would be hard to refactor.

Rust (and codebases using it) haven't really matured (ie been around long enough) to reach this critical point.

All of this emphasis on "specifying it right" is all well and good, but... the definition of "right" changes over time (sometimes drastically) along with the requirements.

Linux has historically placed great emphasis on (and heavily leaned into) internal interfaces and structures being freely malleable, but the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

"Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

Thoughts and clarifications

Posted Sep 15, 2024 11:54 UTC (Sun) by Wol (subscriber, #4433) [Link]

> All of this emphasis on "specifying it right" is all well and good, but... the definition of "right" changes over time (sometimes drastically) along with the requirements.

Banging on again, but with a state table you can (and should) address all possible options. Some things are hard to specify that way, some things you personally don't need to address, but if you have three possible boolean state variables, then you have eight possible states. If you only recognise five, and your solution precludes solving one of the other three, then your code will need replacing. If you can't be bothered to address the other three, but your code is designed to make it easy for someone to come along later and add it, then that's good programming.

A good logical (mathematical?) spec/proof should point out all the possible "wrong" paths so if they become a right path they're easily fixed.

> "Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

The real world is like that :-)

Cheers,
Wol

Thoughts and clarifications

Posted Sep 15, 2024 12:19 UTC (Sun) by asahilina (subscriber, #166071) [Link] (10 responses)

> the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

My experience having gone through several major refactorings of the drm/asahi driver is that it is correct. Some of the bigger ones are:

- Going from a "demo" fully blocking implementation (the first one we shipped) to implementing queues and scheduling and asynchronous work
- Dropping in GPUVM and fully implementing VM_BIND, which along the way ended up changing how non-GPUVM kernel objects are managed including making changes to the heap allocators [1]

It is hard to explain just how liberating it is to be able to refactor and rearrange non-trivial things in the code, fix all the compiler errors, and then end up with working code more often than not. Sure, if you're unlucky you might run into a logic error or a deadlock or something... but with C it's almost impossible to escape adding a new handful of memory safety errors and new flakiness and bugs, every time you make any non-trivial change to the structure of the code.

This is true even when you're interfacing with C code, as long as its API is documented or can be easily understood. The GPUVM change involved first writing abstractions for the underlying C code. That API is reasonably nice and well documented, so it wasn't hard to get the Rust abstraction right (with some care) [2], and then when it came to dropping it into the Rust driver, everything just worked.

Most people don't believe this until they actually start working with Rust on larger projects. All the Rust evangelism isn't just zealotry. There really is something magical about it, even if it might be overstated sometimes.

[1] https://github.com/AsahiLinux/linux/commit/93b390cce8a303...
[2] https://github.com/AsahiLinux/linux/commit/e3012f87bf98c0...

Thoughts and clarifications

Posted Sep 16, 2024 9:05 UTC (Mon) by Wol (subscriber, #4433) [Link] (4 responses)

> > the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

> My experience having gone through several major refactorings of the drm/asahi driver is that it is correct. Some of the bigger ones are:

Out of curiosity, would you describe *your* codebase as complex? Or would you say "my code is simple because Rust handles the complexity for me"?

Or even "the driver problem itself is fairly simple, and Rust just makes it easy to express it"? (Put another way, "C makes the problem a lot more complicated than it should be"!)

Cheers,
Wol

Thoughts and clarifications

Posted Sep 16, 2024 9:48 UTC (Mon) by asahilina (subscriber, #166071) [Link] (3 responses)

Hmm... there are a few dimensions here.

I would say the driver has medium complexity for a GPU driver (in terms of line count it's almost 4x drm/panfrost and around the same as the GPU part of drm/msm). Rust doesn't directly reduce complexity (the driver has to do what it has to do), but it does handle a lot of error-prone boilerplate for you (for example enforced RAII) and it strongly encourages design that makes it easier to reason about the complexity (separation of concerns/encapsulation). So Rust makes it easier to maintain the complexity, understand it, and avoid bugs caused by it. I'm a lot more comfortable dealing with complex code in Rust than in C.

Then, there are some aspects where Rust is specifically a very good fit for this particular GPU driver. One of them is using Rust proc macro magic to describe multiple firmware version and GPU generation interfaces (the firmware interface is not stable) in a single implementation, as cleanly as possible. To do the same thing in C you either end up duplicating all the code, or using ugly preprocessor or build system tricks (drm/apple in our tree is a C driver that has to do this, and it's not pretty. Rust would be a good fit for a rewrite of that driver too for multiple reasons, but we need DRM KMS bindings first). The other one is (ab)using Rust lifetimes to represent GPU firmware interface lifetimes, which makes handling the firmware interface much less error-prone (and this is critical, because an error crashes the GPU firmware irrecoverably). So Rust helps with those more specific kinds of complexity.

At the end of the day it all really boils down to Rust benefiting from decades of programming experience and history in its design. C was designed at a time when programs were a tiny fraction of the size they are today. The entire 1983 UNIX kernel had around the same line count in C as my drm/asahi driver does in Rust. Linux is more than a thousand times more code today, and it shouldn't be a surprise that a programming language designed for codebases 1000x smaller might not be the best option these days. We have learned a lot since then about how to manage complexity, and Rust takes a lot of that and applies it to the kind of systems language that is suitable for writing kernels.

Thoughts and clarifications

Posted Sep 16, 2024 11:16 UTC (Mon) by Wol (subscriber, #4433) [Link] (2 responses)

> Rust doesn't directly reduce complexity (the driver has to do what it has to do), but it does handle a lot of error-prone boilerplate for you (for example enforced RAII) and it strongly encourages design that makes it easier to reason about the complexity (separation of concerns/encapsulation). So Rust makes it easier to maintain the complexity, understand it, and avoid bugs caused by it. I'm a lot more comfortable dealing with complex code in Rust than in C.

So in other words "It's a complex problem, but Rust makes it simple to express that complexity"?

I'm just trying to get a handle on where Rust lies on the problem-complexity / language-complexity graph. I'll upset Jon with this, but I hate Relational/SQL because imnsho Relational lies too far on the simplicity side of the graph, so SQL has to lie way over on the complexity side. So in terms of Einstein's "make things as simple as possible, but no simpler", Relational/SQL lies way above the local minimum. Rustaceans probably feel the same is true of C and modern hardware.

Do you feel Rust lies close to the sweet spot of minimal possible complexity? It certainly comes over you find it easy to express the complexity of the hardware.

Cheers,
Wol

Thoughts and clarifications

Posted Sep 16, 2024 11:57 UTC (Mon) by jake (editor, #205) [Link] (1 responses)

> I'll upset Jon with this

Wol, it's more than just Jon who is tired of you bringing up this stuff in every thread, often multiple times, when it is not particularly relevant. Your comments are voluminous, people have complained to you about that and the content of your posts in comments here, and you are one of the most filtered commenters we have. I think you should perhaps reduce your volume of comments and try to ensure that the drum you are banging does not come up everywhere.

just fyi,

jake

Thoughts and clarifications

Posted Sep 16, 2024 12:21 UTC (Mon) by paulj (subscriber, #341) [Link]

Perhaps putting some stats on "ranking by volume of comments (over the site in last X period, for X in {a, b, c} time|this story)" for a user to that user somewhere would help softly nudge people, where needed, on a self-educating basis?

Thoughts and clarifications

Posted Sep 22, 2024 18:05 UTC (Sun) by Rudd-O (guest, #61155) [Link] (4 responses)

I have reason to believe that you are talking to people who have never seen a match statement in their lives. And so they're used to knowing that when they make a change somewhere in the code, somewhere else very, very far away, and if or case select statement no longer matches that condition that you just added to the code, and therefore things break spectacularly at runtime.

That lack of experience is why they continue issuing the truism that refactoring is "very difficult" and you don't really know when you're changing code if something else is going to break. They haven't gotten the compiler to yell at them, "you're missing this case", because they have never experienced it. Reflectoring is super easy when the computer is doing the thinking of myriad otherwise irrelevant trivialities for you!

There really is something magical about it. And to try and explain to people that haven't seen that, quote, magic, is almost impossible. It's like trying to explain electricity to someone from the 1600s. And it is equally frustrating. In fact, it is doubly frustrating because unlike electricity in the 1600s, this is something that is very easy to witness, you just have to read a little code and push a button in a webpage and you can see it. And they just refuse. It is so oddly disconcerting.

Thoughts and clarifications

Posted Sep 22, 2024 18:29 UTC (Sun) by pizza (subscriber, #46) [Link] (3 responses)

> you just have to read a little code and push a button in a webpage and you can see it. And they just refuse. It is so oddly disconcerting.

$ sloccount `find projdir -name *.[ch]`
[...]
Total Physical Source Lines of Code (SLOC) = 2,278,858

Call me naive, but "read a little code and push a button on a web page" isn't going to cut it.

Thoughts and clarifications

Posted Sep 25, 2024 9:44 UTC (Wed) by Rudd-O (guest, #61155) [Link] (2 responses)

That's a lot of code.

To get back to the topic (common pitfalls of refactoring and how Rust helps avoid errors):

Can you articulate what a match statement does, and how it behaves when you add a new case somewhere very far away from the match statement? How is it different from, say, a chain of if/else or a select case? If your codebase was (hypothetically) Rust, what would the compiler say to such a change, versus what the C compiler says today?

My intention is to figure out if you have had a chance to compare both C and Rust in order to form an honest, informed opinion.

Thanks in advance.

Perhaps that is far enough

Posted Sep 25, 2024 9:48 UTC (Wed) by corbet (editor, #1) [Link] (1 responses)

It is my suspicion that this conversation will go nowhere useful after this point. Perhaps it's best to stop it here?

Perhaps that is far enough

Posted Sep 25, 2024 10:01 UTC (Wed) by Rudd-O (guest, #61155) [Link]

Sure. Have a nice day.

Thoughts and clarifications

Posted Sep 15, 2024 12:27 UTC (Sun) by sunshowers (guest, #170655) [Link]

> the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

It's almost completely true, though.

Think of the type system as a set of proofs you need to provide to the compiler for it to accept your code. To the extent that you can encode your program's properties into the type system, by the time your code compiles you have proven that those properties hold. Mathematically, rock-solid proven.

You can't encode 100% of your properties into the program (for example you can't totally prove linearity, since Rust has an affine system i.e. you can drop values on the floor), but you can get very far.

Thoughts and clarifications

Posted Sep 15, 2024 13:17 UTC (Sun) by mb (subscriber, #50428) [Link] (10 responses)

>but the assertion of "just change the definition and your job is done when the compiler stops complaining"
>is laughably naive.

Yes, it's extremely hard to believe.
And yes, it is an oversimplification.

But it is in fact true, to some extent.

Think of it as being the opposite of what Python does.
In Python code it is extremely difficult and sometimes practically impossible to do large scale refactorings, because almost all things are only checked at run time and almost nothing is checked at "build" time.

Rust is the exact opposite of that. And it also adds many more static checks than Python or C++ program could ever do. The language lets you express properties of the program in the type system. At the core of all this is the lifetime system, the borrow checker and move semantics.

If new Rust code is written, it sometimes has logic bugs or similar bugs in it.
But if Rust code is refactored and the algorithms are not changed, you're 99% done when the compiler stops complaining.

In Python it is extremely scary to pull out part of some code and put it somewhere else. It's extremely easy to forget something that you will only notice years after. It's no fun at all.
In Rust such things are easy, fun and the feeling of "did I forget something?" is not present, because the compiler guides the developer throughout the process.

Rust compiler messages are helpful. If you hear "Rust compiler complaining" translate that to "Rust compiler trying to help the developer".
"To fight the compiler" is not what actually happens. It's not a fight. It's a friendly helping hand.
And that really shines in refactorings.

Thoughts and clarifications

Posted Sep 15, 2024 13:52 UTC (Sun) by pizza (subscriber, #46) [Link] (9 responses)

> But if Rust code is refactored and the algorithms are not changed, you're 99% done when the compiler stops complaining.

That same argument applies to any statically-typed language, even (*gasp*) C.

Meanwhile, if you're not changing the algorithms/semantics/whatever in some way, why are you refactoring anything to begin with?

Thoughts and clarifications

Posted Sep 15, 2024 14:02 UTC (Sun) by mb (subscriber, #50428) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

No. That is not true.
There's a very big difference between what you can encode in the C type system and what is possible with the Rust type system.

>if you're not changing the algorithms/semantics/whatever

That is not what I said.

Thoughts and clarifications

Posted Sep 15, 2024 14:51 UTC (Sun) by asahilina (subscriber, #166071) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

Not at all, not to the extent it does with Rust.

In C, if you change a structure to be reference-counted, the compiler does nothing to ensure you manage the reference counts correctly. In Rust it does.

In C, if you add a mutex to protect some data, the compiler does nothing to ensure you actually hold the mutex before accessing the data. In Rust it does.

In C, if you change the width or signedness of an integer member or variable, the compiler does nothing to ensure you actually update the type of any variables it's copied to or passed into, and it will happily let you truncate or convert integers, even with -Wall. In Rust it won't compile until you change all the types or add explicit casts, and you probably won't even need to touch any code that just creates temporary bindings since Rust has type inference for that and C does not (without extensions like __auto_type or weird macros).

In C, if you need to add cleanup code to a structure that was previously just freed with free(), you need to find all the sites where it is freed and change them to call a helper that does the extra cleanup manually. In Rust none of this code exists to begin with since freeing is automatic, you just implement the `Drop` trait on the struct to add extra cleanup code and you're done, no need to refactor anything at all.

Thoughts and clarifications

Posted Sep 16, 2024 10:36 UTC (Mon) by farnz (subscriber, #17727) [Link] (5 responses)

No; the degree to which you're done when things start compiling depends critically on how much is checked at compile time versus at run time. My experience over 8 years of doing Rust, and approximately 20 years doing C, is that Rust programs tend to have much more in the way of compile time checking than C programs, which in turn means that "it compiles" is a much stronger statement than in C (although not as strong as it tends to be in Idris or Agda).

A more interesting question is whether this will continue to hold as more people write Rust code - is this current behaviour an artifact of early Rust programmers tending to write more compiler-checked guarantees, or is this something that will continue to hold when the Rust programmer pool expands?

Thoughts and clarifications

Posted Sep 16, 2024 11:41 UTC (Mon) by pizza (subscriber, #46) [Link] (4 responses)

> A more interesting question is whether this will continue to hold as more people write Rust code - is this current behaviour an artifact of early Rust programmers tending to write more compiler-checked guarantees, or is this something that will continue to hold when the Rust programmer pool expands?

Personally, I strongly suspect the latter.

Current Rust programmers are self-selecting, in the upper echelon of skill/talent, and largely using Rust for Rust's sake. That is very much non-representative of the software development world as a whole. [1]

Rust will have its Eternal September, when relatively average-to-mediocre corporate body farms start cranking it out. At that point, "Rust Culture" goes out the window as the only "culture" that matters is what falls out of coroporate metrics<->reward mappings.

[1] So is C, for that matter. If I were to pull out a bad analogy, if effectively coding in C represents the top 10th percentile, Rust is currently the top 1%.

Thoughts and clarifications

Posted Sep 16, 2024 11:50 UTC (Mon) by pizza (subscriber, #46) [Link]

> Personally, I strongly suspect the latter.

Gaah, make that 'the former'. (As I hope was clear from the rest of the post)

Thoughts and clarifications

Posted Sep 16, 2024 13:21 UTC (Mon) by farnz (subscriber, #17727) [Link]

I disagree in part; I think it'll get worse than it is today, for the reasons you outline, but that it'll still remain a lot more true of Rust than it is of C.

I have access to a decent body of code written by a contract house (one of the big names for cheap outsourcing), and the comments make it clear that they used their cheap people, not their best people, to write the code. Of the four most common causes of issues refactoring that code, three are things that are compiler-checked in Rust:

Assumptions about the sizes of arrays passed as arguments; where in C, I can pass a 2 element array to a function that expects a 4 element array, Rust either makes this a compile-time error (if the argument type is an array) or a runtime panic (if the argument type is reference to slice).
Assumptions about wrapping of unsigned computations. C promotes unsigned bytes to signed int for calculations, and then does the computation, but there's chunks of this code that assume that every intermediate in a long sequence without storing to a known type remains unsigned (otherwise there's UB lurking when the intermediate exceeds INT_MAX).
Failure to check all possible values of an enum, in large part because it's clear that the value got added after a module was "code complete", and nobody thought to add a default: or a special handler for this value.

Those all become panics or compile failures in Rust, leaving the errors in business logic (of which there are remarkably few) to deal with during refactoring.

And more generally, the biggest issue with cheap contractor C and C++ is the amount of code they write that depends on UB and friends being interpreted a particular way by the compiler, even in cases where there's no way to check that interpretation from code; Rust does seem to reduce this, even in beginner code, since unsafe is easy to find and be scared by.

Thoughts and clarifications

Posted Sep 22, 2024 18:24 UTC (Sun) by Rudd-O (guest, #61155) [Link] (1 responses)

> Rust will have its Eternal September, when relatively average-to-mediocre corporate body farms start cranking it out. At that point, "Rust Culture" goes out the window as the only "culture" that matters is what falls out of coroporate metrics<->reward mappings.

I haven't seen so far, at least in decades of me working in the industry, that eternal September has arrived to Haskell.

And I don't think that's going to happen. At least in Haskell. Maybe in Rust will. Maybe it won't.

There is a set of conceptual difficulties associated with learning any programming language, and it is not the same, depending on the language. Learning ATARI basic is one thing, (by the way that's the first language I learned). Learning Python is another Learning assembly is yet another Learning Haskell is another.

To pull the conversation away from the realm of language and just talk about concepts, pretty much any programmer can program using a stringly typed interface (which we all know leads to shitty code). But not every programmer is capable of learning the Haskell type system (I know I can't but ikcan understand how it leads to improved type safety and thus code quality).

All of this is to say that we're not all made equal. And because we're not all made equal, we are not all able to use the same tools. Just as we are not all able to wield a sledgehammer that weighs 30 pounds and break down a wall, so we are just as unequal to wield a specific programming language with skill and produce the results that one wants. Evolution does not stop at the neck.

But what do i know? Perhaps Haskell will get its eternal September? All i know is i can't learn it. Or at least I'm humble enough to admit that.

Thoughts and clarifications

Posted Sep 22, 2024 18:30 UTC (Sun) by Rudd-O (guest, #61155) [Link]

Addendum:

In case it's interesting for the readers here, the current firm I'm working at started the system that we are developing with Haskell. We had a lot of researchers that were super talented and were able to crank out what I consider pretty high quality code at the very beginning using nothing but Haskell.

The problem is that once you need to grow past 10 engineers, or in this case computer scientists, you can't. Finding 10 Haskell programmers in a Haskell conference is fairly easy. Finding the 11th to join your team when there's no conference going on is almost impossible. Hasklers are wicked smart, and because they're wicked smart, they're wicked rare.

So what did we do after that? We switched our system to Rust. Of course, the system continues to have the same type structure to the extent that it is possible, that it had back in the era when it started as Haskell. And all the Haskell programmers quickly adapted to using Rust because the type system in Rust is less complex than the type system in Haskell, so for them it was a downgrade. But we were able to quickly triple the amount of programmers that we had developing the system.

And the system continues to grow, and it has pretty high code quality for the standards of my career — I've seen code maybe 25 years? I routinely refactor the innards of the system without fearing that I'm going to break something somewhere else, somewhere deep in the system. I don't think I've ever felt so free to actually change the code without having the dread inside of me that it's going to catastrophically explode in production. Two years, and I have yet to put a bug in the system. That is almost completely magical.

Thoughts and clarifications

Posted Sep 22, 2024 18:19 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> That same argument applies to any statically-typed language, even (*gasp*) C.

No. Not even close.

You can make a change in C struct, such that you're missing a member of the struct, and maybe the compiler will complain that you missed initialization somewhere of that struct member. That is true.

But if you add to an enum somewhere, which represents a new state of the object that you are using the enum on, and that enum is used in an if, case or select statement somewhere else, C will happily ignore that you've done this, and compile just fine. Then your code doesn't work.

In Rust, when you do this, generally a selector case statement equivalent, which is called match, will not compile, because it knows that you've added a different case that your code does not handle. Only after you have fixed every site where this enum is used, will it compile.

This simple thing prevents entire classes of logic errors in your code.

There are quite a few other ergonomic changes that the language has over other languages that existed before rust, which work in a similar way. to give you just one other example:

You change a C struct to have a new member that has a pointer to another type. You painstakingly change every place where that struct needs to be initialized so that your program will compile and run. This program is multi-threaded, like the kernel is. You run, your program, and it crashes. In this particular case, that new member that refers to a pointer to this other structure was used at the wrong time. This is due behavior that wasn't there before when the first structure did not have a pointer to the second one.

This is not possible in Rust. The compiler, in fact, the borrow checker in the compiler, keeps track of where every pointer is going and where every pointer is used. And will not let you compile the program if you use a pointer or a reference when you are not supposed to, when it's supposed to be dead or not initialized, or if the lifetime of the object taking a reference to that thing is a different lifetime, incompatible with the lifetime of the object pointed to by the pointer. It even knows when you are using data that you're not supposed to be using because you will have forgotten to grab a lock. And it will tell you you need to change how this is done. Try this, try this other thing, try this other thing. It gives you options.

This is so far ahead of anything that the C language does, that in fact could be construed as magic by Ken, Dennis, and Donald. You need to see it with your own eyes to believe it, but it is amazing.

On a personal note, this conversation on this particular thread has exposed to me the wide difference of perspective that C developers and Rust developers have. Having developed years of my life with both languages, I have the uncomfortable advantage of having perspective from both sides. But to me, it really does feel like we're arguing versus horses and carriages versus automobiles, or electric cars versus gas cars. I, too, thought Teslas were bullshit until I got on one, as a passenger, and the driver punched the pedal. Oh my god. It's a similar experience going from Python, or Go, or C, to Rust.

And I think that explains why a lot of people see what Rust developers say about the language, and then conclude, this must be a religion, or worse, a cult.

Thoughts and clarifications

Posted Sep 15, 2024 20:24 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

> Linux has historically placed great emphasis on (and heavily leaned into) internal interfaces and structures being freely malleable, but the assertion of "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

Others chimed in with examples to the contrary, I also had a similar experience. FWIW, for me, the best feature of Rust was not the lifetimes and borrows, but pattern matching and exhaustiveness checking. I always hated writing code that encodes state machines, but Rust makes that so much better.

To be clear, other languages with pattern matching have similar properties, and even C++ might get it soon.

Thoughts and clarifications

Posted Sep 16, 2024 8:51 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (6 responses)

> and exhaustiveness checking

fn main() {
let i: u32 = 41;
match i % 2 {
0 => println!("0"),
1 => println!("1"),
}
}

error[E0004]: non-exhaustive patterns: `2_u32..=u32::MAX` not covered

But rustc also exposed an embarassing bug in trivial C++ calculator program of mine, so I can't complain too much.

Thoughts and clarifications

Posted Sep 16, 2024 12:30 UTC (Mon) by excors (subscriber, #95769) [Link] (5 responses)

That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

After making that change, you do lose the benefits of compile-time exhaustiveness checking for that match statement; someone might change the condition to "i % 3" and you won't notice until runtime. But you'll still get the benefits for any code that matches integers you can't guarantee are in a particular sub-range (like the inputs to any API), and for any code that matches enums (presumably what Cyberax meant with state machines). I'd guess those situations are more common in most programs, so the exhaustiveness checking is still a valuable feature even if it's not perfect.

If your code is doing lots of work on bounded integers then I guess you'd want something more like the Wuffs type system, but then you'll get the compromises that Wuffs makes to make that work, which seems to restrict it to a very small niche. And that would still be inadequate if you do "match x & 2" since Wuffs doesn't know the value can't be 1. (Though as far as I can tell, Wuffs doesn't actually support any kind of switch/match statement - you have to write an if-else chain instead.)

Thoughts and clarifications

Posted Sep 16, 2024 14:20 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (4 responses)

> That's easily worked around by adding a catch-all "_ => unreachable!()", ideally with a comment explaining why you believe it's unreachable (assuming the real code isn't quite this trivial), and if you were mistaken then it'll become a runtime panic (unlike C where reaching __builtin_unreachable() is undefined behaviour).

Imo this comparison to __builtin_unreachable() is nonsensical. I dislike a lot of UB in C as well, but you'd IMO only ever use __builtin_unreachable() when you *want* the compiler to treat the case as actually unreachable, to generate better code.

Thoughts and clarifications

Posted Sep 16, 2024 14:52 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (1 responses)

Rust does and doesn't do bounds checking at the same time:
without unreachable!() it is compile error 100% of the time,
but at -O1 code generator knows what remainder does to integers.

https://godbolt.org/z/jbefszqa8

Guaranteed behaviour versus permitted optimizations

Posted Sep 17, 2024 8:49 UTC (Tue) by farnz (subscriber, #17727) [Link]

This is normal for any compiled language; the compiler is allowed but not required to remove dead code, and thus when the optimizer is able to prove that a given piece of code cannot be called, it is allowed to remove it (similar applies to unused data). However, it's never required to remove dead code, and when you're not optimizing, it'll skip the passes that look for dead code in the name of speed.

There's a neat trick that you can use to exploit this; put a unique string in panic functions that doesn't appear anywhere else in the code, and then a simple search of the binary for that unique string tells you whether or not the optimizer was able to remove the unwanted panic. It's not hard to put greps in CI that look for your unique string, and thus get a CI-time check for code that could panic at runtime - if the string is present, the optimizer has failed to see that it can remove the panic, and you need to work out whether that's a missed optimization (and if so, what you're going to do about it - make the code simpler? Improve the optimizer?). If it's absent, then you know that the optimizer saw a route to remove the panic for you.

Thoughts and clarifications

Posted Sep 16, 2024 17:40 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)

This is getting slightly tangential, but I don't think it's that far-fetched to compare them - they have basically the same name (especially in codebases like Linux that #define it to "unreachable"), and people do use it in C for non-performance reasons, e.g.:

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (unreachable() when the hardware returns an unexpected chip ID; that doesn't sound safe)

https://github.com/torvalds/linux/blob/v6.11/fs/ntfs3/fre... (followed by error-handling code, suggesting the programmer thought maybe this could be reached)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/kv... (genuinely unreachable switch default case, explicitly to stop compiler warnings)

https://github.com/torvalds/linux/blob/v6.11/arch/mips/la... (looks like they expected unreachable() to be an infinite loop, which I think it was when that code was written, but it will misbehave with __builtin_unreachable())

https://github.com/torvalds/linux/blob/v6.10/drivers/clk/... (probably to stop missing-return-value warnings; not clear if it's genuinely unreachable, since clk_hw looks non-trivial; sensibly replaced by BUG() later (https://lore.kernel.org/all/20240704073558.117894-1-liqia...))

__builtin_unreachable() seems like an attractive nuisance (especially when renamed to unreachable()) - evidently people use it for cases where they think it shouldn't be reached, but they haven't always proved it can't be reached, and if it is then they get UB instead of a debuggable error message. It seems they usually add it to stop compiler warnings, not for better code generation. Often they should have used BUG(), which is functionally equivalent to Rust's unreachable!() though slightly less descriptive.

If you really need the code-generation hint in Rust, when the optimiser (which is a bit smarter than the compiler frontend) still can't figure out that your unreachable!() is unreachable, there's "unsafe { std::hint::unreachable_unchecked() }" which is just as dangerous but much less attractive than Linux's unreachable().

Anyway, I didn't originally mean to denigrate C, I was mainly trying to explain the Rust code to readers who might be less familiar with it. But it does also serve as an example of different attitudes to how easy it should be to invoke UB.

Thoughts and clarifications

Posted Sep 16, 2024 18:14 UTC (Mon) by mb (subscriber, #50428) [Link]

Yes, looks like you found a couple of actual soundness bugs in the C code.
I wonder if there are any uses of unreachable that actually make sense. As in: Places where the performance gain actually matters.

Thoughts and clarifications

Posted Sep 22, 2024 18:32 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> but pattern matching and exhaustiveness checking

This was magical to me too. At first it felt super awkward because it felt like an inversion of the order in which things are supposed to read like. But when it clicked... oh my god. Combining that with the question mark or the return inside of the match, it really helped simplifying the structure of the happy path that I could read.

I am so happy I learned Rust. And I've even happier that I'm getting paid to do it.

Thoughts and clarifications

Posted Sep 22, 2024 17:58 UTC (Sun) by Rudd-O (guest, #61155) [Link] (10 responses)

> "just change the definition and your job is done when the compiler stops complaining" is laughably naive.

In C. And assembly.

> Donald Knuth

— famous C and assembly developer

Thoughts and clarifications

Posted Sep 22, 2024 18:11 UTC (Sun) by pizza (subscriber, #46) [Link] (9 responses)

> "Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

..Your blithe dismissal of Knuth as an "assembly and C programmer" doesn't invalidate his point. "Provably correct" doesn't mean that it actually *works*.

Thoughts and clarifications

Posted Sep 23, 2024 9:15 UTC (Mon) by farnz (subscriber, #17727) [Link] (7 responses)

Knuth's point was not that it wouldn't work, but rather that his proof would only cover the things he cared to prove correct, and not necessarily cover everything that you, the reader of his code, would expect. He wrote that jibe in an era when formal methods practitioners were merrily proving all sorts of properties about code that, to a large extent, were irrelevant to users of computation systems (including those implemented using humans as the compute element), and not considering important properties (like "is this proven to terminate in finite time") because they're not always provable.

Thoughts and clarifications

Posted Sep 23, 2024 9:26 UTC (Mon) by Wol (subscriber, #4433) [Link] (6 responses)

While it may not be what Knuth was thinking of, I also think of it as pointing out that a formal proof merely proves that the mathematical model is internally consistent.

It does not prove that what the model does is what reality does! In a properly specified system, maths and science(reality) usually agree, but there's no guarantee ...

Cheers,
Wol

Thoughts and clarifications

Posted Sep 23, 2024 11:56 UTC (Mon) by pizza (subscriber, #46) [Link] (5 responses)

> It does not prove that what the model does is what reality does! In a properly specified system, maths and science(reality) usually agree, but there's no guarantee ...

This has been my experience.

...The software running on the 737MAX's MCAS was "provably correct" .. for its specifications.

(It turns out that the words "properly specified" are about as common as unicorn-riding sasquatches...)

Thoughts and clarifications

Posted Sep 23, 2024 12:01 UTC (Mon) by farnz (subscriber, #17727) [Link] (4 responses)

Reference for "the software running on the 737MAX's MCAS was "provably correct""? I can't find any evidence anywhere that the MCAS was formally verified at all - merely that it was tested correct, and Boeing asserted to the FAA that the testing covered all plausible scenarios.

Thoughts and clarifications

Posted Sep 23, 2024 14:06 UTC (Mon) by pizza (subscriber, #46) [Link] (3 responses)

> Reference for "the software running on the 737MAX's MCAS was "provably correct""?

I'm giving Boeing's software folks the benefit of the doubt, because the MCAS debacle was a failure of specification (on multiple levels), not one of implementation.

After all, one can't test/validate compliance with a requirement that doesn't exist.

> Boeing asserted to the FAA that the testing covered all plausible scenarios.

It did! Unfortunately, many of those "plausible scenarios" required pilots to be trained differently [1], but a different part of Boeing explicitly said that wasn't necessary [2].

[1] ie recognize what was going on, and flip a circuit breaker (!) to disable MCAS
[2] One of the main selling points of the MAX

737 MAX only tested, not proven correct

Posted Sep 23, 2024 14:40 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

There is a requirement underpinning all avionics that the aircraft's behaviour is safe in the event of a data source failing, and that the avionics are able to detect that a data source has become unreliable and enter the failsafe behaviour mode. This is a specification item for MCAS, and Boeing asserted to the FAA that they had tested MCAS and confirmed that, in the event of an AoA sensor fault, MCAS would detect the fault and enter the failsafe behaviours.

Boeing's tests for this, however, were grossly inadequate, and at least 3 different failure conditions have been found which were not covered by the tests: first is that "AoA DISAGREE" was an optional indication, available during the tests, but not in production MCAS unless purchased (20% of the fleet). Second is that they did not test enough bit error cases, and later investigation found a 5 bit error case that was catastrophic. And third was that the procedure for handling MCAS issues assumed that the pilot would have time to follow the checklist; in practice, the training issues meant that pilots didn't even realise there was a checklist.

737 MAX only tested, not proven correct

Posted Sep 23, 2024 15:49 UTC (Mon) by pizza (subscriber, #46) [Link] (1 responses)

> first is that "AoA DISAGREE" was an optional indication,

It's worse than that -- _redundant sensors_ were optional.

..and they were optional because one set of folks had a different set of functional specifications than another, and management was disincentivized to notice.

737 MAX only tested, not proven correct

Posted Sep 23, 2024 16:23 UTC (Mon) by paulj (subscriber, #341) [Link]

Went and had a read, as it seems you and farnz don't quite agree and/or are talking about slightly different things. ICBW, but my read of the FAA summary report is:

FAA "Safety Item #1: USE OF SINGLE ANGLE OF ATTACK (AOA) SENSOR" - this refers to the use of /data/ from a single AoA sensor by MCAS.

FAA "Safety item #5: AOA DISAGREE:" - refers to the "AOA DISAGREE" warning in the cockpit FDU, which somehow was tied to the optional "AoA Indicator gauge" feature for the FDU, which airlines had to purchases.

AFAICT from the FAA summary. I.e., the change was entirely in the logic - because there was no action item to retro-fit another AoA vane to the 737 Max. Excluding training and maintenance requirements, aircraft changes were all logic update to do better filtering of the 2 AoA signals, with better differential error detection, and cease ploughing ahead with MCAS commands based on measurements from just 1 vane - which could be faulty. Other aircraft safety items added limits, damping and margins, to prevent runaway MCAS crashing aircraft.

Stunning issues really.

Thoughts and clarifications

Posted Sep 25, 2024 9:52 UTC (Wed) by Rudd-O (guest, #61155) [Link]

Not sure what you're referring to with "provably correct", that's Knuth's claim, not mine.

Nevertheless, in my professional experience, the C compiler's silence generally tells you near to nothing about whether the program will run as intended (I concede things have improved over the last 30 years. Whereas the Rust compiler's silence generally does mean the program is either going to run as intended or will have substantially fewer problems than the C one (mostly logic errors introduced by the programmer).

You can choose to be dogmatic and insist that all compilers / languages give the same results (a belief I like to call " blank slatism"), or you can choose to test that theory for yourself. I know what I chose and I am quite happy. My sincere recommendation is that you test the theory for yourself.

Thoughts and clarifications

Posted Sep 15, 2024 12:18 UTC (Sun) by sunshowers (guest, #170655) [Link]

Yes, I've performed many refactorings in complex Rust codebases and it's never been scart. The type system does a great job catching mistakes, and experienced Rust developers can find effective ways to leverage it.

For example, using PhantomData to deliberately introduce a lifetime parameter.

Or carefully using encapsulation to localize very complex code, testing or formally proving certain properties on it, and then using the type system to turn the local property into a global one.

If you've never used Rust, then please listen to those of us who write Rust day in and day out (coming on 8 years full-time for me). Rust really is a massive improvement over both C and C++.

Thoughts and clarifications

Posted Sep 22, 2024 17:57 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> Have you ever seen a moderately complex *RUST* codebase that is easy to refactor?

Yes, I have.

Because the compiler yells at me all the way until I'm done with the refactor, which is awesome. Because I know at the end of the refactor, my refactor is almost certain to work properly. Because Rust forced all of the developers before me to encode the behavior of the APIs and the algorithms in the type system.

And the type system + the borrow checker do not forgive you or give you free passes. You cannot cast to void yourself around it. You cannot pass some sort of shim that could have a bug itself. You cannot hold a pointer for the wrong timespan. You have to fix it correctly.

Sure, you can leave a bunch of functions empty, effectively not doing your job in not finishing the refactor, and obviously the program is going to break after that.

But if you do your job, it is substantially easier than refactoring a C code base, only to discover that the first time you run it corrupts your data, or it crashes, or it works for a good while without any problems, and you discover all those issues much later down the road.

Anecdotally, Python refactorings have the same problem of C refactorings. In fact, it might actually be worse than C refactorings. Python with types is substantially easier to refactor than Python without types. But Rust? Refactoring Python is much harder, with without types, than refactoring Rust.

Thoughts and clarifications

Posted Sep 22, 2024 17:00 UTC (Sun) by Rudd-O (guest, #61155) [Link]

> Well, there was Ted Ts'o's rant at Wedson and the others that started with an accusation of wanting to "convert" people to the Rust "religion", followed by a pile of strawman arguments, followed by another person making jokes comparing Rust to Java and more strawmen.

From the short clip I saw, yes, I agree with you, that was totally inappropriate. They didn't even let him finish; it wasn't even a discussion to begin with. It was just demonstration of how the type system works. and the guy was just trying to explain to them why the type system protects them from problems. And Ted started vaguely accusing RfL devs of converting people to the Rust religion, what the hell? I did a double take when I saw that part of the clip.

Uncalled for, in my opinion. And Ted is an extremely smart man. I wonder what prior interactions he had that led him to that comment.