Time to step up, Linus/GregKH

Posted Jan 31, 2025 11:01 UTC (Fri) by josh (subscriber, #17465)
In reply to: Time to step up, Linus/GregKH by LtWorf
Parent article: Resistance to Rust abstractions for DMA mapping

This is not making extra work for C developers; C developers continue to be free to ignore Rust. At the moment, they're free to accept patches that break Rust bindings, and leave it to the Rust people to fix those bindings.

There are plenty of parts of the kernel many people don't want to work in; ask kernel maintainers how many of them would be enthusiastic about touching the tty layer.

This is an escalation, a maintainer trying to say he doesn't want Rust in the kernel, calling it a "cancer", and saying "I will do everything I can do to stop this". This is not a problem that can be solved by telling someone this is not going to make more work for them and they can ignore it, because they're not looking to ignore it, they're looking to destroy it.

Time to step up, Linus/GregKH

Posted Jan 31, 2025 15:44 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (10 responses)

Except that the Rust people will hit social media and cause outrage.

Time to step up, Linus/GregKH

Posted Jan 31, 2025 16:42 UTC (Fri) by dralley (subscriber, #143766) [Link] (9 responses)

This has not happened. If you are referring to what I think you're referring to, you are misrepresenting the situation completely.

In the case with Wedson, the fact that a roomful of kernel developers argued without agreement for several minutes on the precise semantics of the existing C APIs is precisely evidence that such semantics need to be thoroughly documented - which was all the Rust developers were asking for. If it's so complex that the foremost experts in the world can't remember exactly how it works, then it's very hard to take seriously the opinion that "everything is fine, no need to write it down for posterity".

In the case with Asahi Lina, that was over code that was *broken* with or without Rust, the issues would have been visible even with a pure C driver, but the maintainer was rejecting her opinion on the basis of being a "Rust person talking about lifetime issues" despite this.

In both cases the oppositional stance of the maintainers is extremely difficult to make sense of from a purely-technical standpoint.

Time to step up, Linus/GregKH

Posted Feb 5, 2025 7:28 UTC (Wed) by qtplatypus (subscriber, #132339) [Link] (5 responses)

What was the eventual outcome of the Asahi Lina patch’s?

Time to step up, Linus/GregKH

Posted Feb 5, 2025 9:46 UTC (Wed) by Wol (subscriber, #4433) [Link] (4 responses)

From what I know, the C maintainer was being perfectly reasonable. Asahi was being perfectly reasonable. For whatever reason they just didn't hit it off with each other and got stuck in a slanging match.

Someone higher up stepped in, knocked their heads together, and said "we need to keep these two apart". Then everything got sorted.

Sometimes two people just don't get on. I've had people I can't stand. It happens. It got sorted.

Cheers,
Wol

Time to step up, Linus/GregKH

Posted Feb 5, 2025 14:55 UTC (Wed) by dralley (subscriber, #143766) [Link] (3 responses)

The C maintainer wasn't being entirely reasonable - Lina was pointing out flaws in the C code that could be encountered even from C-written drivers.

Eventually she gave up and wrote her own scheduler in Rust to avoid having to deal with drm_sched.

Time to step up, Linus/GregKH

Posted Feb 5, 2025 15:46 UTC (Wed) by Wol (subscriber, #4433) [Link] (2 responses)

And as I understood it (and not from the other party - but from a respected kernel maintainer here), the C guy was entirely reasonable.

It's a long way from saying "but that is an obvious bug in the C code", to getting to "and this is a fix *that*will*not*cause*spaghetti*breakage*where*you*least*expect*it*". And THAT is what Lina just could not get!

I'm not going to come down on either side, but I've had plenty of experience of an "obvious" fix going subtly and disastrously wrong, so I can sympathise with the C guy. I'm dealing with this right now at work - a colleague didn't bother to understand the subtleties of some tricky code, put in a fix of her own that happened to work (breaking the overarching design of the original code), and now things have changed again we're having to bodge something, that should have been simple, to avoid breaking her code. All because what was meant to be a simple and clean - *and* *reversible* - design was bodged, and despite having forseen that we would want to back it out, we now can't because these bodges rely on it being present when it really should not be!

Cheers,
Wol

Time to step up, Linus/GregKH

Posted Feb 5, 2025 15:52 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

To reply to myself, have you ever heard the saying "if it ain't broke, don't fix it"?

Yes the C code was broke - for ONE person. And if I know the design of the code, I will quite happily dig in and refactor *my* stuff. But as soon as I start digging in other peoples' code, I'm a lot more cautious.

Lina has a compiler that helps her and alerts her much more to potential screwups. The C guy gets precious little help from the compiler - he doesn't want to get burnt where a "simple" change compiles just fine (but wrong) and then blows up in someone *else's* face.

Cheers,
Wol

Time to step up, Linus/GregKH

Posted Feb 5, 2025 21:26 UTC (Wed) by dralley (subscriber, #143766) [Link]

It wasn't broken for "one person", though. Christian König repeatedly agreed that the behavior was problematic, but kept falling back on how it was "as designed" and how the problems were known when it was designed, but everything was designed around that anyway.

See her comments here and the ones from Christian she was directly responding to:

https://lore.kernel.org/lkml/7e53bc1f-7d1e-fb1c-be45-f03c...

And David Arlie dropping in on a second discussion thread w/ Christian and Lina (those comments also worth reading)

https://lore.kernel.org/lkml/CAPM=9txcC9+ZePA5onJxtQr+nBe...

Now, I'm not entirely unsympathetic with Christian. From the standpoint of how things currently are, there's quite a lot of code written around the broken core abstractions, and un-breaking those abstractions is a lot of work. But the problems she was hitting were real, they were acknowledged, and the existing scheduler was not in any way "simple" as you claim it to be - and it doesn't seem like there was much motivation to fix those things. Hence Lina deciding it wasn't worth the time investment.

Time to step up, Linus/GregKH

Posted Feb 18, 2025 9:42 UTC (Tue) by daenzer (subscriber, #7050) [Link] (2 responses)

> In the case with Asahi Lina, that was over code that was *broken* with or without Rust, the issues would have been visible even with a
> pure C driver, but the maintainer was rejecting her opinion on the basis of being a "Rust person talking about lifetime issues" despite this.
>
> In both cases the oppositional stance of the maintainers is extremely difficult to make sense of from a purely-technical standpoint.

I respectfully disagree that these two cases can be classified the same like that.

The case with Asahi Lina was about technical issues[0]. Raising technical issues in proposed patches is the job of a maintainer. No DRM maintainer said anything like "I don't want Rust in the kernel" or "Rust is cancer".

[0] You're right that the same issues might have affected C patches, and they would have been raised the same way then. In fact, similar issues were raised in C patches many times before.

Time to step up, Linus/GregKH

Posted Feb 19, 2025 10:27 UTC (Wed) by smurf (subscriber, #17840) [Link] (1 responses)

> The case with Asahi Lina was about technical issues

Yeah, well, that's the point: the case *was* about technical issues, those being that adding the Rust bindings uncovered ambiguities and whatnot in the C interface.

But when a maintainer then refuses to engage with the reporter (and instead replies with the kernel ML's equivalent of the "Everything's fine" meme GIFs that's ubiquitous on the 'net) the case ceases to be a *technical* problem in the strict sense of that word.

Time to step up, Linus/GregKH

Posted Feb 19, 2025 13:53 UTC (Wed) by daenzer (subscriber, #7050) [Link]

> Yeah, well, that's the point: the case *was* about technical issues,

The text you quoted from my previous comment was in response to "the oppositional stance of the maintainers is extremely difficult to make sense of from a purely-technical standpoint". If a maintainer raising technical issues in patches is "difficult to make sense of from a purely-technical standpoint", maintainers might as well pack up and go shopping.

> those being that adding the Rust bindings uncovered ambiguities and whatnot in the C interface.

Don't think they really "uncovered" anything, the maintainer was already aware of the issues and explained them to the patch author.

> But when a maintainer then refuses to engage with the reporter (and instead replies with the kernel ML's equivalent of the "Everything's fine" meme GIFs that's ubiquitous on the 'net) the case ceases to be a *technical* problem in the strict sense of that word.

Having witnessed that discussion first-hand, I disagree that the maintainer refused to engage with the patch author. There was clearly a communication breakdown, the maintainer isn't solely responsible for that though. Communication is a two-way street.

Do you have a reference to back up the "Everything's fine" claim? My recollection is more like "the patches can't be merged due to these issues", which doesn't imply "everything's fine".

Time to step up, Linus/GregKH

Posted Feb 1, 2025 13:28 UTC (Sat) by ianmcc (subscriber, #88379) [Link] (3 responses)

As far as I can tell, from skimming some of the mailing lists threads, there *is* a problem that there is currently extra work for C developers, and that they cannot completely ignore rust, but it might "simply" be a configuration problem. Namely, https://lwn.net/ml/all/20250131135421.GO5556@nvidia.com/

"It doesn't work like C. Rust builds the PCI bindings always once CONFIG_PCI is turned on. It doesn't matter if no rust driver is being built that consumes those bindings. It won't work like staging does where you can just turn off one driver."

The implication of this, as far as I can tell, is that if you change the API for, say, DMA, get it working in a simple test driver (with all other drivers that use the API disabled, because they won't build anymore), then that would normally be a point where you could start pushing the change to other people (staging?) to work on adapting the other drivers.

But if that API has rust bindings, then it will break as soon as CONFIG_RUST is turned on, because that will attempt to build the rust bindings, even if no actual rust drivers are configured in. And presumably CONFIG_RUST will, at some point, become non-optional if some core code is written in rust.

But that seems to be a configuration problem, and the fix is to get finer-grained control that only enables the rust bindings if there are rust drivers that need it, or the rust bindings are explicitly configured separately from CONFIG_RUST ?

Time to step up, Linus/GregKH

Posted Feb 4, 2025 4:45 UTC (Tue) by mbp (subscriber, #2737) [Link] (2 responses)

It seems like there's an easy fix for people like Christoph who don't want to deal with Rust: build with CONFIG_RUST=n. Perhaps that ought to be a good way to coexist: guarantee that Rust won't ever break the build if it's turned off.

Time to step up, Linus/GregKH

Posted Feb 15, 2025 23:30 UTC (Sat) by lacos (guest, #70616) [Link] (1 responses)

> build with CONFIG_RUST=n

Not good enough *if* calling the kernel "releasable" requires "CONFIG_RUST=y" to build.

- The maintainer starts work on a branch at whose fork-off point the kernel is "releasable".
- The maintainer changes C APIs.
- The maintainer fixes up all C-language call sites across the tree.
- The API changes break the Rust bindings.
- The maintainer doesn't care (or even notice) because they build with Rust disabled. Even CI may pass for them, that way. (They might also have some customization for git-grep in order to exclude Rust source files altogether -- i.e., they wouldn't see any API call sites they wouldn't want to fix up.)
- The result is however, *presumably*, an "unreleasable" kernel, because "CONFIG_RUST=y" no longer builds.
- The maintainer ultimately performed a series of actions that turned a releasable kernel into an unreleasable one.
- The Rust-proponent maintainers now have to come in and clean up "after" the original maintainer, in order to restore the tree to "releasable" state.

I perceive this process as one manifestation of Hyrum's Law. I can see why the maintainer wouldn't want it.

Time to step up, Linus/GregKH

Posted Feb 17, 2025 14:55 UTC (Mon) by taladar (subscriber, #68407) [Link]

There are really only two options that avoid the scenario you describe.

You can either remove all the features that might potentially be affected by any code touched by any maintainer who does not know how to fix every single consequence of that code change exhaustively. You essentially turn the Kernel into a Hello World because that is likely most features.

Or you can restrict any maintainer from touching any code where they can't exhaustively fix everything that might potentially be affected by their code change. In that case the Kernel will likely never be changed again.

The problem you describe is not a Rust problem, it is a problem with an inability to work as a team.