Rust code review and netdev

By Jake Edge
October 31, 2023

A fast-moving patch set—seemingly the norm for Linux networking development—seeks to add some Rust abstractions for physical layer (PHY) drivers. Lots of review has been done, and the patch set has been reworked frequently in response to those comments. Unfortunately, the Rust-for-Linux developers are having trouble keeping up with that pace. There is, it would appear, something of a disconnect between the two communities' development practices.

On October 26, Tomonori Fujita posted version 7 of his patches containing the infrastructure for creating some PHY drivers using Rust; it also includes a driver for the Asix AX88772A PHY. The first version was posted on October 2 and it was preceded by three RFCs starting on September 13. In response to the cover letter for version 7, Rust-for-Linux developer Miguel Ojeda suggested that Fujita slow the pace some:

This patch series has had 8 versions in a month. It would be better to wait more between revisions for this kind of patch series, especially when there is discussion still going on in the previous ones and it is a new "type" of code.
I understand you are doing it so that people can see the latest version (and thus focus on that), and that is helpful, but sending versions very quickly when the discussions are ongoing splits the discussions into several versions. That makes it more confusing for reviewers, and essentially discourages people from being able to follow everything.

But Andrew Lunn, who has been doing a lot of review of the patches, pointed out that the pace is what is expected for the Linux networking subsystem: "[...] netdev moves fast, review comments are expected within about 3 days". He said that new versions of a patch set should not be posted for at least 24 hours, but the fast pace is meant to drive a resolution sooner; "[...] posting a new version is a way to kick reviewers into action". But Rust-for-Linux developer Boqun Feng wondered if posting updated patch sets so frequently really meets that goal: "it's sometimes [frustrating] to see new version post but old comments were not resolved, it rather discourages reviewers".

Lunn also noted that networking patches do not require reviews in order to get merged; "If there is no feedback within three days, and it passes the CI [continuous-integration] tests, it likely will be merged." But Ojeda said that CI tests are not sufficient to determine if the abstractions are "well-designed or sound, which is the key property Rust abstractions need" (see this post for a description of "sound" in Rust terms); he hoped that there were humans in the loop. Lunn replied that humans ultimately decide whether to merge the code, but that API problems are just bugs like any other—they can be fixed later if any are found.

In another part of the thread, Ojeda described the differences he sees between the processes for the netdev community and that of the (nascent) Rust-for-Linux effort. There is a need for the project to do careful review of "a whole new subsystem API" in a new language, he said. Another difficulty is that the Rust-for-Linux project simply does not have enough reviewers and/or review bandwidth to keep up with the netdev pace. "So, if you appreciate/want/need our feedback, you will need to be a bit more patient."

Meanwhile, though, he also quoted parts of the netdev documentation that indicate posting a new revision is not the recommended path to speed things along. In particular: "Do not post a new version of the code if the discussion about the previous version is still ongoing, unless directly instructed by a reviewer." In fact, that is the norm in the rest of the kernel community: posting a new patch set before the discussion has died down is generally unwelcome.

Networking maintainer Jakub Kicinski acknowledged the Rust-for-Linux project's needs, but pointed out that the netdev process is meant to make it possible to handle an enormous patch volume. "Longer review cycles would make keeping track of patches and discussions unmanageable." He wondered if the Rust-for-Linux project would be less involved in patch review after the initial stages. Ojeda agreed that was the goal, but that the initial set of abstractions will require more review time:

In addition, for Rust, we are trying to get the very first abstractions that get into mainline as reasonably well-designed as we can (and ideally "sound"), so that they can serve as an example. This takes time, and typically several iterations.

There is also a need to spend some time on "getting people on the same page around what is expected from Rust code and abstractions", he said. There is more to it than just the design and logic in the code, as formatting is part of what needs to be hammered out. A "trivial example" of that is the maximum line length that Lunn had pointed out as a problem in his review. Ojeda said that the Rust-for-Linux project would like to have consistent formatting for its code throughout the kernel, hopefully maintained in an automated fashion, so those differences need to be ironed out kernel-wide if possible. The plan is to provide a solid foundation so that the networking developers "can actually handle any patches swiftly without having to wait on us"—but it is going to take some time to get there.

Lunn cautioned that it may be difficult to achieve the consistency the project is hoping for—at least for line lengths. The netdev community prefers 80-character lines, while other subsystems use 100-character lines and may not like the shorter lines. In the end, formatting is not really a technical problem, but a social issue, Lunn said:

In order to keep the netdev people happy, you are going to limit it to 80. But i would not be too surprised if another subsystem says the code would be more readable with 100, like the C code in our subsystem. We want Rust to be 100 as well.
Linux can be very fragmented like this across subsystems. Its just the way it is, and you might just have to fit in.

Ojeda seems adamant that the Rust code will be consistently formatted throughout the kernel, which is a departure from the (lack of) consistency of the existing C code. We will have to see how that plays out over time. The discussion highlights some of the problems that will need to be worked out when two disparate development communities "collide". The PHY abstractions and the driver using them would be the first Rust kernel feature that is visible to both driver writers and regular users, so it is no surprise that the developers want to take the time to get it right.

The Linux development pace is far faster than most other development projects, which may simply be too fast for the kernel Rust project—at least at this point. On the other hand, the Rust APIs are not something that have to be set in stone, as Lunn noted; they are internal to the kernel and, thus, can be changed as with other kernel APIs. One senses that the two projects will find a balance that works for both, perhaps assisted by conversations at the upcoming Linux Plumbers Conference in mid-November.

Rust code review and netdev

Posted Oct 31, 2023 19:55 UTC (Tue) by roc (subscriber, #30627) [Link] (9 responses)

> The Linux development pace is far faster than most other development projects

When you report a bug or submit a change and no-one ever responds because Linux development does not assign responsibility for responding to bugs or code submissions ... is that considered "fast"?

Rust code review and netdev

Posted Oct 31, 2023 20:30 UTC (Tue) by pizza (subscriber, #46) [Link] (5 responses)

> When you report a bug or submit a change and no-one ever responds because Linux development does not assign responsibility for responding to bugs or code submissions ... is that considered "fast"?

....You're confusing "pace" with "process".

Rust code review and netdev

Posted Nov 1, 2023 1:23 UTC (Wed) by IanKelling (subscriber, #89418) [Link] (4 responses)

> ....You're confusing "pace" with "process".

Huh? In context, it doesn't make sense to me.

Rust code review and netdev

Posted Nov 1, 2023 3:51 UTC (Wed) by pizza (subscriber, #46) [Link] (3 responses)

Whether or not there is anyone "assigned" to handle, respond, or otherwise track bug reports from random third parties has nothing to do with how much code is [re]written with every new kernel version.

The former is an example of [an unfounded expectation of] process, whereas the latter is an example of development pace.

Rust code review and netdev

Posted Nov 1, 2023 6:43 UTC (Wed) by roc (subscriber, #30627) [Link] (1 responses)

My point is that "pace" depends on your point of view. Yes, in sheer volume of git commits per day, it's fast. When it comes to regressions getting fixed, it is often very slow.

Rust code review and netdev

Posted Nov 9, 2023 17:46 UTC (Thu) by pawel44 (guest, #162008) [Link]

Slow in comparison to what? I got regression fixed in few days. Is that slow?

Rust code review and netdev

Posted Nov 1, 2023 7:55 UTC (Wed) by IanKelling (subscriber, #89418) [Link]

I see. I was mainly thinking about other things not making sense, but you've made your point clear.

Rust code review and netdev

Posted Nov 1, 2023 1:40 UTC (Wed) by IanKelling (subscriber, #89418) [Link] (1 responses)

Good point. I've always wondered how exactly kernel people decide on when to ignore things. Hypothetically, what would happen if people started saying: I read this and (for unspecified reasons I have have nothing to say at this moment | this seems like following up on, but I don't plan to in the near future for unspecified reasons | this does not seem important | this does not seem worthy of follow up, but I could be wrong)" etc.

Rust code review and netdev

Posted Nov 1, 2023 5:09 UTC (Wed) by wtarreau (subscriber, #51152) [Link]

It's the opposite in fact. Nobody "decides" to ignore. Developers decide to engage in a review when they feel legitimate on the topic. This is also why it's extremely important that patch series are properly named and use accurate subjects that try to hook developers.

When a series is ignored, it's just because everyone thinks "not sure about this one, let's not risk to send a misleading response, it's probably for someone else who knows better than me".

Rust code review and netdev

Posted Nov 1, 2023 8:38 UTC (Wed) by IanKelling (subscriber, #89418) [Link]

We can't hope to define a good measure of what development is. But perhaps we can say that development speed varies a lot within the kernel, some is fast, some is slow.

Rust code review and netdev

Posted Oct 31, 2023 21:16 UTC (Tue) by pbonzini (subscriber, #60935) [Link]

> the Rust APIs are not something that have to be set in stone, as Lunn noted

I think the speed at which internal Linux APIs change is overblown. As someone who has taken part in the RHEL Frankenkernel development, I have seen backports to Linux versions that are several years old (remember that Linux gets close to 70,000 commits every year) and they were time consuming but totally feasible, because large tree-wide or subsystem-wide changes are not even that frequent. Ironically, I have been working recently on a backport that was completely hellish not so much because of changes to the internal APIs, but simply because of file renames and changes to Kconfig symbols.

While minor changes to the APIs do happen, it is reasonable to expect a lot more care for the review of the very first Rust version of an API that has been evolving literally for decades but, at this point, should have reached a relatively firm shape.

Rust code review and netdev

Posted Oct 31, 2023 22:17 UTC (Tue) by jhoblitt (subscriber, #77733) [Link] (2 responses)

My experience has been that the best way to end up with a well designed API is to have multiple consumers. I understand that the rust folks don't want there to be a bad experience for perspective rust API users but at some point waiting on perfection is counter productive.

Rust code review and netdev

Posted Nov 1, 2023 0:42 UTC (Wed) by Paf (subscriber, #91811) [Link] (1 responses)

I agree strongly with this in general, but they’ve got concerns that are slightly different, right? They’re trying to figure out how to write usable kernel APIs in Rust, which is a new problem.

Rust code review and netdev

Posted Nov 1, 2023 2:05 UTC (Wed) by admalledd (subscriber, #95347) [Link]

Also a bit glossed over (rightly by the author to not get bogged in detail) is that its not just "Get the API right/usable" its "Get the fundamentals of what an API *means* for Kernel C<--->Rust". If it was merely the shape of netdev APIs that would be one thing but this is a bit more, since as noted once a few of these core things are figured out the Rust-For-Linux people don't have to be the ones to review every single thing. So it is right to play a bit safe/slow but that is also not how netdev is used to working. This is roughly all expected friction as the whole new paradigm that is Rust works its way into Kernel-space.

I have some confidence that while there is a bit of a bumpy road ahead as things are sorted out, it will all be for the better. Just going to be challenges like this impedence-mismatch along the way.

Aside, on the formatting thing: I do think Rust-For-Linux people have a better chance for a nearly-unified style thanks to rustfmt and such, but fully expect it to still end up being a bit of per-subsystem/maintainer like the C is today. Maybe a bit less, but still a mixup.

Some bugs are 10 times as serious as others

Posted Nov 1, 2023 9:45 UTC (Wed) by sdalley (subscriber, #18550) [Link] (1 responses)

I winced when I read
> Lunn replied that humans ultimately decide whether to merge the code,
> but that API problems are just bugs like any other—they can be fixed later
> if any are found.

A well-known software engineering rule-of-thumb is that for every stage in which bugs are not found and fixed, their cost goes up by a factor of <significant-number>. Interface APIs can and do affect the whole conceptual structure of the code that uses them. It really does pay to get them right first time.

Some bugs are 10 times as serious as others

Posted Nov 1, 2023 9:53 UTC (Wed) by Wol (subscriber, #4433) [Link]

+100

Cheers,
Wol

Rust code review and netdev

Posted Nov 1, 2023 13:59 UTC (Wed) by atnot (subscriber, #124910) [Link] (14 responses)

I loathe to bring out the whole "email based patch submission" discussion again but it is notable that in forge based workflows:
- you can add new changes at any time without completely resending everything
- review comments persist across revisions, unless they are marked as resolved
- ACKs and NACKs for previous versions are preserved in the history
- you can easily review just the diff since you last reviewed some changes
- adding 5 new changes doesn't create 5 new items in the reviewers todo list to sort through and be annoyed about
- changes don't vanish into the ether just because they haven't been resent for a few days

I really do see the advantages of a technically decentralized transport layer like Email for the kernel, don't get me wrong. But it does feel worth noting that this is a very widely solved problem outside of the kernel workflow.

Rust code review and netdev

Posted Nov 1, 2023 14:07 UTC (Wed) by mb (subscriber, #50428) [Link] (1 responses)

That's done with Patchwork in the kernel community.
http://jk.ozlabs.org/projects/patchwork/

Rust code review and netdev

Posted Nov 1, 2023 14:19 UTC (Wed) by atnot (subscriber, #124910) [Link]

* for the trees that actually use it, for the extremely limited capabilities it actually has compared to even basic forges, with the incompatabilities and workflow incongruencies it causes by not everyone using it, and inevitability of dropping down to raw email for anything serious anyway

Rust code review and netdev

Posted Nov 1, 2023 17:33 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (8 responses)

On the other hand:

* especially with GitHub, there is a tendency for old revisions to disappear forever

* it is impossible to compare the differences between the revisions. Good luck getting something like this from gitlab or github.

Rust code review and netdev

Posted Nov 1, 2023 19:57 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (1 responses)

You always have the option of cloning the pull request and doing whatever post-processing, backup, etc. you want - it's "just" a Git repository. In practice, the reason that people do not do the things you describe is not because they are impossible, but because the average reviewer does not care enough to bother.

Rust code review and netdev

Posted Nov 1, 2023 21:38 UTC (Wed) by pbonzini (subscriber, #60935) [Link]

> the average reviewer does not care enough to bother.

The existence of squash-on-merge is evidence that the platform teaches you not to bother. But it's a bad habit nevertheless, and one that bites you as soon as your projects becomes large enough that bisecting for regressions is a thing.

Rust code review and netdev

Posted Nov 1, 2023 20:15 UTC (Wed) by geofft (subscriber, #59789) [Link] (5 responses)

This is, ironically, because the forges make more use of Git than the email workflow. The email workflow conceives of patches primarily as diffs, in their rendered text format, so the concept of an interdiff between the two diffs is natural.

The forges use a Git-native workflow which requires that patches are commit objects. You can certainly compare two commit objects against each other, but the forges, following the lead of git diff, will now reinterpret the commit objects as trees of their final state, and thus compare the two final states against each other. It doesn't get you the rich "how has each patch in the series changed" view that the link you provided gets, and in particular it gets very useless if one of the changes is rebasing against a newer upstream version, because it will pull in all the changes that have been rebased as part of the comparison. With the email workflow, there isn't even an explicit base.

GitLab actually has a button for comparing previous versions of a pull request, but in practice using it at $dayjob, it's totally useless for rebases. I have resorted to pulling the branches/commits locally and manually using diff <(git show ...) <(git show ...) to get a useful view of what's new. The best way to handle this, I think, is for a patch author that's rebasing to rebase and fix as few conflicts as possible, force-push just that, and then make any updates they want to make and force-push again, so you can compare their v2 and v3 even if you can't compare their v1 and v2. But this is not at all obvious.

Also, now that you mention it, I suspect this is why forge users who have never seen/experienced the email workflow tend to make a new git commit -m "Addressed code review comments" instead of amending the existing patch. I never really understood why people do this, but now it makes sense to me: if the object that matters is the final tree, then this approach updates the tree just as well, and it also gives you a standalone commit that you can use to see the changes between versions that is easier to use in the forge-based workflow (and gives you a spot to put the "Changes since the previous revision..." comment that would go in your patch 0 email in the email workflow).

So I think I'm now convinced that forges should treat patches in patch format as first-class objects and reconstruct the commits on their end, and this would all work a lot better. This could be through accepting pushes like they do now, or through some other mechanism that's closer to git format-patch.

(Now that I think about it, Phabricator/Phorge does more or less this, though it only supports a single patch per change request/code review/"revision" in Phabricator terminology. But I think I have seen it do a decent job of the interdiff, subject to that constraint.)

Rust code review and netdev

Posted Nov 1, 2023 22:25 UTC (Wed) by kleptog (subscriber, #1183) [Link] (2 responses)

I agree that GitLab is fairly useless at this, but at $DAYJOB we use Gerrit and recent versions have gotten really good at showing diffs between versions of a patch and separating out the changes caused by rebasing.

There's a lot of hate on the Gerrit workflow but the way it treats the individual commits as the patches and not just as trees makes it really useful for certain styles of development, and being required to use GitLab/GitHub feels like a real step back whenever I use it.

I think you've nailed why GitLab doesn't just steal the algorithm from Gerrit, but it's because GitLab just looks at the trees rather than considering the commits as the objects of interest.

Rust code review and netdev

Posted Nov 2, 2023 11:36 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (1 responses)

Git has the tools; they're just not being used by GitLab or GitHub: https://gitlab.com/gitlab-org/gitlab/-/issues/24096

Rust code review and netdev

Posted Nov 2, 2023 21:59 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

Besides the algorithm is pretty simple and could be done entirely client side (the patchew link above works like that).

Rust code review and netdev

Posted Nov 1, 2023 22:36 UTC (Wed) by Tobu (subscriber, #24111) [Link] (1 responses)

Are you aware of git range-diff, for comparing two patch series / commit ranges?

I certainly feel more empowered on the CLI side, platforms have a tendency to de-prioritise serving their users once they reach critical mass when local tools get better the more one uses them. GitHub is the forge I've used most and the user experience is degrading these days (with a new code browser that feels heavier and less useful now that it doesn't serve full, searchable text).

Rust code review and netdev

Posted Nov 2, 2023 12:49 UTC (Thu) by jezuch (subscriber, #52988) [Link]

Second that: I may have already said this, but it's worth repeating: git range-diff is like a review superpower!

Rust code review and netdev

Posted Nov 3, 2023 2:54 UTC (Fri) by jschrod (subscriber, #1646) [Link] (2 responses)

Well, may it be -- that outside the kernel community not many projects need that variety?

Forge-based workflows are good for centraly-controlled developments.
But that won't work for the extremely decentralized development of the Linux kernel.

Do you really think that a forge-based workflow would work in *all* of the Linux kernel subcomunities?

Frankly, then you would have to present more detailled ideas to support your point of view.

Rust code review and netdev

Posted Nov 4, 2023 10:35 UTC (Sat) by snajpa (subscriber, #73467) [Link] (1 responses)

yes, it would, but not with a bunch of high-profile people in the project who are already committed to the idea that it ever won't, then it just won't

Rust code review and netdev

Posted Nov 4, 2023 10:52 UTC (Sat) by snajpa (subscriber, #73467) [Link]

IMHO "the DRM subsys" is the only rationale that anyone should ever need these days, just go and click around there, then compare to the pure LKML experience - and then imagine if we only had this forge experience and no mailing lists fragmentation at all... IMHO a no brainer, but how do you argument sanely with people who get a panic attack from the idea that they should open their browser and enable javascript...