Impact on developer experience

Posted Sep 9, 2024 16:23 UTC (Mon) by tesarik (subscriber, #52705)
Parent article: Testing AI-enhanced reviews for Linux patches

Has anyone considered the impact on the process in practice?

Let me recap my understanding of the situation:

LLMs are quite good in spotting formal errors, logic errors not so much.
LLMs will review everything sent to the mailing lists, even if there's nothing to say.
LLMs tend to act faster than humans.

This sounds to me like the first response to every contribution will be an AI-generated review full of nitpicks. The author may go and fix all the formal issues, send a v2 and get another round of AI comments (presumably less relevant now that the low-hanging fruit is gone). After all this, a human reviewer finally gets to read the code, finds a fundamental design flaw and requests a complete rewrite. I can imagine such experience may be frustrating, especially to newcomers and occasional contributors.

Impact on developer experience

Posted Sep 9, 2024 20:37 UTC (Mon) by kleptog (subscriber, #1183) [Link] (13 responses)

But this is how it is already for many people. Your code often has to pass through several linters and code style checkers before it even gets near the actual test cases. The reason why this is mostly unremarked on is because many people have the linters & code style checkers built into their editor, so they're basically doing your first two rounds of checking continuously.

It's only because the Linux kernel doesn't have an enforced coding style that this all becomes fuzzy. VSCode has a checkpatch extension to show issues straight away. If there were a more sophisticated AI tool, then that would undoubtably also be integrated into editors so it's not nearly the burden you think it is. Those first two levels of reviews would never even be posted to any mailing lists at all. Think of the bandwidth savings.

If you were using some kind of forge, that would be doing all the checking out of sight. But this in the Linux kernel, so all this has to be deposited in thousands of mailboxes around the globe.

Having one automatically enforced coding style makes it easier for everyone because then you never have to review patches that aren't in the correct style.

Impact on developer experience

Posted Sep 10, 2024 5:17 UTC (Tue) by tesarik (subscriber, #52705) [Link] (12 responses)

I believe I get your point, but I'm afraid there is a difference in practice. The existing linters (and other tools) are deterministic and can be executed locally by anyone. LLMs are non-deterministic by design and are executed through a service operated by a third party and can be executed only by those who have the corresponding API token. I don't think the Linux kernel API token would be public, if only for cost reasons and potential abuse.

I am not even sure I appreciate the bandwidth argument, unless it refers to the time spent by everybody reading public mailing lists. However, since you mention mailboxes, you seem to talk about the duplication of content. Well, just think of all the local Git repository clones around the globe. That's a feature! Go to git-scm.com, lo and behold one of the slogans: --distributed-is-the-new-centralized. Linux kernel development is decentralized, which naturally causes a lot of duplication. In a world with millions of copies of kittens and puppies transferred every minute, even LKML looks like a drop in the ocean.

Regarding a forge, this sounds like a good idea, but only if all data and meta-data remains in an open format that can be moved elsewhere if needed. Look, email format hasn't substantially changed since RFC2822 (published in 2001), and most LKML messages would render fine even according to RFC822 (published 1982). There are multiple LKML archives (yes, decentralization again) going all the way back to the 1990s. That's almost 30 years of history that have tested the sustainability of this process. It may not be the best possible process, but at least its strengths and weaknesses are known and understood. If anybody wants to replace it with another process, it is fair to ask if and how the new process preserves the known strengths of the old one. Unfortunately, I haven't seen serious answers to that. Then again, I'm not omniscient, so I'll be grateful if someone here can share a link.

Impact on developer experience

Posted Sep 10, 2024 18:37 UTC (Tue) by kleptog (subscriber, #1183) [Link] (11 responses)

> LLMs are non-deterministic by design

There is randomness during generation of the output, but the processing of the input is fully deterministic. So if you're telling the LLM to just produce a good/bad flag it will be very deterministic. Constrain its output to a specific JSON format. Just don't ask it to write an essay.

> are executed through a service operated by a third party and can be executed only by those who have the corresponding API token

This is a temporary problem. We will get smaller more focussed LLMs that target specific tasks and run on your local machine. You don't need ChatGPT4 to answer questions about C code.

> I am not even sure I appreciate the bandwidth argument,

Yes, I was referring to the developer bandwidth of thousands of people having to look at the email, examine that it's not for them because it it not their part of the kernel, or hasn't passed basic linting checks and discarding it and all followups. The nice thing about something like Gerrit it you can tell it to only show you open patchsets touch particular files/directories and have passed basic tests.

> Regarding a forge, this sounds like a good idea, but only if all data and meta-data remains in an open format that can be moved elsewhere if needed.

I can't believe this keeps coming up, it's solved problem. E.g. Gerrit stores all its metadata in the Git repository itself under a separate branch, that's how its replication works. It's all human readable (at least the bits I've worked with have been).

Impact on developer experience

Posted Sep 11, 2024 5:16 UTC (Wed) by tesarik (subscriber, #52705) [Link] (10 responses)

> Constrain its output to a specific JSON format. Just don't ask it to write an essay.

Great for integration with a forge. Not so great for email communication. Moreover, I'm not sure this is the currently proposed way of using an LLM for kernel development. Not your call, I know.

> We will get smaller more focussed LLMs that target specific tasks and run on your local machine.

Your use of future tense makes me believe the introduction of LLMs should happen in the future (when such LLMs are readily available) and not right now.

Thanks for mentioning Gerrit, because I have never used it myself. And BTW Gerrit is roughly just as old now as email was when LKML started, that is approx. 15 years. How do we persuade the Linux Foundation to host an instance? ;-)

Impact on developer experience

Posted Sep 11, 2024 6:53 UTC (Wed) by Wol (subscriber, #4433) [Link]

> And BTW Gerrit is roughly just as old now as email was when LKML started, that is approx. 15 years. How do we persuade the Linux Foundation to host an instance? ;-)

Really? Email was born about the same time as me (I'm not sure which is older). And linux was released a month before I turned 29.

Cheers,
Wol

Impact on developer experience

Posted Sep 11, 2024 8:01 UTC (Wed) by farnz (subscriber, #17727) [Link] (7 responses)

As far as I can tell, LKML is about 30 years old today, while e-mail is about 60 years old (and Internet e-mail lists are about 50 years old). We have a while to go before Gerrit is as old as Internet e-mail lists were when LKML started, let alone how old e-mail is.

Impact on developer experience

Posted Sep 11, 2024 9:47 UTC (Wed) by Wol (subscriber, #4433) [Link] (6 responses)

Iirc (see above :-) the first email message was sent in 1963.

Cheers,
Wol

Impact on developer experience

Posted Sep 11, 2024 10:24 UTC (Wed) by tesarik (subscriber, #52705) [Link] (5 responses)

AFAIK standard Internet mail (not just any form of email) started with RFC822. But you may actually have a point, because the difference between Gerrit in 2009 and Gerrit today is probably bigger than the difference between RFC822-compliant emails and earlier incarnations.

Impact on developer experience

Posted Sep 11, 2024 10:47 UTC (Wed) by farnz (subscriber, #17727) [Link] (4 responses)

RFC822 starts by saying that it's an update to the older RFC733, itself an update of RFC561, which in turn was an attempt to fix RFC524 from 1973, which itself is describing a more formal version of something that already existed - mail via FTP in the style of UUCP.

The earliest RFC I can find for Internet e-mail is RFC 196 from 1971, but that references an existing service at the NIC for Internet email, including mailing lists. RFC822 is just the last time there was a throw away and start again approach to mail transfer between sites on the Internet.

Impact on developer experience

Posted Sep 11, 2024 11:08 UTC (Wed) by tesarik (subscriber, #52705) [Link] (3 responses)

I know how to look up all that information.

Maybe I should explain that it has never been my goal to demonstrate how long I've been using the internet or how much I know about its history. My point is that the tools used for the development of the Linux kernel are designed to work with RFC822-compliant emails, and this format was approx. 15 years old when the current development process was established. Likewise, Gerrit will be 15 years old this year.

I have already admitted that this parallel may be inaccurate, because 2009-Gerrit data may not be compatible with its current format. If anybody still feels the urge to add more replies on the topic of, because “someone is WRONG on the internet", feel free to use your freedom of speech, but I'm not participating.

Impact on developer experience

Posted Sep 11, 2024 12:16 UTC (Wed) by Wol (subscriber, #4433) [Link] (2 responses)

> My point is that the tools used for the development of the Linux kernel are designed to work with RFC822-compliant emails, and this format was approx. 15 years old when the current development process was established.

Note that RFC822 does not describe Internet email, because RFC822 pre-dates the internet :-) (by about four months) - oh and when was the "current development process" established? Because RFC822 pre-dates linux itself by between 15 and 20 years.

I hope your programming is more pedantic than your chronology! :-)

Cheers,
Wol

Impact on developer experience

Posted Sep 11, 2024 12:39 UTC (Wed) by farnz (subscriber, #17727) [Link] (1 responses)

Perhaps deeper is that there's nothing new introduced by RFC822 - the same process could easily have been built around UUCP e-mails back in 1971. Picking on RFC822 is a lot like saying that you couldn't have a process based around downloading files from remote servers until 1996 and RFC 1945.

Impact on developer experience

Posted Sep 11, 2024 13:38 UTC (Wed) by Wol (subscriber, #4433) [Link]

Actually, iirc, isn't RFC822 not a network protocol at all? It describes the letter, the contents, encapsulated in an RFC821 envelope, which IS the network protocol, specifying addresses, post offices, how to identify the route from sender to receiver, etc etc.

Everything now uses "internet domains", but back then if I'd had an internet address it would have been along the lines of "wol@ac.open@janet", ie deliver to the JANET network, within that send it to the Open University, and then on to me.

The envelope, as specified by 821 and friends, has probably changed MUCH more than 822 over the years, as is obvious from my made-up address above! The @'s might even have been !'s. (didn't they used to be called "bang-addresses"?)

Cheers,
Wol

Impact on developer experience

Posted Sep 11, 2024 15:00 UTC (Wed) by kleptog (subscriber, #1183) [Link]

> [JSON format] Great for integration with a forge.

The output of an LLM should never be wired directly to email output. It should be just one part of a larger system. Using them in conversational form is great for casual use, but not for real work.

> Your use of future tense makes me believe the introduction of LLMs should happen in the future (when such LLMs are readily available) and not right now.

I think people should start thinking now about the kinds of checks they would like use LLMs for. You know, prototype things see whether they can actually do what we want them to do.

Once there is something that works, we can look into making it smaller and more efficient so it can be run locally. Maybe you don't need an LLM at all. The first computers were behemoths, it took a while to get them smaller and more efficient. LLMs will take time to follow that path. That doesn't mean they're not useful now.