|
|
Subscribe / Log in / New account

An update on the UMN affair

By Jonathan Corbet
April 29, 2021
On April 20, the world became aware of a research program conducted out of the University of Minnesota (UMN) that involved submitting intentionally buggy patches for inclusion into the Linux kernel. Since then, a paper resulting from this work has been withdrawn, various letters have gone back and forth, and numerous patches from UMN have been audited. It's clearly time for an update on the situation.

The writing of a paper on this research [PDF] was not the immediate cause of the recent events; instead, it was the posting of a buggy patch originating from an experimental static-analysis tool run by another developer at UMN. That led developers in the kernel community to suspect that the effort to submit intentionally malicious patches was still ongoing. Since then, it has become apparent that this is not the case, but by the time the full story became clear, the discussion was already running at full speed.

The old saying still holds true: one should not attribute to malice that which can be adequately explained by incompetence.

On April 22, a brief statement was issued by the Linux Foundation technical advisory board (or TAB, of which your editor is a member) stating that, among other things, the recent patches appeared to have been submitted in good faith. Meanwhile, the Linux Foundation and the TAB sent a letter to the UMN researchers outlining how the situation should be addressed; that letter has not been publicly posted, but ZDNet apparently got a copy from somewhere. Among other things, the letter asked for a complete disclosure of the buggy patches sent as part of the UMN project and the withdrawal of the paper resulting from this work.

In response, the UMN researchers posted an open letter apologizing to the community, followed a few days later by a summary of the work they did [PDF] as part of the "hypocrite commits" project. Five patches were submitted overall from two sock-puppet accounts, but one of those was an ordinary bug fix that was sent from the wrong account by mistake. Of the remaining four, one of them was an attempt to insert a bug that was, itself, buggy, so the patch was actually valid; the other three (1, 2, 3) contained real bugs. None of those three were accepted by maintainers, though the reasons for rejection were not always the bugs in question.

The paper itself has been withdrawn and will not be presented in May as was planned. One can, hopefully, assume that UMN will not be pursuing similar lines of research anytime soon.

Patch re-review

One immediate result of the attention drawn to UMN's activities was a loss of trust in its developers, combined with a desire in some quarters to take some sort of punitive action. Thus, one of the first things that happened when this whole affair exploded was the posting by Greg Kroah-Hartman of a 190-part patch series reverting as many patches from UMN as he could find. Actually, it wasn't all of them; he mentioned a list of 68 others requiring manual review because they do not revert easily.

As it happens, these "easy reverts" also needed manual review; once the initial anger passed there was little desire to revert patches that were not actually buggy. That review process has been ongoing over the course of the last week and has involved the efforts of a number of developers. Most of the suspect patches have turned out to be acceptable, if not great, and have been removed from the revert list; if your editor's count is correct, 42 patches are still set to be pulled out of the kernel.

For those 42 patches, the reasoning behind the revert varies from one to the next. In some cases, the patches apply to old and presumably unused drivers and nobody can be bothered to properly review them. In others, the intended change was done poorly and will be reimplemented in a better way. And some of the patches contained serious errors; these definitely needed to be reverted (and should not have been accepted in the first place).

A look at the full set of UMN patches reinforces some early impressions, though. First is that almost all of them do address some sort of real (if obscure and hard to hit) problem; there was a justification for writing a patch. While many of these fixes showed a low level of understanding of what the code was doing and thus contained errors, it seems unlikely that any of them were malicious in their intent.

That said, there are multiple definitions of "malice". To some of the developers involved, posting unverified patches from an experimental static-analysis tool without disclosing their nature is a malicious act. It is another form of experiment involving non-consenting humans. At a minimum, it is a violation of the trust that is required for the kernel's development community to work effectively.

The 42 bad patches out of 190 is a 22% bad-patch rate. Chances are, a detailed review of 190 patches from almost any kernel developer would turn up a few that, in retrospect, were not a good idea. Hopefully that rate would not approach 22%, though. But it must be said that all of those patches were accepted by subsystem maintainers throughout the kernel, which is not a great result. Perhaps that is a more interesting outcome than the one that the original "hypocrite commit" researchers were looking for. They failed in their effort to deliberately insert bugs, but were able to inadvertently add dozens of them.

Meanwhile, there is still the list of patches that did not revert cleanly. That list has not been posted publicly, but Kroah-Hartman did start with a subset of seven of them. He also noted that the TAB will be publishing a full report of the audit of all these patches once it is complete. Thus far, none of these patches have actually been reverted in the mainline; that seems likely to happen toward the end of the 5.13 merge window.

Lessons learned

One of the key lessons from this series of events would clearly be: do not use a free-software development community as a sort of free validation service for your experimental tool. Kernel developers are happy to see new tools created and — if the tools give good results — use them. They will also help with the testing of those tools, but they are less pleased to be recipients of tool-inspired patches that lack proper review and an explanation of what is going on.

Another lesson is something we already knew: kernel maintainers (and maintainers of many other free-software projects) are overworked and do not have the time to properly review every patch that passes through their hands. They are, as a result, forced to rely on the trustworthiness of the developers who submit patches to them. The kernel development process is, arguably, barely sustainable when that trust is well placed; it will not hold together if incoming patches cannot, in general, be trusted.

The corollary — also something we already knew — is that code going into the kernel is often not as well reviewed as we like to think. It is comforting to believe that every line of code merged has been carefully vetted by top-quality kernel developers. Some code does indeed receive that kind of review, but not all of it. Consider, for example, the 5.12 development cycle (a relatively small one), which added over 500,000 lines of code to the kernel over a period of ten weeks. The resources required to carefully review 500,000 lines of code would be immense, so many of those lines, unfortunately, received little more than a cursory looking-over before being merged.

One final lesson that one might be tempted to take is that the kernel is running a terrible risk of malicious patches inserted by actors with rather more skill and resources than the UMN researchers have shown. That could be, but the simple truth of the matter is that regular kernel developers continue to insert bugs at such a rate that there should be little need for malicious actors to add more. The 5.11 kernel, released in February, has accumulated 2,281 fixes in stable updates through 5.11.17. If one makes the (overly simplistic) assumption that each fix corrects one original 5.11 patch, then 16% of the patches that went into 5.11 have turned out (so far) to be buggy. That is not much better than the rate for the UMN patches.

So perhaps that's the real lesson to take from this whole experience: the speed of the kernel process is one of its best attributes, and we all depend on it to get features as quickly as possible. But that pace may be incompatible with serious patch review and low numbers of bugs overall. For a while, we might see things slow down a little bit as maintainers feel the need to more closely scrutinize changes, especially those coming from new developers. But if we cannot institutionalize a more careful process, we will continue to see a lot of bugs and it will not really matter whether they were inserted intentionally or not.

Index entries for this article
KernelSecurity/Patch verification


to post comments

An update on the UMN affair

Posted Apr 29, 2021 15:06 UTC (Thu) by willy (subscriber, #9762) [Link] (10 responses)

I'm not entirely convinced that Hanlon's Razor applies to someone who already bragged about punching you in the nose.

An update on the UMN affair

Posted Apr 29, 2021 15:55 UTC (Thu) by yann-gael.gueheneuc (guest, #151961) [Link] (6 responses)

"UMN will not be pursuing similar lines of research anytime soon."

Hopefully, they will continue their research but in a different, collaborative manner.

An update on the UMN affair

Posted Apr 29, 2021 20:37 UTC (Thu) by mpg (subscriber, #70797) [Link] (5 responses)

I had the same reaction to this sentence: I think this is a very interesting line of research and I hope some researchers will pursue this, with better ethics and stronger methodology, and as you say, in a collaborative manner: experimenting _with_ the community rather than _on_ people without their informed consent.

An update on the UMN affair

Posted Apr 30, 2021 8:43 UTC (Fri) by patrick_g (subscriber, #44470) [Link] (4 responses)

> I think this is a very interesting line of research

Could you please explain what is interesting in this research? We already know it's possible to introduce vulnerabilities by sending buggy patches to kernel maintainers. And the concept of "hypocrite commits" (minor commit introducing an exploitable issue elsewhere in the code) is not original. Only the name is new.
So I fail to see what is valuable in this research paper.

An update on the UMN affair

Posted Apr 30, 2021 10:32 UTC (Fri) by dsommers (subscriber, #55274) [Link] (1 responses)

It is interesting in the view of seeing and understanding better how an open source community can be tricked into committing code which looks reasonable separately, but will be stepping stones to create a functional attack vector combined. This is highly relevant in today's security discussions related to supply chain attacks.

Open source communities need to better understand how to defend themselves and how to detect such attempts. Which will an enormous challenge, but with more research it might be possible to find approaches to make such efforts harder to achieve.

An update on the UMN affair

Posted Apr 30, 2021 19:14 UTC (Fri) by viro (subscriber, #7872) [Link]

Bloody hell... What is the relevance of malicious intent, other than improving the odds of acceptance by the conference where they planned to present that... research?

You seem to imply that being a part of malicious plan to introduce a security hole imparts some recognizable features to the patches, making them easier to catch than "innocent" buggy ones. Mind elaborating on that and showing some kind of evidence?

Research into the features that correlate with looser review would be very valuable, exactly because it would allow to improve the rejection rate for crap. But that would take real experiment design - valid statistics, decently-sized datasets, etc.

An update on the UMN affair

Posted Apr 30, 2021 12:30 UTC (Fri) by mpg (subscriber, #70797) [Link] (1 responses)

I agree that this particular experiment does not really bring valuable data to the table. I don't think "showing it can be done" is very interesting: as you said, we already knew that. I think what would be interesting is better papers studying the review process in more depth, and hopefully helping the community improve.

You know the old saying: if you want to improve something, start by measuring it. I think measuring how easily it can be done (rather than just "it can be done"), or what percentage of bad commits get caught, how that percentage can be correlated to various parameters (type of patch, its size, which area of the kernel, probably parameters I can't think of), etc. could perhaps provide insights into how to improve. And obviously it would have to be done in close collaboration with the community and aimed at producing value to the community, not taking advantage of it.

So perhaps we interpreted "similar lines of research" differently here.

An update on the UMN affair

Posted May 6, 2021 1:24 UTC (Thu) by yann-gael.gueheneuc (guest, #151961) [Link]

Also, patches and reviews are done by people (albeit developers!) and it would be certainly interesting to understand the factors that lead these people to miss bugs. It would certainly benefit the community to identify, understand, and mitigate these factors; this has been done in other "safety critical" domains, e.g., plane pilots and doctors (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1117774/)

An update on the UMN affair

Posted Apr 29, 2021 16:58 UTC (Thu) by blackwood (guest, #44174) [Link]

Yeah in this case it seems to very much indicate a penchat for playing things fast&loose, which seems to be supported by the higher bug rate in patches from this group overall compared to maybe the kernel at large.

I think what applies here is that generally it's best to trust people first and assume benevolent intent. But then ensure that good intent is socially enforced by playing tit-for-tat.

Which is also what happened, even if patches from UMN will land again in the future, they will definitely be subjected to a lot more scrutiny and need to clear a higher bar of usefulnees than fairly "trivial" changes to mostly unused code in very old drivers.

An update on the UMN affair

Posted Apr 29, 2021 22:58 UTC (Thu) by Paf (subscriber, #91811) [Link] (1 responses)

In their (sigh) defense, the UMN didn’t brag about it - it was this specific group of researchers. The hunt through every patch ever submitted by the U of M (ever!) seems ... frankly, a little silly though not harmful, and the idea of a blanket revert straight up a self inflicted wound with no useful purpose at *all*.

An update on the UMN affair

Posted May 1, 2021 9:37 UTC (Sat) by dvdeug (guest, #10998) [Link]

This specific group of researchers from UMN. Given that they were all posting from UMN addresses, even without malice, there's no easy way for outsiders to separate this specific group of researchers from other people from UMN. We can always say "this specific group", but the usefulness and ease of separating that group is often lacking.

An update on the UMN affair

Posted Apr 29, 2021 15:37 UTC (Thu) by jkingweb (subscriber, #113039) [Link]

Sober analysis, Mr. Corbet. Thank you for thee update.

An update on the UMN affair

Posted Apr 29, 2021 17:26 UTC (Thu) by intgr (subscriber, #39733) [Link] (9 responses)

> For those 42 patches, the reasoning behind the revert varies from one to the next.
> In some cases, the patches apply to old and presumably unused drivers and nobody
> can be bothered to properly review them.

> The 42 bad patches out of 190 is a 22% bad-patch rate.

I don't think it's fair to call all of these patches "bad" if in some cases they will be reverted only for the reason that nobody can be bothered to review them.

In other cases, the patches touched some piece of code that was indeed found to be buggy, but did not entirely address the brokenness. I found many examples of this, e.g. https://lore.kernel.org/lkml/b43fc2b0-b3cf-15ab-7d3c-25c1... :
> This looks like a good commit but should be done now in a different way
> - using pm_runtime_resume_and_get(). Therefore I am fine with revert
> and I can submit later better fix.

A far more interesting metric would be, how many of these patches were actually found to introduce a bug that wasn't present before? Looking at a random sample of the patch reviews suggests it would be quite low.

Even the one patch that triggered this whole reaction (https://lore.kernel.org/linux-nfs/20210407001658.2208535-...) was actually harmless, but added some redundant code.

I'd like to nominate this whole drama for the "overreaction of the year" title. :)

Also: shouldn't LWN issue some sort of retraction or update to the previous article? It implied that these pathces were intentionally buggy or malicious a few times, which this article admits was unfounded.

An update on the UMN affair

Posted Apr 29, 2021 18:29 UTC (Thu) by deater (subscriber, #11746) [Link]

> I'd like to nominate this whole drama for the "overreaction of the year" title. :)

this particular issue is complex as it affects three areas. The Linux-kernel reaction is pretty well summarized in this article.

There's the IEEESSP (Oakland) issue where a paper like this never should have made it into a "top" conference, especially as it turns out the authors were doing questionable things like modifying the abstract after the paper was already submitted, and apparently their conclusions in the paper were extremely misleading when actually compared to the actual methodology that was released later. How did this pass peer review? Sadly modern academic publications with double-blind reviews are well-known to be easily vulnerable to peer-review attacks (see the ISCA'19 incident).

The final issue is the academic side of things at UMN. Typically the university would not care at all about this (though since the people involved do not have tenure they're on thinner ice than normal). Possibly UMN is even happy as their CS department is getting a lot of publicity out of this. Universities will put up with all kinds of academic dishonesty, abuse of students, etc, as long as the researchers bring in grant money (again, see ISCA'19, also Ullman winning the Turing award). However the wildcard here is the UMN researchers listed $800k of National Science Foundation grants on the withdrawn paper. NSF has been on a *huge* ethics kick recently. I'm assuming this is why UMN is responding so strongly, there's a chance this funding will be removed, or worse, their whole school will get audited. Sadly money talks these days :(

Retraction

Posted Apr 29, 2021 19:45 UTC (Thu) by corbet (editor, #1) [Link] (6 responses)

So, I just went back and reread the original article, but I find nothing there that seems to be in need of retraction. It was a report on what was happening at the time, and the article itself took no position with regard to the status of the other patches. If there is something specific there that you think needs to be retracted, please point it out to us.

Retraction

Posted Apr 29, 2021 20:21 UTC (Thu) by Homer512 (subscriber, #85295) [Link] (2 responses)

The retraction is likely in response to the unethical practice of human experimentation on the kernel devs without prior consent. Fruits from the poisonous tree, so to speak. Whether this is an overreaction is up for debate.

Retraction

Posted Apr 29, 2021 20:29 UTC (Thu) by corbet (editor, #1) [Link] (1 responses)

I think I must not have been clear, sorry. The comment that I was replying to was asking LWN to retract our previous article on this topic; I was saying that I don't understand why that would be necessary.

Retraction

Posted Apr 29, 2021 20:46 UTC (Thu) by bronson (subscriber, #4806) [Link]

Calling for it to be retracted also seems eligible for an overreaction award.

Retraction

Posted Apr 29, 2021 21:43 UTC (Thu) by mpg (subscriber, #70797) [Link]

So, I also went back and reread the original article. While I don't think a retraction would be in order, I'm afraid I still find this article less balanced than is usually the case here on LWN. To be fair, I don't think the article stated anything wrong, and my concern is more about the weight given to various "sides" of the story or various hypotheticals.

Take this sentence from the first paragraph for example: "The patch [...] was duly questioned [...], but it is not an honest mistake; according to Kroah-Hartman, there has been an attack of sorts underway [...]". The way the semi-colon is placed makes it look (at least to my eyes, not a native speaker) like "it is not an honest mistake" is a statement of fact rather than Greg's opinion. Perhaps a more accurate wording would have been: "but, according to KH, it is not an honest mistake and there has been [...]".

Also, I think the order in which different points of views / hypotheses are presented matters. For example it's only until the end of the article (after a few paragraphs about the reversal process) that we learn the researchers claim that none of the hypocrite commits made it to the kernel. The original paper is linked early in the article and described as detailing "the process of introducing use-after-free bugs into the kernel [...]". After re-reading the paper's "Ensuring the safety of the experiment" paragraph, I think a more accurate description would have been "details the process of _nearly_introducing UAF bugs into the kernel [...]".

Of course the link to the paper is right there at the beginning of the article, so I can click and go read the paper right away. But more likely, I'll finish reading the LWN article first, and by the time I get to the end, I'll probably subjectively give more weight to the idea / hypothesis that intentional bugs made it to the kernel, perhaps in large numbers, and things might still be going on, than to the idea that no intentional bug was introduced in the kernel, and that there only were 3 hypocrite commits.

I'm afraid the previous paragraphs read like I'm picking at details in the article. I'd like to clarify that my intention is not to attack this article, but rather to try and explain why I found it less balanced than the average LWN article in a more specific (and hopefully constructive) way than "that's just how it felt to me". Also, I want to emphasize that in general I really appreciate the way LWN reports in a balanced and distanced (as in "looking at the big picture" and "cool-headed") way on all sort of topics, including those that generate heated discussion. It's only because the average quality of LWN articles is so high that I can feel perhaps this one was slightly below the usual standards on that front.

Retraction

Posted Apr 29, 2021 21:48 UTC (Thu) by intgr (subscriber, #39733) [Link] (1 responses)

I had to re-read the article as well, and while it's dissatisfying that it picks a side, you're right that it's mostly factually correctly reporting on Linux developers' reaction.

For the record, I did not mean that the article as a whole needs to be "retracted", but I guess "acknowledgement" that the assumptions reported in the article turned out to be false. I'm sorry, "retraction" was too strong a word

There is one statement that appears to be made in the voice of LWN:

> In fact, a thorough review of patches emanating from UMN over the last year or two is probably in order for other projects, particularly high-profile ones.

As published in the "hypocrite commits" paper, the known malicious patches had actually come from Gmail addresses, and we know now that suspected patches from UMN were not intentionally buggy.

Retraction

Posted Apr 30, 2021 5:41 UTC (Fri) by gfernandes (subscriber, #119910) [Link]

Gmail addresses *belonging to* said committees though. So the article is perfectly valid and you've shown nothing in need of retraction.

Incompetent, but intentionally malicious patches they were indeed.

And we're lucky they were rejected for whatever reasons.

The fact that it happened, is reason enough for the reaction.

An update on the UMN affair

Posted Apr 30, 2021 17:12 UTC (Fri) by calumapplepie (guest, #143655) [Link]

> I don't think it's fair to call all of these patches "bad" if in some cases they will be reverted only for the reason that nobody can be bothered to review them.

It's also not counting the 79 patches which couldn't be auto-reverted. One of the major reasons for an inability to auto-revert is a later bugfix: which seems to be the case for many of those 79. Greg describes this in his second patchset. In short, 22% is probably on the low end.

An update on the UMN affair

Posted Apr 29, 2021 19:27 UTC (Thu) by agrawal-d (guest, #141386) [Link] (13 responses)

I noticed that Jonathan Corbet himself is a member of the Linux Foundation Technical Advisory Board. It is amusing to see he had to link to the ZDNet article to refer to the notice sent to UMN - perhaps he could not have shared it himself as a member of the board but was aware of the contents nonetheless.

An update on the UMN affair

Posted Apr 29, 2021 23:02 UTC (Thu) by Paf (subscriber, #91811) [Link]

This is not an uncommon thing - if something is public, in many cases, a personal obligation to not mention/discuss its contents is considered to expire, as the point of it has lapsed. (Legal obligations may be another matter, but I doubt that’s in play here.)

An update on the UMN affair

Posted Apr 29, 2021 23:03 UTC (Thu) by Paf (subscriber, #91811) [Link] (11 responses)

Huh, I didn’t know John was on the LF TAB.

John, I would say it’s generally standard journalistic practice to disclose when reporting on actions of bodies one is involved with. (I’m not saying anything untoward happened here, not at all, just noting this.)

An update on the UMN affair

Posted Apr 30, 2021 0:36 UTC (Fri) by himi (subscriber, #340) [Link] (10 responses)

Possibly he assumed it was simply common knowledge? Though I have to admit I didn't look into the details of the TAB after hearing it was being set up, Jonathon Corbet (as both "the LWN guy" and the kernel documentation maintainer) seems like a reasonable choice for a seat.

A formal disclosure would probably have been appropriate as soon as the actions of the TAB became part of the discussion, though.

An update on the UMN affair

Posted Apr 30, 2021 1:50 UTC (Fri) by lutchann (subscriber, #8872) [Link] (9 responses)

I, too, would have appreciated such a disclosure, if only for formality's sake. I have full confidence that the reporting here was entirely in good faith, but conflict-of-interest disclosures are expected from reputable media outlets.

TAB

Posted Apr 30, 2021 11:25 UTC (Fri) by sumanah (guest, #59891) [Link] (8 responses)

Was there a general announcement somewhere in a previous article when Corbet joined the TAB, such that LWN assumes that all readers already know this? I did not know or I had forgotten.

TAB

Posted Apr 30, 2021 12:50 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

While there probably was, it is a hard case to make to expect readers to be on the up-and-up for all of LWN's history. Sure, *I've* been reading since 2012, but there are those who haven't. Any arbitrary article could be someone's first, so "basics" like CoI disclaimers are best to put "everywhere" relevant.

For example, I know I see Ars Technica mention their Charter ownership bits whenever they cover Charter/Time Warner or affiliation with Wired when they republish an article at least.

TAB

Posted Apr 30, 2021 13:07 UTC (Fri) by corbet (editor, #1) [Link] (6 responses)

We generally post the results of the TAB elections every year, so there have been a few such announcements.

It honestly just didn't occur to me to add something to the article. I guess it seemed sort of like disclosing that I'm a kernel maintainer every time I write about the kernel — something a certain journalist out there has faulted me for not doing. I also had not really been involved in the TAB response up to the time the article was written.

That said, I certainly wasn't trying to hide anything, and apologize to anybody who felt otherwise. I have added a note to the article disclosing that membership.

TAB

Posted Apr 30, 2021 13:31 UTC (Fri) by sumanah (guest, #59891) [Link] (4 responses)

I appreciate that! I 100% believe and assumed that you were not trying to hide anything -- and I understand why you originally didn't think to do so. Best wishes and thanks as always for your skilled and informed analysis!

TAB

Posted Apr 30, 2021 22:28 UTC (Fri) by Wol (subscriber, #4433) [Link] (3 responses)

And I understand why this disclosure is important, but imho it really grates when it's pushed in your face every article - things like "so and so sponsored our trip to conference X".

I'd be much happier with a page, linked to from the home page, that listed all staff memberships, sponsorship deals, etc etc, but then people will moan that it's not in their face and they didn't know where to look! ...

You can't win, whatever you do.

Cheers.
Wol

TAB

Posted Apr 30, 2021 22:44 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (2 responses)

> pushed in your face every article - things like "so and so sponsored our trip to conference X".

A sentence at the end of the article isn't "in your face" IMO, but experiences may vary…

TAB

Posted May 1, 2021 11:13 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

No it doesn't seem much, but when I'm reading a bunch of articles and it's at the bottom of every one, it gets to me a bit. Other people might not even notice it's there ...

Cheers,
Wol

TAB

Posted May 6, 2021 11:20 UTC (Thu) by kpfleming (subscriber, #23250) [Link]

This is definitely true when a Weekly Edition contains a series of articles covering talks from the same conference; it's not unusual to see that same attribution repeated a half-dozen times in that situation. Since the articles are independent posts that are just presented in aggregate for those who choose to read that way, each article really does need to have the attribution statement.

TAB

Posted Apr 30, 2021 22:47 UTC (Fri) by Paf (subscriber, #91811) [Link]

Thanks for the addition!

I agree it would be patently absurd to mention your involvement in the kernel all the time, particularly given that it is *a public project*. I also know that you tend to mention your work as documentation maintainer when you write anything of length on that topic, which I think is nice if only because it helps the reader understand your perspective on the subject.

So, thank you. This case feels slightly different for what are probably obvious reasons.

An update on the UMN affair

Posted Apr 29, 2021 21:46 UTC (Thu) by chfisher (subscriber, #106449) [Link] (2 responses)

Although it is true that one should never attribute to malice that which can be adequately explained by stupidity or incompetence, as Jerry Pournelle once pointed out, the toleration of certain levels of stupidity or incompetence constitutes malice by itself. It appears that UMN has reached those levels.

An update on the UMN affair

Posted Apr 30, 2021 7:05 UTC (Fri) by taladar (subscriber, #68407) [Link]

I am also not sure that the "our researchers at UNM are stupid, not evil" argument results in a more favourable view on UNM.

As a potential employer I certainly wouldn't want to employ either kind of person so their degrees are worth less than before whether it was stupidity or malice.

An update on the UMN affair

Posted May 5, 2021 10:48 UTC (Wed) by ceplm (subscriber, #41334) [Link]

Making tests on uninformed human subjects by the unversity-graduated (not even mentioning univeristy-employed with IRB) researchers cannot be explained by their stupidity.

An update on the UMN affair

Posted Apr 29, 2021 23:45 UTC (Thu) by flussence (guest, #85566) [Link] (1 responses)

If I wanted to slip actually-malicious code into the kernel, this isn't how I'd go about it. Raise your hand if you've fully read *any* of the multi-megabyte amdgpu code dumps thrown over the wall.

(from the instability of some of that stuff, sometimes it feels like it *was* used as a trojan horse...)

An update on the UMN affair

Posted May 6, 2021 18:15 UTC (Thu) by a0485302@ti.com (guest, #143318) [Link]

Heh, that's exactly what I was thinking. Patches like these are infinitely easier to review than a massive series that no one simply has time for.

An update on the UMN affair

Posted Apr 30, 2021 3:45 UTC (Fri) by PengZheng (subscriber, #108006) [Link]

> But if we cannot institutionalize a more careful process, we will continue to see a lot of bugs and it will not really matter whether they were inserted intentionally or not.

However, intentionally malicious patches could introduce vulnerabilities much more difficult to find out.

An update on the UMN affair

Posted Apr 30, 2021 9:03 UTC (Fri) by hejianet (subscriber, #106177) [Link] (1 responses)

Sorry for my dumbness, even the patch at [1] is incorrect, but put_device() is still needed in the error path, isn't it?
Why there is no followup to fix this...?

[1] https://lwn.net/ml/linux-kernel/20200821034458.22472-1-ac...

An update on the UMN affair

Posted May 3, 2021 10:53 UTC (Mon) by error27 (subscriber, #8346) [Link]

I was the person who reviewed that patch. The UMN devs claim in their paper that dev_err() on the next line is a use after free. But the caller holds a reference so actually it's fine. Otherwise if the caller didn't hold a reference the whole function would be a giant use after free.

So their patch is actually fine, but I rejected it because that error path is going to crash unless they fix the use after free which I mentioned in my email. Normally "first patch" developers do go back and redo their patches. I try to not waste their time, but I do always want them to think of the bigger picture and the surrounding code.

Their patch #4 is also interesting. Part of their bugfix was correct. The patch reviewer spotted the potential for the bug that they introduced and sort of asked them about it, before asking them to re-write the patch. But that reviewer was too vague and the UMN devs misunderstood the review comments. That's why I try to write in short sentences with simple words. But people don't understand my review comments either. :P

An update on the UMN affair

Posted Apr 30, 2021 18:38 UTC (Fri) by mcon147 (subscriber, #56569) [Link] (6 responses)

I'm pretty surprised UMN isn't in legal hot water for this, I would have assumed there was all sorts of overly broad computer security laws that would cover this sort of thing

An update on the UMN affair

Posted May 3, 2021 18:40 UTC (Mon) by error27 (subscriber, #8346) [Link] (5 responses)

Also, it's fraud to sign your patches with someone else's name. But it might be hard to convince a jury about damages.

An update on the UMN affair

Posted May 3, 2021 19:37 UTC (Mon) by ttuttle (subscriber, #51118) [Link] (4 responses)

I'd expect you'd have to prove they were trying to impersonate that someone, rather than just submitting under a pen name or pseudonym or such.

An update on the UMN affair

Posted May 4, 2021 3:32 UTC (Tue) by error27 (subscriber, #8346) [Link] (3 responses)

There are a bunch of different potential legal issues around using a pen name. Also identity theft would probably be several additional crimes as well as fraud.

An update on the UMN affair

Posted May 4, 2021 3:38 UTC (Tue) by ttuttle (subscriber, #51118) [Link] (2 responses)

So, putting aside identity theft, since we both agree that's definitely gonna land you in all manner of legal hot water: what makes this fraud but not someone simply using a pen name? Or are you saying that pen names are inherently fraudulent?

An update on the UMN affair

Posted May 4, 2021 4:46 UTC (Tue) by error27 (subscriber, #8346) [Link]

In MN if you're getting paid under the fake name then you have to fill out a Certificate of Assumed Name form with your county.

But in this case it seems like they were using a fake name in order to deceive. I'm not a lawyer but I think you could be sued for that.

The other thing is we have told everyone that they are supposed to use their real name as they would for signing a legal document. In the past I have had to search for every contributed to the Sparse static checker so we could relicense it. If people use a fake name, how could you find their Facebook page and their new employer?

An update on the UMN affair

Posted May 4, 2021 15:20 UTC (Tue) by Wol (subscriber, #4433) [Link]

In order to be fruadulent there must be intent to deceive. So there's nothing inherently wrong with pen names. After all, I'm well known under my nic on the net, though I don't make it hard for people to find my real name. For anything on the net where real names are important (like as someone mentioned, getting paid :-), I'd just refer to myself by my real name with "(aka Wol)" at the end of it.

Cheers,
Wol

An update on the UMN affair

Posted Apr 30, 2021 20:49 UTC (Fri) by ecree (guest, #95790) [Link]

Hmm. Because they submitted the patches under pseudonyms, those Signed-off-by lines are *very* dodgy. I think they're *technically* valid, because the DCoO doesn't say anything about using your legal name, or an identity that is associated with the right that (a) certifies; but it means that the Signed-off-by chain doesn't give us the attestation it's supposed to. E.g. if the copyright on these patches is owned by UMN (who knows what's written into these researchers' contracts?) then without a Signed-off-by from the researcher's real name, we can't prove that we got it legitimately (and, before the "full disclosure", we'd have had no way of finding out who "James Louise Bond" was and whether they'd had the right to submit the patch).

I realise that next to everything else these 'researchers' have done, forging SOBs is small fry, but it still raises my hackles. More to the point, the DCoO as written doesn't quite make crystal clear that this isn't OK. SubmittingPatches does ("sorry, no pseudonyms or anonymous contributions"); but that's a step further removed. Maintainers already reject obvious pseudonyms, but plausible-looking names can get through (although not entirely unremarked-upon in this case). Is there anything we need to fix here, and if so, how?

An update on the UMN affair

Posted May 1, 2021 9:11 UTC (Sat) by emorrp1 (guest, #99512) [Link] (2 responses)

On the pace of maintainability aspect, as I understand it, the bulk of kernel LOC is in device drivers so would it not be possible to split those off from the kernel tree entirely? Even if that's not desirable for the same reasons as they're in-tree in the first place, a higher review focus could go to patches changing the "core", on the assumption that code there will affect a higher proportion of users?

An update on the UMN affair

Posted May 1, 2021 11:08 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Given the kernel's development model, what would change? The destination email for a set of pull requests to Linus? I don't see how that would get more eyeballs on anything.

Not to mention that when such a thing is cleaved, some kind of stable API needs to appear or you end up with tandem patch submissions and tandem tagging. I don't think it's worth it.

An update on the UMN affair

Posted May 2, 2021 14:03 UTC (Sun) by nivedita76 (subscriber, #121790) [Link]

The core already gets much higher review focus.

An update on the UMN affair

Posted May 3, 2021 9:59 UTC (Mon) by nim-nim (subscriber, #34454) [Link]

> The 42 bad patches out of 190 is a 22% bad-patch rate. […] Perhaps that is a more interesting outcome than the one that the original "hypocrite commit" researchers were looking for. They failed in their effort to deliberately insert bugs, but were able to inadvertently add dozens of them.

Well, that’s not really interesting, of course submissions from uni addresses are going to be fairly low quality. The students need to gain experience somewhere (and once they do gain it scholarship is over and they change addresses).

What is surprising is when quality submissions from companies that release expensive products is not that great either.

Is 500 000 lines of code to much to review?

Posted May 6, 2021 10:34 UTC (Thu) by ededu (guest, #64107) [Link] (4 responses)

> Consider, for example, the 5.12 development cycle (a relatively small one), which added over 500,000 lines of code to the kernel over a period of ten weeks. The resources required to carefully review 500,000 lines of code would be immense, so many of those lines, unfortunately, received little more than a cursory looking-over before being merged.

Well, 500 000 new lines of code for a new release (2.5 months = 60 working days) means around 10 000 lines to review per working day, divided by 10 hours => 1000 lines per hour, divided by ~20 maintainers => around 50 new lines of code to review per hour per maintainer. This does not sound too much to review..., isn't it?

Is 500 000 lines of code to much to review?

Posted May 6, 2021 20:56 UTC (Thu) by micka (subscriber, #38720) [Link] (1 responses)

> 2.5 months = 60 working days
> per working day, divided by 10 hours

I don’t know if I’m the only one, but I find both of these figures to be rather exploitative (not talking about doing one single activity for more than 20 day that would probably have me quit right away).
Is that a cultural/country difference?

Is 500 000 lines of code to much to review?

Posted May 7, 2021 8:36 UTC (Fri) by ededu (guest, #64107) [Link]

>> 2.5 months = 60 working days
>> per working day, divided by 10 hours
>
>I don’t know if I’m the only one, but I find both of these figures to be rather exploitative (not talking about doing one single activity for more than 20 day that would probably have me quit right away).
>Is that a cultural/country difference?
Well, I wanted simply to have a rough approximation in order to have an order of magnitude about the number of lines...

Is 500 000 lines of code to much to review?

Posted May 6, 2021 21:46 UTC (Thu) by pebolle (guest, #35204) [Link] (1 responses)

> around 50 new lines of code to review per hour per maintainer. This does not sound too much to review..., isn't it?

It seems you've not yet run into that one-line patch that, after it turned out to be buggy, took you many days to fix. Because the number of lines of code changed by a patch is pointless to determine the amount of review needed for it.

Is 500 000 lines of code to much to review?

Posted May 6, 2021 23:56 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

After reading the memory model-related articles here on LWN, I can bet that knowing whether the addition (or removal) of a single `READ_ONCE` call is correct would likely take me far longer to verify than some of the other changes I've reviewed.

That said, I sense a…bit of sarcasm in the GP post :) .

TAB report on this was posted 2021-05-05 (today)

Posted May 6, 2021 18:52 UTC (Thu) by david.a.wheeler (subscriber, #72896) [Link]

The Linux Foundation's Technical Advisory Board (TAB) has released today its “Report on University of Minnesota Breach-of-Trust Incident or "An emergency re-review of kernel commits authored by members of the University of Minnesota, due to the Hypocrite Commits research paper.”

You can see it here:
<https://lkml.org/lkml/2021/5/5/1244>


Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds