LWN: Comments on "Stabilizing per-VMA locking" https://lwn.net/Articles/937943/ This is a special feed containing comments posted to the individual LWN article titled "Stabilizing per-VMA locking". en-us Tue, 23 Sep 2025 11:10:39 +0000 Tue, 23 Sep 2025 11:10:39 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Slowing down fork-heavy workloads in favour of thread-heavy workloads by THIS MUCH? https://lwn.net/Articles/938899/ https://lwn.net/Articles/938899/ mirabilos <div class="FormattedComment"> An immediate idea to mitigate this: keep a process-global mmap lock until the first thread is created, switch to per-VMA locking only then.<br> <p> Is that workable?<br> <p> Many many standard Unix tools don’t use any threads at all, while the standard workflow uses loops and pipes and is therefore fork-heavy.<br> </div> Fri, 21 Jul 2023 15:09:35 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938856/ https://lwn.net/Articles/938856/ jeremyhetzler <div class="FormattedComment"> Was this test-case, that triggered the bug, added to the test suite? That would be the best way to prevent similar bugs in the future.<br> </div> Fri, 21 Jul 2023 12:06:46 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938658/ https://lwn.net/Articles/938658/ khim <font class="QuotedText">&gt;But for that to work you need to understand your ownership model clearly, otherwise you can't encode it in the type system.</font> <p>And if you couldn't do that then your program can not be compiled and thus wouldn't have any bugs.</p> <p>The question is not whether Rust may prevent such bugs. It absolutely could do that, it's designed for that and it works well.</p> <p>The question is whether price is too high and if code which would satisfy Rust compiler would <b>also</b> satisfy performance demands of kernel developers, too.</p> <p><b>That</b> is still an open question, that's true.</p> Thu, 20 Jul 2023 14:02:21 +0000 2.4.10 https://lwn.net/Articles/938321/ https://lwn.net/Articles/938321/ Paf <div class="FormattedComment"> Yeah, I can’t argue with the results. That stood out but I was in fact talking as much about the whole process - the differing branches and the seeming lack of maintainers except Linus. It feels a little like watching a kind of multi-mode high traffic near-chaos flow, the kind that seems - to someone used to orderly traffic - like it should lead to constant crashes but somehow doesn’t (usually :x).<br> </div> Sun, 16 Jul 2023 19:30:12 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938293/ https://lwn.net/Articles/938293/ ringerc <div class="FormattedComment"> Ah, that explains it, I missed that it had to be MT. Postgres would never hit that every in unusual deployments like those using PL/Java.<br> </div> Sat, 15 Jul 2023 10:37:39 +0000 2.4.10 https://lwn.net/Articles/938290/ https://lwn.net/Articles/938290/ corbet The VM switch in 2.4 seemed insane at the time as well. It <i>did</i> seem to stabilize a lot of the persistent problems that had been plaguing that kernel, though; it's been a long time since I've heard anybody who thinks it was the wrong decision. Sat, 15 Jul 2023 00:40:51 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938288/ https://lwn.net/Articles/938288/ Paf <div class="FormattedComment"> From the perspective of someone who wasn’t around then and has only seen basically the current process, that all sounds just *insane* as a dev process.<br> </div> Sat, 15 Jul 2023 00:25:39 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938272/ https://lwn.net/Articles/938272/ willy <div class="FormattedComment"> The task calling fork() must be multithreaded to encounter this bug. Most multithreaded tasks don't call fork(). Most tasks that call fork() (including pgbench) are not multithreaded. It's not too surprising that it took months to be uncovered (that is, months since Suren started posting this patch series, not months since it was released in an official kernel).<br> </div> Fri, 14 Jul 2023 19:40:38 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938234/ https://lwn.net/Articles/938234/ mb <div class="FormattedComment"> <span class="QuotedText">&gt;But for that to work you need to understand your ownership model clearly,</span><br> <span class="QuotedText">&gt;otherwise you can't encode it in the type system.</span><br> <p> If you don't understand the ownership model, you can't write correct C code either. The resulting code may be correct by luck, at best.<br> <p> <span class="QuotedText">&gt;I think in this case here it was more an issue of not clearly understanding the</span><br> <span class="QuotedText">&gt;ownership model of vma structs</span><br> <p> Exactly.<br> Because it was not enforced and not even encouraged by the language to think about it.<br> <p> <span class="QuotedText">&gt;not so much a missed case that the compiler could have spotted for you.</span><br> <p> I'm not talking about that kind of thing.<br> I'm talking about Rust encouraging you to think about ownership. This is many levels above a compiler run.<br> Programming Rust puts your brain into a completely different mode of thinking. And that's what makes the real difference.<br> </div> Fri, 14 Jul 2023 14:05:08 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938190/ https://lwn.net/Articles/938190/ sam_c <div class="FormattedComment"> I took the above comment as being about the fork problem in 6.4 rather than the stack rot issue from 6.1.<br> </div> Fri, 14 Jul 2023 12:30:46 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938185/ https://lwn.net/Articles/938185/ farnz <p>It's complicated. The newest kernel is, by definition, the best kernel the developers know how to produce, but that does not necessarily make it the best for your workload, because it may have new bugs that matter to you, while the old bugs don't matter to you. <p>And one component of "please use the latest" is that bug reports become less valuable the longer the gap between bug introduced, and bug found; if you report a bug that's not present in 6.4, but is present in 6.5-rc1, not only are there fewer commits to consider that could possibly have exposed the bug, but the people who will fix your bug have the context around <em>why</em> something works the way it does in recent memory. If you report a bug as added in 5.12-rc1 and still present in 6.5-rc1, but not in 5.11, you're still keeping the commit count down, but now you're asking the developers to remember why changes were made in the 5.12 time frame, and what the intended effect was. Fri, 14 Jul 2023 10:25:41 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938181/ https://lwn.net/Articles/938181/ sima <div class="FormattedComment"> I'm not sure rust would have helped here a lot. Rust is really good at enforcing ownership models and pretty flexible at that. But for that to work you need to understand your ownership model clearly, otherwise you can't encode it in the type system. And you need to encode it all yourself for custom data structure protection schemes like this one.<br> <p> I think in this case here it was more an issue of not clearly understanding the ownership model of vma structs, and not so much a missed case that the compiler could have spotted for you. Where rust might help is that it allows you to roll out the ownership model first (in the type system) without changing the locking, and then once you're fairly confident that you got it right, you flip the switch (and watch things blow up if you missed something).<br> </div> Fri, 14 Jul 2023 07:06:09 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938180/ https://lwn.net/Articles/938180/ mb <div class="FormattedComment"> <span class="QuotedText">&gt; Whether, say, a VMA abstraction written in Rust could truly ensure that accesses use proper</span><br> <span class="QuotedText">&gt; locking while maintaining performance has not yet been proven in the kernel context, though. </span><br> <p> Well, I think that such performance critical paths would either use safe code or they would be marked unsafe {} and use some lockless performance tricks.<br> The unsafe-block by itself would probably help to raise some eyebrows when doing such fundamental locking changes.<br> </div> Fri, 14 Jul 2023 06:26:50 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938179/ https://lwn.net/Articles/938179/ alison <div class="FormattedComment"> Thanks for posting that link into the Wayback Machine: it made me laugh.<br> </div> Fri, 14 Jul 2023 04:54:26 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938177/ https://lwn.net/Articles/938177/ andresfreund <div class="FormattedComment"> I actually happened to run a bunch of postgres benchmarks on a vulnerable kernel, without, as far as I know it, encountering the issue. I suspect it's because postmaster just doesn't have a lot of page faults. <br> </div> Fri, 14 Jul 2023 04:02:47 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938174/ https://lwn.net/Articles/938174/ jlbec Significant VM changes in major releases that take time and consternation to sort out are <a href="https://lwn.net/2001/0927/kernel.php3">time-honored tradition</a>. Fri, 14 Jul 2023 02:16:02 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938163/ https://lwn.net/Articles/938163/ ringerc <div class="FormattedComment"> Wow. This bug would've made postgres extremely sad, and should've turned up reasonably quickly in a pgbench workload. <br> <p> Interesting that it got so far, given that postgres has been a useful tool to exercise other MM kernel issues in the past.<br> </div> Thu, 13 Jul 2023 22:34:25 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938161/ https://lwn.net/Articles/938161/ Paf <div class="FormattedComment"> “ Kernel developers often make the point that the newest kernels are the best ones that the community knows how to make and that users should not hesitate to upgrade to them. Responding more quickly when an upgrade turns out to be a bad idea would help to build confidence in that advice.”<br> <p> It seems so obvious that the truth of “newest is best” is more complicated than this (I’m not saying those involved don’t know this, just it helps me to say it directly). The basic logic of it is obviously correct, but at the same time, it’s common that new things have new problems not yet found through greater exposure. I don’t really have an answer, maybe it’s to hugely increase testing of kernels before they come out. But it seems likely there will still be a sweet spot with stable kernels that are just a bit older base with more fixes, whatever we may say about the best we know how to produce. :/<br> </div> Thu, 13 Jul 2023 22:23:22 +0000 Stabilizing per-VMA locking https://lwn.net/Articles/938157/ https://lwn.net/Articles/938157/ dskoll <p>Anecdote: Firefox began crashing occasionally (1-2x per day) when I switched to kernel 6.4. Since I upgraded to 6.4.3 two days ago, it has not crashed. I don't know for sure it was the kernel, but it seems likely.</p> Thu, 13 Jul 2023 21:02:40 +0000