LWN: Comments on "The conclusion of the 5.14 merge window" https://lwn.net/Articles/861695/ This is a special feed containing comments posted to the individual LWN article titled "The conclusion of the 5.14 merge window". en-us Tue, 30 Sep 2025 12:58:15 +0000 Tue, 30 Sep 2025 12:58:15 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net MADV_POPULATE_* and mbind() https://lwn.net/Articles/901516/ https://lwn.net/Articles/901516/ rockeet <div class="FormattedComment"> Is there any difference between MADV_POPULATE_* and mlock?<br> </div> Sat, 16 Jul 2022 14:52:37 +0000 The conclusion of the 5.14 merge window https://lwn.net/Articles/862929/ https://lwn.net/Articles/862929/ maxfragg <div class="FormattedComment"> the cost would be higher in an OS less focused on portability than linux.<br> Since linux tends to use its own syscall dispatching over old limited syscall mechanisms, were you use a hardware instruction with an immediate syscall number, which tend to be quite limited.<br> <p> for example x86-32 used int80h all syscalls, while some non portable systems might want to avoid dispatching inside the int80h handler and instead spread syscalls over the interupts, then if you run out of interupt numbers, you have a cost increase. Linux uses dispatiching anyways, so there is no big cost to have a thousand syscalls, besides someone having to maintain them all and the desire to basically never break even a single one<br> </div> Thu, 15 Jul 2021 08:00:58 +0000 MADV_POPULATE_* and mbind() https://lwn.net/Articles/862743/ https://lwn.net/Articles/862743/ abatters <div class="FormattedComment"> Thanks for taking the time to look into this!<br> </div> Tue, 13 Jul 2021 14:41:50 +0000 MADV_POPULATE_* and mbind() https://lwn.net/Articles/862740/ https://lwn.net/Articles/862740/ david.hildenbrand <div class="FormattedComment"> Makes sense! QEMU similarly reads+writes one byte of each page when told to preallocate guest memory; the read+write is in place to trigger COW, but to not overwrite existing data, for example, when some piece of guest memory corresponds to a virtual NVDIMM.<br> <p> In the meantime, I verified that MADV_POPULATE_* and mbind() works as expected.<br> </div> Tue, 13 Jul 2021 14:16:04 +0000 MADV_POPULATE_* and mbind() https://lwn.net/Articles/862728/ https://lwn.net/Articles/862728/ abatters <div class="FormattedComment"> I just double-checked, and you are correct, my code does write to the memory, and the comment even says that it is to break the COW mapping so that the memory is actually allocated, so my previous comment was in error.<br> </div> Tue, 13 Jul 2021 13:17:44 +0000 The conclusion of the 5.14 merge window https://lwn.net/Articles/862725/ https://lwn.net/Articles/862725/ Sesse <div class="FormattedComment"> The cost is primarily technical, not really about performance. There might be some small cache effects if you call way too much different code, but it&#x27;s unlikely to be a big deal.<br> </div> Tue, 13 Jul 2021 11:36:11 +0000 MADV_POPULATE_* and mbind() https://lwn.net/Articles/862717/ https://lwn.net/Articles/862717/ david.hildenbrand <div class="FormattedComment"> <font class="QuotedText">&gt; &quot;by looping over the allocation and reading at PAGE_SIZE-intervals&quot;</font><br> <p> Are you sure that you are *reading* and not writing? On anonymous memory, reading will simply populate the shared zeropage, so I&#x27;d be surprised if it (no populated page vs. populated shared zeropage) makes a real difference when later reading from that mapping (read() ...), or even when writing to it (write() ...) in your example.<br> <p> mlock(), MAP_POPULATE and the new MADV_POPULATE_READ and MADV_POPULATE_WRITE options nowadays all end up calling handle_mm_fault() -- the very basic fault handler also called on page faults on the faulting CPU. So I&#x27;d be surprised if they behave differently-- but I&#x27;ll double check.<br> <p> Note that there are subtle differences when it comes to shared mappings: mlock() and MAP_POPULATE won&#x27;t trigger COW on shared mappings. But for your example, mmap(MAP_PRIVATE | MAP_ANONYMOUS), the mbind() documentation is quite clear: &quot;pages will be allocated only according to the specified policy when the application writes (stores) to the page. For anonymous regions, an initial read access will use a shared page in the kernel containing all zeros. &quot;. And I&#x27;d assume that holds for any allocations, also when triggering writes from other CPUs, e.g., as part of a syscall.<br> </div> Tue, 13 Jul 2021 08:01:57 +0000 The conclusion of the 5.14 merge window https://lwn.net/Articles/862713/ https://lwn.net/Articles/862713/ Paf <div class="FormattedComment"> I think if new functionality is desired and can be clearly delineated, then it’s no worse than other systems growing larger. It has costs. But nothing enormous.<br> </div> Tue, 13 Jul 2021 04:51:52 +0000 The conclusion of the 5.14 merge window https://lwn.net/Articles/862711/ https://lwn.net/Articles/862711/ JohnVonNeumann <div class="FormattedComment"> Taken from: <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1507f51255c9">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...</a><br> <p> <font class="QuotedText">&gt; Nowadays a new system call cost is negligible while it is way</font><br> <font class="QuotedText">&gt; simpler for userspace to deal with a clear-cut system calls than with a</font><br> <font class="QuotedText">&gt; multiplexer or an overloaded syscall.</font><br> <p> <p> I am a Kernel noob, was just wondering what/if there are downsides to increasing the number of syscalls? Is there a worry about far too much fragmentation amongst syscalls? I guess if I was to make a bad comparison, I&#x27;m aware that the x86 instruction set is massive, and people like Chris Domas have done research and found hidden instructions due to the size of the instruction set. Again, I want to reiterate that I know this is a bad example, but I&#x27;m just trying to illustrate a point.<br> </div> Mon, 12 Jul 2021 23:01:38 +0000 MADV_POPULATE_* and mbind() https://lwn.net/Articles/862704/ https://lwn.net/Articles/862704/ abatters <div class="FormattedComment"> In some of my programs I allocate memory with specific properties:<br> <p> mmap(MAP_PRIVATE | MAP_ANONYMOUS)<br> mbind() to a specific NUMA node<br> set other madvise flags (MADV_HUGEPAGE, MADV_DONTDUMP, MADV_DONTFORK, etc.)<br> prefault in the pages manually by looping over the allocation and reading at PAGE_SIZE-intervals<br> <p> A long time ago (many kernels ago), I found that prefaulting is needed because just doing a system call like read() and passing the buffer without prefaulting from userspace doesn&#x27;t always obey mbind() policy. I once tried using mlock() to prefault the pages, but that ignored the mbind() policy also (again with old kernels).<br> <p> So do these new MADV_POPULATE_* obey mbind() policy?<br> </div> Mon, 12 Jul 2021 22:07:29 +0000