LWN: Comments on "Blocking userfaultfd() kernel-fault handling" https://lwn.net/Articles/819834/ This is a special feed containing comments posted to the individual LWN article titled "Blocking userfaultfd() kernel-fault handling". en-us Wed, 24 Sep 2025 23:36:16 +0000 Wed, 24 Sep 2025 23:36:16 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/823234/ https://lwn.net/Articles/823234/ tobin_baker <div class="FormattedComment"> How about implementing COW private mappings of shared memory with true snapshot semantics?<br> </div> Wed, 17 Jun 2020 00:48:47 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820730/ https://lwn.net/Articles/820730/ smooth1x <div class="FormattedComment"> What happens if the VM contains a database server? I can see this for that use case.<br> </div> Sun, 17 May 2020 08:54:35 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820350/ https://lwn.net/Articles/820350/ nilsmeyer <div class="FormattedComment"> <font class="QuotedText">&gt; Client workflows often can't be interrupted at will and even asking clients nicely to reboot their instances (so they can migrate to other hardware nodes) can take months. It's much easier to involuntarily migrate client VMs to different hardware.</font><br> <p> That is true in a lot of environments, especially when yo u are dealing with software that manages state. It's easy to say that one can design an application so this isn't necessary (though a lot of the container/cloud-native crowd completely ignores stateful systems), but the reality is very different. <br> </div> Wed, 13 May 2020 08:48:28 +0000 (de-)compression and view are different layers https://lwn.net/Articles/820129/ https://lwn.net/Articles/820129/ gus3 <div class="FormattedComment"> If the kernel handles compression/decompression matters, it's to save on paging space/speed. The user space sees nothing different.<br> <p> If the user space handles compression, the kernel doesn't care about it at all.<br> <p> They aren't related.<br> </div> Mon, 11 May 2020 07:02:26 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820068/ https://lwn.net/Articles/820068/ meyert <div class="FormattedComment"> A bit OT, but I recently learned that quintessential come from Latin "quinta essentia" and means literally "fifths element" and indeed does mean the fifth element, i.e. "ether".<br> <p> </div> Sat, 09 May 2020 22:33:11 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820067/ https://lwn.net/Articles/820067/ NYKevin <div class="FormattedComment"> <font class="QuotedText">&gt; Live migration is ABSOLUTELY justified for cloud computing providers to protect against hypervisor vulnerabilities.</font><br> <p> I don't understand how this contradicts anything that I said...<br> </div> Sat, 09 May 2020 20:33:00 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820059/ https://lwn.net/Articles/820059/ Cyberax <div class="FormattedComment"> I worked at Amazon, but I've heard about T2/T3 migration publicaly at AWS re:Invent multiple times. These instance types are severely oversubscribed and migration is used to balance the load.<br> </div> Sat, 09 May 2020 15:59:23 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820056/ https://lwn.net/Articles/820056/ roc <div class="FormattedComment"> I *think* a UFFD_USER_MODE_ONLY flag/mode would work fine for us. We don't actually allow this fake process to execute syscalls normally; we catch its syscalls with ptrace and emulate them.<br> </div> Sat, 09 May 2020 13:00:00 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820055/ https://lwn.net/Articles/820055/ roc <div class="FormattedComment"> Our Pernosco omnisicient, record-and-replay debugger uses userfaultfd() in a way that's neither 1 nor 2.<br> <p> We have a giant omniscient database which lets us reconstruct the memory state of a process at any point in its recorded history. Sometimes we want to execute an application function "as if" the process was at some point in that history. So we create a new process, ptrace it, create mappings in it corresponding to the VMAs that existed at that point in history, and enable userfaultfd() for those mappings. Then we set the registers into the right state for the function call and PTRACE_CONT. Every time the process touches a new page, we reconstruct the contents of that page from our database. Works great.<br> </div> Sat, 09 May 2020 12:58:37 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820051/ https://lwn.net/Articles/820051/ Sesse <div class="FormattedComment"> You're assuming data is paged out to begin with. :-) A prime candidate for this is if you want to mmap a compressed file (and have your application see uncompressed data).<br> </div> Sat, 09 May 2020 07:59:10 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820050/ https://lwn.net/Articles/820050/ wahern <div class="FormattedComment"> Interesting. Any sources which I could share? All I could find in a quick Google search is an HN comment, "T2 and T3 use live migration to get around this, but it's not public knowledge." <a href="https://news.ycombinator.com/item?id=17815806">https://news.ycombinator.com/item?id=17815806</a><br> <p> </div> Sat, 09 May 2020 07:52:38 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820046/ https://lwn.net/Articles/820046/ Cyberax <div class="FormattedComment"> The virtual machine that runs client's code (KVM) looks like a regular process to the host Linux.<br> </div> Sat, 09 May 2020 05:37:34 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820045/ https://lwn.net/Articles/820045/ kccqzy <div class="FormattedComment"> I don't understand the cloud provider argument. It does seem like this feature can help with live VM migration, but when you are a cloud provider, you don't necessarily require all users to run unmodified Linux kernels. If a user runs a non-Linux VM, how can the cloud provider migrate that VM?<br> </div> Sat, 09 May 2020 05:27:26 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820043/ https://lwn.net/Articles/820043/ Cyberax <div class="FormattedComment"> <font class="QuotedText">&gt; AWS doesn't support live migration.</font><br> It actually does behind the scenes with T2 and T3 instances. <br> <p> Live migration is very useful to move client software out of a failing node. So really this makes sense only for large cloud providers.<br> </div> Sat, 09 May 2020 05:02:43 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820042/ https://lwn.net/Articles/820042/ wahern <div class="FormattedComment"> AWS doesn't support live migration. Live migration is useful, but not for cloud computing, where state is kept outside the node. It's useful for traditional architectures where state is maintained on the node, with only backups (hopefully!) elsewhere. Not just useful but critical, because you're packing more work on the same piece of hardware, so reboots are more disruptive than with dedicated hardware.<br> <p> <p> </div> Sat, 09 May 2020 04:58:33 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820039/ https://lwn.net/Articles/820039/ Cyberax <div class="FormattedComment"> Live migration is ABSOLUTELY justified for cloud computing providers to protect against hypervisor vulnerabilities.<br> <p> Client workflows often can't be interrupted at will and even asking clients nicely to reboot their instances (so they can migrate to other hardware nodes) can take months. It's much easier to involuntarily migrate client VMs to different hardware.<br> </div> Sat, 09 May 2020 02:00:41 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820038/ https://lwn.net/Articles/820038/ josh <div class="FormattedComment"> There are other use cases for this. Fastly's Lucet uses it for their WebAssembly VM, to catch out-of-bounds memory accesses.<br> </div> Sat, 09 May 2020 01:36:55 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820036/ https://lwn.net/Articles/820036/ NYKevin <div class="FormattedComment"> My 2 cents: If you are not one a cloud provider, then you *probably* don't need userfaultfd() at all. It's the low-level equivalent of fiddling with the garbage collection algorithm, or writing your own malloc(). Basically, there are two use cases for this:<br> <p> 1. You're doing live migrations of VMs.<br> 2. You can dynamically regenerate paged-out data faster than the OS can page it in.<br> <p> (1) makes very little sense if you control all of the code in the VM, because it's far easier to just use a container instead of a VM, and start/stop instances as required (with all state living in some kind of database-like-thing, or perhaps a networked filesystem, depending on your needs). Sure, this is slightly more upfront design work, but live migration consumes an incredible amount of bandwidth once you try to scale it up, whereas container orchestration is a mature and well-understood technology. Unless you are making money per VM, it's difficult to justify the cost of live migration.<br> <p> (Granted, if all of your VMs are very similar to one another, you might be able to develop a clever compression algorithm that shaves a lot of bytes off of that cost, but you're still not going to beat containers on size.)<br> <p> That leaves (2). What's happening in case (2) is that you're using the page fault mechanism as a substitute for some kind of LRU cache for data that is expensive to compute, but cheaper than actually hitting the disk. But you can build an LRU cache in userspace, and it'll probably be a lot more efficient and easier to tune, since you can design it to exactly fit your specific use case. Trying to rope page faults into that problem makes no logical sense.<br> <p> So, in conclusion, I'd tentatively suggest that distros consider turning the whole feature off and see if anything breaks. Perhaps they should teach their package managers to enable this setting if, and only if, one or more installed packages really need it.<br> </div> Sat, 09 May 2020 01:30:32 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820032/ https://lwn.net/Articles/820032/ Paf <div class="FormattedComment"> The cost is awfully small, and while another option isn’t perfect, distros can enable it if desired. If it doesn’t break much, many or most will do so.<br> <p> It’s not perfect, but this option is low cost.<br> </div> Fri, 08 May 2020 23:25:01 +0000 Blocking userfaultfd() kernel-fault handling https://lwn.net/Articles/820027/ https://lwn.net/Articles/820027/ dvdeug <div class="FormattedComment"> I see the argument for it, but it's yet another obscure option. By default, it can't be turned on, so it won't provide any defense to most users. Considering "the existing vm/unprivileged_userfaultfd knob", is the cost of a feature to hobble but not disable userfaultfd really worth it? <br> </div> Fri, 08 May 2020 22:57:05 +0000