LWN: Comments on "May the FOLL_FORCE not be with you" https://lwn.net/Articles/983169/ This is a special feed containing comments posted to the individual LWN article titled "May the FOLL_FORCE not be with you". en-us Wed, 22 Oct 2025 05:15:39 +0000 Wed, 22 Oct 2025 05:15:39 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net thanks for the ptracer option https://lwn.net/Articles/989729/ https://lwn.net/Articles/989729/ nix <div class="FormattedComment"> Absolutely, though you have to take some special measures to avoid having to stop all your threads in the process of attachment :)<br> </div> Wed, 11 Sep 2024 11:17:43 +0000 Julia https://lwn.net/Articles/985924/ https://lwn.net/Articles/985924/ mrugiero <div class="FormattedComment"> <span class="QuotedText">&gt; Nevertheless the /proc/self/mem approach was our favored approach</span><br> <span class="QuotedText">&gt; because it a) Required an attacker to be able to execute syscalls</span><br> <span class="QuotedText">&gt; which is a taller order than getting memory write and b) didn't double</span><br> <span class="QuotedText">&gt; the virtual address space requirements (as a dual mapping approach</span><br> <span class="QuotedText">&gt; would).</span><br> <p> I understand the argument b) to some degree (although if your code section takes up a problematic portion of address space there is a bigger problem than just doubling it), but argument a) is seems moot to me.<br> You don't protect your process by making the attack surface the whole system (which FOLL_FORCE do), and you don't protect that much against ROP if you have the exact sequence of instructions the attacker would need as part of your initialization (which the interpreter does).<br> So, it doesn't protect significantly against local exploits and it requires opening up a much bigger (system-wide) exploit space instead. Further, the claim they do have the fallback means also the code to enable the "just modify the memory" approach is one jump away.<br> </div> Fri, 16 Aug 2024 12:26:02 +0000 thanks for the ptracer option https://lwn.net/Articles/985923/ https://lwn.net/Articles/985923/ mrugiero <div class="FormattedComment"> I believe some do to try to fight attempts to reverse engineering, so I guess it's possible.<br> </div> Fri, 16 Aug 2024 12:19:57 +0000 Julia https://lwn.net/Articles/983885/ https://lwn.net/Articles/983885/ khim <font class="QuotedText">&gt; But editing code while it's still cached by another CPU is a very 90s approach to JIT.</font> <p>And walking on Earth is so last millennium, right?</p> <p>People are using the best approach that's available. JITs are editing code that's currently executing and there are no plans to change that.</p> <p>Usually JITs are using two mappings (one readable, one writable) nowadays, but they absolutely do that, because there are no better approach invented yet.</p> <p>P.S. And yes, code editing is limited in most JITs: they are rewriting jump target addresses and add new code in place where previously there were just NOPs. But these two are not going away any time soon, because they are tightly coupled with the nature of JIT: it's name, quite literally, means “Just In Time” which means, essentially:</p> <ol><li>We are compiling small pieces of code at time, thus couldn't afford waste the whole page for a tiny amount of code produced.</li> <li>We are stitching together code “on the fly” which means that calls to “compile-that-code-and-run-it” in the already finished code are replaced with calls to finished, recompiled, code regularly.</li></ol> <p>If you find a way to beat Oracle Java's JIT, Google's ART JIT, Chrome V8's JIT, Firefox's Warp JIT and all other JITs that are using that approach with something <b>better</b> then it would be time to say that everyone should switch. Saying that everyone should stop doing what they are doing just because you don't like it — without offering any alternative, on the other hand, is just irresponsible.</p> Tue, 30 Jul 2024 08:56:14 +0000 thanks for the ptracer option https://lwn.net/Articles/983782/ https://lwn.net/Articles/983782/ gray_-_wolf <div class="FormattedComment"> Is it possible for program to attach to itself as a debugger satisfying the ptrace requirement?<br> </div> Mon, 29 Jul 2024 09:04:37 +0000 Self-modifying code https://lwn.net/Articles/983732/ https://lwn.net/Articles/983732/ khim <font class="QuotedText">&gt; In discussions like this I believe it is useful to go through many possible options and evaluate their pros and cons without being particularly wedded to any one of them.</font> <p>Why? What have that approach brings to you? What have you achieved doing it?</p> <p>I find that very strange. Things that we <b>already have</b>, things that <b>exist</b>, by definition, have a priority. They are already here, they are done, that enough. But any change from the status quo need a justification.</p> <p>Sure, I like to go “back the memory lane” and see <b>why</b> things that we have are like they are. Because situation of today is different from situation of yesterday.</p> <p>But no matter what, even if the thing that made us to pick original decision is no longer valid or even if the original decision was made on a whim without any rational justifications… things that we have are very-very different from things that we don't have.</p> <font class="QuotedText">&gt; Since it's an option that meets the specific constraint that you, yourself, chose to highlight, I felt that it should be included in the discussion.</font> <p>One may invent bazillion crazy schemes if not constrained by anything. Talking about them would take forever unless we would limit these discussions, somehow.</p> <p>“Anything new should come with an extra justification that explains why should we do that if some other solution already exists” is very good rule if we are talking about something that we plan to implement. I don't know anyone who achieved anything significant while violating it (but note the subtle difference: if we don't yet have a solution <b>at all</b> then someone who doesn't “know” that “<i>X</i> is simply impossible” may achieve something really cool… but when said <i>X</i> is not just possible in theory but we already know how to do <i>X</i> in practice then situation chances).</p> <p>Well, maybe fiction writers would be an exception, but even they, when they construct their strange imaginary worlds, still play on that contrast between what “we” have and what “they” have. “Does it exist?” is still very much a central question that governs their decisions even if they imagine a world where something that we already have doesn't exist and where evolution of civilization, as a consequence, goes into a different direction.</p> <font class="QuotedText">&gt; This approach appears to be incompatible with your style of arguing, so I'll just be ignoring you from now on.</font> <p>Fine by me. I don't like to waste time on pointless discussions without any practical consequences (even if the consequence is minor like “now that I have wrote that I may just refer people here instead of repeating my arguments again and again”) while you seem to regard these as the only ones worthy of pursuing.</p> Sun, 28 Jul 2024 12:05:21 +0000 Self-modifying code https://lwn.net/Articles/983726/ https://lwn.net/Articles/983726/ malmedal <div class="FormattedComment"> <span class="QuotedText">&gt; You are proposing third (or fourth) one without explaining why it's better than what we already have.</span><br> <p> In discussions like this I believe it is useful to go through many possible options and evaluate their pros and cons without being particularly wedded to any one of them. Since it's an option that meets the specific constraint that you, yourself, chose to highlight, I felt that it should be included in the discussion. This approach appears to be incompatible with your style of arguing, so I'll just be ignoring you from now on. <br> </div> Sun, 28 Jul 2024 10:57:12 +0000 Self-modifying code https://lwn.net/Articles/983722/ https://lwn.net/Articles/983722/ khim <font class="QuotedText">&gt; This is an answer, it has a cost, an expensive one, but it is an answer.</font> <p>Sure, but <b>as was mentioned in the very beginning</b>, <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f511c0b17b08">seven years ago</a> there are already two other methods (three if you include self-ptrace).</p> <p>You are proposing third (or fourth) one without explaining why it's better than what we already have.</p> <p>This looks like “he have to do <b>something</b>” — “this is <i>something</i>” — “let's do it!” logic.</p> <p>Such logic rarely produces good designs.</p> Sun, 28 Jul 2024 10:00:28 +0000 Self-modifying code https://lwn.net/Articles/983720/ https://lwn.net/Articles/983720/ malmedal <div class="FormattedComment"> <span class="QuotedText">&gt; Why? What's the point?</span><br> <p> You asked for this earlier:<br> <p> <span class="QuotedText">&gt;&gt; maybe even teach kernel not to ever provide W+X mappings at all</span><br> <p> This is an answer, it has a cost, an expensive one, but it is an answer. <br> <p> </div> Sun, 28 Jul 2024 09:31:59 +0000 Self-modifying code https://lwn.net/Articles/983716/ https://lwn.net/Articles/983716/ khim <font class="QuotedText">&gt; If you don't want the kernel to provide userspace with a W+X mapping you can have the kernel keep the mapping to itself, but if userspace then can ask the kernel to do arbitrary changes, or have separate W and X mappings, you haven't gained that much</font> <p>Yes, you did. You have made life for attackers harder. As explained <a href="https://en.wikipedia.org/wiki/W%5EX">in the Wikipedia article</a>. And that article even includes section about JITs, too!</p> <p>Security and usability are always at odds, there exist 100% bullet-proof way to stop any attacks, both local and remote — just turns the computer off and all kinds of attacks are prevented! But this “protection” is not very usable, thus we need something else.</p> <font class="QuotedText">&gt; you want to limit what the kernel will do to the simplest thing that will work.</font> <p>Yes, but now we need to determine <b>what</b> is that work even is!</p> <font class="QuotedText">&gt; The point is to stop the old code at a safe location</font> <p>Do you actually read what I wrote? Just where have I wrote that JIT wants/needs <b>that</b>? It have no such need. On the contrary, what JIT needs is described precisely under titles <i>Asynchronous modification</i> under <i>Cross-Modifying Code</i> in the <a href="https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf#page=268">AMD Manual</a>: <i>the nature of the code being executed by the target thread is such that it is insensitive to the exact timing of the update</i>.</p> <p>JIT (or, heck, dynamic loader that resolves symbols lazily) initially inserts jump to the “slow path” because “fast path” doesn't exist, later, when “fast path” does exist jump is replaced. <b>That's all</b>, JIT doesn't care if “slow path” is used a few times after “fast path” is created, it just wants “eventual consistency” where programs stop using “slow path” after a few milliseconds.</p> <p>And, as I have shown you, with references to AMD manual, CPUs <b>actually offer enough relevant guarantees on the hardware level</b>, description is so precisely tailored to the need of JITs that they could, as well, call it “JIT-friendly code modification” and not “asynchronous modification”.</p> <p>And yet, you repeatedly invent complication that make things more problematic and AFAICS don't achieve anything security-wise? Why? What's the point?</p> Sun, 28 Jul 2024 08:40:31 +0000 Julia https://lwn.net/Articles/983705/ https://lwn.net/Articles/983705/ roc <div class="FormattedComment"> If you're only JITting one function per page then your memory usage is going to be terrible. Yes, in reality you can do more than one function at once but in practice you will seldom be able to fill a page before you need to start executing code in it.<br> <p> And the first flip from writable to executable is already a problem.<br> </div> Sun, 28 Jul 2024 02:01:16 +0000 Self-modifying code https://lwn.net/Articles/983703/ https://lwn.net/Articles/983703/ malmedal <div class="FormattedComment"> It's an answer to your previous request: <br> <p> <span class="QuotedText">&gt; maybe even teach kernel not to ever provide W+X mappings at all</span><br> <p> If you don't want the kernel to provide userspace with a W+X mapping you can have the kernel keep the mapping to itself, but if userspace then can<br> ask the kernel to do arbitrary changes, or have separate W and X mappings, you haven't gained that much, so you want to limit what the kernel will do to the simplest thing that will work. The point is to stop the old code at a safe location, on some CPUs you can set address-based breakpoints instead.<br> </div> Sun, 28 Jul 2024 00:38:03 +0000 Self-modifying code https://lwn.net/Articles/983701/ https://lwn.net/Articles/983701/ khim <font class="QuotedText">&gt; I was responding to your earlier statement:</font> <p>Said statement was just a side comment to explain what JITs are doing, why and how what they are dong is guaranteed to work. To show that need of JITs to alter running code while it's running was acute enough and understood enough that even hardware makers already created special JIT-tailored guarantees there.</p> <font class="QuotedText">&gt; Have the app ask for a writable alloc, fill it with code.</font> <p>That's possible.</p> <font class="QuotedText">&gt; Then the app tells the kernel make this executable and the safe interrupt point is xxx.</font> <p>How exactly do you plan to do that? Would kernel take 10-20 bytes of generated code and allocate whole 4KB (or, worse, 16KB if we are talking about modern ARM) page to make it executable?</p> <p>This sounds pretty wasteful.</p> <font class="QuotedText">&gt; Then when the time comes to replace the running code the app tells the kernel to write a processor-specific safe stop sequence to the interrupt point.</font> <p>So now you want to introduce something like stop-the-world interrupt in place where previously everything was completely lock-free?</p> <p>Also: JIT doesn't <b>actually</b> replaces running code, it replaces <b>branch target</b> in the running code. Using well-documented and guaranteed approach explicitly described in the CPU manual.</p> <p>Why do you want to change <b>that</b>?</p> <font class="QuotedText">&gt; on x86 this would likely be a sequence of eight int 3 instructions. </font> <p>Why eight and what would it give us? Except more complications and more places to have bugs?</p> <font class="QuotedText">&gt; This will stop the thread and the app can allocate a new writable alloc and tell the kernel to make it executable and have the thread continue there.</font> <p>But app doesn't need that! App just simply wants to replace target of jump! Without all that complicated and useless machinery! Previously there was <code>call COMPILE_ME_FOO</code> and now there would be <code>call FOO_JIT_COMPILED_AND_READY_TO_USE</code>. That's all!</p> <p>It's useless because eight int 3 instructions don't guarantee anything (x86 includes instructions longer than eight bytes and with redundant prefix you may force almost any instruction to be longer) and it's useless because write via <code>/proc/self/mem</code> (or use of two mappings) <b>already</b> does everything that's needed!</p> <p>Why adding API that would be more convoluted and slower yet not actually safer then existing API?</p> <p>It doesn't really makes much sense! You gave us some elaborate solution to some unknown problem, but neglected to say what is the problem that solution is supposed to solve!</p> <p>It's very hard to understand whether proposal is good or bad if we have no idea what that proposal even supposed to achieve.</p> <p>As in: what that dane with eight int 3 instructions and additional syscall was supposed to accomplish? What would it do better than write to <code>/proc/self/mem</code> or two separate mappings (one writable, one executable) for the same chunk of memory?</p> Sat, 27 Jul 2024 22:56:06 +0000 Julia https://lwn.net/Articles/983700/ https://lwn.net/Articles/983700/ willy <div class="FormattedComment"> I don't understand why you'd want to flip a page back to writable. You do a preliminary fast JIT to address A, flip the page from writable to executable. Then you decide the function is hot enough and do a more thorough compilation to address B, and change all the references to address A to address B. Once all the CPUs are no longer executing the code in the address A page, free it. Or reuse it. But editing code while it's still cached by another CPU is a very 90s approach to JIT.<br> </div> Sat, 27 Jul 2024 22:17:56 +0000 Self-modifying code https://lwn.net/Articles/983694/ https://lwn.net/Articles/983694/ malmedal <div class="FormattedComment"> I was responding to your earlier statement: <br> <p> <span class="QuotedText">&gt; On x86 CPUs there are special guarantee that it can be done safely if modified part fits fully into 8bytes segment (it probably goes back to 80486 CPUs because I have no idea how to explain that 8bytes limitation)</span><br> <p> Anyway, for you current statement, just have the kernel to the tricky bit.<br> <p> Have the app ask for a writable alloc, fill it with code.<br> <p> Then the app tells the kernel make this executable and the safe interrupt point is xxx. <br> <p> Then when the time comes to replace the running code the app tells the kernel to write a<br> processor-specific safe stop sequence to the interrupt point. <br> <p> on x86 this would likely be a sequence of eight int 3 instructions.<br> <p> This will stop the thread and the app can allocate a new writable alloc and tell the kernel to make it executable and <br> have the thread continue there. <br> <p> </div> Sat, 27 Jul 2024 21:52:33 +0000 Julia https://lwn.net/Articles/983690/ https://lwn.net/Articles/983690/ NYKevin <div class="FormattedComment"> They were already doing that. Specifically, the revert patch linked in the article speaks of opening /proc/self/mem, which is your own memory.<br> </div> Sat, 27 Jul 2024 20:30:05 +0000 Self-modifying code https://lwn.net/Articles/983682/ https://lwn.net/Articles/983682/ khim <font class="QuotedText">&gt; I don't remember if the 386 had an executable-only mode, but we certainly had writable memory that could be executed.</font> <p>The main issue that we are discussing here revolves around <a href="https://en.wikipedia.org/wiki/NX_bit#x86">NX bit</a> that allows one to create <b>non</b>-executable code!</p> <p>On 386 the only way to make code non-executable was to play with segments and their limits. On Unix-like OS the best you may do is split 4GB of virtual address space in two: non-excutable area and executable area.</p> <p>That means that approach that <a href="https://lwn.net/Articles/983614/">skissane talks about</a> is just simply not possible on 386! Except if you use extremely weird OS which doesn't use paging, but uses segments for virtual memory.</p> <p>Such OSes may exist, in theory, but I certainly know none, that actually did this thing in practice, that's why I have become so excited you said you did that with 386.</p> <p>But it looks more and more likely that you haven't done what we are talking here about at all and are talking about entirely different situation.</p> <font class="QuotedText">&gt; They are recommending a sort of RCU like approach to avoid this: <blockquote>Note that since stores to the instruction stream are observed by the instruction fetcher in program order, one can do multiple modifications to an area of the target thread's code that is beyond reach of the thread's current control flow, followed by a final asynchronous update that alters the control flow to expose the modified code to fetching and execution.</blockquote> </font> <p>That just happens with JITs automatically: once you have created optimized version of routine there are rarely the need to go back to intepreter. But yeah, usually only one <code>call</code>/<code>jmp</code> instruction is patched.</p> <blockquote>It's not a fragile thing to do, it is even supported by ld, see the -N option. Seems that interferes with shared libraries, so if you want that you need to use mprotect.</blockquote> <p>Again: that's different. Keeping something in the write+execute mode is dangerous WRT exploits, but not fragile, but playing with permissions and flipping from read+write to read+execute and back is pretty fragile because you need to ensure that code that you want to patch is not executed on the other core!</p> <font class="QuotedText">&gt; I believe it fell out of favour because the performance advantage became much less when the 486 came with on-chip cache.</font> <p>No, it fell out of favor much later, when people started caring about security and started enforcing <a href="https://en.wikipedia.org/wiki/W%5EX">W^X</a> property.</p> <p>First with segment limit tricks and then, later, with hardware <a href="https://en.wikipedia.org/wiki/NX_bit#x86">NX bit</a>.</p> <p>Only Apple and only on iOS enforces it so radically as make JITs simply impossible, other OSes provide ways for JITs to work, that we are discussing here.</p> <p>But <b>all</b> this discussion is happening in an <a href="https://lwn.net/Articles/983604/">W^X</a> world!</p> <p>Why do you keep bring W+X examples and keep saying that you can do everything easily if only you remove that restriction… of course it's possible to do, what could be simpler?</p> <p>That's simply not what we are discussing here! The idea is to ensure that <a href="https://en.wikipedia.org/wiki/W%5EX">W^X</a> is strictly enforced, maybe even teach kernel not to ever provide W+X mappings at all — and yet still keep JITs working, somehow.</p> Sat, 27 Jul 2024 20:21:52 +0000 thanks for the ptracer option https://lwn.net/Articles/983671/ https://lwn.net/Articles/983671/ Heretic_Blacksheep <div class="FormattedComment"> I agree. This is one low hanging fruit down, several left to go, from a security POV. The compromise is acceptable - and hopefully distros will default to "ptrace" in the future after a transition period thus eventually forcing software that can to clean up their acts and responsibly notify users when they functionally can't how to re-enable the old behavior, much as how it worked with OpenBSD's W^X transition some years back.<br> </div> Sat, 27 Jul 2024 17:21:23 +0000 Self-modifying code https://lwn.net/Articles/983666/ https://lwn.net/Articles/983666/ malmedal Thank you. <p> So reading page from page 206, you are talking about asynchronous modification. 8 bytes is not because of 486, it is because 8 bytes is 64 bits, the size that gets atomically updated. Also the 64 bits must be aligned. <p> It is basically warning about the situation where the instruction pointer is in the middle of a 64bit quad word when the quad word gets updated, if the instruction boundary changes so that the IP is not actually at the start of an intended instruction you have a problem. <p> They are recommending a sort of RCU like approach to avoid this: <blockquote> Note that since stores to the instruction stream are observed by the instruction fetcher in program order, one can do multiple modifications to an area of the target thread's code that is beyond reach of the thread's current control flow, followed by a final asynchronous update that alters the control flow to expose the modified code to fetching and execution. </blockquote> Reading a bit further, on synchronous modification where the target thread is waiting while the other thread is writing, the rules are the same as before, you can make whatever changes you want, but the target thread must execute a serialising instruction. <p> I don't remember if the 386 had an executable-only mode, but we certainly had writable memory that could be executed. <p> It's not a fragile thing to do, it is even supported by ld, see the -N option. Seems that interferes with shared libraries, so if you want that you need to use mprotect. <p> I believe it fell out of favour because the performance advantage became much less when the 486 came with on-chip cache. Sat, 27 Jul 2024 17:03:40 +0000 Julia https://lwn.net/Articles/983662/ https://lwn.net/Articles/983662/ khim <font class="QuotedText">&gt; We certainly used to be able to do that. It gave a nice speedup on the 386 when you could save a register by having a mov #immediate, register and modify the immediate as needed.</font> <p>386 doesn't even have “executable” bit in its page tables. Were you using segments? I guess this may work with segments since they cache permission in CPU registers. Still sounds very tricky and fragile to me.</p> <p>Are you even talking about <i>change it from writable to executable using mprotect</i> (and in SMP environment) when you say “we used to be able to do that”?</p> <font class="QuotedText">&gt; The old rule was you needed a jump or taken branch instruction to be sure the pipeline was clear after you modified executable code, later you could do any "serialising" instruction, like cpuid, instead.</font> <p>I think we are talking about past each other, again. I'm talking about execution of the code that you are planning to patch by some <b>other</b> CPU core. Were these 386 systems, that you are talking about, even SMP ones? On UP system things are much, <b>much</b>, <b>MUCH</b> simpler. But on SMP systems when you are patching code that other CPU may be executing at this precise moment you need atomicity guarantees or complicated and convoluted scheme that would ensure that code that you are planning to patch is not, currently, executing.</p> <font class="QuotedText">Reference for this?</font> <p>Look for the <i>Asynchronous modification</i> under <i>Cross-Modifying Code</i> in the <a href="https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf#page=268">AMD Manual</a>. Intel provides more or less the same guarantees, but I don't remember which section of the manual describes that.</p> Sat, 27 Jul 2024 15:36:07 +0000 Julia https://lwn.net/Articles/983661/ https://lwn.net/Articles/983661/ malmedal <div class="FormattedComment"> <span class="QuotedText">&gt; You can not do that to code that's already compiled and, more importantly, is currently executing.</span><br> <p> We certainly used to be able to do that. It gave a nice speedup on the 386 when you could save a register by having a mov #immediate, register and<br> modify the immediate as needed. <br> <p> <span class="QuotedText">&gt; On x86 CPUs there are special guarantee that it can be done safely if modified part fits fully into 8bytes segment </span><br> <p> Reference for this? The old rule was you needed a jump or taken branch instruction to be sure the pipeline was clear after you modified executable code, later you could do any "serialising" instruction, like cpuid, instead.<br> <p> </div> Sat, 27 Jul 2024 15:04:48 +0000 Julia https://lwn.net/Articles/983649/ https://lwn.net/Articles/983649/ khim <font class="QuotedText">&gt; Why not just write to the page normally, and then change it from writable to executable using mprotect? Wouldn’t that be simpler?</font> <p>You can not do that to code that's already compiled and, more importantly, <b>is currently executing</b>.</p> <font class="QuotedText">&gt; I thought that was what most JITs did.</font> <p>Most “serious” JITs also do what Julia does. When you JIT-compiler one function you couldn't be sure that other functions, that are called from the current one needs to be JIT-compiled too: if they are handling some exceptional conditions then they would never be called and JIT-compiler needs to be fast, otherwise it may even be slower than interpreter, in some cases!</p> <p>That's why instead of putting call to JIT-compiled function they put a call to “compile me later” thunk (or, in some JITs, to the interpreter, I'm not sure what exactly Julia uses).</p> <p>But when you have fully-optimized version of function that call to “compile me later” thunk is now just pure overhead! You can go eliminate it… but for that you need to patch already compiled (and, presumably, executing!) code.</p> <p>On x86 CPUs there are special guarantee that it can be done safely if modified part fits fully into 8bytes segment (it probably goes back to 80486 CPUs because I have no idea how to explain that 8bytes limitation), most other CPU don't need this trick because they couldn't embed address into one instruction anyway thus they load address from memory... but that memory have to be in the same page on some CPUs because of limitation of instructions encoding!</p> <p>You may guess how important is it for performance from just a simple fact: <a href="https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html">Intel APX</a> added special new jump format to make that patching easier!</p> <p>And yes, Julia is not an exception, almost all JITs are doing that, they are just using two mappings because that's simple and cross-platform way of doing that.</p> Sat, 27 Jul 2024 14:12:18 +0000 What's in a name? https://lwn.net/Articles/983643/ https://lwn.net/Articles/983643/ intelfx <div class="FormattedComment"> <span class="QuotedText">&gt; follow_page()</span><br> <p> Ah, so that’s what was under all those turtles!<br> <p> Thanks.<br> </div> Sat, 27 Jul 2024 11:33:28 +0000 Julia https://lwn.net/Articles/983638/ https://lwn.net/Articles/983638/ roc <div class="FormattedComment"> I already mentioned above that it's generally harder for an exploit to perform a system call with the right parameters than to just write to the desired address.<br> </div> Sat, 27 Jul 2024 10:11:54 +0000 Julia https://lwn.net/Articles/983634/ https://lwn.net/Articles/983634/ pm215 <div class="FormattedComment"> The commit message to the commit where Linus reverted his initial patch has a quote presumably from one of the Julia developers which gives some context here (<a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f511c0b17b08">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...</a>) :<br> <p> "We used these semantics as a hardening mechanism in the julia JIT. By<br> opening /proc/self/mem and using these semantics, we could avoid<br> needing RWX pages, or a dual mapping approach. We do have fallbacks to<br> these other methods (though getting EIO here actually causes an assert<br> in released versions - we'll updated that to make sure to take the<br> fall back in that case).<br> <p> Nevertheless the /proc/self/mem approach was our favored approach<br> because it a) Required an attacker to be able to execute syscalls<br> which is a taller order than getting memory write and b) didn't double<br> the virtual address space requirements (as a dual mapping approach<br> would)."<br> <p> </div> Sat, 27 Jul 2024 08:26:36 +0000 Julia https://lwn.net/Articles/983628/ https://lwn.net/Articles/983628/ mb <div class="FormattedComment"> <span class="QuotedText">&gt; while the page is writable another thread could modify it to include malicious code.</span><br> <p> yes, well. But that's also possible with /proc/self/mem + FOLL_FORCE, as the article explains.<br> So switching to it doesn't solve that problem.<br> </div> Sat, 27 Jul 2024 07:09:27 +0000 Julia https://lwn.net/Articles/983616/ https://lwn.net/Articles/983616/ roc <div class="FormattedComment"> Flipping pages between writable and executable has a couple of problems. There's performance overhead, since removing prot bits from a page requires IPIs to all other CPUs with the page mapped. And there are security issues, since while the page is writable another thread could modify it to include malicious code.<br> </div> Sat, 27 Jul 2024 01:35:07 +0000 Julia https://lwn.net/Articles/983614/ https://lwn.net/Articles/983614/ skissane <div class="FormattedComment"> Why not just write to the page normally, and then change it from writable to executable using mprotect? Wouldn’t that be simpler? I thought that was what most JITs did.<br> </div> Sat, 27 Jul 2024 00:30:59 +0000 thanks for the ptracer option https://lwn.net/Articles/983605/ https://lwn.net/Articles/983605/ roc <div class="FormattedComment"> Phew, I'm glad the discussion arrived at a solution that supports unlimited access for ptracers. It would have been hugely problematic if people had to choose between debuggers working and no protection at all.<br> </div> Fri, 26 Jul 2024 22:54:40 +0000 Julia https://lwn.net/Articles/983604/ https://lwn.net/Articles/983604/ roc <div class="FormattedComment"> It's for the Julia JIT. JITs want to dynamically generate code and execute it, which happens to be the same thing that exploits want to do. So Julia's JIT allocates some read-only executable pages and then writes to them using /proc/.../mem. That's safer than making the pages directly writable, because it's harder for exploits to make that appropriate write system call than to write to memory directly.<br> </div> Fri, 26 Jul 2024 22:53:08 +0000 Julia https://lwn.net/Articles/983603/ https://lwn.net/Articles/983603/ willy <div class="FormattedComment"> It could operate on its own address space instead of on the address space of another process?<br> </div> Fri, 26 Jul 2024 22:42:35 +0000 Julia https://lwn.net/Articles/983598/ https://lwn.net/Articles/983598/ acarno <div class="FormattedComment"> I'm no expert by any means, but it looks like Julia allows both interpreted execution (like Python) as well as just-in-time compilation. It's a consequence of being a language designed for high-performance numerical analysis - it aims to support heavy number crunching as well as quick scripts. They presumably want to be able to live-patch a process as it runs (I'm not sure how you could do this otherwise).<br> </div> Fri, 26 Jul 2024 22:12:19 +0000 Julia https://lwn.net/Articles/983587/ https://lwn.net/Articles/983587/ rgb <div class="FormattedComment"> Could someone enlighten me what is so special about Julia that it needs this giant security hole to operate properly?<br> </div> Fri, 26 Jul 2024 20:48:52 +0000 What's in a name? https://lwn.net/Articles/983573/ https://lwn.net/Articles/983573/ pbonzini <div class="FormattedComment"> <span class="QuotedText">&gt; That flag causes the write to succeed, regardless of whether the normal memory protections at the target address would allow writing</span><br> </div> Fri, 26 Jul 2024 19:36:22 +0000 What's in a name? https://lwn.net/Articles/983568/ https://lwn.net/Articles/983568/ nickodell What does <code>FORCE</code> stand for? What check is being ignored? Fri, 26 Jul 2024 18:26:55 +0000 What's in a name? https://lwn.net/Articles/983554/ https://lwn.net/Articles/983554/ abatters <div class="FormattedComment"> "Follow", as in the prefix of bitwise-or flag arguments passed to follow_page(), the kernel function that uses page tables to lookup a struct page * given a struct vma and a virtual address.<br> </div> Fri, 26 Jul 2024 16:26:20 +0000 What's in a name? https://lwn.net/Articles/983553/ https://lwn.net/Articles/983553/ intelfx I'll bite. What does <code>FOLL_</code> stand for, anyway? Fri, 26 Jul 2024 16:03:29 +0000