LWN: Comments on "Betrayed by a bitfield" https://lwn.net/Articles/478657/ This is a special feed containing comments posted to the individual LWN article titled "Betrayed by a bitfield". en-us Thu, 18 Sep 2025 01:07:15 +0000 Thu, 18 Sep 2025 01:07:15 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Betrayed by a bitfield https://lwn.net/Articles/482123/ https://lwn.net/Articles/482123/ jwakely <div class="FormattedComment"> <font class="QuotedText">&gt; as -pthread is supposed to switch on the POSIX memory model in gcc</font><br> <p> Says who?<br> <p> On ia64-linux it appears to be equivalent to just -D_REENTRANT, on x86-linux it also passes -lpthread to the linker, but it has no effect on code-generation.<br> <p> <font class="QuotedText">&gt; they don't want full sequential consistency in all cases because of its efficiency implications.</font><br> <p> Which part of the C11 memory model requires sequential consistency?<br> <p> </div> Fri, 17 Feb 2012 16:20:39 +0000 Betrayed by a bitfield https://lwn.net/Articles/481990/ https://lwn.net/Articles/481990/ chrisV <div class="FormattedComment"> Any chance you could file a bug on this one? It is not so bad if the non-standard use of volatile has this effect (although no doubt still very annoying to the kernel developers), but it seems to me to be something else when the code happens to be POSIX-conforming code and corrupts memory locations.<br> <p> (Yes I have seen the argument in papers in the early proposals for the C/C++ threading model that "memory location" is ambiguous in Base Definitions, section 4.11, of the SUS, and could refer to a machine's natural word size rather than individual scalars, and so C11/C++11 now explicitly states that "memory location" in its equivalent wording means any scalar or bitfield, but that is a completely perverse construction of the SUS as it makes POSIX threads completely useless as a standard, bringing threading back to individual non-standard ABIs.)<br> </div> Thu, 16 Feb 2012 20:21:38 +0000 Betrayed by a bitfield https://lwn.net/Articles/481691/ https://lwn.net/Articles/481691/ cbf123 I just tested with gcc version 4.3.2 (Wind River Linux Sourcery G++ 4.3-85) for powerpc. Building the test app (the volatile version) as "gcc x.c -c -m64 -pthread" resulted in the following code, which clearly shows a 64-bit read/write cycle. <pre> 0000000000000000 <.wrong>: 0: fb e1 ff f8 std r31,-8(r1) 4: f8 21 ff c1 stdu r1,-64(r1) 8: 7c 3f 0b 78 mr r31,r1 c: f8 7f 00 70 std r3,112(r31) 10: e9 3f 00 70 ld r9,112(r31) 14: e8 09 00 08 ld r0,8(r9) 18: 39 60 00 01 li r11,1 1c: 79 60 f8 2c rldimi r0,r11,31,32 20: f8 09 00 08 std r0,8(r9) 24: e8 21 00 00 ld r1,0(r1) 28: eb e1 ff f8 ld r31,-8(r1) 2c: 4e 80 00 20 blr </pre> Wed, 15 Feb 2012 17:01:14 +0000 A single historical context of volatile https://lwn.net/Articles/480693/ https://lwn.net/Articles/480693/ dfsmith <div class="FormattedComment"> To quote from Harbison &amp; Steele 3rd ed. (1991)* section 4.4.5<br> <p> "... any object ... of a volatile-qualified type should not participate in optimizations that would ... modif[y] the object."<br> <p> To my mind that prohibition includes non-volatile object optimizations that would clobber volatile objects.<br> <p> The book then slightly contradicts itself two paragraphs later by saying that "optimizations between sequence points are permitted" before clarifying with a hardware example.<br> <p> <p> * A standard ANSI C reference manual for programmers and compiler implementers at the time.<br> <p> <p> </div> Fri, 10 Feb 2012 22:22:30 +0000 Betrayed by a bitfield https://lwn.net/Articles/480296/ https://lwn.net/Articles/480296/ jd <div class="FormattedComment"> Can't we just give a "Compiler On Acid" error?<br> </div> Thu, 09 Feb 2012 17:51:34 +0000 Betrayed by a bitfield https://lwn.net/Articles/480030/ https://lwn.net/Articles/480030/ daglwn <div class="FormattedComment"> Seeing some of your other posts about -pthread, I think we are in agreement. Apologies if I mischaracterized your understanding.<br> <p> </div> Wed, 08 Feb 2012 15:24:55 +0000 Betrayed by a bitfield https://lwn.net/Articles/480003/ https://lwn.net/Articles/480003/ nix <div class="FormattedComment"> SPARC is another major example of an arch with alignment-restricted loads and stores. I have dim memories that suggest that MIPS might be as well.<br> <p> It's actually easier to come up with a list of architectures that do *not* require natural alignment on loads and stores than to come up with a list of those that do.<br> </div> Wed, 08 Feb 2012 13:57:21 +0000 Betrayed by a bitfield https://lwn.net/Articles/480001/ https://lwn.net/Articles/480001/ nix <div class="FormattedComment"> And, thirdly, the kernel is not C11 code -- yet.<br> </div> Wed, 08 Feb 2012 13:51:39 +0000 Betrayed by a bitfield https://lwn.net/Articles/479969/ https://lwn.net/Articles/479969/ khim <p>YMMV, as usual.</p> <p>We recompile the world anyway, so ABI change is less of a problem, but the fact that just a recompilation does not fix the issue and you need to do a lot of investigations is a problem.</p> Wed, 08 Feb 2012 08:54:52 +0000 Betrayed by a bitfield https://lwn.net/Articles/479914/ https://lwn.net/Articles/479914/ daglwn <div class="FormattedComment"> There are plenty of architectures that have these kinds of restricted loads and stores. The old Cray machines only ever operated on 64 bit words, for example.<br> <p> And often a machine implements the smaller accesses but there is a performance penalty due to alignment issues and the bus/DRAM access architecture. It's not uncommon for the RMW to be faster.<br> <p> </div> Wed, 08 Feb 2012 00:11:23 +0000 Betrayed by a bitfield https://lwn.net/Articles/479909/ https://lwn.net/Articles/479909/ daglwn <div class="FormattedComment"> <font class="QuotedText">&gt; And if you get false sharing between two free-standing variables where the &gt; one being operated on is marked volatile, or between fields of a struct </font><br> <font class="QuotedText">&gt; where the struct is marked volatile, there is a compiler bug.</font><br> <p> No, there isn't.<br> <p> There isn't. Really.<br> <p> Volatile does not mean what you think it means.<br> <p> It's a bit like sequential consistency. Just when you think you understand it, something unexpected happens that is both non-intuitive and perfectly legal.<br> <p> </div> Tue, 07 Feb 2012 23:32:34 +0000 Betrayed by a bitfield https://lwn.net/Articles/479907/ https://lwn.net/Articles/479907/ daglwn <div class="FormattedComment"> Ah, I read "C11."<br> <p> Yes, you are correct, but I don't think there's an ABI issue. An ABI issue is much harder to deal with than a semantic change.<br> <p> With an ABI issue you've got to recompile the world (your project, libraries it links to, etc.) to get a working application. With a semantic change you only recompile the bits that had to recoded to account for the change.<br> <p> <p> </div> Tue, 07 Feb 2012 23:29:13 +0000 Betrayed by a bitfield https://lwn.net/Articles/479877/ https://lwn.net/Articles/479877/ khim <blockquote><font class="QuotedText">&gt; Perfectly working C++ program can already be broken by recompilation in <br /> &gt; C++11 mode<br /><br /> But as we all know, C++ is not C. :) I don't see this as a problem.</font></blockquote> <p>C++ is quite explicitly not C, but C++11 pretends that it's still C++.</p> Tue, 07 Feb 2012 20:53:16 +0000 Betrayed by a bitfield https://lwn.net/Articles/479865/ https://lwn.net/Articles/479865/ chrisV <div class="FormattedComment"> I want to keep your issues separate and I have asked you under a separate posting about your tests concerned. On the POSIX point, setting out the assembly output of test cases for Itanium, first without -pthread and then with -pthread, and showing the false sharing would be sufficient, as -pthread is supposed to switch on the POSIX memory model in gcc and it will quickly become apparent if it doesn't.<br> <p> This posting is just to comment on your observations on the C11 memory model. You are quite right in saying it represents accumulated wisdom, because its starting point was the (comparatively under-specified) POSIX memory model for multi-threaded programs. However, if you read the lkml postings in questions, you will see that the kernel community do not want the generalized C11 memory model for the kernel. They want the preclusion of false sharing; they don't want full sequential consistency in all cases because of its efficiency implications.<br> </div> Tue, 07 Feb 2012 20:18:16 +0000 Betrayed by a bitfield https://lwn.net/Articles/479863/ https://lwn.net/Articles/479863/ dlang <div class="FormattedComment"> the kernel did not have one field marked volatile, but in the research into the problem, someone (I think it was Linus) tested with volatile and the false sharing was happening there as well.<br> </div> Tue, 07 Feb 2012 20:01:12 +0000 Betrayed by a bitfield https://lwn.net/Articles/479859/ https://lwn.net/Articles/479859/ chrisV <div class="FormattedComment"> And since you refer to "none of the options you are suggesting makes a slight difference", can you also clarify whether you mean you have run a test case demonstrating false sharing even in the case of 64-bit scalars, and/or even in the case of assembly instructions which should have forced synchronization?<br> <p> </div> Tue, 07 Feb 2012 19:36:31 +0000 Betrayed by a bitfield https://lwn.net/Articles/479854/ https://lwn.net/Articles/479854/ chrisV <div class="FormattedComment"> Are you telling me that you have actually run a test case compiled with the -pthread option which exhibits this problem? If not, I just don't believe you because I have written shed loads of multi-threaded code using POSIX threads which has never encountered a false sharing problem of this kind.<br> <p> If you run a test case compiled with the -pthread flag which exhibits this problem then this is an egregious failure on the part of the threading implementation of gcc and you should post a bug. (It is far more serious than the non-standards complying kernel code.)<br> </div> Tue, 07 Feb 2012 19:21:56 +0000 Betrayed by a bitfield https://lwn.net/Articles/479851/ https://lwn.net/Articles/479851/ chrisV <div class="FormattedComment"> I agree. And if you get false sharing between two free-standing variables where the one being operated on is marked volatile, or between fields of a struct where the struct is marked volatile, there is a compiler bug. It is still not clear whether that is the case with the kernel test case (first, the struct was not marked volatile, only one of its fields; and secondly, we don't know whether the test case involved an asynchronous test (as opposed to threads) or not.<br> </div> Tue, 07 Feb 2012 19:07:43 +0000 Betrayed by a bitfield https://lwn.net/Articles/479844/ https://lwn.net/Articles/479844/ daglwn <div class="FormattedComment"> <font class="QuotedText">&gt; "It says nothing about interrupts."</font><br> <p> Thanks for the correction. But the compiler is still correct here. Volatile doesn't say anything about restricting _when_ it is read or written, only that it will get the "latest" value in a single-thread context.<br> <p> It's perfectly fine for I/O as long as you can guarantee alignment such that there is no "false sharing."<br> <p> </div> Tue, 07 Feb 2012 18:38:57 +0000 Betrayed by a bitfield https://lwn.net/Articles/479843/ https://lwn.net/Articles/479843/ daglwn <div class="FormattedComment"> Ah, cool, didn't know about _Alignas. I'm a codegen guy so I'm not playing around in the C frontend very often. I only look at the standard when I really have to. :)<br> <p> There's still an ABI problem if the compiler always has to align members to uphold the requirement.<br> <p> <font class="QuotedText">&gt; Perfectly working C++ program can already be broken by recompilation in </font><br> <font class="QuotedText">&gt; C++11 mode</font><br> <p> But as we all know, C++ is not C. :) I don't see this as a problem.<br> <p> </div> Tue, 07 Feb 2012 18:35:23 +0000 Betrayed by a bitfield https://lwn.net/Articles/479825/ https://lwn.net/Articles/479825/ BenHutchings <div class="FormattedComment"> None of the options you are suggesting makes a slight difference to gcc behaviour in this case. In any case, the C11 memory model is to a large extent a codification of what is already understood in the industry to be necessary to support multithreaded C programs.<br> <p> Your unhelpful standards-lawyering attitude is exactly what many programmers have come to hate about certain compiler developers.<br> <p> </div> Tue, 07 Feb 2012 17:12:14 +0000 Betrayed by a bitfield https://lwn.net/Articles/479823/ https://lwn.net/Articles/479823/ BenHutchings <blockquote>Insane? I imagine the 6- and 16-bit processor guys said the same when Alpha appeared.</blockquote> <p>Yes, Alpha's lack of 8- and 16-bit store instructions was insane. That's why they were added in later versions of the architecture.</p> Tue, 07 Feb 2012 17:06:47 +0000 Betrayed by a bitfield https://lwn.net/Articles/479818/ https://lwn.net/Articles/479818/ chrisV <div class="FormattedComment"> "It says nothing about interrupts."<br> <p> I think it does. See §5.1.2.3/5 and /10, which are normative. The principal purpose of volatile is to deal with arbitrary changes of data values outside the program context of the process in which the code is running (ie asynchronous interrupts). (This does not include threads, which are within the program context and which, because they can run on more than one core, require quite different synchronizations, some of which are not async-signal-safe.)<br> <p> See also the footnote 134 of §6.7.3/8 (which is non-normative despite the "shall not"): "A volatile declaration may be used to describe an object corresponding to a memory-mapped input/output port or an object accessed by an asynchronously interrupting function. Actions on objects so declared shall not be 'optimized out' by an implementation or reordered except as permitted by the rules for evaluating expressions." This is a curious note as, as far as I am aware, it is the one and only reference to memory mapping (and about which I mis-spoke in an earlier posting on this article because it is not in C99 which contains no reference to memory mapping).<br> <p> </div> Tue, 07 Feb 2012 16:54:12 +0000 Betrayed by a bitfield https://lwn.net/Articles/479759/ https://lwn.net/Articles/479759/ khim <blockquote><font class="QuotedText">This is why volatile is non-portable. Unfortunately, C99 has no standard way to force alignment of any object.</font></blockquote> <p>GCC, MSCV and other compilers include such an ability and C11 finally adds it to standard so it's all is not so bad...</p> Tue, 07 Feb 2012 08:46:45 +0000 Betrayed by a bitfield https://lwn.net/Articles/479757/ https://lwn.net/Articles/479757/ khim <p>Since C11 includes _Alignas and you can specify you own alignment it's not a disaster, albeit it <b>is</b> inconvenience.</p> <p>Perfectly working C++ program can already be broken by recompilation in C++11 mode, so it's not the first time upgrade broke things.</p> <p>Of course is only possible if original program violated specs and worked by accident (if you can show me genuinely different behavior in standards-compliant program it'll be interesting to know, too, but so far all examples I've seen contained subtle violations of one form or another).</p> Tue, 07 Feb 2012 08:43:27 +0000 Betrayed by a bitfield https://lwn.net/Articles/479705/ https://lwn.net/Articles/479705/ dlang <div class="FormattedComment"> in that case I have to agree with the other poster who said that if the compiler considers it Ok to write over any arbitrary memory locations, as long as what it's writing matches what the compiler thinks is already there, then that compiler is unsuitable for use with any memory mapped I/O as it will feel free to clobber the new data that is waiting to be read.<br> <p> since this sort of thing has been part of C's traditional strength, this doesn't seem like a sane interpretation to me.<br> </div> Tue, 07 Feb 2012 01:03:27 +0000 Betrayed by a bitfield https://lwn.net/Articles/479700/ https://lwn.net/Articles/479700/ daglwn <div class="FormattedComment"> Again, this is implementation-defined behavior. The standard you're looking for is the ABI for your target. That's where all the memory layout, access size, calling convention and other low-level stuff is specified.<br> <p> Volatile is perfectly fine for I/O as long as you know the address being accessed is suitably aligned to avoid problems, as the ABI should indicate.<br> <p> This is why volatile is non-portable. Unfortunately, C99 has no standard way to force alignment of any object.<br> <p> </div> Tue, 07 Feb 2012 00:52:42 +0000 Betrayed by a bitfield https://lwn.net/Articles/479698/ https://lwn.net/Articles/479698/ daglwn <div class="FormattedComment"> <font class="QuotedText">&gt; if modifying b causes a read/write of a, this is wrong.</font><br> <p> No, it's not. Believe me.<br> <p> The volatile keyword doesn't say anything about when it will change value, be read/written etc. It says simply that it will not be cached in a register such that every read will get the "latest" value expected when executed under the Abstract Machine.<br> <p> It says nothing about threading.<br> <p> It says nothing about interrupts.<br> <p> Simply remember that volatile is not magic. Think of it as the opposite of "register."<br> <p> </div> Tue, 07 Feb 2012 00:48:44 +0000 Betrayed by a bitfield https://lwn.net/Articles/479696/ https://lwn.net/Articles/479696/ daglwn <div class="FormattedComment"> <font class="QuotedText">&gt; If that is not available it may have to pad so that any 32-bit scalar </font><br> <font class="QuotedText">&gt; occupies 64 bits of space, where necessary to conform to the standard.</font><br> <p> Or pad 8 bits to 32, etc. This is what I mean by breaking ABI compatibility. If interfaces include struct values this could be a real nightmare.<br> <p> Personally, I think this requirement is ridiculous absent some special attribute to enforce it. If there was a qualifier on a type to indicate "shared," _a_la_ UPC, that would be one thing. But to have this requirement for every single data item in the program is a very bad idea.<br> <p> </div> Tue, 07 Feb 2012 00:43:27 +0000 Betrayed by a bitfield https://lwn.net/Articles/479669/ https://lwn.net/Articles/479669/ giraffedata <blockquote> <blockquote> That's very interesting, because it seems to say the provision for "volatile" is so incomplete as to be a pointless language feature. </blockquote> Not totally true. It's true that volatile doesn't do what most people expect. Basically, volatile says the data can't be cached in a register ... </blockquote> <p> I was commenting at a higher level. Never mind what the compiler can and can't do. The question is, what programs can you write in C because it has the "volatile" feature that you couldn't otherwise? Relying on nothing but that the compiler implements the C standard. <p> I've always understood the originally intended answer to be, "you can write a program that uses memory mapped I/O." But comments in this thread say regardless of how one uses the "volatile" keyword in a program, the compiler may always generate code that arbitrarily writes to memory mapped I/O addresses. There's nothing in the standard to stop it. If so, then you not only can't use memory mapped I/O in a C program, you can't even let a C program &mdash; any C program &mdash; <em>run</em> in an address space that includes memory mapped I/O regions. <p> Unless all memory-mapped I/O regions ignore writes. <p> And it's hard to believe that the definers of "volatile" actually had such a useless thing in mind. Mon, 06 Feb 2012 22:22:20 +0000 Betrayed by a bitfield https://lwn.net/Articles/479665/ https://lwn.net/Articles/479665/ dlang <div class="FormattedComment"> Linus and several others in the kernel thread on the subject have said this.<br> </div> Mon, 06 Feb 2012 21:39:31 +0000 Betrayed by a bitfield https://lwn.net/Articles/479662/ https://lwn.net/Articles/479662/ chrisV <div class="FormattedComment"> Well that should make it easier to get the gcc developers to deal with the ia64 at least, although I would take a little persuading that moving memory from 64 bit registers into non-64-bit aligned memory locations didn't cause some slow down in code operating in 64 bit mode. Do you have any citations for that?<br> </div> Mon, 06 Feb 2012 21:32:09 +0000 Betrayed by a bitfield https://lwn.net/Articles/479657/ https://lwn.net/Articles/479657/ chrisV <div class="FormattedComment"> "no, the item in question wasn't related to interrupts, but the cause of the problem causes the same problem if interrupts were involved."<br> <p> Not it doesn't. Memory access issues relating to multiple threads (or multiple processes in the case of shared memory) have nothing to do with interrupts, and nothing to do with volatile.<br> <p> This really is beating a dead horse. Compiler switches are available to ensure memory consistency for the case in question, notably the -pthread switch. I can see why the kernel doesn't want to, or can't, use that. In that case the kernel authors need to write some assembler or use 64-bit values on architectures where it is important, or persuade the gcc developers to provide another compiler switch dealing with this particular problem.<br> </div> Mon, 06 Feb 2012 21:16:25 +0000 Betrayed by a bitfield https://lwn.net/Articles/479659/ https://lwn.net/Articles/479659/ dlang <div class="FormattedComment"> 32 bit operations are not slower on all 64 bit architectures, including specifically the ia64 architecture where this problem was discovered.<br> </div> Mon, 06 Feb 2012 21:11:26 +0000 Betrayed by a bitfield https://lwn.net/Articles/479656/ https://lwn.net/Articles/479656/ chrisV <div class="FormattedComment"> "Again, not being familiar with C11, what does the memory model give that can help multithreaded programming, then? How does §5.1.2.4/4 work in the case where the machine access granularity is larger than some primitive types and a RMW must happen?"<br> <p> Discarding volatile, which is irrelevant, it requires the faulty behavior observed in the kernel not to occur. On 64 bit architectures it can do this by various pessimizations. It could use slower 32 bit accesses where available. If that is not available it may have to pad so that any 32-bit scalar occupies 64 bits of space, where necessary to conform to the standard.<br> <p> None of this is new. gcc's pthreads implementations have been doing this for years. It might be instructive to look at the assembly output of the same test cases compiled with the -pthread switch.<br> </div> Mon, 06 Feb 2012 21:04:58 +0000 Betrayed by a bitfield https://lwn.net/Articles/479655/ https://lwn.net/Articles/479655/ dlang <div class="FormattedComment"> one of the posts I saw on this topic (I think from Linus) tested the code snippet<br> <p> volitile int a;<br> int b;<br> <p> b++;<br> <p> and the resulting assembler did the exact same read-write on a that caused the problem with the case in question. If an interrupt were to happen between the read and the write that changed the value of a, the write would cause that value to be lost.<br> <p> no, the item in question wasn't related to interrupts, but the cause of the problem causes the same problem if interrupts were involved.<br> <p> If the 32 bit loads and stores were significantly more expensive to use than 64 bit loads and stores, there would be at least some reason to have this behaviour available, but in the case of the architectures in question there isn't a performance benefit to doing it this way.<br> </div> Mon, 06 Feb 2012 21:04:17 +0000 Betrayed by a bitfield https://lwn.net/Articles/479654/ https://lwn.net/Articles/479654/ chrisV <div class="FormattedComment"> "the thing is, the current GCC behaviour _would_ overwrite the value of a volatile variable that was modified by an interrupt in the middle of the read-modify-write cycle of a different (but adjacent) variable."<br> <p> I don't believe that has been tested and I don't believe it to be true. On hardware where 64 bit boundaries are important, sig_atomic_t would be likely to be 64-bits wide. (Someone with a ia64 architecture might be able to confirm this.) In that case alignment boundaries would be fully respected.<br> <p> Also, I can't see that the original problem had anything to do with interrupts. It seems to be concerned with either multi-threading, or shared access by multiple processes.<br> <p> Lastly, when you refer to "the bug with respect to volatile", you impute that gcc has the bug. If there is a bug, it is a bug in the kernel.<br> </div> Mon, 06 Feb 2012 20:51:52 +0000 Betrayed by a bitfield https://lwn.net/Articles/479647/ https://lwn.net/Articles/479647/ dlang <div class="FormattedComment"> right, but the programmer is not attempting to do anything with it. The programmer is attempting to do something with another variable, one that just happens to be adjacent to the one in question.<br> <p> again the code snippit is<br> <p> volitile int a;<br> int b;<br> <p> b++;<br> <p> if modifying b causes a read/write of a, this is wrong.<br> <p> the programmer has not made any attempt to specify alignment here.<br> </div> Mon, 06 Feb 2012 20:13:05 +0000 Betrayed by a bitfield https://lwn.net/Articles/479644/ https://lwn.net/Articles/479644/ dlang <div class="FormattedComment"> the thing is, the current GCC behaviour _would_ overwrite the value of a volatile variable that was modified by an interrupt in the middle of the read-modify-write cycle of a different (but adjacent) variable.<br> <p> As Linux says, it's possible for someone to fix the bug with respect to volatile, and explicitly leave the bug in for all other accesses (with the claim that the spec allows the compiler to do that), but it would probably be more work to do that then to just change the behaviour across the board.<br> </div> Mon, 06 Feb 2012 20:06:07 +0000 Betrayed by a bitfield https://lwn.net/Articles/479645/ https://lwn.net/Articles/479645/ daglwn <div class="FormattedComment"> The key work is "cached." On any load/store machine the data must be first loaded into a register to do anything with it. volatile says it must be written back out by the next sequence point. I'm sure language lawyers will find some detail I've missed but that's the way I think about the guarantee.<br> <p> </div> Mon, 06 Feb 2012 20:05:28 +0000