LWN: Comments on "Memory access and alignment" https://lwn.net/Articles/260832/ This is a special feed containing comments posted to the individual LWN article titled "Memory access and alignment". en-us Fri, 19 Sep 2025 10:15:35 +0000 Fri, 19 Sep 2025 10:15:35 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Bitfields in C++ https://lwn.net/Articles/261660/ https://lwn.net/Articles/261660/ jzbiciak <P>Keep in mind that a bitvector (sometimes called a bitmap) is something rather different than a bitfield member of a <TT>struct</TT> or <TT>class</TT>. Bitvectors have array-like semantics and are quite typically used to represent things such as set membership. (eg. a 1 indicates membership, a 0 indicates lack of membership.) Bitfield members, on the other hand, are scalar quantities. There is no indexing associated with them, and their range is often much more than 0 to 1.</P> <P> Bitfield members are often useful for storing values with limited range in a compact manner. For example, consider dates and times. The month number only goes from 1-12 and the day number only goes from 1-31. You can store those in 4 and 5 bits respectively. If you store year-1900 instead of the full 4 digit number, you can get by with 7 or 8 bits there. Hours fit in 5 bits, minutes and seconds in 6 bits each. That leads to something like this: </P> <PRE> struct packed_time_and_date { unsigned int year : 7; /* 7 bits for year - 1900 */ unsigned int month : 4; /* 4 bits for month (1 - 12) */ unsigned int day : 5; /* 5 bits for day (1 - 31) */ unsigned int hour : 5; /* 5 bits for hour (0 - 23) */ unsigned int minute : 6; /* 6 bits for minute (0 - 59) */ unsigned int second : 6; /* 6 bits for second (0 - 59) */ }; </PRE> <P> Now, if I did my math right, that adds up to 33 bits. Under the x86 UNIX ABI, the compiler will allocate 2 32-bit words for this. The first 5 fields will pack into the first 32-bit word, and the 6th field will be in the lower 6 bits of the second word. That is, <TT>sizeof(struct packed_time_and_date)</TT> will be 8. If you can manage to squeeze a bit out of the year (say, limit yourself to a smaller range of years), this fits into 32 bits, and <TT>sizeof(struct packed_time_and_date)</TT> will be 4. In either case, the compiler will update these fields-packed-within-words with an appropriate sequence of loads, shifts, masks, and stores. No C++ or STL necessary.</P> <P>(If you want to see a more in-depth (ab)use of this facility, check out <A HREF="http://spatula-city.org/~im14u2c/intv/jzintv-1.0-beta3/src/cp1600/op_decode.h">this source file.</A> It constructs a <TT>union</TT> of <TT>struct</TT>s, each <TT>struct</TT> modeling a different opcode space in an instruction set.)</P> <P>Anyhow, this is what I was referring to as "bit-fields" in the comment you replied to. As you can see, it's rather different than bitvectors. :-)</P> <P>Bitvectors in C++ and STL have their own problems. I've been reading up on C++ and STL, and vectors of bits apparently have a checkered history in STL. There was one implementation (I believe at SGI) of <TT>bitvector</TT> (separate of <TT>vector&lt;bool&gt;</TT>) that didn't make it into the standard, and yet another (apparently a specialization of <TT>vector&lt;&gt;</TT> especially for <TT>bool</TT>) that many feel doesn't actually provide proper iterators. I wish I could provide references or details.</P> <P>All I know is that it's enough for me to avoid STL on bitvectors, and just resort to coding these manually. It's served me well so far. And I don't see a need to invoke generics like <TT>sort</TT> on them. Something tells me most generic algorithms aren't as interesting on bitvectors. ;-)</P> Tue, 11 Dec 2007 14:24:25 +0000 Bitfields in C++ https://lwn.net/Articles/261624/ https://lwn.net/Articles/261624/ pr1268 <p>Bjarne Stroustrup discusses a C++ Standard Template Library (STL) vector of bools (The C++ Programming Language, Special Edition, p. 458) - this is designed to overcome the wasted space of a 1-bit data structure taking up 16, 32, or 64 bits of memory.</p> Mon, 10 Dec 2007 21:31:57 +0000 Here's an idea https://lwn.net/Articles/261524/ https://lwn.net/Articles/261524/ jzbiciak <P>Oh, and bit-fields (that is, fields of somewhat arbitrary bit widths) tend to be based around the size of "int" on a given machine. That is, they tend to be word oriented. The fields pack together to form words, and a field doesn't straddle two words.</P> <P>Note that I say "tend to." Bit field layout and struct layout are actually ABI issues (ABI == Application Binary Interface). For example, here's the <A HREF="http://www.sco.com/developers/devspecs/abi386-4.pdf">SVR4 i386 ABI</A>. Take a look starting at page 27. In the case of the SVR4 ABI, it appears bitfields are actually packed in terms of their base type. I believe the latest C standard only wants you to use <TT>signed int</TT>, <TT>unsigned int</TT> and <TT>_Bool</TT>, though.</P> Mon, 10 Dec 2007 07:40:35 +0000 Here's an idea https://lwn.net/Articles/261522/ https://lwn.net/Articles/261522/ jzbiciak <P>Yes. On an architecture with alignment constraints, the "packed[3]" field isn't necessary. The compiler will insert padding. You can check this out with the offsetof() macro.</P> <P>For example, if I compile the following program on my 64-bit Opteron, you can see the pointers all get aligned to 8 byte boundaries like they're supposed to. If I compile it on a 32 bit machine, they get aligned to 4 byte boundaries. This is regardless of whether that filler field is there. </P> <PRE> #include &lt;stdio.h&gt; #include &lt;stddef.h&gt; typedef unsigned int uint32_t; typedef struct obj_1 { uint32_t a, b; char c; char filler[3]; uint32_t* p1; char** p2; char** p3; } obj_1; typedef struct obj_2 { uint32_t a, b; char c; uint32_t* p1; char** p2; char** p3; } obj_2; int main() { printf("offset of obj_1.a: %5d bytes\n", offsetof(obj_1, a)); printf("offset of obj_1.b: %5d bytes\n", offsetof(obj_1, b)); printf("offset of obj_1.c: %5d bytes\n", offsetof(obj_1, c)); printf("offset of obj_1.filler: %5d bytes\n", offsetof(obj_1, filler)); printf("offset of obj_1.p1: %5d bytes\n", offsetof(obj_1, p1)); printf("offset of obj_1.p2: %5d bytes\n", offsetof(obj_1, p2)); printf("offset of obj_1.p3: %5d bytes\n", offsetof(obj_1, p3)); putchar('\n'); printf("offset of obj_2.a: %5d bytes\n", offsetof(obj_2, a)); printf("offset of obj_2.b: %5d bytes\n", offsetof(obj_2, b)); printf("offset of obj_2.c: %5d bytes\n", offsetof(obj_2, c)); printf("offset of obj_2.p1: %5d bytes\n", offsetof(obj_2, p1)); printf("offset of obj_2.p2: %5d bytes\n", offsetof(obj_2, p2)); printf("offset of obj_2.p3: %5d bytes\n", offsetof(obj_2, p3)); putchar('\n'); printf("sizeof(int) = %d bytes\n", sizeof(int)); printf("sizeof(long) = %d bytes\n", sizeof(long)); printf("sizeof(void*) = %d bytes\n", sizeof(void*)); return 0; } </PRE> <P>Output on a 32-bit machine:</P> <PRE> offset of obj_1.a: 0 bytes offset of obj_1.b: 4 bytes offset of obj_1.c: 8 bytes offset of obj_1.filler: 9 bytes offset of obj_1.p1: 12 bytes offset of obj_1.p2: 16 bytes offset of obj_1.p3: 20 bytes offset of obj_2.a: 0 bytes offset of obj_2.b: 4 bytes offset of obj_2.c: 8 bytes offset of obj_2.p1: 12 bytes offset of obj_2.p2: 16 bytes offset of obj_2.p3: 20 bytes sizeof(int) = 4 bytes sizeof(long) = 4 bytes sizeof(void*) = 4 bytes </PRE> <P>Output on a 64-bit machine:</P> <PRE> offset of obj_1.a: 0 bytes offset of obj_1.b: 4 bytes offset of obj_1.c: 8 bytes offset of obj_1.filler: 9 bytes offset of obj_1.p1: 16 bytes offset of obj_1.p2: 24 bytes offset of obj_1.p3: 32 bytes offset of obj_2.a: 0 bytes offset of obj_2.b: 4 bytes offset of obj_2.c: 8 bytes offset of obj_2.p1: 16 bytes offset of obj_2.p2: 24 bytes offset of obj_2.p3: 32 bytes sizeof(int) = 4 bytes sizeof(long) = 8 bytes sizeof(void*) = 8 bytes </PRE> Mon, 10 Dec 2007 07:25:04 +0000 Padding structures elsewhere https://lwn.net/Articles/261508/ https://lwn.net/Articles/261508/ pr1268 <p>By the way, in a prior life I programmed mainframes in COBOL where we used fixed-length records. Thus explaining my choice of the identifier <font face="monospace">filler</font>. Filler padding gets interesting in COBOL when working with packed-decimal numbers (not to mention the joys of a <font face="monospace">S0C7</font> exception), but I digress...</p> Sun, 09 Dec 2007 23:34:48 +0000 Here's an idea https://lwn.net/Articles/261506/ https://lwn.net/Articles/261506/ pr1268 <p>Here's an idea: Let's all go back to 8-bit architectures and we won't have this problem anymore. ;-)</p> <p>Okay, that was my one dorky sarcastic comment for the day.</p> <p>Seriously, I'm curious about what happens without programmer intervention: Recently I had to code for a <font face="monospace">struct</font> that looked like this (a similar example is given in the <a title="GCC packed structures" href="http://sig9.com/articles/gcc-packed-structures"><font face="monospace">packed</font> attribute</a> link in the article):</p> <blockquote><pre>struct my_object { uint32_t a; char c; char filler[3]; uint32_t* p1; char** p2; char** p3; };</pre></blockquote> <p>I'm using a 32-bit computer, so I know all pointers occupy 4 bytes. Deal is, the <font face="monospace">char filler[3]</font> array was not going to be used in any shape or form in my program, but I instinctively put it there to pad the whole structure to a multiple of 4 bytes. Would GCC have done that for me automatically if I had not included the <font face="monospace">char filler[3]</font>? Or, would GCC have re-arranged things had I moved the <font face="monospace">char filler[3]</font> to the bottom of the structure (leaving <font face="monospace">char c</font> where it is)? How does the <font face="monospace">-Os</font> optimization affect this? Thanks!</p> Sun, 09 Dec 2007 23:27:11 +0000 Memory access and alignment https://lwn.net/Articles/261502/ https://lwn.net/Articles/261502/ oak <div class="FormattedComment"><pre> On some ARM hardware some of the unaligned accesses may not provide an exception, see "Unaligned memory access" here: <a href="http://www.wirelessnetdesignline.com/howto/199901786">http://www.wirelessnetdesignline.com/howto/199901786</a> </pre></div> Sun, 09 Dec 2007 18:28:26 +0000 Memory access and alignment https://lwn.net/Articles/261269/ https://lwn.net/Articles/261269/ dsd <div class="FormattedComment"><pre> A slightly tweaked version of the document has now been submitted for inclusion with the kernel documentation. You can read it here: <a href="http://article.gmane.org/gmane.linux.kernel/609571">http://article.gmane.org/gmane.linux.kernel/609571</a> </pre></div> Thu, 06 Dec 2007 21:12:44 +0000 Memory access and alignment https://lwn.net/Articles/261114/ https://lwn.net/Articles/261114/ cventers <div class="FormattedComment"><pre> One place where alignment does matter on x86 is in SMP. As noted by the glibc documentation, aligned word-sized reads and writes are atomic on all known POSIX platforms. If you respect memory visibility issues, there are certain ways you can exploit this fact to avoid the overhead of locks. In fact, if you notice, the kernel's atomic_t type is pretty straightforward on most platforms - especially the simple read and store operations. The only requirement is alignment and then it is atomic for free. </pre></div> Thu, 06 Dec 2007 05:00:48 +0000