LWN: Comments on "What every programmer should know about memory, Part 1" https://lwn.net/Articles/250967/ This is a special feed containing comments posted to the individual LWN article titled "What every programmer should know about memory, Part 1". en-us Fri, 29 Aug 2025 18:07:06 +0000 Fri, 29 Aug 2025 18:07:06 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net What every programmer should know about memory, Part 1 https://lwn.net/Articles/753055/ https://lwn.net/Articles/753055/ farnz <p>Nobody has updated this article because, bar a few details, not a lot has changed. FSB is diagrams 2.1 and 2.2, while QPI/UPI is diagram 2.3. All that's changed is which systems fall into which diagram. <p>Similar applies to the discussion of DRAM access details - while the numbers have changed, the differences are minor; DDR4 is a change from DDR3 in the same way that DDR3 is a change from DDR4, and FB-DRAM is now nearly gone from the market. <p>However, beyond these details, the underlying technology remains the same as it was back in 2007. Similar applies to later parts (caches etc) - the numbers are changed, but the technology and its behaviour are not significantly different. Sun, 29 Apr 2018 12:59:35 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/753004/ https://lwn.net/Articles/753004/ quocbao <div class="FormattedComment"> I think it has a reason because nobody did that for over ten years since this article was published. By the way, Wikipedia already had a good article about QPI.<br> </div> Sat, 28 Apr 2018 04:05:04 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/752895/ https://lwn.net/Articles/752895/ neilbrown <div class="FormattedComment"> Maybe you could be that someone?<br> <p> </div> Thu, 26 Apr 2018 21:37:47 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/752889/ https://lwn.net/Articles/752889/ quocbao <div class="FormattedComment"> I hope someone will update this informative article beacause nowaday, many things have been changed for example FSB bus is replaced by QPI/UPI links. <br> </div> Thu, 26 Apr 2018 20:11:35 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/739519/ https://lwn.net/Articles/739519/ cncwebworld <div class="FormattedComment"> Very good compiled article<br> <p> </div> Sat, 18 Nov 2017 11:23:57 +0000 Still very useful and informative https://lwn.net/Articles/638981/ https://lwn.net/Articles/638981/ zenk <div class="FormattedComment"> I am surprised that almost 8 years passed, this article is still very useful and informative.<br> Nowadays performance is almost always related to memory performance, the information and rationale is more useful.<br> Thank you Ulrich and LWN!<br> </div> Fri, 03 Apr 2015 07:17:59 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/579273/ https://lwn.net/Articles/579273/ RohitS5 This are the kind of details which makes difference between a good programmer and an average programmer. I am surprised to see how much more to learn in this space with time, memory, threading, processing etc. <a rel="nofollow" href="http://javarevisited.blogspot.com">Thank you</a> Mon, 06 Jan 2014 10:26:43 +0000 100MHz × 64bit × 2 = 1,600MB/s ? https://lwn.net/Articles/260837/ https://lwn.net/Articles/260837/ sgifford <div class="FormattedComment"><pre> I was confused by the same thing; throughout section 2.2.4 it wasn't clear to me whether MB/s meant megabytes/second or megabits/second. Usually this abbreviation means megabytes, but the text implied that it was megabits. Clearing this up would make that section much... err... clearer. Even with that, a great article! Thanks! </pre></div> Tue, 04 Dec 2007 04:52:47 +0000 The tools used https://lwn.net/Articles/259080/ https://lwn.net/Articles/259080/ Ford_Prefect <div class="FormattedComment"><pre> Is the script publishable? Might save a lot of people a lot of trouble. </pre></div> Mon, 19 Nov 2007 07:46:07 +0000 Hooray! https://lwn.net/Articles/258054/ https://lwn.net/Articles/258054/ zooko <div class="FormattedComment"><pre> Argh -- I shouldn't post to LWN while sleepy. While lecturing people about the value of using precise terminology, I accidentally wrote "gigs" when I meant "teras". If it had been gigs, the people in the example would have been only 7.5% off. Sorry about that. </pre></div> Sun, 11 Nov 2007 06:28:37 +0000 Hooray! https://lwn.net/Articles/258053/ https://lwn.net/Articles/258053/ zooko <div class="FormattedComment"><pre> When we dealt with numbers in the thousands (10^3), approximating a kilo as 2^10 was only 2.5% off. Now that we routinely deal with numbers in the billions (10^9), approximating a giga as 2^30 is 7.5% off. Some of us already deal with numbers in the trillions (10^12), and approximating a tera as 2^40 is a full 10% off! Now if you do binary arithmetic in your head, so that when you see 14,463,188,475,466, you instantly know that it is 13.2 * 2^40, then this comment doesn't apply to you. But you don't. When you see "14,463,188,475,466" you approximate it in your head as "14.5 gigs". If you tell someone else that you are looking at 14.5 gigs, and they think that you mean 14.5 2^40's, then they are overestimating the number you are looking at by more than 10%! See also: <a href="http://en.wikipedia.org/wiki/SI_prefix">http://en.wikipedia.org/wiki/SI_prefix</a> A "kilo" has meant 10^3 to the scientific world since 1795. A "tera" has meant 10^12 since 1960. Programmers use of units are eventually going to have to become compatible with the larger scientific world, not least because the numbers we deal with are getting bigger. </pre></div> Sun, 11 Nov 2007 06:24:02 +0000 This is great! https://lwn.net/Articles/254060/ https://lwn.net/Articles/254060/ nix Warning: Ulrich has no patience at all with people who don't do their <br> homework (by, say, typing in `ulrich drepper home page' in Google).<br> <p> It's &lt;<a href="http://people.redhat.com/drepper/">http://people.redhat.com/drepper/</a>&gt;.<br> Thu, 11 Oct 2007 21:55:08 +0000 This is great! https://lwn.net/Articles/253911/ https://lwn.net/Articles/253911/ vaib Can you please tell the homepage of the author.<br> Thu, 11 Oct 2007 04:04:11 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253899/ https://lwn.net/Articles/253899/ wookey You have missed the bit that the '1.066Ghz' bus is 'quad-pumped' so isn't really 1GHz at all: it is a quarter of that. Hence about 11:1 rather than about 3:1 ration between clock speeds. I just learned this from the above article (I had been taken in by marketers before and assumed that FSB speeds were real :-)<br> Wed, 10 Oct 2007 22:22:00 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253854/ https://lwn.net/Articles/253854/ DonDiego <blockquote>An Intel Core 2 processor running at 2.933GHz and a 1.066GHz FSB have a clock ratio of 11:1 (note: the 1.066GHz bus is quad-pumped). Each stall of one cycle on the memory bus means a stall of 11 cycles for the processor.</blockquote> 11:1? I thought it was ~3:1, what have I missed there? It does not look like a typo... Wed, 10 Oct 2007 15:34:29 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253417/ https://lwn.net/Articles/253417/ njs For the diagrams, see:<br> <a href="http://udrepper.livejournal.com/12663.html">http://udrepper.livejournal.com/12663.html</a><br> <a href="http://udrepper.livejournal.com/12840.html">http://udrepper.livejournal.com/12840.html</a><br> Fri, 05 Oct 2007 19:43:45 +0000 The tools used https://lwn.net/Articles/253368/ https://lwn.net/Articles/253368/ corbet All done in LaTeX and metapost. Conversion to HTML was done by a script I wrote after I gave up on all the more general LaTeX-&gt;HTML tools out there. Fri, 05 Oct 2007 15:51:25 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253360/ https://lwn.net/Articles/253360/ edmcman A wonderful and professional article!<br> <p> As a side note, does anyone know what this was written in, and perhaps what the diagrams were created in?<br> Fri, 05 Oct 2007 15:33:40 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253317/ https://lwn.net/Articles/253317/ jschrod Well, it depends where you study. At the TU Darmstadt, Germany, this stuff was part of our undergrad Computer Science courses, back in 1981ff. (I don't know the current curricula, though.)<br> Fri, 05 Oct 2007 11:18:16 +0000 FB-DIMM pins https://lwn.net/Articles/253214/ https://lwn.net/Articles/253214/ anton Actually only the interface at the memory controller is 69 pins (allowing more channels from one memory controller chip). The FB-DIMM needs these 69 pins, plus 69 pins to talk to the next FB-DIMM, plus additional pins for power and ground; that's why they have the familiar 240-pin form factor. Thu, 04 Oct 2007 20:16:27 +0000 Hyperthreading performance https://lwn.net/Articles/253208/ https://lwn.net/Articles/253208/ anton <blockquote>But I think the usual reason for hyperthreading slowdown is just the overhead of switching threads. </blockquote> In SMT (and that includes hyperthreading), there is no thread switching overhead. The execution core just executes instructions from different contexts at the same time (but in different resources). <p>I don't know why the Pentium 4 variant of SMT performs as badly as it does; cache thrashing may contribute, but I don't think that this is the main reason. The main reasons are probably some obscure microarchitectural details, maybe the <a href="http://en.wikipedia.org/wiki/Replay_system">replay system</a>, maybe something else. Thu, 04 Oct 2007 20:10:22 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/253152/ https://lwn.net/Articles/253152/ tjrtech Great article and good review. I learned all this as a formally educated computer engineer. This shows why computer engineers write faster code than comp sci or informally trained coders.<br> Thu, 04 Oct 2007 16:44:19 +0000 100MHz × 64bit × 2 = 1,600MB/s ? https://lwn.net/Articles/253026/ https://lwn.net/Articles/253026/ pdfan 100MHz × 64bit × 2 = 1,600MB/s<br> <p> bash-3.2# bc<br> bc 1.06<br> Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.<br> This is free software with ABSOLUTELY NO WARRANTY.<br> For details type `warranty'.<br> 100 * 64 * 2<br> 12800<br> <p> 100 * 64 * 2 / 8<br> 1600<br> quit<br> bash-3.2#<br> <p> Thu, 04 Oct 2007 06:15:02 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/252817/ https://lwn.net/Articles/252817/ wyrdwright Excellent article; clear and concise without sacrificing too much detail. Looking forward to the rest.<br> Wed, 03 Oct 2007 14:19:44 +0000 Reader Comments https://lwn.net/Articles/252586/ https://lwn.net/Articles/252586/ roelofs Two more clarification-comments: <P> <FONT COLOR="#880044"><I>Recent RAM types require two separate buses (or channels as they are called for DDR2, see Figure 2.8) which doubles the available bandwidth.</I></FONT> <P> Unless I'm missing something fundamental, Figure 2.8 has nothing to do with DDR2 channels. Indeed, I don't believe the comment even refers to Figures 2.12 or 2.13; I see nothing relevant. Perhaps the figure in question was dropped at some point? <P> <FONT COLOR="#880044"><I>In this example the SDRAM spits out one word per cycle.</I></FONT> <P> Here and in several other places, the text is ambiguous. "Cycle" in this context apparently means clock cycle, but there's an implicit (larger) cycle measured from RAS to RAS (for example) that defines the overall throughput. Figure 2.8 actually shows four words going out in that larger cycle. <P> Greg Mon, 01 Oct 2007 22:22:32 +0000 Grammar correction https://lwn.net/Articles/252415/ https://lwn.net/Articles/252415/ valankar "Implementing this is trivial: one only has the use the same column address for two DRAM cells and access them in parallel."<br> <p> should be:<br> <p> "Implementing this is trivial: one only has to use the same column address for two DRAM cells and access them in parallel."<br> Mon, 01 Oct 2007 03:47:14 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/252389/ https://lwn.net/Articles/252389/ k8to Agreed. But knowing how a dram cell is implemented is more than one level of abstraction below low-level programming, and that is more than one level of abstraction below what "every" programmer will ever deal with.<br> Sun, 30 Sep 2007 20:27:02 +0000 on-board video cards https://lwn.net/Articles/252271/ https://lwn.net/Articles/252271/ foom <blockquote style="margin:0; padding-left:10px; border-left:1px dotted #336699; color:#336699;">But I guess DVI designers decided the computer wants to update the picture by a full raster scan 60 times a second anyway, so there's no need for internal refresh. Doing a little reading just now, it looks like the DVI data stream is a simple raster scan. It even apparently has "blanking intervals," though they couldn't possibly be for same purpose as on a CRT.</blockquote> <p> DVI's timing and blanking intervals are the same as VGA's. I believe it was designed this way to make the modification to the video cards easier, and to facilitate dual-output DVI / VGA video cards. (so the VGA port is basically just the DVI port with an extra D2A converter in the path.) Sat, 29 Sep 2007 23:06:05 +0000 on-board video cards https://lwn.net/Articles/252259/ https://lwn.net/Articles/252259/ giraffedata <p> It's not the physics, but the modernness that I think makes the refresh not necessary for LCDs at the level it is for CRTs: If I were designing a monitor in the 1990s out of parts that need to be refreshed (even a CRT), I would put required refresh function inside,rather than pass the responsibility off to the computer. In SVGA days, though, it probably made sense to keep the monitor dumb. <p> But I guess DVI designers decided the computer wants to update the picture by a full raster scan 60 times a second anyway, so there's no need for internal refresh. Doing a little reading just now, it looks like the DVI data stream is a simple raster scan. It even apparently has "blanking intervals," though they couldn't possibly be for same purpose as on a CRT. Sat, 29 Sep 2007 21:49:38 +0000 Good article (so far) https://lwn.net/Articles/252091/ https://lwn.net/Articles/252091/ filker0 The article here is primarily about x86 type systems, assumes (I think) 64 bit multi-core CPUs, and also assumes a general computing environment. Extending this to other architectures might make it more useful (not that it's not a good overview so far; I learned a few things, and it's only 1/7th of the way through) to the folks this will matter the most to -- the embedded Linux programmer. Embedded programmers have more control over their environment than typical user-space programmers, and often need to tweak things to get rid of every wasted cycle possible.<br> <p> Better knowing how the memory works, how it's connected to the rest of the system, and how software can be written to take this into account can lead to better performance. If this is applied at the kernel level when organizing kernel data structures and code, as well as in the design of service code (DMA, paging, interrupt handlers, data streams/pipes, IPC, etc.) could lead to better system performance.<br> <p> Thanks for the article. I look forward to the rest of the parts.<br> Fri, 28 Sep 2007 19:51:51 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/252058/ https://lwn.net/Articles/252058/ pm101 Personally, I think it is useful to have a reasonable knowledge one level of abstraction down, and one level of abstraction up. There are a number of reasons for this: <ul> <li> You can often find optimizations that cross abstraction barriers. It is difficult to predict what you'll need to know to do this, so you really need to have a deep understanding of both layers. In some cases, you can also influence the hardware design. <li> You can predict how the technology may evolve. <li> You gain intellectual depth. </ul> This is especially important for systems programmers -- the target of this article. If I'm designing a kernel, or a virtual machine (as in JVM, or .net runtime), or a high-performance systems library, I want to design it in such a way that it can take advantage of possible future underlying technologies. <p> Indeed, in many cases, I may even be able to influence underlying technologies. If I am aware of the circuit requirements of memory refresh, I can design code that explicitly leaves time for the refresh, while giving good bandwidth and latency when the memory is actually accessed. If something is a good idea, and a major OS or runtime can take advantage of it, you can bet that hardware designers somewhere will add support for it. <p> The major reason most CPUs only have 2-4 cores today, and didn't have multiple cores a while ago, was that software could not take advantage of them. Right now, optimum performance comes from about 64 cores at 700MHz each (the Tilera processor), but it can only be used in esoteric applications because software designers a decade ago were not aware of where the hardware is headed, and did not design applications, languages, or run-times in a parallelism-friendly way (programmer-friendly parallelism is only starting to happen today with languages like Fortress). Fri, 28 Sep 2007 15:56:56 +0000 on-board video cards https://lwn.net/Articles/252042/ https://lwn.net/Articles/252042/ pm101 Are you sure? <br> <p> My impression was that active matrix LCDs worked a lot like DRAM. They had a transistor and a capacitor for each pixel (the transistor was added in the move from passive to active matrix), but the voltage on that capacitor decayed and needed to be periodically refreshed. I was unaware of LCD displays having any on-board memory from which to do the refresh, but that could have been added while I wasn't following the market, although I'd be surprised, since it seems like it'd be an unnecessary cost item. <br> Fri, 28 Sep 2007 14:41:51 +0000 Another grammar fix https://lwn.net/Articles/251961/ https://lwn.net/Articles/251961/ rmunn <p>I spotted another grammar oops. In section 2 ("Commodity Hardware Today"), seventh paragraph (the one immediately after the first bullet-point list), the last sentence reads: "This problem, therefore, <i>must to be</i> taken into account." That should be either "<i>needs to be</i> taken into account" or "<i>must be</i> taken into account."</p> <p>This is, of course, an artifact of editing the paper, where "needs to be" was changed into "must be" at some point but the leftover "to" was missed.</p> Thu, 27 Sep 2007 21:41:06 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/251948/ https://lwn.net/Articles/251948/ Unleashed Hey, this article and the likes you can find here just made me finally decide to subscribe! Bravo Ulrich!<br> Thu, 27 Sep 2007 20:06:41 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/251940/ https://lwn.net/Articles/251940/ jvestby Excellent stuff.<br> <p> I believe I have found a small typo in the formula just above figure 2.12.<br> Something like 133MHz is needed to get 1600. <br> Thu, 27 Sep 2007 19:36:32 +0000 Hooray! https://lwn.net/Articles/251879/ https://lwn.net/Articles/251879/ dwheeler I'm delighted to see this series, thanks for running it. It's always frustrated me that people who develop software often have no clue what's going on underneath, and as a result write hideous code. E.G., yes, it <i>does</i> matter what order you access matrices in. I presume this series will eventually get there. <p> Also: let me say that I <i>like</i> the SI binary prefixes (GiByte, etc.); when computer memories were 48K, the difference between the binary and decimal prefixes didn't matter much, but as everything is getting bigger/faster, the differences have getting bigger too. When you're being imprecise, it doesn't matter, but when you want to be precise (e.g., when describing product specs or presenting a diagnostic report), I find them REALLY helpful. In some circumstances it's also the law: claiming your product does something, but not actually meeting your claims (because you used the wrong prefix) can actually get you hauled into court. There's a much bigger world beyond computing, and they already know what "Giga" means; it's 10^9. Thu, 27 Sep 2007 14:42:56 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/251811/ https://lwn.net/Articles/251811/ tyhik "... but the none of them need be aware of the circuit design. None."<br> <p> There are programmers out there, who write memory controller configuration code for boot loaders. I have done it and knowing the electrical design of memory cells really helped to answer simple questions like why the hell does DRAM need a configurable controller while the onchip SRAM is nicely ready for use right after powerup.<br> <p> <p> Thu, 27 Sep 2007 08:09:54 +0000 Reader Comments https://lwn.net/Articles/251767/ https://lwn.net/Articles/251767/ drepper I appreciate (most of) the comments and actually made already a few changes based on them to clarify a few things (and correct typos etc).<br> <p> But I'm not going to reply to anything specific here and now. This is just section 2 (with 1 only being an introduction). Some of what has been discussed in comments goes far beyond what is in these sections. Once you've read section 6 you probably have a better understanding about what is covered and what isn't (and to some extend: why certain things are covered in the first place).<br> <p> So, don't regard my silence as a sign of disinterest, it just means that many questions will automatically be answered later.<br> Wed, 26 Sep 2007 23:25:01 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/251750/ https://lwn.net/Articles/251750/ k8to I cannot help but be a sourpuss and say: almost no programmer neeeds to know this about memory.<br> <p> Some architects on software that needs good optimization should probably be acquiainted with the performance characteristics discussed, but the none of them need be aware of the circuit design. None.<br> Wed, 26 Sep 2007 21:22:50 +0000 What every programmer should know about memory, Part 1 https://lwn.net/Articles/251735/ https://lwn.net/Articles/251735/ dankamongmen Ulrich's amazing, and the main source of my understandings of modern glibc / linux-userspace API. Thanks again for such execellent code and attending documentation!<br> Wed, 26 Sep 2007 20:18:45 +0000