|
|
Subscribe / Log in / New account

KS2009: How Google uses Linux

KS2009: How Google uses Linux

Posted Oct 21, 2009 9:13 UTC (Wed) by sdalley (subscriber, #18550)
Parent article: KS2009: How Google uses Linux

> Mike noted that it is increasingly hard to buy memory which actually works, especially if you want to go cheap.

Wow. Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

I'd prefer my code, documents and calculations not to be twiddled with behind my back, thank you.

Reminds me of a new improved version of an HP text editor I used once, backalong. We were maintaining a large Fortran program which started misbehaving in odd ways, then stopped working altogether. Turned out that each time you saved the file, the editor would lose a random line or two.


to post comments

KS2009: How Google uses Linux

Posted Oct 21, 2009 9:37 UTC (Wed) by crlf (guest, #25122) [Link] (2 responses)

> Wow. Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

The issue is one of probability and large numbers.. Memory errors are already common today, and the continued increase in density will not help matters tomorrow.

KS2009: How Google uses Linux

Posted Oct 21, 2009 12:28 UTC (Wed) by sdalley (subscriber, #18550) [Link]

Thank you, very interesting paper.

So, one chance in three per annum of suffering a memory error on a given machine, roughly.

With ECC memory which Google use as standard, 19 out of 20 of these errors will be transparently corrected.

With non-ECC memory, as in commodity PCs, stiff biscuit every time.

KS2009: How Google uses Linux

Posted Oct 21, 2009 21:08 UTC (Wed) by maney (subscriber, #12630) [Link]

You imply that denser chips will cause higher error rates, but that is not what they found:

We studied the effect of chip sizes on correctable and un- correctable errors, controlling for capacity, platform (dimm technology), and age. The results are mixed. When two chip configurations were available within the same platform, capacity and manufacturer, we sometimes observed an increase in average correctable error rates and sometimes a decrease.

There were other, also mixed, differences when comparing only memory module sizes, but that mixes together differences in chip density and number of chips on the module - and quite possibly chip width as well.

The best we can conclude therefore is that any chip size effect is unlikely to dominate error rates given that the trends are not consistent across various other confounders such as age and manufacturer.

Which, I think, summarizes decades of experience that refuted various claims that the ever-shrinking memory cells just had to lead to terrible error problems. I may still have an old Intel whitepaper on this from back in the days when chips sizes were measured in Kbits.

KS2009: How Google uses Linux

Posted Oct 21, 2009 12:30 UTC (Wed) by nye (subscriber, #51576) [Link] (1 responses)

>Who actually wants to buy memory that doesn't work, or flips random bits on an off-day?

Anyone who wants to buy real memory that exists in the physical world, really.

KS2009: How Google uses Linux

Posted Nov 7, 2009 21:01 UTC (Sat) by jlin (guest, #61855) [Link]

I've read from a conference that Google actually goes to DRAM manufacturers and buys bulk memory chips that failed QA, and makes the ram dimm modules themselves in order to take advantage of scale. For Google, the low price outweighs the troubles of validating the failed RAM chips themselves for salvagable parts.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds