LWN.net Logo

KHB: Real-world disk failure rates: temperature

KHB: Real-world disk failure rates: temperature

Posted Jun 18, 2007 3:52 UTC (Mon) by giraffedata (subscriber, #1954)
In reply to: KHB: Real-world disk failure rates: temperature by vaurora
Parent article: KHB: Real-world disk failure rates: surprises, surprises, and more surprises

That puts the question of temperature/failure correlation in a whole different light. The correlation people expect has to do with the idea that if you put a disk drive in a hot environment, it will die sooner than if you put it in a cool environment.

But my guess is that the outside temperature and air flow around the drives doesn't vary among Google's sample, so any temperature difference inside the drive is due to the design of the disk drive. IOW, the correlation shows that the designs that run cooler are also the ones that fail more.

Considering that the engineers do design for and test for failure rates, i.e. the failure rate is the independent variable, I would not expect a drive that runs hotter to fail more. Engineers would have designed it to run that hot.

But I might be convinced that in their struggle to make a drive consume less power, and thus run cooler, the engineers sacrificed longevity. (I don't know enough about disk drive design to know how such a tradeoff would be made, but I'm sure there's a way). That could explain the negative correlation.


(Log in to post comments)

KHB: Real-world disk failure rates: temperature

Posted Jun 20, 2007 2:09 UTC (Wed) by mhelsley (subscriber, #11324) [Link]

Considering the energy needed to manufacture the hard drives and their lower lifetimes I wonder it truly results in net energy savings.

KHB: Real-world disk failure rates: temperature

Posted Jun 20, 2007 16:40 UTC (Wed) by giraffedata (subscriber, #1954) [Link]

Considering the energy needed to manufacture the hard drives and their lower lifetimes I wonder if it truly results in net energy savings.

That's an insightful question, but as the true goal is to save resources in general, and not just energy (or fossil fuels and clean and clear air), the right question is actually a little simpler: Does the money a data center saves on electricity make up for the cost of more frequent replacements? The amount energy used in manfuacturing a drive is reflected in its price.

Of course, we're only speculating at this point that there is any correlation between energy efficiency and failure rate. The one really useful result of these two studies is that manufacturers don't know what their drive lifetimes are, so users have been making buying decisions based on wrong numbers.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds