|
Monitor disks with the S.M.A.R.T. monitoring toolsMonitor disks with the S.M.A.R.T. monitoring toolsPosted Mar 14, 2008 21:45 UTC (Fri) by giraffedata (subscriber, #1954)In reply to: Monitor disks with the S.M.A.R.T. monitoring tools by NRArnot Parent article: Monitor disks with the S.M.A.R.T. monitoring tools
However, the value of the data is usually much greater than the cost of the disk, so it's quite an easy decision. I don't think that's true. Often, the data is relatively unimportant, like a Google web page cache or a small part of a stream of undifferentiated experimental data. The rest of the time, the data is easily reconstructable, e.g. by copying from a mirror disk or backup tape. People set up storage systems so that the value of preserving the data is commensurate with the cost of preserving it. If you perturb that system by replacing drives more often based on SMART data, I think you'll have a net loss. On the other hand, if you could exploit SMART data so as to get the same reliability with fewer redundant copies, that would be a win. Either the Google paper or another that came out around the same time concluded that the best policy was to wait for a drive to fail, then replace it.
One surprise: keeping disks cooled under 30C *reduces* life expectancy If you want to jump to conclusions, but the study didn't actually isolate the cooling policy. It merely showed that drives that failed tended to be the ones that were cooler. That's a long way from saying if you speed up the fans, the disks will fail more. Just as likely is that the cool drives were of models where the engineers traded durability for low power consumption. Remember the one great consistent, fully controlled, correlation these studies show is between failure rate and model.
(Log in to post comments)
Monitor disks with the S.M.A.R.T. monitoring tools Posted Mar 20, 2008 5:01 UTC (Thu) by roelofs (subscriber, #2599) [Link] Either the Google paper or another that came out around the same time concluded that the best policy was to wait for a drive to fail, then replace it....for some definition of "fail." Keep in mind that performance drops, sometimes significantly, before unrecoverable data loss occurs. Greg
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.