Support for shingled magnetic recording devices
Posted Mar 27, 2014 18:10 UTC (Thu) by k3ninho (subscriber, #50375)
I recall reading this at Anand Tech:
Posted Mar 27, 2014 22:51 UTC (Thu) by giraffedata (subscriber, #1954)
It's 20%. You can get 25% more areal density, but after accounting for the extra data you have to store to manage it, you get 20% more data on a square millimeter, or on a device, or on a square meter of floor space. That's in an apples to apples comparison: two disk drives that differ only in that one is shingled and the other not.
That makes it hard to see how bending over backwards to make applications fit the requirements of a shingled disk (basically, sequential writing) is worth the cost.
If you could identify systems that exist today and happen to access existing disk drives in the required pattern, then maybe it would make sense to steer those systems toward shingled disks, but even then I suspect the administrative cost of splitting your storage applications into two camps would outweigh the 20% cost savings at the lower levels.
And if you could arrange to have a lot of sequential writing, unless you also have fast random reading, it would be hard to justify not using tape, for a 500% volume-per-dollar saving.
Posted Mar 27, 2014 22:55 UTC (Thu) by dlang (subscriber, #313)
Posted Mar 27, 2014 23:05 UTC (Thu) by giraffedata (subscriber, #1954)
shingled drives have the same speed random reading as non-shingled drives
Right, so if you can find an application where you have large sequential writing, but fast random reading, that might justify shingled disks. If instead, you have random writing, you'll want a non-shingled disk, and if instead you have large sequential reading, you'll want tape.
Posted Mar 28, 2014 5:12 UTC (Fri) by Cyberax (✭ supporter ✭, #52523)
And you'd be amazed by the number of applications where you need fast access to immutable (or slowly changing) data. Even better, it's possible to use faster hard drives (or even SSDs) as a frontend for the slow shingled disks.
Posted Mar 28, 2014 23:07 UTC (Fri) by giraffedata (subscriber, #1954)
Many of the applications that could tolerate shingled disk could also tolerate tape.
Density per tape cartridge isn't really the point. Cost is the point, and tape is cheaper per terabyte than shingled disk. Part of the reason for that is the storage density per tape drive is far, far greater than for disk drives. Data rate per drive is much greater too.
I don't think I would be amazed at the number of applications that are appropriate for shingled disk, but I also know that there are a lot of applications that aren't, and there are significant storage management costs in using different kinds of disk drives for different kinds of data. I suspect one would need more than a 20% differential in per-terabyte cost to justify that.
Even better, it's possible to use faster hard drives (or even SSDs) as a frontend for the slow shingled disks.
That's the bending over backwards I was talking about that I doubt is worth it for a 20% improvement. Have we ever seen people make such a disruptive transition for 20%? Would people have gone from floppy disks to CD-ROM for 20%? Or CD-ROM to DVD? Would people even have gzipped tar files for only 20%?
Posted Mar 29, 2014 11:25 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
And shingled disks are not that bad, it's not a completely disruptive transition. Sure, it'll require some additional engineering for the front-end write-through caches but it's not a big deal compared to tape.
So think about it - would you build a tape library with expensive robots and lots of tape or would you just prefer to buy somewhat slower hard drives?
Posted Mar 29, 2014 19:15 UTC (Sat) by giraffedata (subscriber, #1954)
Well, tape is actually NOT cheaper unless you do it in a REALLY big way with tape libraries and robots.
Yes, that's what I was talking about. When I compare the economics of storage technologies, I think of large scale storage. With tape, there are thousands of cartridges and plenty of robots.
Even then it's only marginally cheaper than HDs.
So think about it - would you build a tape library with expensive robots
and lots of tape or would you just prefer to buy somewhat slower hard
I'm not sure what you're comparing here. Shingled disks aren't somewhat slower. Used right, they're the same speed as regular drives; used wrong, they're unusably slow. Since tape applications also work on shingled drives, the question would be, would you use something with 2 minute access time or just use slightly more expensive disk drives. Only since I'm claiming shingled drives are 4X more expensive than tape, that question is moot.
By the way, some of my data is on tape. My company backs up its general purpose filesystem to tape. It takes me 4 minutes to recover a lost file - 2 minutes to go through the interactive dialog and 2 minutes for the robots and tape drives to do their thing. Shingled disk would cut that to 2 minutes total. I can't imagine my company switching unless there is virtually no difference in the storage cost.
Posted Mar 29, 2014 20:32 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
LTO6 tapes are about $40 per tape (2.5Tb) when bought in bulk. Maybe $30 if you are really big. Hard drives are around $50 per 2Tb in bulk.
And that's without considering the cost of streamers (multiple $$$$), tape robots ($$$$$) and the storage software solution (shockingly, there are no good OpenSource hierarchical storage managers).
Of course, HDDs need some kind of SAN, but they are cheap these days. AoE/iSCSI solution for 1000 drives capable of 100Gb throughput can be bought for just under $30k and doesn't need any fancy software.
Posted Mar 30, 2014 10:22 UTC (Sun) by khim (subscriber, #9252)
Shingled disks aren't somewhat slower. Used right, they're the same speed as regular drives; used wrong, they're unusably slow.
And when used with GFS or HDFS which need random read access to it's 64 megabytechunks and kinda streamlined write access to these chunks (GFS gives you the ability to append-write files, HDFS does not even offer that today) they are fast.
Shingled disks are only on horizon today, but API which is basically custom-taylored for their limitations is more than decade old and there are thousands of companies and millions of drives which are used for such kinds of applications. End of story.
Posted Mar 30, 2014 16:34 UTC (Sun) by giraffedata (subscriber, #1954)
And when used with GFS or HDFS which need random read access to it's 64 megabyte chunks and kinda streamlined write access to these chunks (GFS gives you the ability to append-write files, HDFS does not even offer that today) they are fast.
I think those are two examples of things that would require re-engineering to work with shingled disks, because neither writes in log-structured fashion today. The only reason I can think of that an existing disk drive application would write in log-structured fashion (fill the drive, or a large segment of it, from beginning to end) is to maximize write speed. But GFS and HDFS assume there is little writing. HDFS is specifically aimed at fast sequential reading of large files, which means it needs to keep files contiguous on disk, which is not possible with a log structured file system.
Posted Mar 31, 2014 10:36 UTC (Mon) by dlang (subscriber, #313)
I've seen a number of packages that have the pattern of having large chunks of data, but new data is not written directly to those large chunks, instead new data is written sequentially to an 'updates' file, and periodically some other job comes along and re-writes the large chunks to include the changes from the updates files, and then deletes the updates files.
such systems would be perfect for shingled drives, they would just get their chunk sizes and alignments adjusted to match.
Posted Apr 1, 2014 11:10 UTC (Tue) by ricwheeler (subscriber, #4980)
Not sure where you got the 20% number from, but the better way to think of this is that each kind of disk technology hits a plateau at some point. Further investment in that will not bring improvements in density.
Moving to SMR is effectively moving from curve for our existing technology that is about to plateau onto a new curve. The delta between the current curve and the new one does start small, but over time will take us to a significant density improvement. The slide shown by one vendor showed that eventual difference being closer to 3-4 times the density but that was more of a hand wave I would guess.
The drive vendors shied away from specific numbers, but they all agreed that SMR was a new foundation that other technologies will build on (not something that will be replaced in time).
Reading a crystal ball is hard, but is does seem like a promising technology to invest in :)
Posted Apr 3, 2014 1:56 UTC (Thu) by giraffedata (subscriber, #1954)
Not sure where you got the 20% number from,
There was a paper from Seagate, which I believe is referenced earlier in this thread, that said the technology provided a 25% improvement in areal density. I read somewhere else that there's a 5% overhead for something else - metadata or guard bands or something, bringing the effective improvement down to 20%.
If that's just a prototype figure and the technology eventually gets to 3-4 times areal density improvement, that's a different story.
Posted Mar 28, 2014 17:52 UTC (Fri) by Creideiki (subscriber, #38747)
Posted Mar 28, 2014 23:27 UTC (Fri) by giraffedata (subscriber, #1954)
Posted Mar 29, 2014 7:05 UTC (Sat) by Creideiki (subscriber, #38747)
Posted Mar 29, 2014 12:00 UTC (Sat) by james (subscriber, #1325)
These are companies that design their own servers to save money: they'll certainly be interested in minimising the cost of storing that data. They also already identify this data: sending it to special disks is not a big cost for them.
This market is probably big enough on its own to justify the investment: Facebook and Google will certainly have been consulted on these drives, and may have committed to buying a certain quantity if they meet price, performance, and reliability criteria.
Then there are systems like CCTV and personal video recorders, which may well run Linux but won't need small random write speeds (and will love the extra capacity).
The four remaining questions are:
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds