LWN.net Logo

Runtime filesystem consistency checking

Runtime filesystem consistency checking

Posted Apr 3, 2012 18:58 UTC (Tue) by martinfick (subscriber, #4455)
In reply to: Runtime filesystem consistency checking by Tara_Li
Parent article: Runtime filesystem consistency checking

I don't understand your point? Have random accesses slowed down? Are they anticiapted to slow down?


(Log in to post comments)

Runtime filesystem consistency checking

Posted Apr 3, 2012 20:00 UTC (Tue) by drag (subscriber, #31333) [Link]

The slowdown is relative to other metrics related to computer performance, I imagine.

Also while reliability and capacity has both increased, capacity has far outstripped reliability. So that while today's drives are generally more reliable then older ones (as in bad/corrupt blocks lost per GB) the chances of you losing part of your data is much higher simply because there is so much more of it.

This sort of stuff why online fsck and scrubs (reading in data and comparing it to checksums to detect and correct corruption) is so important on modern file systems. Previously the only people that needed to care were ones that could justify the expense of purchasing big SAN devices and whatnot.

Runtime filesystem consistency checking

Posted Apr 3, 2012 20:01 UTC (Tue) by cmccabe (guest, #60281) [Link]

Hard disk sizes have continued increasing exponentially, while rotations per minute (RPMs) have more or less stopped increasing. So seeks are becoming more expensive, and fsck in general is starting to take much longer on hard disks.

SSDs don't have these limitations, however.

Runtime filesystem consistency checking

Posted Apr 4, 2012 8:00 UTC (Wed) by dgm (subscriber, #49227) [Link]

That doesn't make a lot of sense.

Disk capacity may have increased, but disk platters are exactly the same size as before: 3.5 inches. So, moving the read head around should cost mostly the same as before. The only factor I can think of is that the head has to be more precisely positioned, and that may (or may not) be more costly because of physical limitations (rebounds).

On the other hand there are two factors that should make seek time decrease: improved machinery and more density. More density means that more data goes faster under the read head, so more often seeks can be satisfied without moving the read head, just waiting for the data to pass below.

Runtime filesystem consistency checking

Posted Apr 4, 2012 9:22 UTC (Wed) by epa (subscriber, #39769) [Link]

I thought the other poster's point was about rotational speed. If the disk rotates at 100 revolutions per second then you may have to wait ten milliseconds in the worst case, even if the head is already positioned correctly. That ten milliseconds is not getting any shorter because disks are not spinning faster. However, the other components in the system are getting faster, so the ten millisecond overhead becomes more and more significant. Similarly, the disk head takes almost as long to move into position today as it did twenty years ago, even though processors and RAM are many times faster.

Or maybe the point is that larger filesystems necessarily require more random accesses and hence more disk seeks when you fsck them. Larger RAM would mitigate this but I don't know whether increased RAM for caching has kept pace with filesystem sizes enough. An fsck expert would be able to give some numbers.

Runtime filesystem consistency checking

Posted Apr 4, 2012 10:27 UTC (Wed) by khim (subscriber, #9252) [Link]

Actually the original poster was wrong: seeks are no more expensive. They have the same cost, but you need more of them. Even if you'll grown filesystem data structures to reduce fragmentation undeniable fact is that number is tracks is growing and time to read a single track is constant.

This means that time needed to read the whole disk from the beginning to the end is growing.

Runtime filesystem consistency checking

Posted Apr 4, 2012 12:17 UTC (Wed) by epa (subscriber, #39769) [Link]

Makes sense. I imagine that the number of tracks grows with the square root of disk capacity.

Runtime filesystem consistency checking

Posted Apr 4, 2012 12:59 UTC (Wed) by khim (subscriber, #9252) [Link]

More or less. This means that when you go from Linux 0.1 (with typical size of HDD 200-300MB) to Linux 3.0 (with typical size of HDD 2-4TB) filesystem slows by a factor of 100, not by a factor of 10'000. But 100x slowdown is still a lot.

Runtime filesystem consistency checking

Posted Apr 4, 2012 16:01 UTC (Wed) by wazoox (subscriber, #69624) [Link]

I'm currently testing 4TB drives right now. RAID rebuild now reaches 48 hours, up from 10 hours for 1 TB.
The individual 4 TB drive needs more than 9 hours to simply fill it up sequentially.
We'll need to index blocks on our spinning rust on SSD cache before long :)

Runtime filesystem consistency checking

Posted Apr 4, 2012 19:41 UTC (Wed) by khim (subscriber, #9252) [Link]

Contemporary 4TB HDDs are especially slow because they use 5 plates (where your 1TB disks probably used 2 or 3). This means that not only you see the slowdown from growing number of tracks, you see additional slowdown from growing number of plates!

Thankfully in this direction 5 is the limit: I doubt we'll see return of 30 plates monsters like the infamous Winchester… all 3.5" HDDs to date had 5 plates or less.

Runtime filesystem consistency checking

Posted Apr 5, 2012 9:18 UTC (Thu) by misiu_mp (guest, #41936) [Link]

I dont think its clear why would more plates be slower. More plates means more heads, with possibility for concurrency - that should increase sequential transfer speed.
If data is written cylinder-wise, the latency should be similar to one-plate disk.
If it is written plate-wise, the latency should vary up and down in relation to block numbers. Its possible the average latency would still be comparable.
The only clear negative about multi-platter systems is the increased inertia of the head assembly. It's not so clear whether it has a practical implication.
Apart from this unclear performance implication, there is of course the decreased reliability and increased cost of multi-platter solutions. That is the main reason we don't see that many of them.

Runtime filesystem consistency checking

Posted Apr 5, 2012 10:00 UTC (Thu) by khim (subscriber, #9252) [Link]

More plates means more heads, with possibility for concurrency - that should increase sequential transfer speed.

Good idea. Sadly it's about ten years too late. Today's tracks are too small: when the head is on a track on one plate all other heads are not on this same track. In fact they are not on track at all. They just randomly drift between 2-3 tracks adjacent to each other. That's why you can only use one head actively (how can we use even one if it's all is so unstable? well, it's easy: there are active scheme which dynamically moves head to keep it on track).

If data is written cylinder-wise, the latency should be similar to one-plate disk.

Latency of seeks - yes, number of tracks - no. If you use the same plates then filesystem on a single plate HDD will be roughly five times faster then filesystem on five plates HDD.

That is the main reason we don't see that many of them.

The main reason we don't see many of them is cost. They are more expensive to produce and since they are less reliable they incur more warranty overhead. They are also slower, but this secondary problem.

Runtime filesystem consistency checking

Posted Apr 13, 2012 8:47 UTC (Fri) by ekj (guest, #1524) [Link]

So, we "only" need to make the arms move independently then. :-)

Runtime filesystem consistency checking

Posted Apr 5, 2012 19:01 UTC (Thu) by cmccabe (guest, #60281) [Link]

When I said "seeks are becoming more expensive" meant in relation to other things going on in the system, not in an absolute sense.

From a programmer's perspective, the growth in hard disk capacity has not been matched by a corresponding increase in either throughput or worst-case latency.

Because hard disk throughput has not kept pace, in a high performance setup, your only hope for reasonable throughput is to use RAID with striping. But RAID increases the minimum size that you can read-- before, that minimum was a sector-- with RAID, it's a stripe. This makes hard disks even less of a random-access medium, since you never want to be reading just a few bytes-- you want to read a whole RAID stripe at a time in order to be efficient.

Most programmers don't know about these details because the database does all this for you.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds