Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
And this is relevant to ext4... exactly how?
Posted Dec 1, 2011 1:59 UTC (Thu) by dlang (✭ supporter ✭, #313)
Posted Dec 1, 2011 3:29 UTC (Thu) by tytso (subscriber, #9993)
But yes, a journal by itself has as its primary feature avoiding long fsck times. One nice thing with ext4 is that fsck times are reduced (typically) by a factor of 7-12 times. So a TB file system that previously took 20-25 minutes might now only take 2-3 minutes.
If you are replicating your data anyway because you're using a cluster file system such as Hadoopfs, and you're confident that your data center has appropriate contingencies that mitigate against a simultaneous data-center wide power loss event (i.e., you have bat, and diesel generators, etc., and you test all of this equipment regularly), then it may be that going without a journal makes sense. You really need to know what you are doing though, and it requires careful design both at the hardware level, the data center level, as well as the storage stack above the local disk file system.
Posted Dec 2, 2011 18:55 UTC (Fri) by walex (subscriber, #69836)
One nice thing with ext4 is that fsck times are reduced (typically) by a factor of 7-12 times. So a TB file system that previously took 20-25 minutes might now only take 2-3 minutes.
Posted Dec 2, 2011 19:10 UTC (Fri) by dlang (✭ supporter ✭, #313)
Posted Dec 3, 2011 0:40 UTC (Sat) by walex (subscriber, #69836)
An unclean shutdown is usually not that damaged, which can however happen with a particularly bad unclean shutdown (lots of stuff in flight, for example on a wide RAID) or RAM/disk errors. The report I saw was not for a "enterprise" system with battery, ECC and a redundant storage layer.
Posted Dec 2, 2011 21:41 UTC (Fri) by nix (subscriber, #2304)
Fill up the fs, even once, and this benefit goes away -- but a *lot* of filesystems sit for years mostly empty. fscking those filesystems is very, very fast these days (I've seen subsecond times for mostly-empty multi-Tb filesystems).
Posted Dec 2, 2011 22:45 UTC (Fri) by tytso (subscriber, #9993)
Not all of the improvements in fsck time come from being able to skip reading portions of the inode table. Extent tree blocks are also far more efficient than indirect blocks, and so that contributes to much of the speed improvements of fsck'ing an ext4 filesystem compared to an ext2 or ext3 file system.
Posted Dec 2, 2011 23:35 UTC (Fri) by nix (subscriber, #2304)
We could fix things so that as you delete files from a full file system, we reduce the high watermark field for each block group's inode table
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds