Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 1, 2011 10:46 UTC (Thu) by trasz (guest, #45786)
[Link]
Or rather, like UFS did for over a decade. In UFS, "blocks", which are kind of what's called clusters here, are 32kB by default, and consist of "fragments" - 4kB by default.
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 1, 2011 16:36 UTC (Thu) by tytso (subscriber, #9993)
[Link]
Which UFS are you talking about? UFS as found in BSD 4.4 and FreeBSD uses a default cluster size of 8k with 1k fragments.
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 1, 2011 20:03 UTC (Thu) by trasz (guest, #45786)
[Link]
UFS as found in FreeBSD 10 uses 32kB/4kB. Older versions used 16/2kB sizes since, IIRC, FreeBSD 4. See newfs(8) manual page (http://www.freebsd.org/cgi/man.cgi?newfs).
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 2, 2011 23:43 UTC (Fri) by walex (subscriber, #69836)
[Link]
«UFS as found in FreeBSD 10 uses 32kB/4kB»
That is terrible, becase it means that except for the tail the system enforces a fixed 32KiB read ahead and write behind, rather than an adaptive (or at least tunable) one.
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 3, 2011 1:01 UTC (Sat) by walex (subscriber, #69836)
[Link]
BTW many years ago I persuaded the original developer of ext to not implement in it the demented BSD FFS idea of large block/small fragment, arguing that adaptive read-ahead and write-behind would give better dynamic performance, and adaptive allocate-ahead (reservations) better contiguity, without the downsides.
Not everything got implemented as I suggested, but at least all the absurd complications of large block/small fragment (for example the page mapping issues) were avoided in Linux, as well as the implied fixed ra/wb/aa.
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Dec 3, 2011 11:06 UTC (Sat) by nix (subscriber, #2304)
[Link]
But of course we have had page-mapping-related bugs, in the *other* direction, from people building filesystems with sub-page-size blocks. (Support for this case is unavoidable unless you want filesystems not to be portable from machines with a large page size to machines with a small one, but it's still tricky stuff.)
Improving ext4: bigalloc, inline data, and metadata checksums
Posted Jan 3, 2012 17:38 UTC (Tue) by jsdyson (guest, #71944)
[Link]
Actually, as the author of earlier forms of the FreeBSD readahead/writebehind, I do know that FreeBSD can be very aggressive with larger reads/writes than just the block size. One really big advantage of the FreeBSD buffering is that the length of the queues/pending writes is generally planned to be smaller, thereby avoiding that nasty sluggish feeling (or apparent stopping) that occurs with horribly large pending writes.