User: Password:
|
|
Subscribe / Log in / New account

Improving ext4: bigalloc, inline data, and metadata checksums

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Nov 30, 2011 2:05 UTC (Wed) by nix (subscriber, #2304)
Parent article: Improving ext4: bigalloc, inline data, and metadata checksums

rather than allocate single blocks, a filesystem using clusters will allocate them in larger groups
Like FAT, only less forced-by-misdesign. Everything old is new again...


(Log in to post comments)

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 1, 2011 10:46 UTC (Thu) by trasz (guest, #45786) [Link]

Or rather, like UFS did for over a decade. In UFS, "blocks", which are kind of what's called clusters here, are 32kB by default, and consist of "fragments" - 4kB by default.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 1, 2011 16:36 UTC (Thu) by tytso (subscriber, #9993) [Link]

Which UFS are you talking about? UFS as found in BSD 4.4 and FreeBSD uses a default cluster size of 8k with 1k fragments.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 1, 2011 20:03 UTC (Thu) by trasz (guest, #45786) [Link]

UFS as found in FreeBSD 10 uses 32kB/4kB. Older versions used 16/2kB sizes since, IIRC, FreeBSD 4. See newfs(8) manual page (http://www.freebsd.org/cgi/man.cgi?newfs).

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 2, 2011 23:43 UTC (Fri) by walex (subscriber, #69836) [Link]

«UFS as found in FreeBSD 10 uses 32kB/4kB»

That is terrible, becase it means that except for the tail the system enforces a fixed 32KiB read ahead and write behind, rather than an adaptive (or at least tunable) one.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 1:01 UTC (Sat) by walex (subscriber, #69836) [Link]

BTW many years ago I persuaded the original developer of ext to not implement in it the demented BSD FFS idea of large block/small fragment, arguing that adaptive read-ahead and write-behind would give better dynamic performance, and adaptive allocate-ahead (reservations) better contiguity, without the downsides.

Not everything got implemented as I suggested, but at least all the absurd complications of large block/small fragment (for example the page mapping issues) were avoided in Linux, as well as the implied fixed ra/wb/aa.

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Dec 3, 2011 11:06 UTC (Sat) by nix (subscriber, #2304) [Link]

But of course we have had page-mapping-related bugs, in the *other* direction, from people building filesystems with sub-page-size blocks. (Support for this case is unavoidable unless you want filesystems not to be portable from machines with a large page size to machines with a small one, but it's still tricky stuff.)

Improving ext4: bigalloc, inline data, and metadata checksums

Posted Jan 3, 2012 17:38 UTC (Tue) by jsdyson (guest, #71944) [Link]

Actually, as the author of earlier forms of the FreeBSD readahead/writebehind, I do know that FreeBSD can be very aggressive with larger reads/writes than just the block size. One really big advantage of the FreeBSD buffering is that the length of the queues/pending writes is generally planned to be smaller, thereby avoiding that nasty sluggish feeling (or apparent stopping) that occurs with horribly large pending writes.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds