A short history of btrfs
Posted Aug 1, 2009 17:27 UTC (Sat) by graydon
In reply to: A short history of btrfs
Parent article: A short history of btrfs
A couple points. I'm not expert enough to be sure these answer your question:
- As far as I know, slab allocators are based on a regional grouping: "this entire range of memory is dedicated to objects of size N bytes", say. The Lea allocator keeps a freelist that can hold objects of N bytes, but there's nothing saying those objects are either allocated from an N-byte-object slab initially, nor that they're exactly N bytes. Just that that's the right power-of-two freelist to put them on. It also coalesces adjacent allocations into larger blocks and moves them to larger freelists whenever it can.
- Furthermore -- here I am getting more sketchy in my understanding -- I think that the design of a slab-like allocator implies having a certain degree of co-operation from your allocation client. The client has to be willing to ask for memory from slabs, and the memory it asks for has to (commonly) be in slab-entry-sized units. This is fine -- actually a correct design observation that slab allocation was invented to take advantage of -- when your client is a kernel with a fixed-at-compile-time assortment of struct sizes it's going to be (mostly) allocating. It is not really a correct design observation for things like "general-purpose mallocs" or "general-purpose filesystems", where the clients are unknown and aren't interested in confining themselves to slab-appropriate behavior.
IOW I think both are trying to "reduce fragmentation", but with a different assumed type of workload: a slab allocator eliminates external fragmentation almost entirely if the client can co-operate, but a more general allocator like Lea malloc lacks that co-operation promise, and so takes a simpler approach that (on average) keeps both internal and external fragmentation "under control", but with less of a guarantee about it.
I believe, if I'm reading correctly, that btrfs will be more in the Lea-malloc camp: possibly subject to more external fragmentation (say if you let a bunch of leaves from the middle of blocks drop to 0-refcount), but possibly less internally fragmenting than ZFS (due to the absence of slab size-grouping), so better on average for the general workload of a filesystem. Allocators are always tuned to workload assumptions.
to post comments)