User: Password:
|
|
Subscribe / Log in / New account

ext3 block reservation

Like most modern filesystems, ext3 tries to lay out files contiguously on the disk. This layout allows files to be read and written quickly, without a lot of disk head seeks in the middle. This strategy can be thwarted, however, by the fact that ext3 allocates blocks as they are actually needed by a file. By the time a file requests a new block, the space immediately after the file on disk may well have been allocated for some other file. At that point, a contiguous allocation will be impossible.

Mingming Cao has attempted to fix this problem with a set of "block reservation" patches for ext3; those patches are currently part of the -mm tree. The core idea behind these patches is that the filesystem should think ahead of time about where it might place blocks for growing files and reserve that space. That way, when the file does grow, there will be blocks available in a useful part of the disk.

To that end, the ext3 block allocator has been replaced by a reservation-oriented version. The first time a block is needed for a file, the filesystem creates a "reservation window" which sets aside a range of blocks (eight of them, initially); the actual block allocations are then taken from the window. When the window is exhausted, a new, possibly expanded window is allocated, as near as possible to the old window, to replace it. Reservations only last until the process writing the file closes it; thereafter, the blocks become free once again.

Interestingly, nothing in the filesystem itself tracks block reservations; they are all handled by a single, in-core linked list (per filesystem). A block reservation will not actually prevent blocks inside the window from being allocated to some other file. Since the filesystem allocates out of reservation windows whenever possible, however, and those windows do not overlap, the reservations are almost always honored. In some situations (such as when all remaining free blocks are reserved) the filesystem will forget about reservations and allocate blocks from anywhere.

Some benchmark results show significant performance improvements, especially when large numbers of processes are running. To some extent, this improvement comes about because block reservations narrow down the area of the disk that must be searched for free blocks and increase the chances that a block will be found quickly. The real benefit, however, is that the on-disk layout of the files is much improved. Unless problems turn up, this patch may find its way into the mainline fairly quickly.


(Log in to post comments)

ext3 defragmentation

Posted Apr 22, 2004 11:23 UTC (Thu) by jbh (subscriber, #494) [Link]

There has been some progress to avoid fragmentation. But it doesn't help much for older filesystems that are already heavily fragmented. Does anyone know of an ext3 defragmenter? Online or offline.

I've seen one for ext2 but I don't trust it, especially after seeing this:
"Tried it on a spare ext3 partition that I backed up first. Did a diff after, the defrag corrupted data."
http://lists.debian.org/debian-user/2003/debian-user-200308/msg03259.html

ext3 defragmentation

Posted Apr 22, 2004 16:42 UTC (Thu) by Duncan (guest, #6647) [Link]

There's not enough info in that limited quote to tell for sure, but if an
ext2 defragger is run on an ext3 filesystem either mounted, or unmounted
but with unsynced data in the journal, it WILL likely corrupt data,
because the ext3 side of things won't be aware of the data-blocks moving
out from where it expects them to be, and could easily attempt to write
data to the OLD location.

Some years ago, back in early '98, b4 I switched from MSWormOS and while I
was participating in the public betas for IE4, it had a similar conflict
between the IE cache code and the 95 defragger. IE3 had used a mechanism
where by the cache index file location was kept in memory for direct
writing, bypassing the file-system lookup after it was loaded and writing
directly to disk, for performance reasons. That worked with IE3, because
it was only temporarily loaded and was shut down during defrags. MS
changed the rules with IE4 and its desktop extensions, however, and it
remained loaded as long as Windows Explorer was loaded, because it WAS now
Windows Explorer as WELL as IE. Normally, such constantly loaded "system"
files remained untouched and unmoved by the defragger. Unfortunately in
the IE4 second beta, they forgot to set the IE cache index file as
"system" and the defragger would move it out from under the still live
IE/WE process, which would then write all over whatever replaced the file,
when it tried to rewrite it to disk. The simple enough fix was to set the
system flag on the file, but IE would reset it every time it was started,
so one had to keep on top of things.

Fortunately, here, I'd set up my temp dir as a seperate partition, and had
the IE cache set to use my temp partition. Thus, the only data I had
exposed was temporary anyway, and the bug wasn't a big problem for me.
However, some of the other beta newsgroup regulars and others that posted
only when they had problems, lost valuable data to that bug. Keeping temp
files in their own partition saved my butt, but I've never forgotten that,
as it left a BIG impression on me, not only on the risks of beta, but ALSO
on the wisdom of limiting potential damage with multiple partitions. Of
course, it also impressed me with the wisdom of making SURE nothing is
going to be writing to that defragged partition without being aware of the
new location of the data.

It's entirely possible the test was done with a properly unmounted and
journal-empty ext3 partition for the defrag, but if it wasn't, it's no
surprise there were data integrity issues.

Duncan

ext3 defragmentation

Posted Apr 23, 2004 1:43 UTC (Fri) by prat (guest, #20866) [Link]

I've actually looked at this before, and found a few solutions, but quickly concluded that the programs in question (which had to be run offline) seemed a little too unmaintained and unreliable to test out on my partition. The best answer I've gotten from anyone so far is that yes, there is a program that can do this, and that program is tar. =) Of course, you'd need someplace to back everything up to. Then you can just untar everything onto a clean partition, possibly with these new patches in place in the kernel you use during the restore, but since tar is probably doing this linearly anyway, I doubt it would make much difference.

Long story short, most people I've talked to have never had any problems with fragmented ext[23] filesystems. Sorry.

ext3 defragmentation

Posted Apr 23, 2004 18:35 UTC (Fri) by southey (subscriber, #9466) [Link]

In my (poor) opinion defragmentation is a myth. At least on Windows it doesn't change a thing except the time waiting for it to finish. There is more benefit is having files used together in the same sequence - at least this is one of the tricks MS uses to get Windows to 'boot' faster. Linux's ability to put that 'unused' memory to 'good use' also probably helps minimize fragmentation delays with file access. Harddrive technology probably also makes this a less of an issue.

ext3 defragmentation

Posted Apr 25, 2004 11:15 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

ext3 filesystems that are cleanly unmounted are also valid ext2 filesystems and can therefore be defrag'd by any working ext2 defrag application. I last used such a thing in 1998 and it seemed to work fine at that time. I believe there have since been one or perhaps at most two incompatible changes to the ext2 structures that might (not sure) affect defrag programs. You should check that any defrag application you consider is aware of ext2/3's compatibility flags, which would prevent it from modifying a filesystem with features that it doesn't understand.


Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds