User: Password:
|
|
Subscribe / Log in / New account

ext4 and data consistency

ext4 and data consistency

Posted May 13, 2010 20:45 UTC (Thu) by quotemstr (subscriber, #45331)
In reply to: ext4 and data consistency by drag
Parent article: The Next3 filesystem

The difference is that due to a fluke to Ext3's design the window that the 'zero length files' would be created on improper shutdown is much shorter then the same window for Ext4 (or XFS or whatever)
That window must be vanishingly small because neither I nor anyone else has ever been able to make ext3 crease zero-length files in the way you describe. Quirk or not, rename atomicity is an important feature that works just fine on a running filesystem, and filesystems ought to preserve its qualities on a restart. Allowing random garbage to exist on the filesystem after a restart is terrible policy and reflects a profound ignorance on the part of filesystem developrse as to how applications and users expect their systems to work.


(Log in to post comments)

ext4 and data consistency

Posted May 13, 2010 23:07 UTC (Thu) by njs (guest, #40338) [Link]

And ext4 now has rename atomicity over crashes. I also think that this is the right decision, but I wince when I see people tear into filesystem developers over this; if anything, it seems to reflect a profound ignorance of the difficulty of the trade-offs fs developers have to make, the disparity between what people want from a fs and what fs's have historically provided, etc. Keep in mind that if you go two web-pages over, you can find people tearing into POSIX for providing *too* strong guarantees and how we absolutely need to relax them for real-world usage (atime is the obvious example, but there are others). So I can hardly blame fs developers for being *cautious* about introducing strong *new* guarantees.

ext4 and data consistency

Posted May 14, 2010 13:43 UTC (Fri) by anton (subscriber, #25547) [Link]

[...] trade-offs fs developers have to make, the disparity between what people want from a fs and what fs's have historically provided, etc.
Yes, different people expect different things from file systems.

E.g., I expect data consistency from a file system; Linux file systems don't give any guarantee on that, but at least ext3 does ok in most cases; some people may consider this a fluke (but is Stephen Tweedie, the creator of ext3 among them?), but that's the reality.

Other people expect maximum speed. And for these people Linux provides tmpfs and ext4.

Given this choice, ext4 is certainly not a replacemet of ext3 for me.

Keep in mind that if you go two web-pages over, you can find people tearing into POSIX for providing *too* strong guarantees and how we absolutely need to relax them for real-world usage (atime is the obvious example, but there are others).
Yes, there are different kinds of users. I lost quite a bit of time because Linux does not follow POSIX atime semantics by default anymore. I find them useful in my real-world usage. Those who don't want atime have been able to use noatime for a long time, and now there is relatime, but making it the default (especially with mounts that don't know about strictatime) is a bad practice.

ext4 and data consistency

Posted May 14, 2010 15:53 UTC (Fri) by bronson (subscriber, #4806) [Link]

What on earth do you use atime for? Personally, the last time I ever needed to worry about atime was in the 1990s, and it was very easy to replace.

ext4 and data consistency

Posted May 15, 2010 8:36 UTC (Sat) by anton (subscriber, #25547) [Link]

I use atime to check whether some complex software really does access the files that I think it does.

ext4 and data consistency

Posted May 20, 2010 19:23 UTC (Thu) by oak (guest, #2786) [Link]

> I use atime to check whether some complex software really does access the files that I think it does.

Wouldn't "strace -f" be handier for that kind of thing? With that you notice also a lot of other stuff that the SW does.

Strace-account script gives an overview of file accesses in the strace output:
http://blogs.gnome.org/mortenw/2005/12/14/strace-account/

ext4 and data consistency

Posted May 21, 2010 12:04 UTC (Fri) by anton (subscriber, #25547) [Link]

It would not be handier exactly because it tells me a huge amount of other stuff the software does and that I am not interested in.

ext4 and data consistency

Posted Jun 8, 2010 22:17 UTC (Tue) by elanthis (guest, #6227) [Link]

Meet grep. Grep is your friend. Grep can make your life much easier. Grep is here to help you.

ext4 and data consistency

Posted Jun 9, 2010 9:06 UTC (Wed) by anton (subscriber, #25547) [Link]

And how is that handier than just doing "stat <file>"?

ext4 and data consistency

Posted May 14, 2010 17:33 UTC (Fri) by njs (guest, #40338) [Link]

I'm not aware of any common filesystem that provides "data consistency" in any coherent sense, unless you do weird things like mount -o sync. Speed is too important -- Stephen Tweedie didn't make data=journal the default, either. At most you get guarantees in particular situations -- e.g., both ext3 and ext4 guarantee that a rename will not be committed to disk until writes to the file being renamed have been committed to disk. They even both try to guarantee that programmers who do horrible things like truncating the file and *then* rewriting it are somewhat protected from their incompetence.

But maybe there are other cases where ext3 does better than ext4. You must have some excellent ones in mind to lump ext4 in with tmpfs... can you give any examples?

ext4 and data consistency

Posted May 15, 2010 9:19 UTC (Sat) by anton (subscriber, #25547) [Link]

Speed is too important
For whom? For me data consistency is much more important. Before barriers were supported, we ran ext3 on IDE disks without write caching, and that's really slow. The file system was still fast enough.
Stephen Tweedie didn't make data=journal the default, either.
Actually he did, at least at the start. Later it got changed (by whom?) to data=ordered; that still has the potential to provide data consistency unless existing files are overwritten.

As for an example: Consider a process writing file A and then file B. With ext4 I expect that it can happen that after recovery B is present and A is not or is empty. With ext3 I expect that this does not happen. But given that I did not find any documented guarantees in Documentation/filesystems/ext3.fs, maybe we should lump ext3 with tmpfs, too.

Still, my search brought up a Linux file system that gives guarantees: In nilfs2.txt it says:

order=strict	Apply strict in-order semantics that preserves sequence
		of all file operations including overwriting of data
		blocks.  That means, it is guaranteed that no
		overtaking of events occurs in the recovered file
		system after a crash.
Yes, that's exactly the guarantee I want to see. This means that any application that keeps its files consistent as visible from other processes will also have consistent files after an OS crash.

ext4 and data consistency

Posted May 16, 2010 3:57 UTC (Sun) by njs (guest, #40338) [Link]

> For whom? For me data consistency is much more important

That's fine. I'd like data consistency too. But I still don't mount my disks with -o sync, nor does pretty much anyone else, even most of the people who say they want data consistency. That's the reality that fs developers live in.

Maybe on SSD (where nilfs2 is designed to live), we'll be able to get guaranteed data consistency as a matter of course. That'll be nice if it happens.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds