Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Temporary files: RAM or disk?
Posted Jun 4, 2012 10:44 UTC (Mon) by Serge (guest, #84957)
Posted Jun 4, 2012 23:05 UTC (Mon) by dlang (✭ supporter ✭, #313)
then do a fsync on some file in a different directory (with a small change to the file)
watch the fsync take a long time to complete on ext3, and almost no time on any other filesystem.
because of this behavior, users and sloppy programmers have been conditioned that fsync calls make their program pause unexpectedly for a potentially long time period (I think I've seen Ted Tso report that he's seen delays longer than 30 seconds)
If you go back the the blog messages about fsync and data reliability when people were claiming that ext4 was eating their KDE configuration data, you will see detailed discussions about this.
Posted Jun 5, 2012 5:24 UTC (Tue) by Serge (guest, #84957)
Thanks for a detailed description.
I could not notice a major difference on my system, it could be my HDD is too fast for that (100MB/s, usual rotating hdd, no raids, no LVMs). I thought this "bug" was fixed a few years ago with Linus's blessing, btw.
But the question is: how is this related to /tmp? Nobody fsync()s file in /tmp. This won't work for small short-lived files as well, since there's no chance they're created at the moment of fsync (and even if they are, you won't notice the difference, because they're small). So there must be program writing large file in /tmp. It must be large enough so that fsync on another partition was noticeably delayed. But large file will trigger dirty*ratio and start writing to disk, thus not delaying fsync() much anyway.
BTW, even if fsync was delayed, what is the application where you could notice this delay? I'm trying to say that I can't think of any real-world use cases, that /tmp on ext3 is not good for.
> I think I've seen Ted Tso report that he's seen delays longer than 30 seconds
Technically it may be possible (if it still was not fixed 3 years ago). You need a machine with a lot of RAM, very slow HDD, increase dirty*ratio to 90%, write a few GBs and then call fsync(). But that would be useless, because it's not related to any real-world use cases.
> If you go back the the blog messages about fsync and data reliability when people were claiming that ext4 was eating their KDE configuration data, you will see detailed discussions about this.
Those were different ext4-specific problems of recently modified files lost on crash (usual thing, actually, official xfs "feature"), not related to fsync(), and definitely not related to /tmp, as far as I remember. And those were fixed a few years ago anyway.
Posted Jun 5, 2012 7:12 UTC (Tue) by dlang (✭ supporter ✭, #313)
something along the lines that the ext3 journal doesn't know what blocks are related to the metadata, so to avoid revealing old data that may be on disk, the filesystem is required to flush all pending writes.
the XFS/ext4 'problem' you refer to is the way every filesystem other than ext3 works. If you don't do a fsync, the data isn't safe and a crash at the wrong time can give you grief.
the functionality that makes this less of a problem on ext3 is the same functionality that makes it behave so horribly when you do a fsync
this is good for crash-prone desktop systems running software that wasn't written to be crash safe (note that it doesn't make the systems safe, it just reduces the probability of data loss)
but if you are running any software that is written to be crash safe, ext3 is about the worst filesystem you could use (in some cases worse than ext2 or other non-journaled filesystem).
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds