LWN.net Logo

Temporary files: RAM or disk?

Temporary files: RAM or disk?

Posted Jun 4, 2012 8:00 UTC (Mon) by neilbrown (subscriber, #359)
In reply to: Temporary files: RAM or disk? by dvdeug
Parent article: Temporary files: RAM or disk?

> The default policy should be that when I save a file it's saved.

You are, of course, correct.
However this is a policy that is encoded in your editor, not in the filesystem. And I suspect most editors do exactly that. i.e. they call 'fsync' before 'close'.

But not every "open, write, close" sequence is an instance of "save a file". It may well be "create a temporary file which is completely uninteresting if I get interrupted". In that case an fsync would be pointless and costly. So the filesystem doesn't force an fsync on every close as the filesystem doesn't know what the 'close' means.

Any application that is handling costly-to-replace data should use fsync. An app that is handling cheap data should not. It is really that simple.


(Log in to post comments)

Temporary files: RAM or disk?

Posted Jun 4, 2012 9:11 UTC (Mon) by dvdeug (subscriber, #10998) [Link]

Another choice for a set of semantics would be to make programs that don't want to use a filesystem as a permanent storage area for files specify that. That is, fail safe, not fail destructive. As it is, no C program can portably save a file; fsync is not part of the C89/C99/C11 standards. Many other languages can not save a file at all without using an interface to C.

I've never seen this in textbooks and surely that should be front and center with the discussion of file I/O, that if you're actually saving user data, that you need to use fsync. It's not something you'll see very often in actual code. But should you actually be in a situation where this blows up in your face, it will be all your fault.

Temporary files: RAM or disk?

Posted Jun 4, 2012 9:51 UTC (Mon) by dgm (subscriber, #49227) [Link]

It's not in the C standard because it has nothing to do with C itself, but with the underlaying OS. You will find fsync() in POSIX, and it's portable as long as the target OS supports POSIX semantics (event Windows used to).

Temporary files: RAM or disk?

Posted Jun 4, 2012 10:24 UTC (Mon) by dvdeug (subscriber, #10998) [Link]

What do you mean nothing to do with C itself? Linux is interpreting C semantics to mean that a standard C program cannot reliably produce permanent files. That's certainly legal, but it means that most people who learn to write C will learn to write code that doesn't reliably produce permanent files. Linux could interpret the C commands as asking for the creation of permanent files and force people who want temporary file to use special non-portable commands.

Temporary files: RAM or disk?

Posted Jun 4, 2012 10:33 UTC (Mon) by andresfreund (subscriber, #69562) [Link]

Mount your filesystems with O_SYNC and see how long you can endure that. Making everything synchronous by default is a completely useless behaviour. *NO* general purpose OS in the last years does that.
Normally you need only very few points where you fsync (or equivalent) and quite some more places where you write data...

Temporary files: RAM or disk?

Posted Jun 4, 2012 11:20 UTC (Mon) by neilbrown (subscriber, #359) [Link]

To be fair, O_SYNC is much stronger than what some people might reasonably want to expect.

O_SYNC means every write request is safe before the write system call returns.

An alternate semantic is that a file is safe once the last "close" on it returns. I believe this has been implemented for VFAT filesystems which people sometimes like to pull out of their computers without due care.
It is quite an acceptable trade-off in that context.

This is nearly equivalent to always calling fsync() just before close().

Adding a generic mount option to impose this semantic on any fs might be acceptable. It might at least silence some complaints.

Temporary files: RAM or disk?

Posted Jun 4, 2012 12:19 UTC (Mon) by andresfreund (subscriber, #69562) [Link]

> To be fair, O_SYNC is much stronger than what some people might reasonably want to expect.
> O_SYNC means every write request is safe before the write system call returns.
Hm. Not sure if that really is what people expect. But I can certainly see why it would be useful for some applications. Should probably be a fd option or such though? I would be really unhappy if a rm -rf or copy -r would behave that way.

Sometimes I wish userspace controllable metadata transactions where possible with a sensible effort/interface...

Temporary files: RAM or disk?

Posted Jun 4, 2012 16:44 UTC (Mon) by dgm (subscriber, #49227) [Link]

Linux does not interpret C semantics. Linux implements POSIX semantics, and C programs use POSIX calls to access those semantics. So this has nothing to do with C, but POSIX.

POSIX offers a tool to make sure your data is safely stored: the fsync() call. POSIX and the standard C library are careful not to make any promises regarding the reliability of writes, because this would mean a burden for all systems implementing those semantics, some of which do not even have a concept of fail-proof disk writes.

Now Linux could chose to deviate from the standard, but that would be exactly the reverse of portability, wouldn't it?

Temporary files: RAM or disk?

Posted Jun 4, 2012 15:37 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

Any application that is handling costly-to-replace data should use fsync. An app that is handling cheap data should not. It is really that simple.

Well, it's a little more complex because applications are more complex than just C programs. Sometimes the application is a person sitting at a workstation typing shell commands. The cost of replacing the data is proportional to the amount of data lost. For that application, the rule isn't that the application must use fsync, but that it must use a sync shell command when the cost of replacement has exceeded some threshold. But even that is oversimplified, because it makes sense for the system to do a system-wide sync automatically every 30 seconds or so to save the user that trouble.

On the other hand, we were talking before about temporary files on servers, some of which do adhere to the fsync dogma such that an automatic system-wide sync may be exactly the wrong thing to do.

Temporary files: RAM or disk?

Posted Jun 4, 2012 23:06 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

a system-wide sync can take quite a bit of time, and during that time it may block a lot of other activity (or make it so expensive that the system may as well be blocked)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds