User: Password:
|
|
Subscribe / Log in / New account

Deferring mtime and ctime updates

Deferring mtime and ctime updates

Posted Aug 23, 2013 18:45 UTC (Fri) by bfields (subscriber, #19510)
In reply to: Deferring mtime and ctime updates by dlang
Parent article: Deferring mtime and ctime updates

You're suggesting ensuring that any pending ctime/mtime/change attribute updates be committed to disk before responding to an nfs stat? I'm not sure that's practical.


(Log in to post comments)

Deferring mtime and ctime updates

Posted Aug 24, 2013 1:31 UTC (Sat) by dlang (subscriber, #313) [Link]

remember that the NFS spec requires that any writes to a NFS volume must be safe on disk before the write completes.

This requires a fsync after every write, which absolutely kills performance (unless you avoid ext3 and you have NVRAM or battery backed cache to write to), updating the attribute at the same time seems to be required by the standard.

Now, many people configure their systems outside the standard, accepting less data reliability in the name of performance, but if you are trying to provide all of the NFS guarantees, you need to update the timestamp after every write

This is why it's a _really_ bad idea to put things like Apache logs on NFS, unless you have a server with a lot of NVRAM to buffer your writes, and even then it's questionable.

Deferring mtime and ctime updates

Posted Aug 24, 2013 2:18 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

I… think that reminding the maintainer of the kernel NFS server how NFS works might be a touch unnecessary.

Deferring mtime and ctime updates

Posted Aug 24, 2013 2:28 UTC (Sat) by dlang (subscriber, #313) [Link]

could be (and for the record, I didn't recognize that was who he was), but I've seen people manage to miss obvious things before in their area of expertise (and I've done it myself)

If I'm wrong about my understanding of what NFS requires, I'm interested in being corrected, I'll learn something and be in a better position to setup or troubleshoot in the future.

David Lang

Deferring mtime and ctime updates

Posted Aug 24, 2013 19:45 UTC (Sat) by bfields (subscriber, #19510) [Link]

No problem, I can overlook the obvious....

But as jlayton says, what you describe is not the typical case for NFS since v3, and reverting to NFSv2-like behavior would be a significant regression in some common cases.

And on a quick check.... I think the Linux v4 client, as one example, does request the change attribute on every write (assuming it doesn't hold a delegation), so the server would be forcing a commit to disk on every write.

Deferring mtime and ctime updates

Posted Aug 24, 2013 20:11 UTC (Sat) by dlang (subscriber, #313) [Link]

Ok, I wasn't aware that newer versions of NFS had relaxed the standard (I've been dealing with NFS for a while, but for the last 10 years or so it's either been with home-grade machines that I didn't expect great performance from, or with EMC/Netapp high end devices that include a lot of NVRAM to handle writes fast anyway)

just so I can see if I've got the use cases correct, I am understanding that we have the following cases

1. no NFS: ctime and mtime updates can be deferrred

2. NFSv2 in use: all writes are synchronous and ctime/mtime updates should be as well.

3. NFSv3+ in use: writes can be delayed (which should include ctime/mtime updates), unless the client says they can't, in which case NFSv2 rules apply

It seems to me that having a mount options like relctime or relmtime where the timestamp gets written out when the file is closed/mmunmap, when a fsync is done, or sooner if the kernel feels like it, should work (assuming NFS does flushes)

The only gap I can see is if the writes to the file are being done locally (mmap for example), then the writes may not be visible to NFS clients immediatly, but if this is a mount option like relatime is, people who care about this case just don't use the mount option and get the old (slower but reliable) mode.

Deferring mtime and ctime updates

Posted Aug 24, 2013 11:23 UTC (Sat) by jlayton (subscriber, #31672) [Link]

> remember that the NFS spec requires that any writes to a NFS volume must
> be safe on disk before the write completes. This requires a fsync after
> every write, which absolutely kills performance (unless you avoid ext3 and
> you have NVRAM or battery backed cache to write to), updating the
> attribute at the same time seems to be required by the standard.

That was true for NFSv2, but NFSv3 and later allow you to do UNSTABLE writes. Those don't need to be written to stable storage until the client issues a COMMIT (though the server is free to write them out earlier if it needs to). Most clients (Linux' included) will use UNSTABLE writes for the bulk of the writes that it does. STABLE (NFSv2-ish) writes are still used in some cases, but that's only where we deem that it's more efficient to do it that way.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds