|
|
Subscribe / Log in / New account

Atime and btrfs: a bad combination?

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 4:30 UTC (Fri) by jzbiciak (guest, #5246)
Parent article: Atime and btrfs: a bad combination?

I seem to recall posting a crazy idea about moving atime out to its own separately managed data structure and keeping it out of the inode entirely. (I can't find the comment now, of course.)

But still... atime is broken. It turns reads into writes and is generally just nasty. Furthermore, most things just don't need it.

Here's a totally radical, unsellable idea:

  • Disable all atime updates by default everywhere.
  • Add an extended attribute saying "do atime updates on this file only" (or relatime, if you so choose).
  • For systems that truly need accurate atime everywhere: Create a new mechanism that all filesystems can use just for storing atime. Create a backing store that is highly optimized for atime updates and nothing else. Provide an option to roll atime updates into the filesystem if necessary, but in most cases, allow this parallel structure to manage atime. Everyone else that doesn't need atime updates can ignore that kludgy thing.

Even in the absence of that crazy idea, it still sounds like having atime in the inode allows for trivial bandwidth and storage-size amplification attacks. If you could factor out atime updates to some dedicated on-disk structure that relied more on versioning semantics than COW, you could at least fix btrfs' immediate problem without totally ditching atime. With 8 byte atime and 8 byte inode numbers (let's say), The 2.2GB quoted in the article is enough space to store 128M atime updates if they were stored like a journal.


to post comments

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 4:51 UTC (Fri) by neilbrown (subscriber, #359) [Link] (3 responses)

No, please don't disable 'atime'.

I don't use it a lot, but I certainly do use it from time to time to see what files are being accessed. Not a killer feature, but a valuable one.

I'm a big fan of keeping atime in a separate data structure. The liveness properties, stability requirements, and necessary precision are very different from other values in the inode and keeping it together with them is a simplification, not a requirement.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 5:06 UTC (Fri) by jzbiciak (guest, #5246) [Link]

My real point (which your comment echoes) was that atime is so different than just about anything else, that if you want to keep it, it really deserves to be treated rather differently than everything else also. And, as the example in the article shows, atime can have real negative consequences even if you largely ignore it most of the time.

It seems to me the other option, if you don't fix atime, is to mitigate it with hacks (relatime -- which doesn't work well for the attack against btrfs shown in the article) or outright disable it everywhere or almost everywhere.

My comment above was perhaps slightly over the top. Sorry for any confusion.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 14:10 UTC (Fri) by jezuch (subscriber, #52988) [Link] (1 responses)

> I don't use it a lot, but I certainly do use it from time to time to see what files are being accessed. Not a killer feature, but a valuable one.

Hah. I disable ataime on all my filesystems and the only use I have for it is a side effect: it functions as creation time, which is much more valuable for me than access time :)

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 15:13 UTC (Fri) by jamesh (guest, #1159) [Link]

Provided no one goes and updates the atime via utime() or touch.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 7:53 UTC (Fri) by MrWim (subscriber, #47432) [Link] (4 responses)

Interesting. It seems to me that the infrastructure for this already exists so for any particular use case this could be implemented in userspace using inotify and the flag IN_ACCESS. Presumably tmpwatch and tmpreaper could use a mechanism like this listening for files in tmp (tmpwatchd?) or perhaps a daemon could be written which you could use to request atime like information be collected for a particular directory heirarchy. There would be the potential that mutt could use the same mechanism.

One nice property about this solution is that reads being writes are now explicit and if disk runs out read isn't going to fail but the failure mode can be implemented in the watching daemon.

Potentially you could take the dconf like design where a convenient atime API is provided such that atime can be read synchronously by mapping the atime "database" into the process that is reading it read-only whereas the atimed process would be the only process with write access to this file.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 12:45 UTC (Fri) by ablock (guest, #84933) [Link]

I really like that idea. This would allow linux to completely get rid of atime in the filesystem code, or at least to default mount with noatime.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 13:13 UTC (Fri) by jzbiciak (guest, #5246) [Link] (2 responses)

Of course, here's where it breaks: How do you export that over NFS? I guess nfsd would also need to talk to that infrastructure also.

My suggestion about doing it at the kernel level is that you retain the userspace ABI, and there's never an application that breaks because you've radically changed where atime gets monitored and recorded.

I guess you could provide a FUSE-like mechanism to hook userspace back up to 'stat'.

Atime and btrfs: a bad combination?

Posted Jun 3, 2012 22:39 UTC (Sun) by MrWim (subscriber, #47432) [Link] (1 responses)

To export it over NFS you would just export the atime database over NFS. The daemon would only run on the server and clients would (read-only) access the database file directly. This would work just as well as the local case (i.e. well if your applications are atimed aware and not at all for non-atimed aware apps which care about atime).

As you say a FUSE filesystem could be offered to preserve the stat() interface but it would probably be less work to get the existing apps which use atime to use some new library. The only examples I ever hear of are tmpwatch and mutt.

Atime and btrfs: a bad combination?

Posted Jun 7, 2012 12:29 UTC (Thu) by MrWim (subscriber, #47432) [Link]

On second thought this is probably way more complicated than it needs to be. In particular this generic atimed approach introduces a whole bunch of synchronization complexity. It would probably be easier to patch mutt to update atime explicitly and introduce a bespoke tmpwatchd explicitly for the /tmp case.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 9:41 UTC (Fri) by bergwolf (guest, #55931) [Link] (1 responses)

Enabling atime on per file/directory basis sounds good. mutt is already broken anyway in case of relatime mount. Adding such an interface would allow applications like mutt to work again without having to tweak whole file system performance.

Atime and btrfs: a bad combination?

Posted Jun 5, 2012 10:25 UTC (Tue) by mgedmin (subscriber, #34497) [Link]

How is Mutt broken by relatime? I'm using it this way every day, with mbox files on ext4 mounted with relatime. New mail notification works reliably.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 18:48 UTC (Fri) by Yorick (guest, #19241) [Link] (4 responses)

Your suggestions sound eminently sensible to me, although I would settle for the first item and call it a day. More fundamentally, the article author seems to think that it is a problem that atime is slow on btrfs. Quite the contrary: it is excellent news, especially since it seems to be caused by the basic btrfs design principles (so it is hard to "fix").

In fact, once most people agree that there is no reason whatsoever not to mount everything with noatime, we can drop it altogether and start reaping the benefits. All operations are faster, the code becomes simpler, and we can put the now free space in inodes (both on disk and in memory) to more productive use. It is difficult to see any costs here—what would break? finger?

Then, once that has been taken care of, we can go on dealing with some other part of the baroque Unix legacy. Remove 99 % of the TTY options, perhaps? We can start slowly, by taking away the one that converts lower to upper case, and see if anyone notices.

Atime and btrfs: a bad combination?

Posted Jun 1, 2012 21:40 UTC (Fri) by jzbiciak (guest, #5246) [Link]

<stty olcuc>

WE CAN START SLOWLY, BY TAKING AWAY THE ONE THAT CONVERTS LOWER TO UPPER CASE, AND SEE IF ANYONE NOTICES.

I'M SURE ANYONE WHO MIGHT COMPLAIN WILL DO SO VERY LOUDLY.

<STTY -OLCUC>

Atime and btrfs: a bad combination?

Posted Jun 4, 2012 12:59 UTC (Mon) by nix (subscriber, #2304) [Link] (2 responses)

And with every change you lose a bit of your userbase. Before you know it you end up with not very much userbase left at all.

Atime and btrfs: a bad combination?

Posted Jun 4, 2012 14:50 UTC (Mon) by hummassa (subscriber, #307) [Link]

You made me smile.

One of our problems as developers is exactly that: one does not simply take features away. Lots of systems made me bury them exactly by trying to take "my" features (the ones I used and cared for and needed) away.

Atime and btrfs: a bad combination?

Posted Jun 5, 2012 12:19 UTC (Tue) by Yorick (guest, #19241) [Link]

I'm going to assume you mean atime specifically, and not olcuc or anything else (you would have a hard time arguing for that one).

To remove old cruft, a good start is quarantine. Simply don't implement atime in new file systems (btrfs); people who need it for their business-critical fingerd can run UFS or ext2 or something else. The important part is that we don't let use of a bad feature to spread, since that is only going to make it harder to get rid of.

Instead of making code worse for everyone for the (dubious) benefit of a vocal minority of cavemen, deal with the problem head-on. Give them a chance to adapt - help them all you can - but set a firm date for when the coddling stops.

Atime and btrfs: a bad combination?

Posted Jun 5, 2012 11:42 UTC (Tue) by roblucid (guest, #48964) [Link]

No atime is NOT broken!!
The problem is Filesystem Designers not implementing it well.

Rather than whinge about it on LKML (Kernel developers & casual enthusiasts aren't the ppl who find atimes useful) implementing atime's better ought to be the focus of discussion. Ppl think about advanced features like snap-shotting and ignore the basic POSIX requirements.

atime doesn't need synchronous update guarantees, in real use the fuzzy relatime (better 23hr min update than 24, to be predictable on daily jobs) would likely be adequate. If your FS can't stand some inode info updates, during reading, then it is what is broken, not the spec.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds