|
|
Subscribe / Log in / New account

Once upon atime

By Jonathan Corbet
August 8, 2007
Among the metadata maintained by most filesystems is the last file access time, or "atime." This time can be a useful value to have - it lets an administrator (or a program) know when a file was last used. There is a strong downside to this feature, though: it forces a write to the disk every time a file is accessed. So read-only operations, which might have been satisfied entirely from cache, turn into filesystem writes to keep the atime value up to date.

A recent discussion on write throttling turned to atime after Ingo Molnar pointed out that atime was probably a bigger performance problem than just about everything else. He went on to say:

Atime updates are by far the biggest IO performance deficiency that Linux has today. Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the past 10 years, _combined_.

He also claimed that it was "perhaps the most stupid Unix design idea of all times".

Such discussion leads quickly to the question of what should be done about this old situation. One step that any Linux user can take now is to mount filesystems with the noatime option, which turns off the tracking of access times. For filesystem-intensive tasks, the performance reward can be immediately apparent. Unfortunately, turning off atime unconditionally will occasionally break software. Some mail tools will compare modification and access times to determine whether there is unread mail or not. The tmpwatch utility and some backup tools also use atime and can misbehave if atime is not correct. For this reason, distributors tend not to make noatime the default on installed systems.

Another approach was added in 2.6.20: the relatime mount option. If this flag is set, access times are only updated if they are (before the update) earlier than the modification time. This change allows utilities to see if the current version of a file has been read, but still cuts down significantly on atime updates. This option is not heavily used, perhaps because few people have heard of it and many distributions lack a version of mount which is new enough to know about it. Using relatime can still confuse tools which want to ask questions like "has this file been accessed in the last week?"

To fix that problem, Linus suggested a tweak to how relatime works: update it if the current value is more than a certain time in the past - one day, for example. Ingo responded with a patch implementing that behavior and adding a couple of new boot options: relatime_interval, which specifies the update interval in seconds, and default_relatime, which turns on the relatime option in all filesystems by default.

Something resembling this version of the patch might go into 2.6.24. It was suggested that, whenever a file's inode is to be written to disk anyway, the kernel might as well update atime as well. Alan Cox objected that this change might make the overall behavior less predictable, which might not be desirable. No new version of the patch with this feature has been posted, so chances are it will not be in the version which gets merged - if and when that happens.

Index entries for this article
KernelFilesystems/Access-time tracking


to post comments

Defer updates

Posted Aug 9, 2007 9:09 UTC (Thu) by addw (guest, #1771) [Link]

Part of the reason that atime causes disk traffic is that if a file is read many times (each read satisfied from cache) an atime update will be generated for each time the file is read. So you get multiple disk writes to same the inodes over and over.

Why not defer the inode updates, then you might get one set of disk writes either when the file system is unmounted or when it becomes idle or at scheduled intervals (eg every 1/2 hour). In practice many systems do not open large numbers of files, but repeatedly access a subset time and again. The usual exception to this is the backup program.

From my point of view: if the system dies before it writes the atime out then not a lot is lost; maybe a spurious ''you have mail'' when I login again.

Memory usage is the primary cost as the in-memory inode needs to be kept longer than it normally would be. This could be ameliorated by having a new strucut (device, inode#, atime) to store this info if we trash the in-memory inode.

noatime and mail programs

Posted Aug 9, 2007 12:26 UTC (Thu) by rfunk (subscriber, #4054) [Link] (2 responses)

I'm pretty sure the noatime incompatibility with mail programs shows up only when using a single-file mailbox format (usually mbox). Yet another reason to use maildir instead.

noatime and mail programs

Posted Jun 2, 2012 22:52 UTC (Sat) by mirabilos (subscriber, #84359) [Link] (1 responses)

I believe it’s exactly the other way round, split-into-files formats would have issues with this.

noatime and mail programs

Posted Jun 3, 2012 18:19 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

Experimenting a bit, the unread flag is marked by files being in the new/ directory in maildir. So maildir seems that the common implementation wouldn't care about atime at all.

Once upon atime

Posted Aug 9, 2007 14:52 UTC (Thu) by davecb (subscriber, #1574) [Link]

The article notes: It was suggested that, whenever a file's inode is to be written to disk anyway, the kernel might as well update atime as well. Alan Cox objected that this change might make the overall behavior less predictable, which might not be desirable.

Solaris has a "defer atime" mount option that does the atime update only when other disk I/O is scheduled, which gives a fairly fine-grained update, and makes the behavior less prefictable, but closer to what happens when atime is turned on. This approach has been in place for a number of years (five or six, at least) and doesn't confuse programs or unsuspecting users.

--dave

nodiratime

Posted Aug 9, 2007 16:12 UTC (Thu) by elanthis (guest, #6227) [Link] (5 responses)

Everyone knows about noatime, but it seems almost nobody uses nodiratime.

The thread this discussion was in notes that the kernel devs like Linus explicitly not only noatime, but also nodiratime. I ASSume that noatime only affects files and that directories need nodiratime.

nodiratime

Posted Aug 9, 2007 16:17 UTC (Thu) by corbet (editor, #1) [Link] (4 responses)

noatime implies nodiratime - it's a superset. You can use nodiratime by itself to turn off the updates on directories only. As you note, it does not seem to be widely used.

Does noatime imply nodiratime?

Posted Aug 9, 2007 20:17 UTC (Thu) by tarvin (guest, #4412) [Link] (3 responses)

Are you sure about noatime implying nodiratime?

mount(8) states:
  noatime
    Do not update inode access times on this file system[...]

  nodiratime
    Do not update directory inode access times on this filesystem.

while mount(2) puts it this way:
  MS_NOATIME
    Do not update access times for (all types of) files on this file system.

  MS_NODIRATIME
    Do not update access times for directories on this file system.

I wonder how to interpret mount(2)'s (all types of) files for noatime: Is a directory considered a file in this context?

Does noatime imply nodiratime?

Posted Aug 9, 2007 20:39 UTC (Thu) by corbet (editor, #1) [Link] (2 responses)

Yep, I'm sure. When in doubt, use the source. From touch_atime() in fs/inode.c:

void touch_atime(struct vfsmount *mnt, struct dentry *dentry)
{
        /* ... */
        if (inode->i_flags & S_NOATIME)
                return;
        if (IS_NOATIME(inode))
                return;
        if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
                return;

So if NOATIME is set, the NODIRATIME flag is never even checked.

Does noatime imply nodiratime?

Posted Aug 10, 2007 6:41 UTC (Fri) by tarvin (guest, #4412) [Link] (1 responses)

OK. I've been using noatime systematically for years. And then I read about a prominent kernel developer like Molnar using both noatime and nodiratime, making me worried: Have I been missing out on I/O-performance for years?

- But your definite statement is comforting, thanks. I'll keep using just "noatime".

Does noatime imply nodiratime?

Posted Aug 10, 2007 14:26 UTC (Fri) by jzbiciak (guest, #5246) [Link]

Actually, Andrew Morton actually corrected Ingo on this point:
From: Andrew Morton 
To:	Ingo Molnar 
Subject: Re: [PATCH 00/23] per device dirty throttling -v8
Date:	Sun, 5 Aug 2007 00:29:34 -0700

On Sun, 5 Aug 2007 09:21:41 +0200 Ingo Molnar [email blocked] wrote:

> even on a noatime,nodiratime filesystem

noatime is a superset of nodiratime, btw.
I trust Andrew on this point. :-)

Once upon atime

Posted Aug 10, 2007 12:39 UTC (Fri) by liljencrantz (guest, #28458) [Link]

One other possibility is to keep atime updates in memory and only write them out to disk if the page was to be flushed to disk anyway. That would mean storing atimes in memory in a separate structure outside of the regular page.

The additional memory requirements should be relatively small, but the extra time needed to perform atime lookup might not be.

It would also mean that after a crash atimes will be wrong.

Once upon atime

Posted Aug 11, 2007 1:12 UTC (Sat) by xkahn (subscriber, #1575) [Link]

Interestingly, Microsoft seems to have looked at this issue in the past. From MSDN:

http://msdn2.microsoft.com/en-us/library/ms724290.aspx
Not all file systems can record creation and last access time and not
all file systems record them in the same manner. For example, on NT
FAT, create time has a resolution of 10 milliseconds, write time has a
resolution of 2 seconds, and access time has a resolution of 1 day
(really, the access date). On NTFS, access time has a resolution of 1
hour.

Popularity contest

Posted Aug 17, 2007 15:19 UTC (Fri) by endecotp (guest, #36428) [Link]

The only program that I run that benefits from atime is Debian's "popcon", which records which packages are installed and which of those has been used recently. This would work well with the "update atime when > 1 day" approach suggested by Linus.


Copyright © 2007, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds