|
|
Subscribe / Log in / New account

File creation times

By Jonathan Corbet
July 26, 2010
Linux systems, like the Unix systems that came before, maintain three different timestamps for each file. The semantics of those timestamps are often surprising to users, though, and they don't provide the information that users often want to know. The possible addition of a new system call is giving kernel developers the opportunity to make some changes in this area, but there is not, yet, a consensus on how that should be done.

The Unix file timestamps, as long-since enshrined by POSIX, are called "atime," "ctime," and "mtime." The atime stamp is meant to record the last time that the file was accessed. This information is almost never used, though, and can be quite expensive to maintain; Ingo Molnar once called atime "perhaps the most stupid Unix design idea of all times". So atime is often disabled on contemporary systems or, at least, rolled back to the infrequently-updated "relatime" mode. Mtime, instead, makes a certain amount of sense; it tells the user when the file was last modified. Modification requires writing to the file anyway, so updating this time is often free, and the information is often useful.

That leaves ctime, which is a bit of a strange beast. Users who do not look deeply are likely to interpret ctime as "creation time," but that is not what is stored there; ctime, instead, is updated whenever a file's metadata is changed. The main consumer of this information, apparently, is the venerable dump utility, which likes to know that a file's metadata has changed (so that information must be saved in an incremental backup), but the file data itself has not and need not be saved again. The number of dump users has certainly fallen over the years, to the point that the biggest role played by ctime is, arguably, confusing users who really just want a file's creation time.

So where do users find the creation time? They don't: Linux systems do not store that time and provide no interface for applications to access it.

That situation could change, though. Some newer filesystems (Btrfs and ext4, for example) have been designed with space for file creation times. Other operating systems also provide this information, and some network filesystem protocols expect to have access to it. So it would be nice if Linux properly supported file creation times; the proposed addition of the xstat() system call would be the ideal time to make that change.

Current xstat() implementations do, in fact, add a st_btime field to struct xstat; the "b" stands for "birth," which is a convention established in the BSD camp. There has been a fair amount of discussion about that addition, though, based on naming and semantics.

The naming issue, one would think, would be relatively straightforward. It was pointed out, though, that other names have been used in the kernel. JFS and Btrfs use "otime," for some reason, while ext4 uses "crtime." And BSD, it turns out, uses "birthtime" instead of "btime." That discussion inspired Linus to exclaim:

Oh wow. And all of this just convinces me that we should _not_ do any of this, since clearly it's all totally useless and people can't even agree on a name.

After that, though, Linus looked a bit more deeply at the problem, which he saw as primarily being to provide a Windows-style creation time that Samba could use. It turns out that Windows allows the creation time to be modified, so Linus saw it as being a sort of variation on the Unix ctime notion. That led to a suggestion to change the semantics of ctime to better suit the Windows case. After all, almost nobody uses ctime anyway, and it would be a trivial change to make ctime look like the Windows creation time. This behavior could be specified either as a per-process flag or a mount-time option; then there would be no need to add a new time field.

This idea was not wildly popular, though; Jeremy Allison said it would lead to "more horrible confusion". If ctime could mean different things in different situations, even fewer people would really understand it, and tools like Samba could not count on its semantics. Jeremy would rather just see the new field added; that seems like the way things will probably go.

There is one last interesting question, though: should the kernel allow the creation time to be modified? Windows does allow modification, and some applications evidently depend on that feature. Windows also apparently has a hack which, if a file is deleted and replaced by another with the same name, will reuse the older file's creation time. BSD systems, instead, do not allow the creation time to be changed. When Samba is serving files from a BSD system, it stores the "Windows creation time" in an extended attribute so that the usual Windows semantics can be provided.

If the current xstat() patch is merged, Linux will disallow changes to the creation time by default - there will be no system call which can make that change. Providing that capability would require an extended version of utimes() which can accept the additional information. Allowing the time to be changed would make it less reliable, but it would also be useful for backup/restore programs which want to restore the original creation time. That is a discussion which has not happened yet, though; for now, creation times cannot be changed.

Index entries for this article
KernelFilesystems/Access-time tracking


to post comments

Cygwin too!

Posted Jul 29, 2010 2:15 UTC (Thu) by quotemstr (subscriber, #45331) [Link]

stat is particularly painful under Cygwin; collecting the information necessary to emulate the various parts of struct stat(and especially interpreting the NT ACL and simulating permission bits) takes a while, and most of that effort is wasted when applications use only a few fields of the result.

I've been wondering for a little now how much effort it'd be to implement something like xstat() for Cygwin. (Mostly to scratch the itching caused by slow directory listings.) I might as well implement the same interface Linux will have; the Windows file attribute bits would be in the extra stats. Cygwin would benefit a lot more from xstat than any native system would.

The main problem isn't implementing the system call, however, but patching programs all over the place to use it. It'd be hard to get a Cygwin-specific patch upstream. But hopefully, with two implementations, it'd be easy to get those changes merged.

File creation times

Posted Jul 29, 2010 3:09 UTC (Thu) by ctpm (guest, #35884) [Link] (1 responses)

"After all, almost nobody uses ctime anyway"

Well that's not quite true. Unless you discount Tar as a seldom used, almost unknown obscure application.

The fact is that, generally, a program that relies on tar (for example, Amanda) to create incremental dumps should be able to get a proper ctime for each file. See the "-g" option on the GNU Tar manpage.

There might be other tools that depend on current ctime behaviour. Somehow I don't think that breaking people's backups is the way to go, so a separate field seems rather wiser.

Regards

Cláudio

File creation times

Posted Jul 29, 2010 22:00 UTC (Thu) by nix (subscriber, #2304) [Link]

And dar. And rdiff-backup. And probably amanda and every other backup program on earth capable of incremental backups. (I haven't found any but dump(8) that can avoid backing up the data if only the metadata has changed. That's a clever trick that should be emulated, I think, and I never thought I'd say that about *anything* dump(8) did...)

ctime

Posted Jul 29, 2010 4:04 UTC (Thu) by markh (subscriber, #33984) [Link] (2 responses)

ctime is actually very useful, as it is currently the only somewhat reliable way to determine that something has changed (by checking whether ctime or mtime has been updated). For some reason when people copy files from other places they like to also copy the last modification time. Fortunately that will set the ctime to the current time and they cannot alter that. So when something breaks and I want to find what changed I can at least search for recent ctime or mtime.

ctime

Posted Jul 29, 2010 23:46 UTC (Thu) by giraffedata (guest, #1954) [Link] (1 responses)

The provision I quoted applies to end users. It also prohibits distribution of those adaptations:

OK, I'll bite. The reason is that they use the modification time as a version indication of the content of the file, not of the file per se. I found that I most often want mtime to be the former, so many years ago I changed "cp" to an alias that preserves mtime, and I have been much happier since. I use mtimes in manual processes a lot.

Of course, I get bitten the reverse way sometimes.

ctime

Posted Jul 30, 2010 0:45 UTC (Fri) by markh (subscriber, #33984) [Link]

I have no problem with people using mtime this way, as long as there is some other way to determine that some change was made. ctime satisfies that nicely. I am just concerned that Linus of all people would suggest that ctime is not useful and would be so quick to suggest that it be repurposed.

File creation times

Posted Jul 29, 2010 4:50 UTC (Thu) by magfr (subscriber, #16052) [Link] (1 responses)

Another question is if the btime relates to the creation time of the file or of the link to the file?

Consider
touch f1
ln f1 f2
Is btime(f1) == btime(f2)?

File creation times

Posted Jul 29, 2010 8:09 UTC (Thu) by eugeniy (guest, #24280) [Link]

Well, the metadata is stored in inode, and those links point to the same inode. So the answer is yes.

File creation times

Posted Jul 29, 2010 7:34 UTC (Thu) by dlang (guest, #313) [Link] (7 responses)

if a file gets larger so that another block needs to be allocated, doesn't this change the metadata for the file?

File creation times

Posted Jul 29, 2010 9:18 UTC (Thu) by saffroy (guest, #43999) [Link] (6 responses)

Any change to file data will require updating mtime, which is metadata, hence ctime will be updated too. If you enlarge the file with truncate() without touching its content, you still change the file length, again a metadata.

File creation times

Posted Jul 29, 2010 9:48 UTC (Thu) by fperrin (subscriber, #61941) [Link] (5 responses)

> Any change to file data will require updating mtime, which is metadata, hence ctime will be updated too.

Does this mean that ctime is always more recent than mtime ?

File creation times

Posted Jul 29, 2010 10:01 UTC (Thu) by Mog (subscriber, #29529) [Link] (4 responses)

> Does this mean that ctime is always more recent than mtime ?

Since mtime can be set to any value (using touch for example), not always.

File creation times

Posted Jul 29, 2010 17:10 UTC (Thu) by docwhat (guest, #40373) [Link] (3 responses)

I had to try it for myself...
$ mkdir -p /tmp/tmp.e0EkZhvZQR
$ touch /tmp/tmp.e0EkZhvZQR/somefile
$ stat /tmp/tmp.e0EkZhvZQR/somefile
  File: `/tmp/tmp.e0EkZhvZQR/somefile'
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 801h/2049d	Inode: 2760536     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  holtje)   Gid: ( 1000/  holtje)
Access: 2010-07-29 13:09:14.990815140 -0400
Modify: 2010-07-29 13:09:14.990815140 -0400
Change: 2010-07-29 13:09:14.990815140 -0400
$ touch -m -d 2110-01-01 /tmp/tmp.e0EkZhvZQR/somefile
$ stat /tmp/tmp.e0EkZhvZQR/somefile
  File: `/tmp/tmp.e0EkZhvZQR/somefile'
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: 801h/2049d	Inode: 2760536     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  holtje)   Gid: ( 1000/  holtje)
Access: 2010-07-29 13:09:14.990815140 -0400
Modify: 2110-01-01 00:00:00.000000000 -0500
Change: 2010-07-29 13:09:14.990815140 -0400
Yup!

File creation times

Posted Jul 29, 2010 23:45 UTC (Thu) by jzbiciak (guest, #5246) [Link] (2 responses)

So how come ctime did *not* change? The metadata (ie. mtime) changed. I'm willing to believe that mtime is *not* in the set of metadata that ctime represents.

File creation times

Posted Jul 30, 2010 0:13 UTC (Fri) by docwhat (guest, #40373) [Link] (1 responses)

It happened too fast. I'm using a script. Here it is with a sleep.
$ mkdir -p /tmp/tmp.RZFIqs7625
$ touch /tmp/tmp.RZFIqs7625/somefile
$ stat /tmp/tmp.RZFIqs7625/somefile
  File: `/tmp/tmp.RZFIqs7625/somefile'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: ca02h/51714d    Inode: 1622396     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1002/ docwhat)   Gid: ( 1002/ docwhat)
Access: 2010-07-29 19:50:55.000000000 -0400
Modify: 2010-07-29 19:50:55.000000000 -0400
Change: 2010-07-29 19:50:55.000000000 -0400
$ sleep 2
$ touch -m -d 2110-01-01 /tmp/tmp.RZFIqs7625/somefile
$ stat /tmp/tmp.RZFIqs7625/somefile
  File: `/tmp/tmp.RZFIqs7625/somefile'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: ca02h/51714d    Inode: 1622396     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1002/ docwhat)   Gid: ( 1002/ docwhat)
Access: 2010-07-29 19:50:55.000000000 -0400
Modify: 2110-01-01 00:00:00.000000000 -0500
Change: 2010-07-29 19:50:57.000000000 -0400
The script I used:
#!/bin/bash

set -x

mkdir -p /tmp/tmp.e0EkZhvZQR
touch /tmp/tmp.e0EkZhvZQR/somefile
stat /tmp/tmp.e0EkZhvZQR/somefile
sleep 2
touch -m -d 2110-01-01 /tmp/tmp.e0EkZhvZQR/somefile
stat /tmp/tmp.e0EkZhvZQR/somefile
Here's how I ran it:
$ /tmp/foo.sh |& perl -p -e 's@^\+@\$@'
I just cut-and-pasted the middle of the output.

Ciao!

File creation times

Posted Jul 30, 2010 13:49 UTC (Fri) by jzbiciak (guest, #5246) [Link]

Ah, ok. Mystery solved!

Rsync?

Posted Jul 29, 2010 12:26 UTC (Thu) by NRArnot (subscriber, #3033) [Link] (1 responses)

Doesn't rsync by default use ctime to decide whether it needs to compare file checksums?

If so it's hardly little-used.

And if changing the file's metadata did not update its ctime, then wouldn't that subtly break scrips that create remote backups or mirrors of filesystems using rsync? Rsync would then have to compare all metadata, or at least a checksum of all the relevant metadata. Does this cause a lot of extra I/O, or is it not significantly costlier to fetch all metadata compared to fetching just ctime?

Rsync?

Posted Jul 29, 2010 17:13 UTC (Thu) by docwhat (guest, #40373) [Link]

The man page says it uses "either a changed size or a changed last-modified time".

File creation times

Posted Jul 29, 2010 19:04 UTC (Thu) by butlerm (subscriber, #13312) [Link] (9 responses)

Creation times are a great idea. The idea that a creation time should not be changeable is an extraordinarily bad one. Backup software, copy commands with the preserve attribute option specified, and any application that does file replacement with a write / fsync / rename sequence need that ability. Otherwise it is borderline useless, telling you something much more akin to the time of last modification than when the file was created.

File creation times

Posted Jul 29, 2010 19:31 UTC (Thu) by hppnq (guest, #14462) [Link] (5 responses)

The idea that a creation time should not be changeable is an extraordinarily bad one.

Not really. It allows you to easily verify whether a file has been tempered with by an attacker.

File creation times

Posted Jul 29, 2010 20:45 UTC (Thu) by sync (guest, #39669) [Link] (4 responses)

No. When the attacker overwrites (not replace) the file the creation time doesn't change.

File creation times

Posted Jul 29, 2010 21:10 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (3 responses)

Worse, the attacker is not obliged to obey your conventions.

Just because _you_ don't want to change the create time, doesn't prevent the attacker from doing so. "Oh," you say "but there will be no syscall". Again, this is a problem for you, the legitimate user, but not for the attacker, he can just force the relevant blocks out to disk, scribble on the raw disk, and let them be read back in again - voila!

File creation times

Posted Jul 30, 2010 7:14 UTC (Fri) by hppnq (guest, #14462) [Link] (2 responses)

It's quite likely that you can't use ctime to verify that a file has NOT been tempered with. If someone is able to scribble something poetic on a raw device, it makes no sense to worry about the ctime on /bin/ls. This should not be trivial for an attacker, of course.

But obviously, if a ctime has changed unexpectedly, there's no doubt someone messed with the file, or the kernel.

File creation times

Posted Jul 30, 2010 11:50 UTC (Fri) by sync (guest, #39669) [Link] (1 responses)

Now you are talking about change time (ctime) not creation time.

And ctime changes doesn't means that someone messed the file. There are a lot of false positives:
selinux relables the file
backup program resets atime
...

And of course ctime should not be user changeable. But not for security reasons.

File creation times

Posted Jul 30, 2010 16:49 UTC (Fri) by hppnq (guest, #14462) [Link]

Now you are talking about change time (ctime) not creation time.

Ah, I assumed indeed that the original comment was about ctime. I was never talking about creation time. Sorry for the confusion.

And ctime changes doesn't means that someone messed the file.

Of course not.

And of course ctime should not be user changeable. But not for security reasons.

Look up some real-world examples of intrusions and how they were detected, or delve deeper into forensic discovery with The Coroner's Toolkit or its successor The Sleuth Kit. Fascinating stuff.

File creation times

Posted Aug 9, 2010 11:41 UTC (Mon) by dgm (subscriber, #49227) [Link] (2 responses)

Also creation times may come from outside. For example, I would love to be able to list all my pictures in a directory by the time I created (shoot) them. That would be impossible without the ability to modify the creation time, because it would not be the time provided by the camera, but the one stamped by the kernel when the pictures were copied from it, which is far less useful for me.

File creation times

Posted Sep 29, 2010 13:36 UTC (Wed) by misiu_mp (guest, #41936) [Link] (1 responses)

Photo creation time is usually written in the exif data. I dont know if the camera can be trusted with setting file creation time for the files on the sdcard correctly. Not to mention they are stored on a removable fat system which means they could be modified by multitude systems with other oses.

File creation times

Posted Sep 29, 2010 16:43 UTC (Wed) by bronson (subscriber, #4806) [Link]

Most cameras that I've seen (especially idiot boxes) can't be trusted to have their clocks set to the current month, much less the correct time zone.

There's a photographer that I know who often cares about the exact time a picture was taken (professional building shots, relying on sun angles). He first takes a picture of his GPS so, if things look weird, he can figure out the correction.

Anyhow point is, unless the camera is running NTP or a GPS receiver, I wouldn't put much weight in EXIF data!

File creation times - choose it at the moment of creation

Posted Sep 29, 2010 13:19 UTC (Wed) by misiu_mp (guest, #41936) [Link]

"Allowing the time to be changed would make it less reliable, but it would also be useful for backup/restore programs which want to restore the original creation time."

How about making it possible to choose the creation time at the moment of creation? This way backups could be restored with creation time intact and no mess caused by changing the creation time of existing files.


Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds