Avoiding disk-full problems
Posted Jun 8, 2012 3:09 UTC (Fri) by pjm
In reply to: Avoiding disk-full problems
Parent article: Atime and btrfs: a bad combination?
We all agree on physically what's happening,
and I'm sure we agree that in truth it's not just reading or snapshotting by itself that uses extra
space, it's the combination of a read and a preceding snapshot.
The only question is what to do about the possibility of there not being enough space to rewrite the inode.
Some possibilities include:
- Return ENOSPC on read. (The undesirable prospect alluded to in the article.)
- Let the read go ahead but don't update the atime (even the in-memory atime?) if there's no space left. (I gather that this is the current solution.)
- Let the read go ahead but scribble over the snapshot's atime.
- Exclude atime's from snapshots. (What does that mean? I.e. what atime do people see when doing ls -ltu in the snapshot?)
- Laptop mode (lossy atimes): Never initiate a write just for the sake of updating an on-disk atime,
but still copy the in-memory atime to disk if we're writing the inode for some other reason.
- Never store atime on disk in the first place, but still have accesses update the in-memory atime,
like in romfs, cramfs etc. (What value would the in-memory atime get initialized to when reading
the inode from disk? 1970, or some function of ctime and mtime?)
- Mandatory noatime: the atime that stat(2) sees (and hence find, ls, mutt etc.) is just the creation time.
- Reserve enough space for atime to be reliable. E.g. have the superblock
record the number of inodes that we are "in debt": initially 0 at
filesystem creation, and snapshot sets it to the (then-current) number of
inodes, and a copy-on-write of an inode decreases it by one. This debt is
tied to the amount of free space left, influencing whether an allocation or
snapshot operation returns ENOSPC. Snapshotting is still a cheap operation
both in time (no immediate write necessary, and one or two integers in the
superblock to update in write-behind) and disk space: a million snapshots a
year still only requires as much disk space as the writes that occur
between snapshots, except with the difference that we also reserve space
for inode writes to occur in the future. This is a once-off reservation,
there's no additional cost between one snapshot a year or one million.
I don't want to advocate one solution over another, and I'm pretty happy with
what I'm told is the current approach, I'm just listing some of the options.
to post comments)