LWN.net Logo

JLS2009: A Btrfs update

JLS2009: A Btrfs update

Posted Oct 30, 2009 16:11 UTC (Fri) by giraffedata (subscriber, #1954)
Parent article: JLS2009: A Btrfs update

Doing snapshots at the device mapper/LVM layer involves making a lot more copies of the relevant data. Chris ran an experiment where he created a 400MB file, created a bunch of snapshots, then overwrote the file. Btrfs is able to just write the new version, while allowing all of the snapshots to share the old copy. LVM, instead, copies the data once for each snapshot.

I don't follow. LVM doesn't have snapshots of a volume share blocks?


(Log in to post comments)

JLS2009: A Btrfs update

Posted Nov 2, 2009 8:50 UTC (Mon) by njs (guest, #40338) [Link]

I believe that when you have a original volume and make multiple snapshots of it, then no, the snapshot volumes are logically independent. They can share blocks with the original volume, but cannot share blocks with each other (except when those blocks are also present in the original volume).

JLS2009: A Btrfs update

Posted Nov 2, 2009 15:14 UTC (Mon) by giraffedata (subscriber, #1954) [Link]

OK, I see. When you write to the base version, LVM copies the original data to a new block for each existing snapshot, then updates the original block. Btrfs instead writes the new data to a new block for the base version and leaves the snapshots pointing to the original block.

What I was hoping to get to is whether this difference is an inherent difference between doing snapshots in the filesystem vs in the logical volume. Apparently, it isn't, because LVM could use the same strategy if it wanted to.

Or maybe it's more important in LVM than Btrfs for the original block to stay with the base version?

JLS2009: A Btrfs update

Posted Nov 8, 2009 3:18 UTC (Sun) by butlerm (subscriber, #13312) [Link]

I understand that ZFS and Netapp use a very similar copy on write technique
to make read only snapshots of filesystems (hence the lawsuit). NetApp uses
the same scheme to make read only snapshots of virtual block devices as well.

The problem is something like that is probably much too complex for LVM,
comparable in complexity to BTRFS itself. So for LVM to avoid the copy
before write problem, presumably it would have to use a scheme where the
physical locations of one or more versions of each block are stored in an
persistent segment somewhere.

However, if the version tracking segment is itself on a typical storage
device, every random write to something that a snapshot has been taken of
requires both a write to a new block on the disk and a write to the version
pointer entry. Short of locating the segment in NVRAM or a more reliable
than average flash device that is a bit of a problem.

JLS2009: A Btrfs update

Posted Nov 8, 2009 11:54 UTC (Sun) by nix (subscriber, #2304) [Link]

If there's md in there as well, with its superblock updates to track the
array dirty state, one write could be amplified to, what, six? (of course
you don't get a superblock update with every write unless writes are quite
rare... but often writes *are* rare.)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds