de-duplication in filesystems
Posted Jul 8, 2006 17:44 UTC (Sat) by giraffedata
In reply to: Interesting work - and some ideas for the future
Parent article: The 2006 Linux Filesystems Workshop (Part III)
The tricky part of de-duplication is identifying the duplicate files.
Users today create multiple copies of files because it's easier than sharing. The idea of de-duplication is that the users maintain that ease, but get the benefits of sharing because the system stores only one copy anyhow.
The copy on write technology is pretty much the same as is used today for snapshot copies. But the identification of duplicate files (or, in some proposals, blocks) is something I have yet to see done with demonstrable gain.
to post comments)