LWN.net Logo

repository format?

repository format?

Posted Feb 25, 2010 0:57 UTC (Thu) by dmarti (subscriber, #11625)
In reply to: Hg Init: a Mercurial tutorial by lambda
Parent article: Hg Init: a Mercurial tutorial

Keith Packard wrote a while ago, "Mercurial uses a truncated forward delta scheme where file revisions are appended to the repository file, as a string of deltas with occasional complete copies of the file (to provide a time bound on operations). This suffers from two possible problems—the first is fairly obvious where corrupted writes of new revisions can affect old revisions of the file. The second is more subtle — system failure during commit will leave the file contents half written. Mercurial has recovery techniques to detect this, but they involve truncating existing files, a piece of the Linux kernel which has constantly suffered from race conditions and other adventures."

Is this still the case? I still feel less nervous about putting potentially long-lived projects into git, but maybe I have nothing to worry about with hg.


(Log in to post comments)

repository format?

Posted Feb 25, 2010 2:50 UTC (Thu) by bboissin (subscriber, #29506) [Link]

It was never the case, it just shows a lack of knowledge as to how hg work
(this blog entry was considered a bit FUD'ish at that time).

First a corrupted write doesn't affect old revisions (that's the whole
point
of the append only design of hg, maybe he is mistaking hg format with svn,
where I think one of the format does the delta the other way round).

Second, partial writes are detected, hg asks you to use `hg recover`, which
truncates the files. During the (already) almost five years where I've been
involved in hg, I've never seen a single case of problem with truncate. The
scariest things I've seen were with xfs where after a hard failure and a
reboot, the repo would be in a state where files that where written
disappeared (git would have the same problem as hg in that case).

Finally unpacked git repos might be very safe (you basically write a new
file for every version of every file) but not space efficient (the repo
grows quickly). If you use packs (as everybody does), corrupted writes in
the pack could destroy the history too (but they probably use transactions
and checks like hg to detect it).

repository format?

Posted Feb 26, 2010 21:42 UTC (Fri) by bfields (subscriber, #19510) [Link]

If you use packs (as everybody does), corrupted writes in the pack could destroy the history too (but they probably use transactions and checks like hg to detect it).

No, my understanding is that packs are write-once just like other objects. (So you have to pack loose objects every now and then to keep performance from degrading.)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds