LWN.net Logo

Support large repositories!

Support large repositories!

Posted Apr 4, 2010 21:31 UTC (Sun) by nix (subscriber, #2304)
In reply to: Support large repositories! by marcH
Parent article: A proposed Subversion vision and roadmap

No you won't. hg's delta algorithm is binary, as is svn's; git can
transform its series of commit snapshots in all sorts of ways without
changing visible behaviour, and can detect commonality between entirely
unrelated files merged from completely different source repositories. The
diff you see when you do 'git diff' is completely unrelated to the
delta-compression algorithm. (IMHO, this is one of git's biggest
architectural strengths: it can change its on-disk representation almost
beyond recognition without changing anything the user sees or breaking
compatibility in any way.)


(Log in to post comments)

Support large repositories!

Posted Apr 5, 2010 11:35 UTC (Mon) by marcH (subscriber, #57642) [Link]

> hg's delta algorithm is binary, as is svn's; git can transform its series of commit snapshots in all sorts of ways without changing visible behaviour,

Many binary formats are compressed by default. This usually prevents computing deltas. Are these tools clever enough to transparently uncompress revisions before comparing?

Support large repositories!

Posted Apr 5, 2010 16:24 UTC (Mon) by nix (subscriber, #2304) [Link]

Of course the source formats being compressed doesn't prevent computing
deltas, but it does mean that the deltas might be larger than they would
otherwise be. (If they end up too large, you'll just end up storing a
sequence of snapshots.)

Support large repositories!

Posted Apr 5, 2010 21:47 UTC (Mon) by marcH (subscriber, #57642) [Link]

> Of course the source formats being compressed doesn't prevent computing deltas, but it does mean that the deltas might be larger than they would otherwise be.

Every time I tried this, the delta was almost as big as the file itself. Would you have counter-examples?

Support large repositories!

Posted Apr 5, 2010 23:14 UTC (Mon) by nix (subscriber, #2304) [Link]

No. You'll end up with a lot of snapshots rather than deltas, currently.

Support large repositories!

Posted Apr 5, 2010 17:25 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

the ability is there in git to have the delta algorithm uncompress revisions before comparing them.

This has been discussed several times (especially in the context of handling things like .odf files that are compressed XML). What needs to be done to handle formats like this is well understood. Git even has the mechanism to flag files as being of a specific type and call arbatrary tools (external scripts/programs) to handle different file types.

unfortunately, nobody has good, simple examples of this that I am aware of. It's possible, but will take some git-fu to get setup.

Support large repositories!

Posted Apr 6, 2010 17:00 UTC (Tue) by Spudd86 (guest, #51683) [Link]

What would be nice is if someone wrote the code to handle the common files of this type and just included it as part of git (or at least posted it somewhere)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds