there are cases that current free VCS sytems cannot handle well, if at all. These are related to storing large binary blobs (including compiled binaries) in the VCS, which is a fairly common thing to do in corporate environments. instead of storing lots of tar files of releases, they like to check all the files of each release into the VCS.
Perforce handles this well (for all it's shortcomings in other areas)
Posted Oct 21, 2009 7:46 UTC (Wed) by epa (subscriber, #39769)
[Link]
svn also copes fairly well with big blobs. But perhaps when Google started out to pick a version control system all those years ago, svn wasn't mature enough.
Git sounds like it should cope well with large objects in the repository, but the general view is that it doesn't perform so well. I wonder why not.
Perforce
Posted Oct 21, 2009 9:25 UTC (Wed) by cortana (subscriber, #24596)
[Link]
I just ran into one of the reasons yesterday. We were trying to check in a bunch of large binary
files, some several hundred megabytes large. Git ran out of memory with a fairly uninformative
error message while 'packing' objects, whatever that means...
Fortunately, a short trip to #git revealed the cause of the problem: git compresses objects
before sending them to a remote repository; it simply ran out of virtual memory while
compressing some of the larger files.
There were two fixes.
1. Use a 64-bit version of git. I'd be happy to, but there isn't an x64 binary download available
from the msysgit web site.
2. Tell git not to perform the compression; 'echo * -delta > .git/info/attributes'. Somewhat
undocumented, but at least I will be able to search for this LWN comment if I ever run into this
problem again. :)
-delta
Posted Oct 21, 2009 23:31 UTC (Wed) by joey (subscriber, #328)
[Link]
Me and my 50 gb git repos thank you for that! But since finding LWN comments
in future is not my strong suite, I sent in a patch to document it on
gitattributes(1) ;)
Looks to me to make large object commits fast, but git pull will still
compress the objects, and still tends to run out of memory when they're
large.
-delta
Posted Oct 21, 2009 23:51 UTC (Wed) by cortana (subscriber, #24596)
[Link]
Thanks so much for that. I would have suggested a patch, honest, but I'm super busy at work at the moment... ;)
Presumably git-pull running out of memory would be a server-side issue? And in that case, if you're not running a sensible 64-bit operating system on your server then you deserve what you get... ;)
Perforce
Posted Oct 21, 2009 12:13 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
Quote:
Git sounds like it should cope well with large objects in the repository, but the general view is that it doesn't perform so well. I wonder why not.
git mmaps the files to access them, and the pack definition is limited to no more than 4G (and since the over-the-wire protocol for download is the same as a pack file you run into limits there)
4G is _huge_ for source code. especially with the compression that git does, but when you start storing binaries that don't diff against each other the repository size can climb rapidly.
this has been documented several times, but it seems to be in the category of 'interesting problem, we should do something about that someday, but not a priority' right now
Perforce
Posted Oct 21, 2009 14:30 UTC (Wed) by drag (subscriber, #31333)
[Link]
Yes. Git is exceptionally good at managing text.
Pretty shitty at everything else. Its too bad because I'd like to use it for synchronizing my desktop.
Perforce
Posted Nov 1, 2009 18:54 UTC (Sun) by mfedyk (guest, #55303)
[Link]
You probably want to look at couchdb and the fuse driver for it.
Perforce
Posted Oct 21, 2009 9:22 UTC (Wed) by nix (subscriber, #2304)
[Link]
It's also useful if your project consists of a large number of loosely-related changes, so you really don't *want* tree-wide commits. Amazing though it may sound some organizations actually depend on this sort of thing.