Not logged in
Log in now
Create an account
Subscribe to LWN
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
Posted Oct 21, 2009 5:19 UTC (Wed) by bradfitz (subscriber, #4378)
So with stuff like that, or git-svn, if you're going to have a blessed
"central" repo anyway, who really cares if that repo is actually git, svn,
perforce, etc, as long as you can use your DVCS of choice at the edge?
The alternative is changing years of accumulated tools & checks every time a
new VCS comes out and you change your master repo's storage format.
Posted Oct 21, 2009 22:02 UTC (Wed) by ianw (subscriber, #20143)
Although everyone always talks about getting rid of it, there is so much build, QA and release infrastructure built around it, I can't fathom it could ever happen. But, using git wrappers, us developers can pretty much forget that it's even there :)
Posted Oct 21, 2009 6:07 UTC (Wed) by dlang (✭ supporter ✭, #313)
Perforce handles this well (for all it's shortcomings in other areas)
Posted Oct 21, 2009 7:46 UTC (Wed) by epa (subscriber, #39769)
Git sounds like it should cope well with large objects in the repository, but the general view is that it doesn't perform so well. I wonder why not.
Posted Oct 21, 2009 9:25 UTC (Wed) by cortana (subscriber, #24596)
Fortunately, a short trip to #git revealed the cause of the problem: git compresses objects
before sending them to a remote repository; it simply ran out of virtual memory while
compressing some of the larger files.
There were two fixes.
1. Use a 64-bit version of git. I'd be happy to, but there isn't an x64 binary download available
from the msysgit web site.
2. Tell git not to perform the compression; 'echo * -delta > .git/info/attributes'. Somewhat
undocumented, but at least I will be able to search for this LWN comment if I ever run into this
problem again. :)
Posted Oct 21, 2009 23:31 UTC (Wed) by joey (subscriber, #328)
Looks to me to make large object commits fast, but git pull will still
compress the objects, and still tends to run out of memory when they're
Posted Oct 21, 2009 23:51 UTC (Wed) by cortana (subscriber, #24596)
Presumably git-pull running out of memory would be a server-side issue? And in that case, if you're not running a sensible 64-bit operating system on your server then you deserve what you get... ;)
Posted Oct 21, 2009 12:13 UTC (Wed) by dlang (✭ supporter ✭, #313)
git mmaps the files to access them, and the pack definition is limited to no more than 4G (and since the over-the-wire protocol for download is the same as a pack file you run into limits there)
4G is _huge_ for source code. especially with the compression that git does, but when you start storing binaries that don't diff against each other the repository size can climb rapidly.
this has been documented several times, but it seems to be in the category of 'interesting problem, we should do something about that someday, but not a priority' right now
Posted Oct 21, 2009 14:30 UTC (Wed) by drag (subscriber, #31333)
Pretty shitty at everything else. Its too bad because I'd like to use it for synchronizing my desktop.
Posted Nov 1, 2009 18:54 UTC (Sun) by mfedyk (guest, #55303)
Posted Oct 21, 2009 9:22 UTC (Wed) by nix (subscriber, #2304)
Posted Oct 21, 2009 11:54 UTC (Wed) by jonth (subscriber, #4008)
First, some background. The company had grown over the course of it's first three years to a two-site, 100 person company. Up to that point, we had used CVS with a bunch of bespoke scripts to handle branches and merges. We used GNATs (I _really_ wouldn't recommend this) as our bug tracking system. These decisions had been taken somewhat by default, but by 2005 it was clear that we needed something new that would scale to our needs in the future. Our requirements were
a) Integration of bug tracking and source control management. For us, we felt that it was vital to understand that SCM is only half of the problem. I think that this tends to be overlooked in non-commercial environments.
b) Scalable to multi-site and 100s of users.
c) Ease of use.
g) Windows/Linux support. We're pridominantly a Linux shop, but we have teams who write Windows drivers.
We looked at the following systems (in no particular order):
a) git. Git had been active for about 6 months when we started looking at it. We liked the design principles, but at that time there was no obvious way to integrate it into an existing bug tracking system. It also had no GUI then (although I'm a confirmed command line jockey, a GUI for these things definitely improves productivity) and there was no Windows version of git. Finally, the underlying storage was still somewhat in flux, and all in all, it seemed just too young to risk the future of the company on it.
b) Mercurial. Many of the problems we had with git also applied to Mercurial. However, even then it did integrate with Trac, so we could have gone down that route. In the end, like git, it was just too new to risk.
c) Clearcase/Clearquest. Too slow, too expensive, and rubbish multi-site support.
d) Bitkeeper. Nice solution, but we were scared of the "Don't piss Larry off" license.
e) Perforce/Bugzilla. Provided "out of the box" integration with Bugzilla, worked pretty well with multi-site using proxies, had a nice GUI, scaled well, was stable (our major supplier had used it for a few years), had client versions for Windows and Linux, and was pretty quick, too.
f) MKS. No better than CVS.
g) SVN. In many ways, similar to Perforce in terms of how it is used. In fact, one part of the company decided to use SVN instead of Perforce. However, this lasted for about 6 months. I don't know the details but due to some technical difficulties, they gave up and moved over to Perforce.
All in all, Perforce integrated with a customized version of Bugzilla, while not perfect (git/mercurial/bk's model of how branches work is more sensible I think), gave us the best fit to our needs. We now have ~200 users spread all over the world, with no real performance problems. The bug tracking integration works well. Perforce's commercial support is responsive and good, we've never lost any data and we can tune the whole system to our needs.
If we had to revisit the decision, it's possible that Mercurial/Trac would have fared better, but to be honest the system we chose has stood the test of time and so there is no reason to change.
Posted Oct 21, 2009 12:16 UTC (Wed) by ringerc (subscriber, #3071)
BDB is great if used in the (optional) transactional mode on an utterly stable system where nothing ever goes wrong. In other words, in the real world I like to see its use confined to throw-away databases (caches, etc).
I've been using SVN with the fsfs backend for years both personally and in several OSS projects and I've been very happy. Of course, the needs of those projects and my personal needs are likely quite different to your company's.
Posted Oct 21, 2009 19:21 UTC (Wed) by ceswiedler (subscriber, #24638)
People use Perforce because it works very well for centralized version control, and that's what a lot of companies need. It enforces user security, integrates with a lot of other software, can be backed up centrally, and has a lot of very good tools. On the other hand, it doesn't scale as well as DVCSs do, and can't be used offline.
Posted Oct 21, 2009 21:11 UTC (Wed) by man_ls (subscriber, #15091)
Posted Oct 31, 2009 4:55 UTC (Sat) by Holmes1869 (guest, #42043)
That being said, I feel that some of the git features will only ever be used by people that take source control seriously. The people I work with check-in code without commit messages, mistakenly commit files that they forgot they changed (or other random files that ended up in their sandbox), and don't ever perform a simple 'svn diff' (or Subclipse comparison) just to make sure they are checking in what they want. Do you think these people care that they can re-order or squash a commit to create a single pristine, neat, atomic commit to fix exactly one particular bug? Probably not unfortunately. I hope to one day work with people that do care.
Posted Oct 22, 2009 7:38 UTC (Thu) by cmccabe (guest, #60281)
I've worked with perforce, subversion, and git in the past. The three systems all have very different philosophies.
perforce has some abilities that none of the other systems have. When you start editing a file, it tells you who else has it open. You can look at their changes, too.
Both perforce and subversion can check out part of a repository without checking out the whole thing. Git can't do this. Arguably, you should use git subprojects to solve this problem. I've never actually done that, so I don't know how well it works.
Of course, git allows you to work offline, which neither perforce nor subversion can do. git also allows you to merge changes from one client to another ("branch," in git lingo). I've definitely been frustrated in the past by having to manually port changes from one perforce client to another-- even wrote scripts to automate it. What a waste.
"p4 merge" is a powerful command, much more powerful than "svn copy." p4 preserves the "x was integrated into y" relationships between files, whereas svn does not. Imagine a company that has branches for product 1.0, 2.0, and 3.0. It periodically integrates changes from 1.0 into 2.0, and 2.0 into 3.0. In this situation, the relative lack of sophistication of svn copy is a real Achilles heel. Imagine how much pain renaming a file in version 2.0 causes for the hapless svn copy user. Each time the build monkey does the integration from 1.0 to 2.0, he has to remember the files that were renamed. Except that with perforce, the system remembers it for him.
git I think has heuristics to detect this sort of thing. In general git was built from the ground up to do merging on a massive basis.
perforce also has excellent Windows support, a pile of GUI tools, and was about a dozen years earlier to the party. git and svn are catching up with these advantages, but it will take some time.
Posted Oct 22, 2009 19:17 UTC (Thu) by dsas (subscriber, #58356)
Posted Oct 30, 2009 21:57 UTC (Fri) by lkundrak (subscriber, #43452)
Posted Oct 29, 2009 3:05 UTC (Thu) by tutufan (guest, #60063)
Wow. I can almost hear the punch card reader in the background. Talk about an obsolete mindset. If I'm editing file X, do I really want to know whether somebody, somewhere, working on some idea that I have no idea about, is trying out something that also somehow involves file X, something that ultimately may never see the light of day? No.
If we get to the point of merging, I think about it then (if necessary).
Posted Nov 4, 2009 21:44 UTC (Wed) by jengelh (subscriber, #33263)
Sure you could split it up, but uh, all too tightly integrated. Should anything go git in a future, I would guess all repositories will start with a fresh slate.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds