|
|
Subscribe / Log in / New account

Large projects and decentralized development

By Jake Edge
August 22, 2007

Development using Git, with its decentralized model, is gaining proponents for projects beyond its Linux kernel heritage. Some recent threads on the kde-core-devel mailing list have been discussing how Git might be used by some developers without disrupting the Subversion (svn) infrastructure that is used by KDE. That conversation has broadened to consider how a large project like KDE might reorganize to take advantage of Git's strengths. It does not look like KDE is really considering a switch – they converted from CVS a little over two years ago – but the discussion is useful to anyone thinking about using Git.

There are really two separate discussions taking place, the first concerns using Git without disrupting svn, while the second covers the larger issues of how to structure and use Git for a larger project. The two are intertwined as the "best practice" for a KDE-sized project is to convert incrementally. Smaller sub-projects, a particular KDE application for example, would use Git while still committing the changes back to the svn repository. Trying to do a wholesale conversion of a project the size of KDE, with many developers, testers, translators and users – not to mention millions of lines of code – would be something approaching impossible.

For tracking an svn repository, while using Git locally, the git-svn tool is indispensable. It uses any of the svn protocols to check out a repository, optionally including branches and tags, and installing them as a Git repository. A developer then uses Git commands locally, using git-svn again when ready to update from or push changes to the svn repository. It is not a perfect fit, complaints about losing history in the conversion have been heard, but it does provide Git users a way to interact with svn.

The decentralized nature of the Git development model is always a stumbling block for projects that are used to the single, central, repository model of svn and other revision control systems. Adam Treat invited a rather well-known expert on Git, with some small experience in applying it to large projects, to comment on some of the questions he and others had. Linus Torvalds, who is also a KDE user, responded, at length, with some very useful insights.

Breaking the project into sub-projects is the first step:

So I'm hoping that if you guys are seriously considering git, you'd also split up the KDE repository so that it's not one single huge one, but with multiple smaller repositories (ie kdelibs might be one, and each major app would be its own), and then using the git "submodule" support to tie it all together.

Using the git-submodule command, a project can be broken up into many pieces, each with their own Git repository. Those separate repositories can then be stitched together into a "superproject" that understands how to handle a collection of repositories. If a change affects multiple modules, it can still be handled in an atomic way:

What happens is that you do a single commit in each submodule that is atomic to that *private* copy of that submodule (and nobody will ever see it on its own, since you'd not push it out), and then in the supermodule you make *another* commit that updates the supermodule to all the changes in each submodule.

See? It's totally atomic. Anybody that updates from the supermodule will get one supermodule commit, when that in turn fetches all the submodule changes, you never have any inconsistent state.

Users of a development tree have differing needs, which Git supports by not requiring a central repository that all users must interact with. Torvalds believes that the development organization, not the tool, should determine which repositories are central:

I certainly agree that almost any project will want a "central" repository in the sense that you want to have one canonical default source base that people think of as the "primary" source base.

But that should not be a *technical* distinction, it should be a *social* one, if you see what I mean. The reason? Quite often, certain groups would know that there is a primary archive, but for various reasons would want to ignore that knowledge.

For Linux, his kernel Git tree is the center, but for a variety of other users, the "stable" tree or distribution kernel trees for example, their repositories are the source. Those repositories can and do update from time to time from the main tree, but they control when and the users of those trees don't have to care.

On the subject of mapping the current KDE practices to Git, Torvalds is, characteristically, not shy about expressing his opinion:

Hey, you can use your old model if you want to. git doesn't *force* you to change. But trust me, once you start noticing how different groups can have their own experimental branches, and can ask people to test stuff that isn't ready for mainline yet, you'll see what the big deal is all about.

Centralized _works_. It's just *inferior*.

There is a clash of development models going on and Torvalds is pushing the kernel's model. His reasons are good, though they may not convince everyone, which is why Git tries hard to avoid forcing any particular style. As he did with open source development, Torvalds is trying to lead by example, while not forcing anyone to change.

Reading the full threads including the entire posting by Torvalds will be very interesting to those who follow source code management issues. This culture clash, centralized and somewhat bureaucratic versus decentralized and freewheeling will come up again and again over the next few years. Torvalds seems to think the Git model will work most everywhere and his track record for making smart choices is good. It will be interesting to watch.



to post comments

Privelege error.

Posted Aug 23, 2007 1:35 UTC (Thu) by dw (subscriber, #12017) [Link] (1 responses)

Hey Jake,

Nice article but I noticed I'm getting a "privilege error" when trying to click through to http://lwn.net/Articles/246381/ from the first link "responded". Forget to ticket a checkbox? :)

Privelege error.

Posted Aug 23, 2007 2:07 UTC (Thu) by jake (editor, #205) [Link]

Sorry about that. Should be fixed now, thanks!

jake

Large projects and decentralized development

Posted Aug 23, 2007 4:36 UTC (Thu) by thedevil (guest, #32913) [Link] (3 responses)

Well, "git-submodule" not only doesn't exist on my machine (which runs etch with a few lenny packages mixed in), but the file-in-package search for it doesn't return any hits, either, which means it is not even in the version in sid. Evidently, Linus would like the KDE people to use a bleeding-edge git. No matter how good it is, I wouldn't agree if I were one of them.

Large projects and decentralized development

Posted Aug 23, 2007 5:50 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

submodules are fairly new, about 6 months old. it does require the 1.5 version of git.

unfortunantly Debian hasn't moved from the 1.4 version yet, and they won't for etch (a policy decision by debian)

Large projects and decentralized development

Posted Aug 23, 2007 7:35 UTC (Thu) by Wummel (guest, #7591) [Link]

The git-submodule script is available only in development versions, eg. 1.5.3rc5. The current version 1.5.2.5 from the download page does not include it. So I consider it still bleeding-edge.

Large projects and decentralized development

Posted Aug 23, 2007 10:48 UTC (Thu) by gnb (subscriber, #5132) [Link]

>Linus would like the KDE people to use a bleeding-edge git.
But all this is still at the speculative discussion stage. Odds are that
any decision will take long enough that the feature will be quite far
from the bleeding edge by then, so his position isn't unreasonable.

Large projects and decentralized development

Posted Aug 23, 2007 7:59 UTC (Thu) by Wummel (guest, #7591) [Link] (3 responses)

I listened to most of Linus' git talk at Google. I remember that he said the reason why KDE did not choose git was that there was no Windows support for it. That might have changed a little, judging from the Wiki Windows page.

But Windows support is still not on the same level as Unix. The wiki page even suggests using CVS when there are only a few Windows developers on the project. That makes me feel that the Windows port is not really serious.

Additionally there are currently no plugins for IDEs such as Eclipse, kdevelop or anjuta. There are plans/feature requests for all of those though. The future might bring a full alternative to Subversion, but right now that is not the case when Subversion is used with Windows or IDEs.

Large projects and decentralized development

Posted Aug 23, 2007 10:57 UTC (Thu) by ms (subscriber, #41272) [Link] (1 responses)

There are more than one DSCM. Mozilla has recently switched to Bazaar. Then there's Mercurial and Monotone. Darcs is kinda neat too and then there's always Arch/TLA.

Some of these have damn good windows support. Most of them have saner command line interfaces than Git.

Large projects and decentralized development

Posted Aug 23, 2007 10:59 UTC (Thu) by ms (subscriber, #41272) [Link]

Damn, sorry, that's factually wrong. Mozilla went to Mercurial, not Bazaar.

http://weblogs.mozillazine.org/preed/2007/04/version_cont...

Large projects and decentralized development

Posted Aug 30, 2007 6:26 UTC (Thu) by biehl (subscriber, #14636) [Link]

There does seem to be a Git Eclipse plugin. Haven't tried it yet, myself

http://git.or.cz/gitwiki/EclipsePlugin

Large projects and decentralized development

Posted Aug 24, 2007 9:06 UTC (Fri) by jospoortvliet (guest, #33164) [Link]

Thanx for the article. I did follow the discussion on the mailinglist, as it was interesting and informed. It seems, even though there still are some issues to work out, the most important problems are more social than technical. Settling down, as Aaron suggested, is a good move - KDE has been in flux, changing it's core technologies for the last 2 years. Now it's time to write some code. But sure, there is already some experimentation with Git and the old adagio of 'who codes, decides' still goes. Just like the transition to CMAKE, which became the defacto standard in KDE, even though at first everyone thought it would be SCons...


Copyright © 2007, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds