SCM innovation in git
Posted Apr 21, 2005 12:03 UTC (Thu) by kevinbsmith
Parent article: A very quick guide to starting with git
Some folks may question the value of git, given that several free SCM tools already exist. I did at first, too. But increasingly it appears that Linus has managed to advance the state of the art in a few ways. It's too early to know for sure, but git appears to be an impressive piece of design work.
It relies on a simpler core than most SCM's, and has a very coherent philosophy. It fully separates the "engine" or "filesystem" layer from the user-oriented SCM layer. I expect there will be at least a couple front-ends to git (cogito being one), sharing compatible back-end data. The core design is very simple, making it easy for people to experiment with better merging algorithms and other SCM-level features.
One of the most interesting design choices is not to track file renames. At first, this seems like a step backward, since most of the other tools use rename tracking to support commands like "blame", where you can track the history of a piece of code, even if the module it is in has been renamed. If you don't track renames, aren't you back in the CVS world where history gets lost? Fortunately, it looks like that is not the case.
Linus claims that a rename (or move) is merely a special case of a more interesting problem: Text moving from one file to another. He points out that it is pretty common to cut and paste a function from one module to another. Or to split a file into two or three pieces. If you only track renames, you will lose history in any of those cases.
Instead, he has laid out the design of a tool that would allow you to point to some code in the current version, and track that code backward through time, even as it moved from file to file. Even better, this tool would not need any patch metadata to do its work...it would rely solely on the tree snapshots (or diffs).
The trick is that when the tool wants to know where some text came from, it doesn't have to search the entire directory tree. It only has to search those files that have been touched by this changeset. So it can be fast, even without requiring any "hints" to be written at commit time.
Very cool! I have never heard of an SCM that does this. Certainly not any of the free distributed SCM systems.
to post comments)