LWN.net Logo

Book Review: Pragmatic Version Control Using Git

By Jake Edge
April 13, 2009

Given the ubiquity of Git as a version control system throughout the free software community, one would expect there to be more books about it. So far, that is not the case—though there are indications that is changing—so Travis Swicegood's Pragmatic Version Control Using Git is welcome for those trying to come up to speed on Git. Overall, the book provides a nice starting point, though there are some rough spots.

Like any book covering a free software package, this one begins with some important basics: where to get and how to install the tool. For Linux users, this guide is probably unnecessary as Git is packaged for most distributions these days—Mac OS X or Windows users may find more of interest. The discussion of Git configuration, along with the reminder to set the user.name and user.email parameters before doing any commits, something that I regularly forget when setting up a new machine, is quite useful for all. [Book Cover]

Unlike some other authors, though, Swicegood takes the time to give a bit of the flavor of Git through a discussion of its concepts—along with some indication of why one might want to use it—before descending into the much more boring installation guide. He takes a "30,000 foot" view of the tool and, with no command syntax or specific usage details, spells out what Git can do.

One of the primary problems that any text on a version control system (VCS) must overcome is the need to give "real-world" examples while still keeping the book to a reasonable size. Swicegood does a good job here, by following one example repository throughout the text. One could quibble with the scope of some of his examples, but, by and large, they give a good idea of how things work. In some ways, the simplicity of those examples appears to encourage curious readers to do some experimentation. That is, after all, a pretty good way to learn how to use a tool.

The book is broken up into three main sections (plus an Appendix with a reference and some pointers to more information), but the meat of the text is in Section II, "Everyday Git". For whatever reason, the last chapter of the first section covers setting up local repositories as well as cloning remote repositories. That might make sense, but it is rather puzzling that it starts talking about things like git rebase, branches, and doing releases here. Much of that is covered in further detail later and it doesn't seem to belong.

In Section II, the book does an excellent job of covering how to use Git on a day-to-day basis. I have found myself referring to it several times since reading it to remind myself of the syntax of a command—or the name of a command itself. The sequence is logical, starting with adding and committing files, moving through branch creation and management as well as examining and working with history in Git, and completing the core with a look at remote repositories. Two additional chapters covered somewhat more advanced—or just less often used—features such as organizing the repository and working with multiple remote projects as well as things like compacting a repository and working with the reflog.

Swicegood uses the term "staging" for what is commonly referred, at least in other Git documentation, as the "index". Some readers, especially if they are already well-versed in Git, may find this a bit confusing, but I found that it made sense and, in some ways, simplified the concept. In any case, it seems clear that is how Swicegood envisions the Git index, so passing it along to his readers is a nice touch.

There is no specific mention of the Git version covered by the book—though some early examples mention 1.6.0.2—which is a rather large oversight. Git development moves rapidly, so some of what Swicegood talks about could well be out-of-date. New Git features, such as the unmentioned git stash, were left out, but it isn't clear whether that was done on purpose or because they were added after the book was completed. Most of what is covered should be unaffected, though, as the basic operation of the tool is fairly stable.

The third, thinnest and weakest section is "Administration", which covers migrating to Git and running Git servers. Both chapters seem to suffer from a lack of breadth. In the migration chapter, nothing but CVS or Subversion are considered, and tools like tailor are not even mentioned.

Two things about Swicegood's choices of Git features stood out in a negative way. He seems overly enamored of git rebase, which certainly has its place, but it has some drawbacks that he doesn't fully caution against. His solution for how to create a repository for others to use was somewhat unsatisfying; Git itself can be configured to support such things. Instead, Swicegood reaches for Gitosis, a Python tool for managing remote git repositories. The project seems to have no web page (other than a gitweb page) and one must install it by cloning its repository. Given that there is no mention of how to "manually" set up a Git server, it all seems a bit strange.

There are a handful of less-substantive complaints I could make as well: a throwaway George Santayana quote on the history chapter was a bit annoying, an embarrassing "EMCAScript" typo in one of the examples stood out, as did a few other minor flaws. Swicegood complains frequently about having to truncate or otherwise modify the output of commands to fit on the page, which seems a bit silly. Either fix the problem somehow in the production process or ignore it to the extent possible; involving the reader in the pain of the typographic process seems unnecessary. But these are nits.

While I had some complaints—it is a rare book indeed where I don't—Pragmatic Version Control Using Git has certainly found a spot for itself on my shelf. It especially shines as a quick reference to commands needed daily or nearly so. It will also provide a good starting point for those who wish to learn Git from scratch. Once other Git books come out, it will be interesting to see which end up on my shelf and which are shuffled off to long-term storage. In the end, that is the best test for a good book.


(Log in to post comments)

Book Review: Pragmatic Version Control Using Git

Posted Apr 13, 2009 20:46 UTC (Mon) by pr1268 (subscriber, #24648) [Link]

There is no specific mention of the Git version covered by the book—though some early examples mention 1.6.0.2—which is a rather large oversight.

I don't see this as an oversight, considering how rapid git development has been the past several years. One the one hand, having the entire book refer to one specific version has a feel of consistency and stability, on the other hand some may consider 1.6.0.2 too "old". The author usually has to decide how best to not give his book the appearance of obsolescence.

Just a thought... FWIW my Slackware (ver. 12.2) box has git 1.6.1.3.

One more

Posted Apr 14, 2009 0:58 UTC (Tue) by dmarti (subscriber, #11625) [Link]

This one by Jon Loeliger is in the pipe from O'Reilly: Version Control with Git. I like it -- Jon calls the index the index.

One more

Posted Apr 14, 2009 5:52 UTC (Tue) by PO8 (guest, #41661) [Link]

As somebody who helped to review/edit Loeliger's book, I can recommend it. It seems to me to have a good balance of theory and pragmatics and to be reasonably comprehensive for the beginning to intermediate Git user.

Book Review: Pragmatic Version Control Using Git

Posted Apr 14, 2009 2:28 UTC (Tue) by bronson (subscriber, #4806) [Link]

I thought the Git team agreed last year to start calling the index the "staging area". The idea was to use a less generic term for such an important concept.

Maybe this is like the metric system in the US... Most people agree it would be a good idea but it just can't seem to stick?

Book Review: Pragmatic Version Control Using Git

Posted Apr 14, 2009 13:25 UTC (Tue) by nye (guest, #51576) [Link]

I highly doubt it. It seems more like the addition of the new phrasing is considered a Bad Thing. eg:
http://marc.info/?l=git&m=123896158631869

staging vs index vs cache

Posted Apr 14, 2009 20:32 UTC (Tue) by rfunk (subscriber, #4054) [Link]

Sounds to me like a classic case of developers wanting to use names based
on internal concepts, and users/teachers wanting to use names based on
user-visible behavior. And in this case, the preference for "index"
appears to be primarily for historical reasons. (I can't really blame
them for that.)

As a user, I'd prefer to have both consistent user-visible naming and
naming that's consistent with the behavior I see. Based on that mailing
list post, the developers appear to be already stuck with the
inconsistency of "index" vs "cache", and don't want a third name even if
it's more consistent with the bahavior.

Even (especially) after reading the developer's explanation, I
think "index" is the wrong name for the concept and just adds confusion.
I'm not sure "cache" is much better, though it is a little better.

staging vs index vs cache

Posted Apr 17, 2009 16:58 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

Learning Git, I wasted far too much time and effort trying to reconcile what I was reading about "the index" with my understanding of "index" as a common English (and data processing) word.

In fact, I don't think I ever did figure out what the index is; I eventually had to give up on Git for lack of quality documentation. I'm looking forward to reading this book, especially now that I know the author isn't afraid to improve on classic terminology. Maybe, unlike so many engineers, he's able to place himself in the shoes of a beginner.

Book Review: Pragmatic Version Control Using Git

Posted Apr 16, 2009 16:04 UTC (Thu) by bronson (subscriber, #4806) [Link]

Oof, that's really too bad. The index / cache / staging ambiguity is probably the single most confusing part of trying to learn git. That's where I've had to spend most of my time explaining things anyway, merge commits are a distant second.

Most people have memorized "git diff --cached" as an idiom. The word "cache" doesn't appear anywhere in git add's help page, the word "index" doesn't appear anywhere in the help for --cached, and "staging" has been scattered all throughout the docs but rarely connects to "index" or "cache".

Now, what does "git-diff-index --cached "do? Why should this require yet another trip to the docs? It's a mess.

Consistent naming would make Git a lot more usable.

Unfortunately, this leads to quite a poor first impression. It has contributed to a lot of people believing that Git internals are complex and chaotic when they're actually simple and rather pretty.

Is my experience atypical? Is there any other concept that causes as much trouble for people trying to learn git? (an honest question, not rhetorical at all!)

Book Review: Pragmatic Version Control Using Git

Posted Apr 19, 2009 1:38 UTC (Sun) by jlokier (guest, #52227) [Link]

Your experience is similar to mine. I tried to learn Git twice, and decided it wasn't worth it both times. (I'll try again one of these days...)

I could understand some of the excellent tutorials which are around. The concepts are nice, and easy enough. I love data structures. But the terminology and command structure didn't help: I'd never have confidence that I wasn't going to accidentally wreck my repository - or someone elses - by accidentally not using the right set of obscure options to each command.

Book Review: Pragmatic Version Control Using Git

Posted Apr 23, 2009 14:37 UTC (Thu) by Duncan (guest, #6647) [Link]

I figured this was basic *ix sysadmin/developer strategy, but as soon as I
begin to get familiar enough with a command set I'm going to be using
regularly to discern what commands and variants I'll be using frequently,
I generally setup scriptlet/alias wrappers expressing the commands as I
interact with and think of them. Characteristically, these wrappers are
far shorter than the original commands, and devoid of slightly difficult
to touch-type characters like =/-/+. I might use g* to indicate git
commands, and gru for git remote update, for instance.

When there's enough scriptlet/alias task families that the short-form
namespaces start colliding, it's time to make them switchable. Put the
scriptlets in their own subdir and add it to the path. or source the alias
file to turn it on for instance, and remove it from the path and hash -r
or source the alias unsetter file to turn it off. It's also easy enough
to extend any supplied tab-completion scripts to include the wrappers as
well, making things even easier, as one then has /both/ tab-completion and
naturally/logically expressed wrappers to help them.

One quickly finds themselves with a set of 2-4 letter commands, obtuse to
others perhaps, but easy enough for the individual to work with as they've
been effectively customized to match his own thought processes. This
technique has certainly served me well over a reasonable range of task
families here, and generally, one either finds themselves learning the
commands and options the individual letters correspond to almost by
accident, if the mapping to the original commands is close enough (as I
did with my distribution's package management commands and options, for
instance), or can quickly lookup the wrapper to see what it does and
invoke the appropriate original command with appropriate modification as
necessary (as I've done with my kernel fetch/config/compile/install
scripts, for instance).

Book Review: Pragmatic Version Control Using Git

Posted May 12, 2009 0:56 UTC (Tue) by fbriere (subscriber, #4961) [Link]

> Is my experience atypical? Is there any other concept that causes as much trouble for people trying to learn git? (an honest question, not rhetorical at all!)

As much as I love git, it's been over a year, and I still occasionally get confused with checkout vs. reset, path vs. no path, --hard vs. --mixed, etc.

And wait until you realize that git will gladly let you commit to a detached HEAD. Be prepared to fish from the reflog if you checkout another branch before realizing what you did. (This was deemed a "cool" feature on the mailing-list, and is therefore unlikely to go away.)

Book Review: Pragmatic Version Control Using Git

Posted Apr 14, 2009 13:57 UTC (Tue) by vonbrand (subscriber, #4458) [Link]

First off, I got this book for review as a teaching/reference aid here at UTFSM. I had planned to buy it anyway.

All in all, I'm quite happy with the book (minor nits as given by Our Ilustrious Reviewer notwithstanding). But what I do really miss is discussion of the workflow (in our local work quite normal, and heavily used in "real projects" like git itself) sending patches by email, having an "integrator" who then publishes the result (after polishing, if need be) and the tools used.

Book Review: Pragmatic Version Control Using Git

Posted Apr 19, 2009 21:11 UTC (Sun) by oak (subscriber, #2786) [Link]

> It especially shines as a quick reference to commands needed daily or
nearly so

Why one would need a reference book for Git commands that one use (almost)
daily?

(I'm a newbish Mercurial user and at least its command line help seems
quite sufficient as a reference.)

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds