LWN.net Logo

Bazaar on the slow track

Bazaar on the slow track

Posted Sep 11, 2012 21:31 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)
Parent article: Bazaar on the slow track

Another point - bzr has a very confusing model. It's a DVCS for projects with the central 'trunk'.

Git is conceptually simple (it's just a list of hash-linked diffs between revisions). Mercurial is more complicated (it tracks tree changes) but by now it's not really that different from git in functionality.

Bzr stands out among them. And for reasons that are not really clear.


(Log in to post comments)

Bazaar on the slow track

Posted Sep 11, 2012 21:42 UTC (Tue) by juliank (subscriber, #45896) [Link]

It supports automatically pushing stuff on commit, but that's not really required or needed. You can just use bzr branch instead of bzr checkout to create a branch that is disconnected from the parent one.

Bazaar just adopted common terminology: checkout works the same way as in svn, branch (aka get/clone, but those are deprecated) works like clone in git.

Bazaar on the slow track

Posted Sep 11, 2012 22:03 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

No, I'm talking about the way it does versioning. There is no clear way to unambiguously refer to a commit, because version numbers are repository-dependent and are clearly designed with the idea of the 'central' repository.

For example: http://bazaar.launchpad.net/~bzr-pqm/bzr/bzr.dev/view/hea...
You see that this file has revision 6558. This version is repository-local as there's no way to create a distributed numbering algorithm without synchronization points (mathematically, bzr revisions are a completely ordered set). This fact underlines all the bzr design - it's ridiculously hard to work in a truly distributed manner with bzr. There's even that scary threat of renumbering, where numbers in the trunk _change_.

In comparison, hg and git are truly distributed - they're using hashes to identify commits: http://selenic.com/repo/hg/rev/8fea378242e3 This design makes sure that there's no single global ordering of commits, but there is always a clearly-defined local ordering (i.e. git/hg commits form a partially ordered set).

Bazaar on the slow track

Posted Sep 11, 2012 22:23 UTC (Tue) by james_w (subscriber, #51167) [Link]

Click "View revision" and you will find the unique id of that revision: pqm@pqm.ubuntu.com-20120905205226-8s3bzolvduug3ifj. That id will never change. The revision numbers are just for convenience when you know the context you are talking about. If you don't know the context then use the ids.

Bazaar on the slow track

Posted Sep 11, 2012 22:32 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Yet almost all bzr tools insist on working and showing revision numbers, rather than the unwieldy global IDs. There are lots more of smaller annoyances in bzr that are simply not present in git/hg.

For instance, try to google your ID - it's not present in any publically-crawled repository viewers.

Bazaar on the slow track

Posted Sep 11, 2012 23:10 UTC (Tue) by marcH (subscriber, #57642) [Link]

I've used CVS and SVN extensively. I've used git extensively. And also a few others from time to time. In theory this should give me plenty enough background to easily understand Bazaar's data model(s), shouldn't it? Yet not. The few times I tried to run a few basic Bazaar commands I got completely confused by this very strange mix of centralized AND distributed models. Or by other things? I guess I'll never know.

I much prefer a simple, clear and sound data model over a familiar and supposedly "user-friendly" interface.

On the other hand I've met a significant number of people who want to know as little as possible about version control *in general* (I know this is wrong but what can you do?). They just want to run the same and very small subset of commands again and again to publish their work without absolutely any interest for what happens behind the scenes nor for any other actual version control feature. git's complex and inconsistency command line makes their life extremely difficult. They would probably much prefer Bazaar. As noted in the article, this type of lusers would also be extremely unlikely to contribute to any VC tool in any way.

Great quote: "A common response I get to complaints about Git’s command line complexity is that “you don’t need to use all those commands, you can use it like Subversion if that’s what you really want”. Rubbish. That’s like telling an old granny that the freeway isn’t scary, she can drive at 20kph in the left lane if she wants."
http://steveko.wordpress.com/2012/02/24/10-things-i-hate-...

Sometimes I wish I weren't that comfortable with git because that makes me too lazy now to try and learn Mercurial...

Bazaar on the slow track

Posted Sep 12, 2012 2:33 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> Sometimes I wish I weren't that comfortable with git because that makes me too lazy now to try and learn Mercurial...

I tried using it for a few hours and my git experience made me lose some changes[1]. I'd call it data loss, but the devs consider it just a bug. Sure, they're both a DVCS, but I think starting from step one is probably easier when learning a new one than trying to make analogies based on my experience with darcs (XMonad repos and some other Haskell stuff), hg (tried to make a patch for mutt and udiskie), and one attempt to do something with bzr (don't even remember what it was, but zsh's vcs_info plugin for it is/was molasses (as in 5s for a prompt to appear with just the branch name and "dirty" status)). Every time, I get the feeling "I'd rather use git", but that's probably just familiarity talking.

[1]http://bz.selenic.com/show_bug.cgi?id=3423

Bazaar on the slow track

Posted Sep 11, 2012 23:32 UTC (Tue) by nix (subscriber, #2304) [Link]

it's just a list of hash-linked diffs between revisions
ITYM 'it's just a parent-linked tree of filesystem tree snapshots in a content-addressable store'.

(Sure, the *pack implementation* happens to be delta-compressed, but one of git's very nice features, shared with bzr as it happens, is that this is not visible to the user at all: the conceptual model and the storage mechanism are completely decoupled. Recently-added (loose) objects, note, are gzipped but not delta-compressed at all, but the user need not care.)

Bazaar on the slow track

Posted Sep 13, 2012 7:51 UTC (Thu) by mbp (subscriber, #2737) [Link]

bzr's model is a DAG with special emphasis given to the path from the branch tip back to the origin through left-hand ancestors. It's quite mathematical and not that hard to understand.

In many projects, after a patch/feature/fix is merged to trunk, the history of just how that patch was written becomes relatively unimportant: to start with, people looking at history just want to see "jdoe fixed bug 123". One approach is to literally throw that history away and just submit a plain patch, as is often done with git. I wanted to try something different that would keep all the history, but also have a view of which path through the dag was the main history. (You can also do the prior one in bzr of course.)

The other major difference with bzr is that revisions are hashed for integrity, but primarily identified by assigned ids. This avoids transitions when the representation changes and allows directly talking about revisions in foreign systems. But, hash or not, they still have globally unique ids.

Bazaar on the slow track

Posted Sep 13, 2012 19:42 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

> One approach is to literally throw that history away and just submit a plain patch, as is often done with git. I wanted to try something different that would keep all the history, but also have a view of which path through the dag was the main history.

note that git doesn't force you to throw away the history.

If you pull from the mainline, create your patch, and send a pull request, your history will show up in the main repository.

you can even edit your patch history prior to sending the pull request. This is commonly done by people doing major changes as it lets them clean things up and make each patch 'correct' and self contained rather than showing the reality where one patch may introduce a bug that's fixed several patches later.

the only question is defining "which path was the main history" because git really doesn't define a "main history".

Bazaar on the slow track

Posted Sep 19, 2012 14:22 UTC (Wed) by pboddie (subscriber, #50784) [Link]

note that git doesn't force you to throw away the history.

I think that the intention may have been to describe the apparently common practice, particularly amongst git-using projects, of aggressively rebasing everything and asking people to collapse their contributions into a single patch so that the project history is kept "clean".

Bazaar on the slow track

Posted Sep 19, 2012 15:31 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

one of the features of git is the ability to recreate and combine patches before pushing them upstream.

Yes, this can be abused to combine a huge amount of work into one monster patch.

But it can be used sanely to re-order and combine patches from a line of development into a clean series of logical patches.

When you are developing something, if you check it in frequently as you go along, you are going to have cases where you introduce a bug at one point and don't find and fix it for several commits. You are also going to have things that you do at some point that you find were not the best way to do something and that you change back later (but want to keep other work you have done in the meantime)

you now have the option of either pushing this history, including all your false starts, bugs, etc.

Or you can clean the history up, combining the bugfixes with the earlier patches that introduced the bug, eliminating the false starts, etc and push the result.

The first approach has the advantage that everything is visible, but it has the disadvantage that there are a lot of commits in the history where things just don't work.

If the project in question encourages the use of bisect to track down problems, having lots of commits where things just don't work makes it really hard for users trying to help the developers track down the bugs.

As a result, many projects encourage the developers to take the second approach.

Now, many developers misunderstand this to mean that they are encouraged to rebase their entire development effort into one monster patch relative to the latest head, but that's actually a bad thing to do.

And in any case, the history is still available to the developer, they are just choosing not to share that history with the outside world.

Bazaar on the slow track

Posted Sep 19, 2012 19:52 UTC (Wed) by smurf (subscriber, #17840) [Link]

What he said.

A "clean" history (meaning "to the best of my knowledge, every change transforms program X1 into a strictly better program X2") means that you can take advantage of one of git's main features when you do find a regression.

Bisecting.

If you do break something, "git bisect" requires ten compile-test-run cycles to find the culprit, among a thousand changes. Or twenty cycles if you have a million changes. (OK, more like 13 and 30, because history isn't linear, but you get the idea.) If you try to keep track of that manually you'd go bonkers.

Of course this isn't restricted to git. bzr and hg also implemented the command. The idea was too good not to. ;-)
I don't know how well they do in finding reasonable bisection points in a complex revision graph; git's algorithm is very good these days.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds