| Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
It's a common developer practice to track down a bug by looking for the change that introduced it. This is most efficiently done by performing a binary search between the last known working commit and the first known broken commit in the commit history. git bisect is a feature of the Git version control system that helps developers do just that.
git bisect may also be well known by LWN readers for heated discussions on the Linux kernel mailing list about "asking" (or "forcing" depending on the point of view) users to find the bad commit when they report a regression. But a little-known addition, git bisect run, can allow a developer to completely automate the process. This can be very useful and may enable switching to interesting new debugging workflows.
At each step of the binary search, git bisect checks out the source code at the commit chosen by the search. The user then has to test to see if the software is working or not. If it is, the user performs a git bisect good, otherwise they do a git bisect bad, and the search proceeds accordingly. This is different than the idea behind git bisect run, as it uses a script or a shell command to determine if the source code—which git bisect automatically checked out—is "good" or "bad".
This idea was suggested by Bill Lear in March 2007, and I implemented it shortly thereafter. It was then released in Git 1.5.1.
Technically, the script or command passed to git bisect run is run at each step of the bisection process, and its exit code is interpreted as "good", if it's 0, or "bad", otherwise (except 125 and values greater than 127, see the git bisect documentation for more information.)
One simple and yet useful way to take advantage of that is to use git bisect run to find which commit broke the build. Some kernel developers like this very much. Ingo Molnar wrote:
For example, with a not too old Git (version 1.5.2 or greater), bisecting a build bug in the Linux kernel may be just a matter of launching:
git bisect start linux-next/master v2.6.26-rc8
git bisect run make kernel/fork.o
because the git bisect start command, when it is passed two (or more) revisions, here "linux-next/master" and "v2.6.26-rc8", interprets the first one as "bad" and the other ones as "good".
This works as follows: git bisect checks out the source code of a commit to be tested, then runs make kernel/fork.o. make will exit with code 0 if it builds, or with something else (usually 2) otherwise. This gets recorded as "good" or "bad" for the commit that was checked out, which will enable the binary search to continue by finding another commit to check out, then run make again, and so on, until the first "bad" commit in the history is found.
But to bisect regressions that manifest themselves on the running code, as opposed to build problems, it's usually more complicated. You probably have to write a test script that should be passed to git bisect run.
For example, a test script for an application built with make and printing on its standard output might look like this:
#!/bin/sh
make || exit 125 # an exit code of 125 asks "git bisect"
# to "skip" the current commit
# run the application and check that it produces good output
./my_app arg1 arg2 | grep 'my good output'
See this message from Junio Hamano, the Git maintainer, for explanations and a real world example of git bisect run used to find a regression in Git. The git bisect documentation has some short examples too.
It's even trickier for kernel hackers, because you have to reboot the computer each time you want to test a new kernel, but some kernel hackers suggest that it be used anyway if the problem is "reproducible, scriptable, and you have a second box". Ingo Molnar describes his bisection environment this way:
So it's possible to use git bisect run on a wide array of
applications. This means that, for example, automatically in
your nightly builds, you can find the commit that broke the build or the test
suite, and then use information from it to send a flame
warning
email to the developer responsible for that.
But what may be more interesting is that fully automated bisection may enable new workflows. On the git mailing list, Andreas Ericsson, a Git developer, reported:
So it requires a little more work to make sure that every commit is small and easily bisectable. Then, to debug regressions, they follow these steps:
This may seem more complicated than a traditional workflow. But when asked about it, Andreas says:
So this kind of workflow is good to take advantage of test cases you write. But what about global productivity? Four months after having said that he uses git bisect run, Andreas Ericsson wrote that git bisect "is well-nigh single-handedly responsible for reducing our average bugreport-to-fix time from 4 days to 6 hours".
Now, after more than one year of using it, he gives the following details:
So quality costs, but, when using the right tools and workflows, it can bring in a rather nice return on investment!
Fully automated bisecting with "git bisect run"
Posted Feb 5, 2009 2:13 UTC (Thu) by ncm (subscriber, #165) [Link]
Fully automated bisecting with "git bisect run"
Posted Feb 5, 2009 3:48 UTC (Thu) by dtlin (✭ supporter ✭, #36537) [Link]
Fully automated bisecting with "git bisect run"
Posted Feb 5, 2009 5:58 UTC (Thu) by christian_couder (subscriber, #56350) [Link]
Exit code 125 tells "git bisect" to use "git bisect skip". "git bisect skip" marks the current commit as untestable and checks out another one nearby to be tested.
See the "git bisect" documentation for more information:
http://www.kernel.org/pub/software/scm/git/docs/git-bisec...
Fully automated bisecting with "git bisect run"
Posted Feb 5, 2009 7:44 UTC (Thu) by dlang (subscriber, #313) [Link]
how big a problem this is depends on what your failure condition is.
if you know that when it fails it generates message X then you just look for message X and mark everything else as 'good' (it may actually crash and not do anything useful, but it's not the bug you are looking for)
if you are looking for a hang (or failure to boot like Ingo did in one example) then it's harder, you may end up going down the wrong path becouse some other bug is causing the problem (failing to boot in this example)
Bisect on patchset boundaries ?
Posted Feb 6, 2009 10:33 UTC (Fri) by lbt (subscriber, #29672) [Link]
A bisect that throws you into the middle of a patch set that messes with your filesystem is a dangerous place to be; especially when you ask 'normal' users to run a bisect on their everyday machines without warning them that they are potentially about to expose their data to random collections of code.
The problem is that all patches are supposed to be non-toxic - and git is deliberately not good at revisionist history ;)
That means that marking 'safe' bisect points is hard - but maybe a step in the right direction would be worth cutting on patchset boundaries?
Or maybe an external (well, inside .git/) list of 'good' commits eg rc releases?
Obviously this would make the bisect slightly less efficient but it may reduce the risk.
Other similar tools
Posted Feb 5, 2009 10:30 UTC (Thu) by epa (subscriber, #39769) [Link]
An alternative is to randomly generate in-between files to find what difference causes the change. I believe DD.py <http://www.st.cs.uni-saarland.de/dd/ddusage.php3> is a tool for doing this.
Finally, a plug for delta <http://delta.tigris.org/> which isn't quite the same thing, but will automatically generate a minimal test case given a larger one.
Fully automated bisecting with "git bisect run"
Posted Feb 7, 2009 18:19 UTC (Sat) by oak (guest, #2786) [Link]
Fully automated bisecting with "git bisect run"
Posted Feb 9, 2009 18:19 UTC (Mon) by droundy (subscriber, #4559) [Link]
Fully automated bisecting with "git bisect run"
Posted Feb 8, 2009 16:06 UTC (Sun) by wfranzini (guest, #6946) [Link]
People interested in a workflow that includes tests can look at Aegis (http://aegis.sf.net/). It also has an aebisect(1) command that run the command being investigated.
Fully automated bisecting with "git bisect run"
Posted Feb 10, 2009 4:41 UTC (Tue) by christian_couder (subscriber, #56350) [Link]
The long usage message is:
$ git bisect help
Usage: git bisect [help|start|bad|good|skip|next|reset|visualize|replay|log|run]
git bisect help
print this long help message.
git bisect start [<bad> [<good>...]] [--] [<pathspec>...]
reset bisect state and start bisection.
git bisect bad [<rev>]
mark <rev> a known-bad revision.
git bisect good [<rev>...]
mark <rev>... known-good revisions.
git bisect skip [(<rev>|<range>)...]
mark <rev>... untestable revisions.
git bisect next
find next bisection to test and check it out.
git bisect reset [<branch>]
finish bisection search and go back to branch.
git bisect visualize
show bisect status in gitk.
git bisect replay <logfile>
replay bisection log.
git bisect log
show bisect log.
git bisect run <cmd>...
use <cmd>... to automatically bisect.
Please use "git help bisect" to get the full man page.
So you have to use "git bisect run <cmd>..." to automatically run the command you investigate. And if you don't want to automatically run the command, you can test by yourself at each step of the binary search and then use "git bisect good" or "git bisect bad" or "git bisect skip" depending on the result of your tests.
Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds