User: Password:
|
|
Subscribe / Log in / New account

Development

Race detection and more with ThreadSanitizer 2

By Nathan Willis
May 14, 2014

Race conditions are notoriously difficult to catch, in part because they can be difficult to spot and reproduce through traditional testing. For example, they may reveal themselves only under specific, hard-to-reproduce timing conditions. Consequently, some means of automatically detecting race conditions is sure-fire way to spare developers hours of labor. A team from Google has developed a tool designed to help catch at least a significant portion of race conditions in C, C++, and Go programs. The first incarnation of the tool, Thread Sanitizer (TSan), was built on top of Valgrind. But its recently unveiled update works as a compile-time utility instead, as an option that is available for both GCC and LLVM.

Google released its first implementation of TSan in 2009. It was a binary translation tool implemented for the Valgrind debugger that did dynamic detection of data races (as opposed to, say, race conditions in file access or threads competing for network resources). Specifically, TSan monitored the executing code of a program, noting memory-access events and synchronization events (such as locking events or signals), keeping track of the program state. When it noted, for example, two threads accessing the same object without proper locking, it would report the context of the potential race.

The algorithm used is described in a 2009 paper [PDF]; it was based on the algorithm used by the Helgrind race detector with improvements made for both speed and accuracy. The key question the algorithm tries to answer is whether or not two events have a "happens-before" relationship—that is, whether or not they can be demonstrably placed in order. In a single thread, this question is easy; with multiple concurrent threads, the state machine must keep track of each thread's context, assemble sequences of happens-before events, and try to connect discrete happens-before events into transitive sequences. All of this overhead contributes to execution slowdown, so the challenge is increasing the accuracy of the race-detection without reducing speed to a crawl.

The team used TSan to find nearly 180 bugs in various components of Google's Chromium browser, but its speed was slow enough (slowing down execution by a factor of 20 to 300, according to a 2012 presentation [PDF]) that it was not a feasible option to use in full-blown browser testing. Thus, they set out to rewrite TSan from scratch as a compile-time tool.

The TSan v2 project is hosted at Google Code, although the runtime library is actually developed inside the LLVM trunk, and is imported into the GCC trunk, so it is available in both compilers upstream. TSan v2 is BSD-licensed, and there are instructions available to compile it into GCC and Clang, if recent enough packages are not available from one's distribution.

For either compiler, TSan can be activated with the -fsanitize=thread switch, and it must be accompanied by the position-independent executable option (via the -fPIE switch) as well. The tool works by instrumenting almost every memory access in the target program, omitting the ones that are provably race-free (such as reads of constant global variables) or are redundant (such as a read that is followed immediately by a write to the same location). When the instrumented build is executed, the runtime library then starts a state machine to track all accesses to the instrumented memory regions, recording the thread ID (an integer assigned by TSan) and time stamp of each access, the access size and offset within the region, and whether or not the access is a write.

The maximum number of threads that the state machine will monitor for any given memory location is configurable (currently the options are 2, 4, and 8); if a particular region is accessed by more threads, the state machine overwrites its record of one of the already-tracked threads, selected at random. During execution, TSan steps through the target program and each time it encounters a memory access, compares it against the current machine state for the region. If it determines that the new access does not have a happens-before relationship with previous accesses to the same memory location, that constitutes a race and TSan reports a warning, indicating the file names and line numbers of the access events.

In addition to data races, TSan v2 can also detect other types of bugs, including deadlocks, destruction of locked mutexes, and use-after-free races. It also recognizes atomic operations, which allows it to detect races in lockless code. TSan v2 works for C and C++, as well as for Google's Go language, and is reportedly twenty times faster than the original TSan implementation. For the Chromium development team, that makes the new tool fast enough to use when testing the full browser, not just unit tests, and the team has reportedly used it to find more than 100 additional bugs.

Naturally, no tool can claim to catch all race conditions, and there is some risk of false positives (which is why TSan v2 warns only of "possible" data races), but TSan v2 does offer some advantages over other race detectors like Helgrind and DRD. Speed is certainly an important factor; the Valgrind-based tools take a performance hit due to Valgrind's binary translation. The TSan developers also claim to be able to detect more types of data races than the older tools, although there has not been a detailed head-to-head comparison. On the other hand, TSan v2 is limited to 64-bit architectures, due to the limited address space of 32-bit system.

Perhaps the biggest point in favor of TSan v2, though, is the fact that it can be easily integrated into essentially any GCC- or LLVM-based build process. It first debuted in GCC 4.8 and in Clang 3.2. Considering the enormous percentage of Linux software built with one of those two compilers, that makes the tool readily available to a wide variety of free-software projects at the flip of a compiler switch.

Comments (2 posted)

Brief items

Quotes of the week

Personally, I think it would be *awesome* if our regression tests started failing due to the establishment of Mars/Mons_Olympus as a real time zone.
Robert Haas (hat tip to Martijn van Oosterhout)

Despite the fact that I spend hundreds of dollars a year and hours of work to host my own email server, Google has about half of my personal email! Last year, Google delivered 57% of the emails in my inbox that I replied to. They have delivered more than a third of all the email I’ve replied to every year since 2006 and more than half since 2010. On the upside, there is some indication that the proportion is going down. So far this year, only 51% of the emails I’ve replied to arrived from Google.

The numbers are higher than I imagined and reflect somewhat depressing news. They show how it’s complicated to think about privacy and autonomy for communication between parties. I’m not sure what to do except encourage others to consider, in the wake of the Snowden revelations and everything else, whether you really want Google to have all your email. And half of mine.

Benjamin Mako Hill

Comments (11 posted)

PyPy 2.3 released

The PyPy project has released version 2.3 of its high-performance implementation of the Python language. Along with a number of fixes, this release includes support for several new modules, the ability to embed the interpreter within hosting applications, OpenBSD support, and more.

Comments (2 posted)

digiKam Software Collection 4.0.0 released

DigiKam 4.0 has been released. The image editor and photo-management tool includes several new features in this update, such as a tool to organize image tags hierarchically, and automatic image-tagging tool, and considerable work refactoring the code for the transition to Qt5. We took a preview look at this release in early April.

Full Story (comments: none)

Emacspeak 40.0 available

Version 40.0 of the Emacspeak audio desktop environment has been released. Highlights include updates to the EWW web browser as well as the feed reader, web-search engine, and map-searching tool.

Comments (none posted)

Git v1.9.3 and 2.0.0-rc3 available.

Git version 1.9.3 has been released. The update consists mainly of bug fixes, including replacing some shell constructs that did not work well on FreeBSD, fixing the handling of some zero-width Unicode codepoints, and fixing a security flaw in the PROMPT_COMMAND shell prompt interface. This is anticipated to be the last maintenance release in the 1.9 series.

In addition, the third release candidate for Git 2.0.0 has also been released. There are quite a few changes compared to the current stable Git series, including revisions to the interface and Git workflow, plus new features. The changes over the previous release candidate are few, however, so Git 2.0 would appear to be on its way to a stable release. Should no surprises be found in rc3; Git 2.0 will be the next release.

Full Story (comments: none)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Balazs: Results of Card Sorting the KDE System Settings

At his blog, Björn Balazs reports the results of a "card sorting" test performed on KDE's configuration options—which, at times, have been described as overly complex. "Participants are asked to build a structure that fits best their representation by creating groups and sorting all items (aka index cards in the offline world) into these groups. Items sorted into the same group get higher similarity value. And the statistical evaluation aggregates the individual grouping to an average model represented by a dendrogram," he explained. Perhaps unsurprisingly, the dendograms produced by users differ from the hierarchy shown in the KDE menus, which suggests some areas for improvement. "Based on this list we will outline an idea how to get both simple access to a module and the full set of features without introducing a further navigation."

Comments (1 posted)

Page editor: Nathan Willis
Next page: Announcements>>


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds