LWN.net Logo

Hw-breakpoint: shared debugging registers

September 16, 2009

This article was contributed by Jon Ashburn

Modern processors support hardware breakpoint or watchpoint debugging functionality, but the Linux kernel does not provided a way for debuggers, such as kgdb or gdb, to access these breakpoint registers in a shared manner. Thus, debuggers running concurrently can easily collide in their use of these registers, causing the debuggers to act in a strange and confusing manner. For example, continuing execution through a breakpoint, rather than breaking, would certainly confuse a programmer.

This issue is being addressed by a proposed kernel API called hw-breakpoint (alternatively hw_breakpoint). The hw-breakpoint functionality, developed in a series of patches by K. Prasad, Frederic Weisbecker, and Alan Stern, aims to provide a consistent, portable, and robust method for multiple programs to access special hardware debug registers. These registers are useful for any application that requires the ability to observe memory data accesses, or trigger the collection of program information based on data accesses. Such applications include debugging, tracing, and performance monitoring. While these patches initially target the x86, they attempt to provide a generic API that can be supported in an architecture independent manner on various processors. Although the details are still being ironed out, with hw-breakpoint hardware debug resources can be concurrently available to various users in a more portable manner.

The most common debugging scenarios that would use the hw-breakpoint patches are memory corruption bugs. Programming mistakes such as bad pointers, buffer overruns, and improper memory allocation/deallocation can lead to memory corruption where valid data is accidentally overwritten. These bugs can be hard to find; the corruption can occur anywhere in the program. The error resulting from the corruption often occurs long after the corruption. These bugs cannot typically be found by focusing on the local sections of code that explicitly access the corrupted data. Instead, debugger watchpoints, which are a special type of breakpoint, are the first choice for debugging memory corruption problems.

Debugger breakpoints halt program execution at a given address and transfer control to the debugger. This allows the program state (variables, memory, and registers) to be examined. When programmers talk of breakpoints they usually are referring to software breakpoints. For example, in gdb the break command sets a software breakpoint at the specified instruction address. The break command replaces the specified instruction with a trap instruction that, when executed, passes control to gdb.

In contrast, watchpoints are best implemented using hardware breakpoints; software implementations of watchpoints are extremely slow. But, hardware breakpoints require special debug registers in the processor. These debug registers continuously monitor memory addresses generated by the processor, and a trap handler is invoked if the address in the register matches the address generated by the processor.

Memory accesses can be for data read, data write, or instruction execute (fetch), so hardware breakpoints usually support trapping on not only the address, but also the type of access: read, write, read/write, or execute. Hardware debug registers may also support trapping on IO port accesses in addition to memory accesses. In either case, a watchpoint is a trap on any type of data access rather than just an instruction execute access. Since memory corruption can happen anywhere in the program, a watchpoint set to trap on writes to the corrupted variable/location can be a good way to catch these bugs in the act.

These hardware debug registers are limited resources: Intel x86 processors support up to four hardware breakpoints/watchpoints using the special purpose DR0 to DR7 registers. Registers DR0 to DR3 can be programmed with the virtual memory address of the desired hardware breakpoint or watchpoint. DR4 and DR5 are reserved for processor use. DR6 is a status register that gives information about the last breakpoint hit, such as the register number of the breakpoint, and DR7 is the breakpoint control register. DR7 includes controls such as, local and global enables, memory access type, and memory access length. However, as with any limited hardware resource, multiple software users must contend for access of these registers.

Since existing released kernels do not control or arbitrate access to these registers, software users can unknowingly clash in their usage, which usually will result in a software error or crash. Hw-breakpoint solves this problem by arbitrating the access to these limited hardware registers from both user-space and kernel-space software. User-space access, such as from gdb, is done via the ptrace() system call. Kernel-space access includes kgdb and KVM (only during context switches between host and guests). Hw-breakpoint arbitration keeps kernel and/or user space debuggers from stepping on each others' toes .

Additional kernel patches have been developed to take advantage of the hw-breakpoint API. A plug-in for ftrace (ftrace has previously been discussed in LWN articles here and here) has been developed to dynamically trace any kernel global symbol. This functionality, called ksym_tracer, allows all read and write accesses on a kernel variable to be displayed in debugfs. Since it uses the hw-breakpoint API, it relies on underlying hardware breakpoint support. This new feature of ftrace could be very useful for memory corruption bugs that are difficult to catch with watchpoints. These difficulties include such things as: 1) an erroneous write that is lurking beneath a large quantity of valid writes, 2) the necessity to setup a remote machine to run Kgdb, and 3) kernel bugs which no longer manifest themselves when the machine is halted via breakpoints. Hw-breakpoint allows the concurrent use of both ksym_tracer and debugger watchpoints without the risk of hardware debug register corruption.

In addition to ftrace, perfcounters (see LWN articles here and here) can be enhanced through the generic hw-breakpoint functionality. Specifically, counters can be updated based on data accesses rather than instruction execution. A patch to perfcounters has been developed to use kernel-space hardware breakpoints to monitor performance events associated with data accesses. For example, spinlock accesses can be counted by monitoring the spinlock flag itself. Currently this patch is rather limited in supporting the definition and use of breakpoint counters. However, additional features are planned.

Since the additions to ftrace and perfcounter patches, the hw-breakpoint API can now be potentially used by several pieces of code: kgdb, KVM, ptrace, ftrace, and perfcounters. This increased potential usage has resulted in increased scrutiny of the API by various developers: hw-breakpoint is no longer solely of concern to debugger developers. This increased scrutiny has resulted in major changes to the hw-breakpoint code that are still ongoing. In particular, the coupling of perfcounters to hw-breakpoint has caused the rethinking of a significant chunk of the original hw-breakpoint functionality and structure.

The original (pre-perfcounter support) hw-breakpoint functionality was primarily developed by K. Prasad. It supported global, system-wide kernel-space breakpoints and per-thread user-space breakpoints. Whereas user-space breakpoints were only enabled during thread execution, kernel breakpoints were always present on all CPUs in the system. Additionally, no reservation policy was implemented. Requests for hardware debug registers were granted on a first-come, first-serve basis. Once all physical debug registers were used, hw-breakpoint returned an error for further breakpoint requests.

This original hw-breakpoint implementation is "an utter mis-match" to support perfcounter functionality for three reasons, as pointed out by Peter Zijlstra. First, counters (either user or kernel-space) can be defined per-cpu or per-task; this conflicts with hw-breakpoint's system-wide kernel breakpoints. Second, per-task counters are scheduled by perfcounter to save unnecessary context swaps of the underlying hardware resources when it is not necessary. Third, counters can be multiplexed, in a time-sliced fashion, beyond the underlying hardware PMUs (performance monitoring unit) resource limit, which for x86 hardware breakpoints is four. These incongruities between perfcounter and hw-breakpoint led to a debate about any coupling between hw-breakpoint and perfcounter. However, a consensus formed that integrating hw-breakpoint into perfcounter's PMU reservation and scheduling infastructure would be beneficial given perfcounters richer support for scheduling, reservation, and management of hardware resources. About these benefits Frederic Weisbecker writes:

And in the end we have a pmu (which unifies the control of this profiling unit through a well established and known object for perfcounter) controlled by a high level API that could also benefit to other debugging subsystems.

Newly posted in the last week is Weisbecker's patch to integrate hw-breakpoint and perfcounter code. Conceptually, this splits the hw-breakpoint functionality into two halves: 1) the top level API, and 2) the low level debug register control. In between these halves lies the perfcounter functionality. With this patch each breakpoint is a specific perfcounter instance called a breakpoint counter. Perfcounter handles register scheduling, and thread/CPU attachment of these breakpoint counter instances. The modified hw-breakpoint API still handles requests from ptrace(), ftrace, and kgdb for breakpoints by creating a breakpoint counter. Breakpoint counters can also be created directly from the existing perfcounter system call (perf_counter_open()). The breakpoint counter layer interacts with the low-level, architecture specific hw-breakpoint code that handles reading and writing the processor's debug registers.

Unfortunately, because of the very recent integration into perfcounters, the hw-breakpoint API has changed and additional changes to the API are planned. Rather than cover in detail the existing API, since it appears likely to change, I will give a summary of it. Two Function calls are provided to set a new hardware breakpoint.

     int register_user_hw_breakpoint(struct task_struct *tsk, struct hw_breakpoint *bp);
     int register_kernel_hw_breakpoint(struct hw_breakpoint *bp, int cpu);
where:
     cpu   is the cpu number to set the breakpoint on;
     *tsk  is a pointer to 'task_struct' of the process to which the address belongs;
     *bp   is a pointer to the breakpoint property information which includes:
             1) a pointer to function handler to be invoke upon hitting the breakpoint; 
             2) a pointer to architecture dependent data (struct arch_hw_breakpoint).
The struct arch_hw_breakpoint provides breakpoint properties such as the memory address of the breakpoint, type of memory access (read/write, read, or write), and the length of memory access (byte, short, word, ...). These parameters are highly dependent upon the specific support provided by the hardware. For example, while x86 supports virtual memory addresses, other processors support physical memory addresses. Since the API aims for architecture independence, this structure is architecture dependent.

To avoid having to register and unregister a breakpoint if it just needs modification, the following function is provided:

    int modify_user_hw_breakpoint(struct task_struct *tsk, struct hw_breakpoint *bp)
Hardware breakpoints are removed by an unregister function:
    void unregister_hw_breakpoint(struct hw_breakpoint *bp)

Hw-breakpoint has made its way into the -tip tree, the kernel source development tree maintained by Ingo Molnar. In June it was tentatively targeted for merging from -tip into the 2.6.32 kernel. However, the delayed integration with perfcounters has pushed any merge out past 2.6.32.

Whenever it is released, hw-breakpoint promises to provide a portable and robust method for debuggers to access hardware breakpoints without conflict. While the hw-breakpoint functionality started out as a relatively isolated feature to support debuggers, its existence has spawned new tracing and performance monitoring features. These new features should prove useful for various situations where data memory access, rather than instruction access provides the appropriate trigger to collect dynamic information. By leveraging the perfcounter resource scheduling and reservation functionality, hw-breakpoint has a very generalized method for managing limited hardware breakpoint registers. The release of hw-breakpoint promises to enable new ways for Linux users to track down difficult bugs such as memory corruption, and to enable diverse dynamic data access techniques (such as gdb watchpoints and ftrace ksym_tracer) to play well together.


(Log in to post comments)

Hw-breakpoint: shared debugging registers

Posted Sep 25, 2009 18:33 UTC (Fri) by HalfMoon (guest, #3211) [Link]

I'll be skeptical about the framework untill I see it working with three different hardware implementations ... and working with GDB. Such things can be tricky to get right!

After x86, I'd suggest the EmbeddedICE as seen on ARM9 chips ... Linux will need to implement an in-kernel debug monitor. ARM9 chips, notably ARM926, are extremely common. An alternative might be the newer Cortex-A8 chips; extremely interesting, and with far richer debug hardware, but for the moment they're only really visible in OMAP3 chips (and thus for example BeagleBoard.org hardware).

One issue there will be how to interact with JTAG debuggers. Better maybe to call it an "opportunity"; this might be the guts of the hook needed for tools like OpenOCD to support debugging of non-kernel code.

ARM hardware has some pretty serious integrated hardware debug tools. Does x86 include stuff like an Embedded Trace Module (ETM)? On ARM those sometimes hook up to the breakpoint/watchpoint hardware to get a few more sources of trace triggers. You can do things like record per-instruction costs for the Nth invocation of a given function. Presumably this work would eventually want to coexist with that too.

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds