Jump label

By Jonathan Corbet
October 27, 2010

The kernel is filled with tests whose results almost never change. A classic example is tracepoints, which will be disabled on running systems with only very rare exceptions. There has long been interest in optimizing the tests done in such places; with 2.6.37, the "jump label" feature will make those tests go away entirely.

Consider the definition of a typical tracepoint, which, behind all of the preprocessor madness, looks something like:

    static inline trace_foo(args)
    {
  	if (unlikely(trace_foo_enabled))
	    goto do_trace;
        return;
    do_trace:
	/* Actually do tracing stuff */
    }

The cost of a test for a single tracepoint is essentially zero. The number of tracepoints in the kernel is growing, though, and each one adds a new test. Each test must fetch a value from memory, adding to the pressure on the cache and hurting performance. Given that the value almost never changes, it would be nice to find a way to optimize the "tracepoint disabled" case.

In 2.6.37, this tracepoint can be rewritten using a new macro:

    #include <linux/jump_label.h>

    #define JUMP_LABEL(key, label)		\
	if (unlikely(*key))			\
		goto label;

The nice thing is that JUMP_LABEL() does not have to be implemented like that. It can, instead, (1) note the location of the test and the key value in a special table, and (2) simply insert a no-op instruction. That reduces the cost of the test (and the tracepoint) to zero for the common "not enabled" case. Most of the time, the tracepoint will never be enabled and the omitted test will never be missed.

The tricky part happens when somebody wants to enable the tracepoint. Changing its status now requires calling one of a pair of special functions:

    void enable_jump_label(void *key);
    void disable_jump_label(void *key);

A call to enable_jump_label() will look up the key in the jump label table, then replace the special no-op instructions with the assembly equivalent of "goto label", enabling the tracepoint. Disabling the jump label will cause the no-op instruction to be restored.

The end result is a significant reduction in the overhead of disabled tracepoints. This feature only works on architectures which support it (x86 only, at the moment) and only with relatively recent versions of GCC; otherwise the preprocessor version is used.

Index entries for this article
Kernel	Jump label

Jump label

Posted Oct 28, 2010 5:37 UTC (Thu) by daney (guest, #24551) [Link] (2 responses)

Two points:

1) Jump label works on SPARC and MIPS too, not just x86.

2) Jump label doesn't work anywhere reliably due to bugs in current GCC releases.

Jump label

Posted Oct 28, 2010 21:41 UTC (Thu) by nevets (subscriber, #11875) [Link] (1 responses)

It's looking like a bug just with i386. I think all other archs are fine (including x86_64). In fact, it is caused by the same thing that broke the function tracer a while back ago, with the craziness gcc does without the -maccumulate-outgoing-args option.

Jump label

Posted Oct 29, 2010 21:18 UTC (Fri) by daney (guest, #24551) [Link]

Indeed, although it now looks like the GCC issue has been sorted out, so those that like living on the bleeding edge can have their jump label on i386 too.

Jump label

Posted Oct 28, 2010 7:19 UTC (Thu) by epa (subscriber, #39769) [Link] (2 responses)

Thus we take another step closer to the self-compiling kernel.

Jump label

Posted Oct 29, 2010 11:11 UTC (Fri) by marcH (subscriber, #57642) [Link] (1 responses)

Coming soon: JIT

Jump label

Posted Oct 29, 2010 15:31 UTC (Fri) by nix (subscriber, #2304) [Link]

Of course jit is coming soon. We already *have* git, so this should just be a matter of a few ++s away (on a big-endian architecture).

Jump label

Posted Oct 28, 2010 7:40 UTC (Thu) by tcourbon (guest, #60669) [Link] (4 responses)

Is there anyway to measure the actual overhead caused by the test in the current kernel ? To my naive eyes this thing looks complicated (at least from the developer point of view).

(Furthermore does the x86 architecture include x86_64 ? If not it looks, again, over complicated for a not so impressive expected result.)

Trade-off

Posted Oct 28, 2010 13:22 UTC (Thu) by CChittleborough (subscriber, #60775) [Link] (2 responses)

With the new approach, there is very little overhead in the instrumented code: either a NOP or an unconditional branch. The overhead has been moved to compile time (setting up the table of 'jump labels') and the time at which jumps are turned on or off (by calling enable_jump_label() or disable_jump_label(), which use the table to replace the NOP with a branch or vice versa).

Trade-off

Posted Oct 28, 2010 14:50 UTC (Thu) by nix (subscriber, #2304) [Link]

It looks more than slightly reminiscent of the PLT's lazy binding trick to me.

Trade-off

Posted Oct 30, 2010 22:05 UTC (Sat) by giraffedata (guest, #1954) [Link]

Is there anyway to measure the actual overhead caused by the test in the current kernel ?
With the new approach, there is very little overhead

I believe "current kernel" refers to kernel code before the new approach.

I.e. What do we get in return for this complication?

Jump label

Posted Dec 31, 2010 17:25 UTC (Fri) by SEJeff (guest, #51588) [Link]

32bit and 64bit x86 were merged together sometime ago. You must have missed the HUGE flamewar that resulted in ingo winning and the two being merged.

Jump label

Posted Oct 28, 2010 7:54 UTC (Thu) by ptman (subscriber, #57271) [Link] (4 responses)

Is this comparable to dtrace in Solaris? Zero overhead when probes are disabled?

Jump label

Posted Oct 28, 2010 13:09 UTC (Thu) by i3839 (guest, #31386) [Link] (2 responses)

There is no such thing as zero overhead. In the best case you still waste a little bit of memory per tracepoint, which all adds up to something significant if you have too many of them.

Jump label

Posted Nov 1, 2010 20:58 UTC (Mon) by ThomasBellman (guest, #67902) [Link] (1 responses)

Instead of inserting a NOP instruction, you could fill that slot with a useful instruction. Then, of course, that useful instruction must also be part of the tracepoint code, in order to have it performed when the tracepoint is enabled as well, when the original instruction is replaced with a jump.

Jump label

Posted Nov 2, 2010 20:28 UTC (Tue) by i3839 (guest, #31386) [Link]

That's very smart, but a bit tricky to implement (at least on x86 and other archs with variable instruction lengths). I'd say go for it!

Jump label

Posted Oct 30, 2010 3:50 UTC (Sat) by compudj (subscriber, #43335) [Link]

The overhead can be expected to be even lower than the static DTrace instrumentation, because jump labels, in combination with Tracepoints, branch over the whole stack setup and function call. DTrace only nops out the actual function call with a special linker phase, leaving in place the whole stack setup.