Weekly edition Kernel Security Distributions Contact Us Search Archives Calendar Subscribe Write for LWN LWN.net FAQ Sponsors

# Likely unlikely()s

## Likely unlikely()s

Posted Dec 16, 2010 5:09 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246)
Parent article: Likely unlikely()s

Being wrong 39% of the time was pretty obviously too much and led to the removal of the annotation for that test.

What that's saying is that a 60:40 ratio is too close. So what is an appropriate threshold?

It seems like any notable bias in a particular direction should be reason enough. The main key is whether the bias is stable over a range of reasonable workloads. (Note I say "reasonable". If you have likely(doesn't segfault) and then run a workload full of segfaults, well that's just silly.)

Consider a simplified model: You have two costs, kcorrect and kincorrect. If you predict a branch is taken and the branch is in fact taken, then you get charged kcorrect, otherwise you get charged kincorrect. And, this whole exercise is only worthwhile if kcorrect < kincorrect. You also have two probabilities: pcorrect is the probability that your prediction matches what the computer did, and (1 - pcorrect) is the probability it didn't. In this model, then, overall performance should approach kcorrect * pcorrect + kincorrect * (1 - pcorrect) if you annotated your branch correctly. In the 60:40 example from the article, pcorrect would be 0.6.

In this model, you end up with a few possibilities:

1. You annotated your branch correctly, and performance = kcorrect * pcorrect + kincorrect * (1 - pcorrect)
2. You annotated your branch incorrectly, effectively switching kcorrect and kincorrect. performance = kincorrect * pcorrect + kcorrect * (1 - pcorrect)
3. You left it up to the compiler to figure it out. Whether you get #1 or #2 above depends on how good/lucky your compiler is.

Now you're left weighing the quality of the compiler against your ability to get the annotation right and your desire to maintain that annotation. And that's just with this simplified model. In a more complex model, the probability a branch gets taken can vary on other factors, some of which may be discoverable by the CPU's branch prediction logic for example.

Given all that, I can see why you'd want a very clear signal for 'likely'/'unlikely'. But still, how clear?

(Log in to post comments)

Likely unlikely()s

Posted Dec 16, 2010 6:51 UTC (Thu) by cras (guest, #7000) [Link]

I was once adding likely()s and unlikely()s to heavily CPU-intensive code. I remember that adding unlikely() to a branch that was rarely executed (IIRC only a few % of calls) was slower than simply not giving any hint. Nowadays I add them only to branches where getting the branch prediction wrong just doesn't matter (error handling mostly).

Likely unlikely()s

Posted Dec 16, 2010 13:45 UTC (Thu) by dmk (subscriber, #50141) [Link]

wouldn't it be worthwhile sometimes, to annotate a codepath with likely(), even if it wasn't true in reality, just to make that one codepath fast? For example if we are in that one codepath that decides if the world continues or not, we would want to get the decision really fast, if the world continues, else we don't care?

Likely unlikely()s

Posted Dec 16, 2010 13:58 UTC (Thu) by cras (guest, #7000) [Link]

Well, if the code was like:

while (likely(!is_world_ending()) {
update_world_status();
}
save_the_world();

And the branch prediction was wrong pretty much always, that would also increase the amount of time used to update world status. So even when it finally sees the world ending and jumps immediately to saving the world, perhaps it could have done that in the previous iteration already if the inner loop had been faster.

Or something like that :)

Likely unlikely()s

Posted Dec 18, 2010 20:54 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

dmk's basic idea seems valuable, but the example is bad. And cras seems to have misread that example. We need a better example.

The problem with what dmk writes ("we would want to get the decision really fast") is that likely() doesn't affect how fast the test is. It affects how fast what you do as a result of that test is. If the test is world_continues and you want to respond quickly when it does, then the truth is the right way to go: world_continues is likely, and that's how you want to annotate it.

I believe cras assumes we want to respond quickly if the world is ending, because it isn't really ending -- we can save it. That's not what dmk described: he said the world really is ending, so all we can do is say good bye, and it just won't make any difference how slowly we do that.

Likely unlikely()s

Posted Dec 27, 2010 10:54 UTC (Mon) by tesarik (guest, #52705) [Link]

Well, let's give a better example of this. Consider a spinlock slow
path (ie. on contention):

while(unlikely(is_locked(lock)))
cpu_relax();

Although it's likely that the lock is held many times before we can take it,
it's more important to run the critical section as fast as possible when it
finally gets available.

Likely unlikely()s

Posted Dec 16, 2010 19:44 UTC (Thu) by vonbrand (subscriber, #4458) [Link]

Sounds nonsensical to me... if the annotated code is not in any way performance critical, doing the extra work of annotating it is pure waste.

Likely unlikely()s

Posted Dec 16, 2010 20:21 UTC (Thu) by cras (guest, #7000) [Link]

Of course. But if there are some heavily used macros and functions where it's easy to add them once and then get them used throughout the code, it's not much extra trouble even if the benefit is small.

Although now that I think about it, maybe many of those checks I used to put unlikely() in my code are nowadays pointless, because they call functions that I've since marked with __attribute__((cold)), which should put the unlikely()s there automatically. (But do they? I remember testing that the cold attribute did nothing in some gcc version.)

Likely unlikely()s

Posted Dec 19, 2010 3:39 UTC (Sun) by quotemstr (subscriber, #45331) [Link]

You're assuming that pcorrect and pincorrect are constant over time, and they're often not. CPUs these days have large global branch prediction buffers and can detect fairly complex patterns (e.g., branch every third time) automatically. The overall result of this dynamic prediction may be superior to a purely static assertion, even if that assertion is correct on average.

Think of a branch that goes one way during the day and another way at night. If we suppose that activity is twice as high during the day, then we'll see a 2:1 branch bias in favor of the daylight path. But over smaller intervals of time (say, an hour), the branch bias will be almost totally in favor of one path or the other. Unlike static assertions, the CPU's automatic prediction can look at the specific circumstances of the system and make predictions based on recent behavior and specific circumstances. Using static branch prediction could actually hurt.

Likely unlikely()s

Posted Dec 20, 2010 20:35 UTC (Mon) by jzbiciak (✭ supporter ✭, #5246) [Link]

I did state my model was simplified, and did allow for the fact that branch prediction could change the cost over time and that that wasn't included in my model.

My understanding is that static prediction mainly involves either putting a hint on a branch opcode (on architectures that support it) and/or changing which path is the "fall through" path versus the "branch taken" path, overlaid with a rule such as "predict backward branches as taken; forward branches not-taken". In the presence of dynamic prediction, a branch hint should be just that: a hint. A strong prediction signal from the predictor ought to override a hint.

The main purpose of branch hinting is to help the cold-predictor case. This article describes one such scheme. Basically, if your branch isn't in the Branch History Table (BHT), then it falls back to the static prediction. This puts an upper bound on how much damage the static prediction can do, as well as an upper bound on how much it can help.

For static branch prediction to hurt, you'd need to have a large enough working set that the branch in question isn't in the BHT (to use the P4's scheme above), but still hit often enough to make a measurable performance difference.

For your bimodal case (day/night shifting patterns), which ever way you schedule your code, you'll penalize one or the other. If you have accurate long-term branch statistics, though, you can still pick the one that occurs less often and penalize that. So, if daytime has twice the load as nighttime, it should show up in the statistics and you can put likely/unlikely on the conditional to favor daytime.

Really, the argument that makes the most sense to me against likely/unlikely in marginal cases is that different users could have completely different workloads with wildly different statistics (multimodal across all users, even though each user might be unimodal), and so there's no good training set, so just let the compiler guess. There's no way to avoid penalizing someone, so let the compiler do it. Only put likely/unlikely where it's clear that it's right across the vast majority of workloads. Otherwise, there's a good chance it could be wrong for more than half the users out there, and it's also not worth the maintenance burden.

So, that leaves the question: What's the criteria for figuring that out? Taken vs. not-taken ratio seems like a shaky heuristic to me.

Likely unlikely()s

Posted Dec 21, 2010 13:58 UTC (Tue) by vonbrand (subscriber, #4458) [Link]

It has been observed that programmers are terrible at guessing where their programs spend most of their time... and that should be much easier than predicting the frequency of some critical branch going one way or the other.

Best leave it alone, or only annotate when extensive measurements show a high bias (and the compiler gets it wrong).

Likely unlikely()s

Posted Dec 27, 2010 11:00 UTC (Mon) by tesarik (guest, #52705) [Link]

*the* compiler? which compiler?

Surely, gcc heuristic decisions change with the version...

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds