An introduction to lockless algorithms [LWN.net]

An introduction to lockless algorithms

Posted Feb 20, 2021 0:12 UTC (Sat) by mpr22 (subscriber, #60784) [Link] (2 responses)

From the fine article:

"The ordering of events is total within a single thread of execution. In layman's terms, for any two events from the same thread you can always say which came first and which came second."

An introduction to lockless algorithms

Posted Feb 20, 2021 0:25 UTC (Sat) by pebolle (guest, #35204) [Link] (1 responses)

> [...] all accesses to the data structure form a total ordering [...]

and

> The ordering of operations is total within a single thread.

Totally clear.

An introduction to lockless algorithms

Posted Feb 20, 2021 7:21 UTC (Sat) by pbonzini (subscriber, #60935) [Link]

"All accesses to the data structure form a total ordering" means that you can always say which access came first and which came second. This is what you'd expect from a mutex, because:

1) you can always say which critical section came first and which came second (lock waits for the other critical section to complete, if there's one), and

2) accesses within a critical section come from a single thread, and there is always a total order (defined in the article) within a thread.

An introduction to lockless algorithms

Posted Feb 20, 2021 2:32 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (22 responses)

It's a mathematical term (a "totally ordered set"). It means that for any two events you can tell which one came earlier.

This is opposed to a partially ordered set. A good example of a partially ordered set is a genealogical tree. From the information within the tree you can tell that some people are older than other (a father is always older than a son) but for some cases there's not enough information (e.g. you can be younger or older than your cousin).

An introduction to lockless algorithms

Posted Feb 20, 2021 9:34 UTC (Sat) by pebolle (guest, #35204) [Link] (21 responses)

> A good example of a partially ordered set is a genealogical tree.

That helps.

I'm mathematically challenged. So, for example, I found it very annoying in programming exercises that recursion would always be taught with Fibonacci numbers. Who cares about that? It wasn't until I once had to walk through directories and filenames that recursion started to make sense to me.

An introduction to lockless algorithms

Posted Feb 21, 2021 11:20 UTC (Sun) by ballombe (subscriber, #9523) [Link] (20 responses)

Maybe you missed the point of the exercise ?
The point is that recursion *does not* work for computing Fibonacci numbers...

An introduction to lockless algorithms

Posted Feb 21, 2021 12:13 UTC (Sun) by pebolle (guest, #35204) [Link] (9 responses)

> [...] recursion *does not* work for computing Fibonacci numbers...

I just double checked - I was afraid that my memory let me down again - but recursion does work for Fibonacci. At least, that's what the web and the only introductory programming book I have on my shelves tells me.

Perhaps we disagree on the meaning of "recursion" and "work"?

An introduction to lockless algorithms

Posted Feb 21, 2021 12:29 UTC (Sun) by Wol (subscriber, #4433) [Link] (6 responses)

int fibonacci( int f)
{
if (f <= 1)
return f;
else
return fibonacci( f-1) + fibonacci( f-2);
}

Which will blow your stack for large values of f :-)

I hope I've got it right ... :-)

Cheers,
Wol

An introduction to lockless algorithms

Posted Feb 21, 2021 13:08 UTC (Sun) by dtlin (subscriber, #36537) [Link]

With that implementation, there's a good chance you'll experience integer wraparound (and thus wrong results) before you blow the stack. fib(92) = 7540113804746346429, fib(93) = 12200160415121876738 which doesn't fit in a 64-bit signed integer.

The problem with the recursive solution is the exponential runtime; the stack usage is linear.

An introduction to lockless algorithms

Posted Feb 21, 2021 15:24 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

I knew I should have checked it ...

The start of the conditional should be

if (f <= 2) return 1;

Cheers,
Wol

An introduction to lockless algorithms

Posted Feb 21, 2021 16:16 UTC (Sun) by yann.morin.1998 (guest, #54333) [Link]

> The start of the conditional should be
> if (f <= 2) return 1;

Not necessarily. The generalised Fibonacci sequence starts with F(0) = 0 and F(1) = 1 [0]. So your initial condition was (not in)correct.

[0] https://en.wikipedia.org/wiki/Fibonacci_number

An introduction to lockless algorithms

Posted Feb 21, 2021 17:10 UTC (Sun) by matthias (subscriber, #94967) [Link]

This will definitely not blow the stack. Even for quite small numbers of f, the runtime will be longer than the age of the universe.

An introduction to lockless algorithms

Posted Feb 21, 2021 23:41 UTC (Sun) by neilbrown (subscriber, #359) [Link]

> Which will blow your stack for large values of f :-)

Only if you try to implement it on a computer.
As a mathematical abstraction, this is perfectly fine and well defined for all f.

An introduction to lockless algorithms

Posted Mar 5, 2021 0:13 UTC (Fri) by qzPhUZ7af5M6kjJi3tDgMaVq (guest, #140267) [Link]

Recursive definitions work just fine in certain programming languages.

https://wiki.haskell.org/The_Fibonacci_sequence

In a Haskell interpreter:

Prelude> let fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
Prelude> fibs !! 100000

(20,899 digit number omitted... it took about 2 seconds to compute it)

An introduction to lockless algorithms

Posted Feb 22, 2021 9:49 UTC (Mon) by ballombe (subscriber, #9523) [Link] (1 responses)

To compute fibo(50) the naive sequential algorithm needs 50 additions, the naive recursive algorithm needs
12586269025 additions. That is what I call 'not working'.

An introduction to lockless algorithms

Posted Feb 22, 2021 10:04 UTC (Mon) by pebolle (guest, #35204) [Link]

I'll file this under 'Disagreeing what "work" means' too.

An introduction to lockless algorithms

Posted Feb 22, 2021 11:47 UTC (Mon) by jezuch (subscriber, #52988) [Link] (9 responses)

Actually I don't think I ever saw a programming tutorial which introduced recursion using Fibonacci sequences and said that this was a bad example. Which I too think is sad. But I guess this is the prototypical "recursive function" known from the math classes, so...

An introduction to lockless algorithms

Posted Feb 22, 2021 12:14 UTC (Mon) by Wol (subscriber, #4433) [Link] (8 responses)

Thing is, the *definition* of fibonacci is recursive. Doesn't mean it's the best way to calculate it ...

Cheers,
Wol

An introduction to lockless algorithms

Posted Feb 24, 2021 23:49 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (7 responses)

The recursive definition is perfectly fine for computing it as long as you memoize appropriately. You don't even need all that much memo space, because you only ever need to keep the largest two values (which I believe will generally also be the two most recently computed values, so you can just bolt on a bog-standard LRU cache).

Of course, if you *really* want to motivate memoization, you probably want to talk about something more interesting, like Hashlife. I remember reading a lengthy series of articles on Eric Lippert's blog where he spends much of the series painstakingly squeezing out every ounce of performance from the iterative approach (vectorization, bit-twiddling, lookup tables, etc.), and then he switches to Hashlife ("Gosper's algorithm"), with hardly any weird optimizations at all beyond basic algorithm design and memoization, using *garbage collected objects* to represent everything, and suddenly the whole thing is orders of magnitude faster* for large-enough N. My conclusion: Using the right algorithm is far, far more effective than heavily optimizing the wrong algorithm.

* Strictly speaking, it is faster when solving the problem "Figure out what the board will look like at generation N." If the problem is instead "Figure out what the board will look like at every generation from 1 through N," then Hashlife is much less effective. So using it for an interactive UI that sequentially displays one generation at a time is probably not a great idea. But if you want to advance directly to generation 2^27 or something like that, Hashlife is basically the only game in town.

An introduction to lockless algorithms

Posted Mar 2, 2021 20:07 UTC (Tue) by jezuch (subscriber, #52988) [Link] (6 responses)

Well actually...

Fibonacci numbers are defined by a recursive function which has an exact solution, so you can calculate every number in O(1) time (assuming no bignums are needed). So it's doubly (or triply) a bad example to intoduce recursion to new programmers ;)

An introduction to lockless algorithms

Posted Mar 5, 2021 3:16 UTC (Fri) by dtlin (subscriber, #36537) [Link] (5 responses)

It has a closed-form solution but that does not make it O(1).

An introduction to lockless algorithms

Posted Mar 11, 2021 9:56 UTC (Thu) by anselm (subscriber, #2796) [Link] (4 responses)

It has a closed-form solution but that does not make it O(1).

If your platform only supports integers up to a certain maximum (say 2^64-1), it is O(1) because it is straightforward to create a lookup table for all n where F(n) < 2^64 and use that rather than the recursive definition.

An introduction to lockless algorithms

Posted Mar 11, 2021 19:28 UTC (Thu) by nybble41 (subscriber, #55106) [Link] (3 responses)

> … it is O(1) because it is straightforward to create a lookup table for all n where F(n) < 2^64 and use that rather than the recursive definition.

If you strict permissible inputs to a predetermined finite range then every algorithm becomes O(1). That's not a very useful definition, though. And while a 93-entry table suffices to represent all Fibonacci numbers less than 2^64, many languages have built-in arbitrary-precision integer types limited only by available memory.

An introduction to lockless algorithms

Posted Mar 12, 2021 10:49 UTC (Fri) by anselm (subscriber, #2796) [Link] (1 responses)

Obviously. I think the point here is that Fibonacci numbers are often used as a prime example for the use of recursion, but in practice, calculating Fibonacci numbers through a 1:1 implementation of their recursive definition is not what one would actually ever do. As such it gives people silly ideas. It would be better to come up with examples where there are more compelling arguments for using recursion in the first place, or at the very least put a big health warning on the Fibonacci example.

An introduction to lockless algorithms

Posted Mar 13, 2021 8:54 UTC (Sat) by neilbrown (subscriber, #359) [Link]

I recall in first year CompSci, recursion was introduced using the Towers of Hanoi game. I think that is an excellent vehicle, much better than Fibonacci.

An introduction to lockless algorithms

Posted Mar 16, 2021 20:49 UTC (Tue) by nix (subscriber, #2304) [Link]

The canonically crazy example of this is the wonderful Hutter search: https://arxiv.org/abs/cs/0206022. The title describes it well: it is, in fact, the fastest and shortest general algorithm possible if you want one single algorithm that can solve all well-defined problems -- but given that it involves searching for and then executing all possible programs until it finds the one that solves the problem you asked about, it is somewhat impractical. You might find it useful if you are looking for something for your hypercomputer to run when you are bored of having it work out all the busy beaver numbers up to 10^100...

An introduction to lockless algorithms

Posted Feb 21, 2021 21:31 UTC (Sun) by thoughtpolice (subscriber, #87455) [Link] (2 responses)

Many people are giving you exact definitions of the word "total ordering" which is relevant to the article, obviously, but it is also worth pointing out that the word "total", when applied to some operation, be it mathematics or a computer program, means it is an operation that works and "makes sense" over every possible input you could feed it. This is an extremely common occurrence computer programmers encounter regularly, though it may not seem like it.

For example, many people define division as a function that takes two parameters, 'n' and 'd', and computes n / d. But most people would agree that d=0 is an invalid parameter your program will actually fault. It's nonsensical.[1] So division is not total, then: you can give it an input and it can fail. But multiplication is total: it doesn't matter what two integers you give it. You can always multiply them, and there is always a sensible result. Thus, being "total", or "totality", is a property of algorithms/functions/actions you can perform on some objects (those objects being the set of all possible parameters, commonly called "the input domain".)

There is another operator that you often use when programming, and it is called "is x less than y", or <. You use it to figure out how things are ordered. Is 1 less than 2? It turns out, this is very similar to the question "does a thing happen before another thing?" and it's a very important question. They are both a way of asking what order things occurred in, just for different kinds of "things". But, as an algorithm, ordering is not always total: sometimes you just can't tell if A happened before B, or if A is less than B, or not. It might not make sense for the objects in question.

I'll also say that partiality and totality, and especially partially ordered sets (sometimes called "posets"), are an extremely, extremely fundamental component of computer science and mathematics. They're very important, and very well worth understanding, even if it's only at a high level.

[1] It's only nonsensical to some people or computers. On some CPUs, such as RISC-V for example, division by zero is well defined, as n/0=0. This is fine, and not really problematic. At the end of the day, math is what human beings make of it. For rigorous mathematics, n/0=0 might be heresy. But for computers, for humans using them, it's maybe OK to let that slide.

An introduction to lockless algorithms

Posted Feb 25, 2021 1:47 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

Another hint: For *finite* sets (which is basically all the sets the average programmer is likely to care about), a poset is (mostly) equivalent to a DAG. So for example:

* Git history is a poset (under the relationship "commit X is an ancestor of commit Y").
* The packages in a (sufficiently powerful) package manager are a poset (under the relationship "package X is a transitive dependency of package Y").
* In languages with OOP and inheritance, the set of classes (or classes and interfaces) is a poset (under the relationship "class/interface X is a direct or indirect subclass/subtype of class/interface Y").

In short: A poset is a thing that lets you have some objects come "logically before" other objects, for whatever definition of "logically before" is appropriate.

Then, a totally ordered set is just a poset with the additional requirement that for every X and Y, either X comes before Y or Y comes before X (or X = Y). In the DAG interpretation, this is what you would get if you require that the DAG forms a simple chain A -> B -> C ... -> Z, without any branching or whatnot allowed. So, to continue our examples:

* Git history is totally ordered, if you never create any branches or merges, never use detached HEAD mode, and never create a second root commit (i.e. a commit without parents, other than the first commit in the repository).
* Packages are pretty much never totally ordered, in real-world cases. But maybe, if you wanted to ensure a totally deterministic installation order, you could insert artificial dependencies to force the package manager to install everything in the exact order you want.
* In languages with single inheritance, each individual class's set of superclasses (not counting interfaces) is totally ordered. But the language as a whole is not, in most reasonable cases.

An introduction to lockless algorithms

Posted Feb 26, 2021 21:07 UTC (Fri) by thoughtpolice (subscriber, #87455) [Link]

In fact it's so close to a DAG that the given Hasse diagram of a poset (when drawn correctly) might make you blink twice. :) My original post was already long enough, though...