|
|
Subscribe / Log in / New account

User-space RCU: Atomic-operation and utility API

November 12, 2013

This article was contributed by Paul E. McKenney, Mathieu Desnoyers, and Lai Jiangshan


User-space RCU

Many of URCU's atomic operations were derived from the implementations in the BSD-licensed Boehm-Demers-Weiser conservative garbage collector for license-compatibility reasons. This section also includes utility functions that are not strictly speaking atomic, but are often used in conjunction with atomic operations.

With a few exceptions, these operations have one of the following three prefixes:

  1. caa_: Concurrent Architecture Abstraction.
  2. cmm_: Concurrent Memory Model.
  3. uatomic_: URCU Atomic Operation.

The individual operations are as follows:

  1. caa_container_of(ptr, type, member): Given a pointer ptr to a field named member in an enclosing structure of type type, return a pointer to the enclosing structure.
  2. caa_likely(x) and caa_unlikely(x): Inform the compiler of the developer's opinion on whether the specified condition x is likely to evaluate to true.
  3. caa_max(a,b) and caa_min(a,b): Take the maximum or minimum, respectively, of a and b.
  4. cmm_barrier(): Prevent the compiler from carrying out any code-motion optimizations that would move memory references across this directive.
  5. cmm_smp_mb(): Memory barrier that prevents both the compiler and the CPU from carrying out any code-motion optimizations that would move memory references across this directive.
  6. cmm_smp_rmb(): Memory barrier that prevents both the compiler and the CPU from carrying out any code-motion optimizations that would move reads from memory across this directive.
  7. cmm_smp_wmb(): Memory barrier that prevents both the compiler and the CPU from carrying out any code-motion optimizations that would move writes to memory across this directive. Please see the memory-barrier menagerie for more details on how these three memory-barrier primitives can be used.
  8. CMM_ACCESS_ONCE(x): Force the compiler to access x exactly as specified in the source code, preventing the compiler from carrying out any optimizations that would either combine or split accesses. This is quite similar to the Linux kernel's ACCESS_ONCE() primitive.
  9. CMM_LOAD_SHARED(p): Perform a load with CMM_ACCESS_ONCE(x) semantics.
  10. CMM_STORE_SHARED(x, v): Store v to x, with CMM_ACCESS_ONCE(x) semantics on the store and also with type checking. This is similar to CMM_ACCESS_ONCE(x) = v.
  11. DEFINE_URCU_TLS(type, name): Define a thread-local variable of type type of name name. Initializers are not permitted. This macro uses __thread where available, and falls back to POSIX get/set specific otherwise (as is also the case for DECLARE_URCU_TLS() and URCU_TLS() below).
  12. DECLARE_URCU_TLS(type, name): Declare a thread-local variable of type type of name name so that other compilation units can access it.
  13. URCU_TLS(name): Access a thread-local variable. This produces a C-language lvalue, permitting both loads and stores.
  14. uatomic_set(addr, v): This is semantically equivalent to the assignment statement *addr = v, but with CMM_STORE_SHARED() semantics. Note that no ordering constraints are placed on the CPU.
  15. uatomic_read(addr): Load from *addr with CMM_LOAD_SHARED() semantics. Note that no ordering constraints are placed on the CPU.
  16. uatomic_cmpxchg(addr, old, _new): Atomically compare the data referenced by addr with old, and storing _new if the two values compare equal, in either case returning the previous value referenced by addr. This is a standard compare-and-swap operation, constraining both the compiler and the CPU to avoid optimizations that would reorder the uatomic_cmpxchg() with code either preceding or following it.
  17. uatomic_xchg(addr, v): Atomically store the value v into the location referenced by addr, returning the prior value of the location referenced by addr. This is the standard atomic-exchange operation, constraining both the compiler and the CPU to avoid optimizations that would reorder the uatomic_xchg() with code either preceding or following it.
  18. uatomic_and(addr, v): Atomically AND the value v into the location referenced by addr. This operation does not return any value, and imposes no ordering constraints on either the compiler or the CPU (this follows the convention in the Linux kernel that ordering is provided only if the atomic primitive returns a value). If such constraints are desired, they may be provided using cmm_smp_mb__before_uatomic_and() and cmm_smp_mb__after_uatomic_and().
  19. uatomic_or(addr, v): Atomically OR the value v into the location referenced by addr. This operation does not return any value, and imposes no ordering constraints on either the compiler or the CPU. If such constraints are desired, they may be provided using cmm_smp_mb__before_uatomic_or() and cmm_smp_mb__after_uatomic_or().
  20. uatomic_add_return(addr, v): Atomically add v to the location referenced by addr, returning the result. This is a standard add-and-fetch operation, constraining both the compiler and the CPU to avoid optimizations that would reorder the uatomic_add_return() with code either preceding or following it.
  21. uatomic_add(addr, v): Atomically add the value v to the location referenced by addr. This operation does not return any value, and imposes no ordering constraints on either the compiler or the CPU. If such constraints are desired, they may be provided using cmm_smp_mb__before_uatomic_add() and cmm_smp_mb__after_uatomic_add().
  22. uatomic_inc(addr): Atomically increment the location referenced by addr. This operation does not return any value, and imposes no ordering constraints on either the compiler or the CPU. If such constraints are desired, they may be provided using cmm_smp_mb__before_uatomic_inc() and cmm_smp_mb__after_uatomic_inc().
  23. uatomic_dec(addr): Atomically decrement the location referenced by addr. This operation does not return any value, and imposes no ordering constraints on either the compiler or the CPU. If such constraints are desired, they may be provided using cmm_smp_mb__before_uatomic_dec() and cmm_smp_mb__after_uatomic_dec().

Unlike the Linux kernel, userspace RCU's atomic operations are type-generic, supporting multiple operand sizes. In some cases, hardware will restrict the available sizes. If 8-bit atomics are supported, the UATOMIC_HAS_ATOMIC_BYTE C-preprocessor macro will be defined, and if 16-bit atomics are supported, the UATOMIC_HAS_ATOMIC_SHORT macro will be defined. In addition, 32-bit implementations support 32-bit operands and 64-bit implementations support both 32-bit and 64-bit operands. Therefore, atomic operations on ints and longs are portable. Finally, the operands for atomic operations must be properly aligned, which means that you should not attempt to use atomic operations on fields within packed structures.

Although these operations can be used directly, the intent is that higher-level operations defined elsewhere in the library are intended to be used in the common case. Instead, these operations are provided primarily for use by the rest of this library.


(Log in to post comments)


Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds