|
|
Subscribe / Log in / New account

Documentation/kdbus.txt (from the initial patch set)

D-Bus is a system for powerful, easy to use interprocess communication (IPC).

The focus of this document is an overview of the low-level, native kernel D-Bus
transport called kdbus. Kdbus in the kernel acts similar to a device driver,
all communication between processes take place over special character device
nodes in /dev/kdbus/.

For the general D-Bus protocol specification, the payload format, the
marshaling, and the communication semantics, please refer to:
  http://dbus.freedesktop.org/doc/dbus-specification.html

For a kdbus specific userspace library implementation please refer to:
  http://cgit.freedesktop.org/systemd/systemd/tree/src/syst...

Articles about D-Bus and kdbus:
  http://lwn.net/Articles/580194/


1. Terminology
===============================================================================

  Domain:
    A domain is a named object containing a number of buses. A system
    container that contains its own init system and users usually also
    runs in its own kdbus domain. The /dev/kdbus/domain/<container-name>/
    directory shows up inside the domain as /dev/kdbus/. Every domain offers
    its own "control" device node to create new buses or new sub-domains.
    Domains have no connection to each other and cannot see nor talk to
    each other. See section 5 for more details.

  Bus:
    A bus is a named object inside a domain. Clients exchange messages
    over a bus. Multiple buses themselves have no connection to each other;
    messages can only be exchanged on the same bus. The default entry point to
    a bus, where clients establish the connection to, is the "bus" device node
    /dev/kdbus/<bus name>/bus.
    Common operating system setups create one "system bus" per system, and one
    "user bus" for every logged-in user. Applications or services may create
    their own private named buses. See section 5 for more details.

  Endpoint:
    An endpoint provides the device node to talk to a bus. Opening an
    endpoint creates a new connection to the bus to which the endpoint belongs.
    Every bus has a default endpoint called "bus".
    A bus can optionally offer additional endpoints with custom names to
    provide a restricted access to the same bus. Custom endpoints carry
    additional policy which can be used to give sandboxed processes only
    a locked-down, limited, filtered access to the same bus.
    See section 5 for more details.

  Connection:
    A connection to a bus is created by opening an endpoint device node of
    a bus and becoming an active client with the HELLO exchange. Every
    connected client connection has a unique identifier on the bus and can
    address messages to every other connection on the same bus by using
    the peer's connection id as the destination.
    See section 6 for more details.

  Pool:
    Each connection allocates a piece of shmem-backed memory that is used
    to receive messages and answers to ioctl command from the kernel. It is
    never used to send anything to the kernel. In order to access that memory,
    userspace must mmap() it into its task.
    See section 12 for more details.

  Well-known Name:
    A connection can, in addition to its implicit unique connection id, request
    the ownership of a textual well-known name. Well-known names are noted in
    reverse-domain notation, such as com.example.service1. Connections offering
    a service on a bus are usually reached by its well-known name. The analogy
    of connection id and well-known name is an IP address and a DNS name
    associated with that address.

  Message:
    Connections can exchange messages with other connections by addressing
    the peers with their connection id or well-known name. A message consists
    of a message header with kernel-specific information on how to route the
    message, and the message payload, which is a logical byte stream of
    arbitrary size. Messages can carry additional file descriptors to be passed
    from one connection to another. Every connection can specify which set of
    metadata the kernel should attach to the message when it is delivered
    to the receiving connection. Metadata contains information like: system
    timestamps, uid, gid, tid, proc-starttime, well-known-names, process comm,
    process exe, process argv, cgroup, capabilities, seclabel, audit session,
    loginuid and the connection's human-readable name.
    See section 7 and 13 for more details.

  Item:
    The API of kdbus implements a notion of items, submitted through and
    returned by most ioctls, and stored inside data structures in the
    connection's pool. See section 4 for more details.

  Broadcast and Match:
    Broadcast messages are potentially sent to all connections of a bus. By
    default, the connections will not actually receive any of the sent
    broadcast messages; only after installing a match for specific message
    properties, a broadcast message passes this filter.
    See section 10 for more details.

  Policy:
    A policy is a set of rules that define which connections can see, talk to,
    or register a well-know name on the bus. A policy is attached to buses and
    custom endpoints, and modified by policy holder connection or owners of
    custom endpoints. See section 11 for more details.

    Access rules to allow who can see a name on the bus are only checked on
    custom endpoints. Policies may be defined with names that end with '.*'.
    When matching a well-known name against such a wildcard entry, the last
    part of the name is ignored and checked against the wildcard name without
    the trailing '.*'. See section 11 for more details.

  Privileged bus users:
    A user connecting to the bus is considered privileged if it is either the
    creator of the bus, or if it has the CAP_IPC_OWNER capability flag set.


2. Device Node Layout
===============================================================================

The kdbus interface is exposed through device nodes in /dev.

  /sys/bus/kdbus
  `-- devices
    |-- kdbus!0-system!bus -> ../../../devices/virtual/kdbus/kdbus!0-system!bus
    |-- kdbus!2702-user!bus -> ../../../devices/virtual/kdbus/kdbus!2702-user!bus
    |-- kdbus!2702-user!ep.app -> ../../../devices/virtual/kdbus/kdbus!2702-user!ep.app
    `-- kdbus!control -> ../../../devices/kdbus!control

  /dev/kdbus
  |-- control
  |-- 0-system
  |   |-- bus
  |   `-- ep.apache
  |-- 1000-user
  |   `-- bus
  |-- 2702-user
  |   |-- bus
  |   `-- ep.app
  `-- domain
      |-- fedoracontainer
      |   |-- control
      |   |-- 0-system
      |   |   `-- bus
      |   `-- 1000-user
      |       `-- bus
      `-- mydebiancontainer
          |-- control
          `-- 0-system
              `-- bus

Note:
  The device node subdirectory layout is arranged that a future version of
  kdbus could be implemented as a file system with a separate instance mounted
  for each domain. For any future changes, this always needs to be kept
  in mind. Also the dependency on udev's userspace hookups or sysfs attribute
  use should be limited to the absolute minimum for the same reason.


3. Data Structures and flags
===============================================================================

3.1 Data structures and interconnections
----------------------------------------

  +-------------------------------------------------------------------------+
  | Domain (Init Domain)                                                    |
  | /dev/kdbus/control                                                      |
  | +---------------------------------------------------------------------+ |
  | | Bus (System Bus)                                                    | |
  | | /dev/kdbus/0-system/                                                | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | | Endpoint                      | | Endpoint                      | | |
  | | | /dev/kdbus/0-system/bus       | | /dev/kdbus/0-system/ep.app    | | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | +--------------+ +--------------+ +--------------+ +--------------+ | |
  | | | Connection   | | Connection   | | Connection   | | Connection   | | |
  | | | :1.22        | | :1.25        | | :1.55        | | :1.81        | | |
  | | +--------------+ +--------------+ +--------------+ +--------------+ | |
  | +---------------------------------------------------------------------+ |
  |                                                                         |
  | +---------------------------------------------------------------------+ |
  | | Bus (User Bus for UID 2702)                                         | |
  | | /dev/kdbus/2702-user/                                               | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | | Endpoint                      | | Endpoint                      | | |
  | | | /dev/kdbus/2702-user/bus      | | /dev/kdbus/2702-user/ep.app   | | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | +--------------+ +--------------+ +--------------+ +--------------+ | |
  | | | Connection   | | Connection   | | Connection   | | Connection   | | |
  | | | :1.22        | | :1.25        | | :1.55        | | :1.81        | | |
  | | +--------------+ +--------------+ +-------------------------------+ | |
  | +---------------------------------------------------------------------+ |
  |                                                                         |
  | +---------------------------------------------------------------------+ |
  | | Domain (Container; inside it, fedoracontainer/ becomes /dev/kdbus/) | |
  | | /dev/kdbus/domain/fedoracontainer/control                           | |
  | | +-----------------------------------------------------------------+ | |
  | | | Bus (System Bus of "fedoracontainer")                           | | |
  | | | /dev/kdbus/domain/fedoracontainer/0-system/                     | | |
  | | | +-----------------------------+                                 | | |
  | | | | Endpoint                    |                                 | | |
  | | | | /dev/.../0-system/bus       |                                 | | |
  | | | +-----------------------------+                                 | | |
  | | | +-------------+ +-------------+                                 | | |
  | | | | Connection  | | Connection  |                                 | | |
  | | | | :1.22       | | :1.25       |                                 | | |
  | | | +-------------+ +-------------+                                 | | |
  | | +-----------------------------------------------------------------+ | |
  | |                                                                     | |
  | | +-----------------------------------------------------------------+ | |
  | | | Bus (User Bus for UID 270 of "fedoracontainer")                 | | |
  | | | /dev/kdbus/domain/fedoracontainer/2702-user/                    | | |
  | | | +-----------------------------+                                 | | |
  | | | | Endpoint                    |                                 | | |
  | | | | /dev/.../2702-user/bus      |                                 | | |
  | | | +-----------------------------+                                 | | |
  | | | +-------------+ +-------------+                                 | | |
  | | | | Connection  | | Connection  |                                 | | |
  | | | | :1.22       | | :1.25       |                                 | | |
  | | | +-------------+ +-------------+                                 | | |
  | | +-----------------------------------------------------------------+ | |
  | +---------------------------------------------------------------------+ |
  +-------------------------------------------------------------------------+

The above description uses the D-Bus notation of unique connection names that
adds a ":1." prefix to the connection's unique ID. kbus itself doesn't
use that notation, neither internally nor externally. However, libraries and
other usespace code that aims for compatibility to D-Bus might.

3.2 Flags
---------

All ioctls used in the communication with the driver contain two 64-bit fields,
'flags' and 'kernel_flags'. In 'flags', the behavior of the command can be
tweaked, whereas in 'kernel_flags', the kernel driver writes back the mask of
supported bits upon each call, and sets the KDBUS_FLAGS_KERNEL bit. This is a
way to probe possible kernel features and make code forward and backward
compatible.

All bits that are not recognized by the kernel in 'flags' are rejected, and the
ioctl fails with -EINVAL.


4. Items
===============================================================================

To flexibly augment transport structures used by kdbus, data blobs of type
struct kdbus_item are used. An item has a fixed-sized header that only stores
the type of the item and the overall size. The total size is variable and is
in some cases defined by the item type, in other cases, they can be of
arbitrary length (for instance, a string).

In the external kernel API, items are used for many ioctls to transport
optional information from userspace to kernelspace. They are also used for
information stored in a connection's pool, such as messages, name lists or
requested connection information.

In all such occasions where items are used as part of the kdbus kernel API,
they are embedded in structs that have an overall size of their own, so there
can be many of them.

The kernel expects all items to be aligned to 8-byte boundaries.

A simple iterator in userspace would iterate over the items until the items
have reached the embedding structure's overall size. An example implementation
of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h.


5. Creation of new domains, buses and endpoints
===============================================================================

The initial kdbus domain is unconditionally created by the kernel module. A
domain contains a "control" device node which allows to create a new bus or
domain. New domains do not have any buses created by default.


5.1 Domains and buses
---------------------

Opening the control device node returns a file descriptor, it accepts the
ioctls KDBUS_CMD_BUS_MAKE and KDBUS_CMD_DOMAIN_MAKE which specify the name of
the new bus or domain to create. The control file descriptor needs to be kept
open for the entire life-time of the created bus or domain, closing it will
immediately cleanup the entire bus or domain and all its associated
resources and connections. Every control file descriptor can only be used once
to create a new bus or domain; from that point, it is not used for any
further communication until the final close().

Each bus will generate a random, 128-bit UUID upon creation. It will be
returned to the creators of connections through kdbus_cmd_hello.id128 and can
be used by userspace to uniquely identify buses, even across different machines
or containers. The UUID will have its its variant bits set to 'DCE', and denote
version 4 (random).

When a new domain is created, its structure in /dev/kdbus/<name>/ is a
replication of what's initially created in /dev/kdbus. In fact, internally,
a dummy default domain is set up when the driver is loaded. This allows
userspace to bind-mount domain subtrees of /dev/kdbus into a container's
filesystem view, and hence achieve complete isolation from the host's domain
and those of other containers.


5.2 Endpoints
-------------

Endpoints are entry points to a bus. By default, each bus has a default
endpoint called 'bus'. The bus owner has the ability to create custom
endpoints with specific names, permissions, and policy databases (see below).

To create a custom endpoint, use the KDBUS_CMD_ENDPOINT_MAKE ioctl with struct
kdbus_cmd_make. Custom endpoints always have a policy db that, by default,
does not allow anything. Everything that users of this new endpoint should be
able to do has to be explicitly specified through KDBUS_ITEM_NAME and
KDBUS_ITEM_POLICY_ACCESS items.

5.3 Creating domains, buses and endpoints
-----------------------------------------

KDBUS_CMD_BUS_MAKE, KDBUS_CMD_DOMAIN_MAKE and KDBUS_CMD_ENDPOINT_MAKE take a
struct kdbus_cmd_make argument.

struct kdbus_cmd_make {
  __u64 size;
    The overall size of the struct, including its items.

  __u64 flags;
    The flags for creation.

    KDBUS_MAKE_ACCESS_GROUP
      Make the device node group-accessible

    KDBUS_MAKE_ACCESS_WORLD
      Make the device node world-accessible

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  struct kdbus_item items[0];
    A list of items, only used for creating custom endpoints. Ignored for
    buses and domains.
};


6. Connections
===============================================================================


6.1 Connection IDs and well-known connection names
--------------------------------------------------

Connections are identified by their connection id, internally implemented as a
uint64_t counter. The IDs of every newly created bus start at 1, and every new
connection will increment the counter by 1. The ids are not reused.

In higher level tools, the user visible representation of a connection is
defined by the D-Bus protocol specification as ":1.<id>".

Messages with a specific uint64_t destination id are directly delivered to
the connection with the corresponding id. Messages with the special destination
id KDBUS_DST_ID_BROADCAST are broadcast messages and are potentially delivered
to all known connections on the bus; clients interested in broadcast messages
need to subscribe to the specific messages they are interested though, before
any broadcast message reaches them.

Messages synthesized and sent directly by the kernel will carry the special
source id KDBUS_SRC_ID_KERNEL (0).

In addition to the unique uint64_t connection id, established connections can
request the ownership of well-known names, under which they can be found and
addressed by other bus clients. A well-known name is associated with one and
only one connection at a time. See section 8 on name acquisition and the
name registry, and the validity of names.

Messages can specify the special destination id 0 and carry a well-known name
in the message data. Such a message is delivered to the destination connection
which owns that well-known name.

  +-------------------------------------------------------------------------+
  | +---------------+     +---------------------------+                     |
  | | Connection    |     | Message                   | -----------------+  |
  | | :1.22         | --> | src: 22                   |                  |  |
  | |               |     | dst: 25                   |                  |  |
  | |               |     |                           |                  |  |
  | |               |     |                           |                  |  |
  | |               |     +---------------------------+                  |  |
  | |               |                                                    |  |
  | |               | <--------------------------------------+           |  |
  | +---------------+                                        |           |  |
  |                                                          |           |  |
  | +---------------+     +---------------------------+      |           |  |
  | | Connection    |     | Message                   | -----+           |  |
  | | :1.25         | --> | src: 25                   |                  |  |
  | |               |     | dst: 0xffffffffffffffff   | -------------+   |  |
  | |               |     |  (KDBUS_DST_ID_BROADCAST) |              |   |  |
  | |               |     |                           | ---------+   |   |  |
  | |               |     +---------------------------+          |   |   |  |
  | |               |                                            |   |   |  |
  | |               | <--------------------------------------------------+  |
  | +---------------+                                            |   |      |
  |                                                              |   |      |
  | +---------------+     +---------------------------+          |   |      |
  | | Connection    |     | Message                   | --+      |   |      |
  | | :1.55         | --> | src: 55                   |   |      |   |      |
  | |               |     | dst: 0 / org.foo.bar      |   |      |   |      |
  | |               |     |                           |   |      |   |      |
  | |               |     |                           |   |      |   |      |
  | |               |     +---------------------------+   |      |   |      |
  | |               |                                     |      |   |      |
  | |               | <------------------------------------------+   |      |
  | +---------------+                                     |          |      |
  |                                                       |          |      |
  | +---------------+                                     |          |      |
  | | Connection    |                                     |          |      |
  | | :1.81         |                                     |          |      |
  | | org.foo.bar   |                                     |          |      |
  | |               |                                     |          |      |
  | |               |                                     |          |      |
  | |               | <-----------------------------------+          |      |
  | |               |                                                |      |
  | |               | <----------------------------------------------+      |
  | +---------------+                                                       |
  +-------------------------------------------------------------------------+


6.2 Creating connections
------------------------

A connection to a bus is created by opening an endpoint device node of
a bus and becoming an active client with the KDBUS_CMD_HELLO ioctl. Every
connected client connection has a unique identifier on the bus and can
address messages to every other connection on the same bus by using
the peer's connection id as the destination.

The KDBUS_CMD_HELLO ioctl takes the following struct as argument.

struct kdbus_cmd_hello {
  __u64 size;
    The overall size of the struct, including all attached items.

  __u64 conn_flags;
    Flags to apply to this connection:

    KDBUS_HELLO_ACCEPT_FD
      When this flag is set, the connection can be sent file descriptors
      as message payload. If it's not set, any attempt of doing so will
      result in -ECOMM on the sender's side.

    KDBUS_HELLO_ACTIVATOR
      Make this connection an activator (see below). With this bit set,
      an item of type KDBUS_ITEM_NAME has to be attached which describes
      the well-known name this connection should be an activator for.

    KDBUS_HELLO_POLICY_HOLDER
      Make this connection a policy holder (see below). With this bit set,
      an item of type KDBUS_ITEM_NAME has to be attached which describes
      the well-known name this connection should hold a policy for.

    KDBUS_HELLO_MONITOR
      Make this connection an eaves-dropping connection that receives all
      unicast messages sent on the bus. To also receive broadcast messages,
      the connection has to upload appropriate matches as well.
      This flag is only valid for privileged bus connections.

  __u64 attach_flags;
      Request the attachment of metadata for each message received by this
      connection. The metadata actually attached may actually augment the list
      of requested items. See section 13 for more details.

  __u64 bus_flags;
      Upon successful completion of the ioctl, this member will contain the
      flags of the bus it connected to.

  __u64 id;
      Upon successful completion of the ioctl, this member will contain the
      id of the new connection.

  __u64 pool_size;
      The size of the communication pool, in bytes. The pool can be accessed
      by calling mmap() on the file descriptor that was used to issue the
      KDBUS_CMD_HELLO ioctl.

  struct kdbus_bloom_parameter bloom;
      Bloom filter parameter (see below).

  __u8 id128[16];
      Upon successful completion of the ioctl, this member will contain the
      128 bit wide UUID of the connected bus.

  struct kdbus_item items[0];
      Variable list of items to add optional additional information. The
      following items are currently expected/valid:

      KDBUS_ITEM_CONN_NAME
        Contains a string to describes this connection's name, so it can be
        identified later.

      KDBUS_ITEM_NAME
      KDBUS_ITEM_POLICY_ACCESS
        For activators and policy holders only, combinations of these two
        items describe policy access entries (see section about policy db).

      KDBUS_ITEM_CREDS
      KDBUS_ITEM_SECLABEL
        Privileged bus users may submit these types in order to create
        connections with faked credentials. The only real use case for this
        is a proxy service which acts on behalf of some other tasks. For a
        connection that runs in that mode, the message's metadata items will
        be limited to what's specified here. See section 13 for more
        information.

      Items of other types are silently ignored.
};


6.3 Activator and policy holder connection
------------------------------------------

An activator connection is a placeholder for a well-known name. Messages sent
to such a connection can be used by userspace to start an implementor
connection, which will then get all the messages from the activator copied
over. An activator connection cannot be used to send any message.

A policy holder connection only installs a policy for one or more names.
These policy entries are kept active as long as the connection is alive, and
are removed once it terminates. Such a policy connection type can be used to
deploy restrictions for names that are not yet active on the bus. A policy
holder connection cannot be used to send any message.

The creation of activator, policy holder or monitor connections is an operation
restricted to privileged users on the bus (see section "Terminology").


6.4 Retrieving information on a connection
------------------------------------------

The KDBUS_CMD_CONN_INFO ioctl can be used to retrieve credentials and
properties of the initial creator of a connection. This ioctl uses the
following struct:

struct kdbus_cmd_info {
  __u64 size;
    The overall size of the struct, including the name with its 0-byte string
    terminator.

  __u64 flags;
    Specify which items should be attached to the answer.
    The following flags can be used:

    KDBUS_ATTACH_NAMES
      Add an item to the answer containing all the names the connection
      currently owns.

    KDBUS_ATTACH_CONN_NAME
      Add an item to the answer containing the connection's name.

    After the ioctl returns, this field will contain the current metadata
    attach flags of the connection.

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  __u64 id;
    The connection's numerical ID to retrieve information for. If set to
    non-zero value, the 'name' field is ignored.

  __u64 offset;
    When the ioctl returns, this value will yield the offset of the connection
    information inside the caller's pool.

  struct kdbus_item items[0];
    The optional item list, containing the well-known name to look up as
    a KDBUS_ITEM_NAME. Only required if the 'id' field is set to 0.
    All other items are currently ignored.
};

After the ioctl returns, the following struct  will be stored in the caller's
pool at 'offset'.

struct kdbus_info {
  __u64 size;
    The overall size of the struct, including all its items.

  __u64 id;
    The connection's unique ID.

  __u64 flags;
    The connection's flags as specified when it was created.

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  struct kdbus_item items[0];
    Depending on the 'flags' field in struct kdbus_cmd_info, items of
    types KDBUS_ITEM_NAME and KDBUS_ITEM_CONN_NAME are followed here.
};

Once the caller is finished with parsing the return buffer, it needs to call
KDBUS_CMD_FREE for the offset.


6.5 Getting information about a connection's bus creator
--------------------------------------------------------

The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
the bus the connection is attached to. The metadata returned by this call is
collected during the creation of the bus and is never altered afterwards, so
it provides pristine information on the task that created the bus, at the
moment when it did so.

In response to this call, a slice in the connection's pool is allocated and
filled with an object of type struct kdbus_info, pointed to by the ioctl's
'offset' field.

struct kdbus_info {
  __u64 size;
    The overall size of the struct, including all its items.

  __u64 id;
    The bus' ID

  __u64 flags;
    The bus' flags as specified when it was created.

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  struct kdbus_item items[0];
    Metadata information is stored in items here.
};

Once the caller is finished with parsing the return buffer, it needs to call
KDBUS_CMD_FREE for the offset.


6.6 Updating connection details
-------------------------------

Some of a connection's details can be updated with the KDBUS_CMD_CONN_UPDATE
ioctl, using the file descriptor that was used to create the connection.
The update command uses the following struct.

struct kdbus_cmd_update {
  __u64 size;
    The overall size of the struct, including all its items.

  struct kdbus_item items[0];
    Items to describe the connection details to be updated. The following item
    types are supported:

    KDBUS_ITEM_ATTACH_FLAGS
      Supply a new set of items to be attached to each message.

    KDBUS_ITEM_NAME
    KDBUS_ITEM_POLICY_ACCESS
      Policy holder connections may supply a new set of policy information
      with these items. For other connection types, -EOPNOTSUPP is returned.
};


6.6 Termination
---------------

A connection can be terminated by simply closing the file descriptor that was
used to start the connection. All pending incoming messages will be discarded,
and the memory in the pool will be freed.

An alternative way of way of closing down a connection is calling the
KDBUS_CMD_BYEBYE ioctl on it, which will only succeed if the message queue
of the connection is empty at the time of closing, otherwise, -EBUSY is
returned.

When this ioctl returns successfully, the connection has been terminated and
won't accept any new messages from remote peers. This way, a connection can
be terminated race-free, without losing any messages.


7. Messages
===============================================================================

Messages consist of a fixed-size header followed directly by a list of
variable-sized data 'items'. The overall message size is specified in the
header of the message. The chain of data items can contain well-defined
message metadata fields, raw data, references to data, or file descriptors.


7.1 Sending messages
--------------------

Messages are passed to the kernel with the KDBUS_CMD_MSG_SEND ioctl. Depending
on the the destination address of the message, the kernel delivers the message
to the specific destination connection or to all connections on the same bus.
Sending messages across buses is not possible. Messages are always queued in
the memory pool of the destination connection (see below).

The KDBUS_CMD_MSG_SEND ioctl uses struct kdbus_msg to describe the message to
be sent.

struct kdbus_msg {
  __u64 size;
    The over all size of the struct, including the attached items.

  __u64 flags;
    Flags for message delivery:

    KDBUS_MSG_FLAGS_EXPECT_REPLY
      Expect a reply from the remote peer to this message. With this bit set,
      the timeout_ns field must be set to a non-zero number of nanoseconds in
      which the receiving peer is expected to reply. If such a reply is not
      received in time, the sender will be notified with a timeout message
      (see below). The value must be an absolute value, in nanoseconds and
      based on CLOCK_MONOTONIC.

      For a message to be accepted as reply, it must be a direct message to
      the original sender (not a broadcast), and its kdbus_msg.reply_cookie
      must match the previous message's kdbus_msg.cookie.

      Expected replies also temporarily open the policy of the sending
      connection, so the other peer is allowed to respond within the given
      time window.

    KDBUS_MSG_FLAGS_SYNC_REPLY
      By default, all calls to kdbus are considered asynchronous,
      non-blocking. However, as there are many use cases that need to wait
      for a remote peer to answer a method call, there's a way to send a
      message and wait for a reply in a synchronous fashion. This is what
      the KDBUS_MSG_FLAGS_SYNC_REPLY controls. The KDBUS_CMD_MSG_SEND ioctl
      will block until the reply has arrived, the timeout limit is reached,
      in case the remote connection was shut down, or if interrupted by
      a signal before any reply; see signal(7).

      The offset of the reply message in the sender's pool is stored in
      in 'offset_reply' when the ioctl has returned without error. Hence,
      there is no need for another KDBUS_CMD_MSG_RECV ioctl or anything else
      to receive the reply.

    KDBUS_MSG_FLAGS_NO_AUTO_START
      By default, when a message is sent to an activator connection, the
      activator notified and will start an implementor. This flag inhibits
      that behavior. With this bit set, and the remote being an activator,
      -EADDRNOTAVAIL is returned from the ioctl.

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call of
    KDBUS_MSG_SEND.

  __s64 priority;
    The priority of this message. Receiving messages (see below) may
    optionally be constrained to messages of a minimal priority. This
    allows for use cases where timing critical data is interleaved with
    control data on the same connection. If unused, the priority should be
    set to zero.

  __u64 dst_id;
    The numeric ID of the destination connection, or KDBUS_DST_ID_BROADCAST
    (~0ULL) to address every peer on the bus, or KDBUS_DST_ID_NAME (0) to look
    it up dynamically from the bus' name registry. In the latter case, an item
    of type KDBUS_ITEM_DST_NAME is mandatory.

  __u64 src_id;
    Upon return of the ioctl, this member will contain the sending
    connection's numerical ID. Should be 0 at send time.

  __u64 payload_type;
    Type of the payload in the actual data records. Currently, only
    KDBUS_PAYLOAD_DBUS is accepted as input value of this field. When
    receiving messages that are generated by the kernel (notifications),
    this field will yield KDBUS_PAYLOAD_KERNEL.

  __u64 cookie;
    Cookie of this message, for later recognition. Also, when replying
    to a message (see above), the cookie_reply field must match this value.

  __u64 timeout_ns;
    If the message sent requires a reply from the remote peer (see above),
    this field contains the timeout in absolute nanoseconds based on
    CLOCK_MONOTONIC.

  __u64 cookie_reply;
    If the message sent is a reply to another message, this field must
    match the cookie of the formerly received message.

  __u64 offset_reply;
    If the message successfully got a synchronous reply (see above), this
    field will yield the offset of the reply message in the sender's pool.
    Is is what KDBUS_CMD_MSG_RECV usually does for asynchronous messages.

  struct kdbus_item items[0];
    A dynamically sized list of items to contain additional information.
    The following items are expected/valid:

    KDBUS_ITEM_PAYLOAD_VEC
    KDBUS_ITEM_PAYLOAD_MEMFD
    KDBUS_ITEM_FDS
      Actual data records containing the payload. See section "Passing of
      Payload Data".

    KDBUS_ITEM_BLOOM_FILTER
      Bloom filter for matches (see below).

    KDBUS_ITEM_DST_NAME
      Well-known name to send this message to. Required if dst_id is set
      to KDBUS_DST_ID_NAME. If a connection holding the given name can't
      be found, -ESRCH is returned.
      For messages to a unique name (ID), this item is optional. If present,
      the kernel will make sure the name owner matches the given unique name.
      This allows userspace tie the message sending to the condition that a
      name is currently owned by a certain unique name.
};

The message will be augmented by the requested metadata items when queued into
the receiver's pool. See also section 13.1 ("Metadata and namespaces").


7.2 Message layout
------------------

The layout of a message is shown below.

  +-------------------------------------------------------------------------+
  | Message                                                                 |
  | +---------------------------------------------------------------------+ |
  | | Header                                                              | |
  | | size: overall message size, including the data records              | |
  | | destination: connection id of the receiver                          | |
  | | source: connection id of the sender (set by kernel)                 | |
  | | payload_type: "DBusDBus" textual identifier stored as uint64_t      | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size (without padding)                         | |
  | | type: type of data                                                  | |
  | | data: reference to data (address or file descriptor)                | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | padding bytes to the next 8 byte alignment                          | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size (without padding)                         | |
  | | ...                                                                 | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | padding bytes to the next 8 byte alignment                          | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size                                           | |
  | | ...                                                                 | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | padding bytes to the next 8 byte alignment                          | |
  | +---------------------------------------------------------------------+ |
  +-------------------------------------------------------------------------+


7.3 Passing of Payload Data
---------------------------

When connecting to the bus, receivers request a memory pool of a given size,
large enough to carry all backlog of data enqueued for the connection. The
pool is internally backed by a shared memory file which can be mmap()ed by
the receiver.

KDBUS_MSG_PAYLOAD_VEC:
  Messages are directly copied by the sending process into the receiver's pool,
  that way two peers can exchange data by effectively doing a single-copy from
  one process to another, the kernel will not buffer the data anywhere else.

KDBUS_MSG_PAYLOAD_MEMFD:
  Messages can reference memfd files which contain the data.
  memfd files are tmpfs-backed files that allow sealing of the content of the
  file, which prevents all writable access to the file content.
  Only sealed memfd files are accepted as payload data, which enforces
  reliable passing of data; the receiver can assume that neither the sender nor
  anyone else can alter the content after the message is sent.

Apart from the sender filling-in the content into memfd files, the data will
be passed as zero-copy from one process to another, read-only, shared between
the peers.


7.4 Receiving messages
----------------------

Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
endpoint device node of the bus supports poll() to wake up the receiving
process when new messages are queued up to be received.

With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.

struct kdbus_cmd_recv {
  __u64 flags;
    Flags to control the receive command.

    KDBUS_RECV_PEEK
      Just return the location of the next message. Do not install file
      descriptors or anything else. This is usually used to determine the
      sender of the next queued message.

    KDBUS_RECV_DROP
      Drop the next message without doing anything else with it, and free the
      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.

    KDBUS_RECV_USE_PRIORITY
      Use the priority field (see below).

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  __s64 priority;
      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
      the queue with at least the given priority. If no such message is waiting
      in the queue, -ENOMSG is returned.

  __u64 offset;
      Upon return of the ioctl, this field contains the offset in the
      receiver's memory pool.
};

Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
offset field contains the location of the new message inside the receiver's
pool. The message is stored as struct kdbus_msg at this offset, and can be
interpreted with the semantics described above.

Also, if the connection allowed for file descriptor to be passed
(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
returns. The receiving task is obliged to close all of them appropriately.

The caller is obliged to call KDBUS_CMD_FREE with the returned offset when
the memory is no longer needed.


7.5 Canceling messages synchronously waiting for replies
--------------------------------------------------------

When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and
blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be
used on the same file descriptor to cancel the message, based on its cookie.
If there are multiple messages with the same cookie that are all synchronously
waiting for a reply, all of them will be canceled. Obviously, this is only
possible in multi-threaded applications.


8. Name registry
===============================================================================

Each bus instantiates a name registry to resolve well-known names into unique
connection IDs for message delivery. The registry will be queried when a
message is sent with kdbus_msg.dst_id set to KDBUS_DST_ID_NAME, or when a
registry dump is requested.

All of the below is subject to policy rules for SEE and OWN permissions.


8.1 Name validity
-----------------

A name has to comply to the following rules to be considered valid:

 - The name has two or more elements separated by a period ('.') character
 - All elements must contain at least one character
 - Each element must only contain the ASCII characters "[A-Z][a-z][0-9]_"
   and must not begin with a digit
 - The name must contain at least one '.' (period) character
   (and thus at least two elements)
 - The name must not begin with a '.' (period) character
 - The name must not exceed KDBUS_NAME_MAX_LEN (255)


8.2 Acquiring a name
--------------------

To acquire a name, a client uses the KDBUS_CMD_NAME_ACQUIRE ioctl with the
following data structure.

struct kdbus_cmd_name {
  __u64 size;
    The overall size of this struct, including the name with its 0-byte string
    terminator.

  __u64 flags;
    Flags to control details in the name acquisition.

    KDBUS_NAME_REPLACE_EXISTING
      Acquiring a name that is already present usually fails, unless this flag
      is set in the call, and KDBUS_NAME_ALLOW_REPLACEMENT or (see below) was
      set when the current owner of the name acquired it, or if the current
      owner is an activator connection (see below).

    KDBUS_NAME_ALLOW_REPLACEMENT
      Allow other connections to take over this name. When this happens, the
      former owner of the connection will be notified of the name loss.

    KDBUS_NAME_QUEUE (acquire)
      A name that is already acquired by a connection, and which wasn't
      requested with the KDBUS_NAME_ALLOW_REPLACEMENT flag set can not be
      acquired again. However, a connection can put itself in a queue of
      connections waiting for the name to be released. Once that happens, the
      first connection in that queue becomes the new owner and is notified
      accordingly.

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  struct kdbus_item items[0];
    Items to submit the name. Currently, one one item of type KDBUS_ITEM_NAME
    is expected and allowed, and the contained string must be a valid bus name.
};


8.3 Releasing a name
--------------------

A connection may release a name explicitly with the KDBUS_CMD_NAME_RELEASE
ioctl. If the connection was an implementor of an activatable name, its
pending messages are moved back to the activator. If there are any connections
queued up as waiters for the name, the oldest one of them will become the new
owner. The same happens implicitly for all names once a connection terminates.

The KDBUS_CMD_NAME_RELEASE ioctl uses the same data structure as the
acquisition call, but with slightly different field usage.

struct kdbus_cmd_name {
  __u64 size;
    The overall size of this struct, including the name with its 0-byte string
    terminator.

  __u64 flags;

  struct kdbus_item items[0];
    Items to submit the name. Currently, one one item of type KDBUS_ITEM_NAME
    is expected and allowed, and the contained string must be a valid bus name.
};


8.4 Dumping the name registry
-----------------------------

A connection may request a complete or filtered dump of currently active bus
names with the KDBUS_CMD_NAME_LIST ioctl, which takes a struct
kdbus_cmd_name_list as argument.

struct kdbus_cmd_name_list {
  __u64 flags;
    Any combination of flags to specify which names should be dumped.

    KDBUS_NAME_LIST_UNIQUE
      List the unique (numeric) IDs of the connection, whether it owns a name
      or not.

    KDBUS_NAME_LIST_NAMES
      List well-known names stored in the database which are actively owned by
      a real connection (not an activator).

    KDBUS_NAME_LIST_ACTIVATORS
      List names that are owned by an activator.

    KDBUS_NAME_LIST_QUEUED
      List connections that are not yet owning a name but are waiting for it
      to become available.

  __u64 offset;
    When the ioctl returns successfully, the offset to the name registry dump
    inside the connection's pool will be stored in this field.
};

The returned list of names is stored in a struct kdbus_name_list that in turn
contains a dynamic number of struct kdbus_cmd_name that carry the actual
information. The fields inside that struct kdbus_cmd_name is described next.

struct kdbus_name_info {
  __u64 size;
    The overall size of this struct, including the name with its 0-byte string
    terminator.

  __u64 flags;
    The current flags for this name. Can be any combination of

    KDBUS_NAME_ALLOW_REPLACEMENT

    KDBUS_NAME_IN_QUEUE (list)
      When retrieving a list of currently acquired name in the registry, this
      flag indicates whether the connection actually owns the name or is
      currently waiting for it to become available.

    KDBUS_NAME_ACTIVATOR (list)
      An activator connection owns a name as a placeholder for an implementor,
      which is started on demand as soon as the first message arrives. There's
      some more information on this topic below. In contrast to
      KDBUS_NAME_REPLACE_EXISTING, when a name is taken over from an activator
      connection, all the messages that have been queued in the activator
      connection will be moved over to the new owner. The activator connection
      will still be tracked for the name and will take control again if the
      implementor connection terminates.
      This flag can not be used when acquiring a name, but is implicitly set
      through KDBUS_CMD_HELLO with KDBUS_HELLO_ACTIVATOR set in
      kdbus_cmd_hello.conn_flags.

  __u64 owner_id;
    The owning connection's unique ID.

  __u64 conn_flags;
    The flags of the owning connection.

  struct kdbus_item items[0];
    Items containing the actual name. Currently, one one item of type
    KDBUS_ITEM_NAME will be attached.
};

The returned buffer must be freed with the KDBUS_CMD_FREE ioctl when the user
is finished with it.


9. Notifications
===============================================================================

The kernel will notify its users of the following events.

  * When connection A is terminated while connection B is waiting for a reply
    from it, connection B is notified with a message with an item of type
    KDBUS_ITEM_REPLY_DEAD.

  * When connection A does not receive a reply from connection B within the
    specified timeout window, connection A will receive a message with an item
    of type KDBUS_ITEM_REPLY_TIMEOUT.

  * When a connection is created on or removed from a bus, messages with an
    item of type KDBUS_ITEM_ID_ADD or KDBUS_ITEM_ID_REMOVE, respectively, are
    sent to all bus members that match these messages through their match
    database.

  * When a connection owns or loses a name, or a name is moved from one
    connection to another, messages with an item of type KDBUS_ITEM_NAME_ADD,
    KDBUS_ITEM_NAME_REMOVE or KDBUS_ITEM_NAME_CHANGE are sent to all bus
    members that match these messages through their match database.

A kernel notification is a regular kdbus message with the following details.

  * kdbus_msg.src_id == KDBUS_SRC_ID_KERNEL
  * kdbus_msg.dst_id == KDBUS_DST_ID_BROADCAST
  * kdbus_msg.payload_type == KDBUS_PAYLOAD_KERNEL
  * Has exactly one of the aforementioned items attached


10. Message Matching, Bloom filters
===============================================================================

10.1 Matches for broadcast messages from other connections
----------------------------------------------------------

A message addressed at the connection ID KDBUS_DST_ID_BROADCAST (~0ULL) is a
broadcast message, delivered to all connected peers which installed a rule to
match certain properties of the message. Without any rules installed in the
connection, no broadcast message or kernel-side notifications will be delivered
to the connection. Broadcast messages are subject to policy rules and TALK
access checks.

See section 11 for details on policies, and section 11.5 for more
details on implicit policies.

Matches for messages from other connections (not kernel notifications) are
implemented as bloom filters. The sender adds certain properties of the message
as elements to a bloom filter bit field, and sends that along with the
broadcast message.

The connection adds the message properties it is interested as elements to a
bloom mask bit field, and uploads the mask to the match rules of the
connection.

The kernel will match the broadcast message's bloom filter against the
connections bloom mask (simply by &-ing it), and decide whether the message
should be delivered to the connection.

The kernel has no notion of any specific properties of the message, all it
sees are the bit fields of the bloom filter and mask to match against. The
use of bloom filters allows simple and efficient matching, without exposing
any message properties or internals to the kernel side. Clients need to deal
with the fact that they might receive broadcasts which they did not subscribe
to, as the bloom filter might allow false-positives to pass the filter.

To allow the future extension of the set of elements in the bloom filter, the
filter specifies a "generation" number. A later generation must always contain
all elements of the set of the previous generation, but can add new elements
to the set. The match rules mask can carry an array with all previous
generations of masks individually stored. When the filter and mask are matched
by the kernel, the mask with the closest matching "generation" is selected
as the index into the mask array.


10.2 Matches for kernel notifications
------------------------------------

To receive kernel generated notifications (see section 9), a connection must
install special match rules that are different from the bloom filter matches
described in the section above. They can be filtered by a sender connection's
ID, by one of the name the sender connection owns at the time of sending the
message, or by type of the notification (id/name add/remove/change).

10.3 Adding a match
-------------------

To add a match, the KDBUS_CMD_MATCH_ADD ioctl is used, which takes a struct
of the struct described below.

Note that each of the items attached to this command will internally create
one match 'rule', and the collection of them, which is submitted as one block
via the ioctl is called a 'match'. To allow a message to pass, all rules of a
match have to be satisfied. Hence, adding more items to the command will only
narrow the possibility of a match to effectively let the message pass, and will
cause the connection's user space process to wake up less likely.

Multiple matches can be installed per connection. As long as one of it has a
set of rules which allows the message to pass, this one will be decisive.

struct kdbus_cmd_match {
  __u64 size;
    The overall size of the struct, including its items.

  __u64 cookie;
    A cookie which identifies the match, so it can be referred to at removal
    time.

  __u64 flags;
    Flags to control the behavior of the ioctl.

    KDBUS_MATCH_REPLACE:
      Remove all entries with the given cookie before installing the new one.
      This allows for race-free replacement of matches.

  struct kdbus_item items[0];
    Items to define the actual rules of the matches. The following item types
    are expected. Each item will cause one new match rule to be created.

    KDBUS_ITEM_BLOOM_MASK
      An item that carries the bloom filter mask to match against in its
      data field. The payload size must match the bloom filter size that
      was specified when the bus was created.
      See section 10.4 for more information.

    KDBUS_ITEM_NAME
      Specify a name that a sending connection must own at a time of sending
      a broadcast message in order to match this rule.

    KDBUS_ITEM_ID
      Specify a sender connection's ID that will match this rule.

    KDBUS_ITEM_NAME_ADD
    KDBUS_ITEM_NAME_REMOVE
    KDBUS_ITEM_NAME_CHANGE
      These items request delivery of broadcast messages that describe a name
      acquisition, loss, or change. The details are stored in the item's
      kdbus_notify_name_change member. All information specified must be
      matched in order to make the message pass. Use KDBUS_MATCH_ID_ANY to
      match against any unique connection ID.

    KDBUS_ITEM_ID_ADD
    KDBUS_ITEM_ID_REMOVE
      These items request delivery of broadcast messages that are generated
      when a connection is created or terminated. struct kdbus_notify_id_change
      is used to store the actual match information. This item can be used to
      monitor one particular connection ID, or, when the id field is set to
      KDBUS_MATCH_ID_ANY, all of them.

    Other item types are ignored.
};


10.4 Bloom filters
------------------

Bloom filters allow checking whether a given word is present in a dictionary.
This allows connections to set up a mask for information it is interested in,
and will be delivered broadcast messages that have a matching filter.

For general information on bloom filters, see

  https://en.wikipedia.org/wiki/Bloom_filter

The size of the bloom filter is defined per bus when it is created, in
kdbus_bloom_parameter.size. All bloom filters attached to broadcast messages
on the bus must match this size, and all bloom filter matches uploaded by
connections must also match the size, or a multiple thereof (see below).

The calculation of the mask has to be done on the userspace side. The kernel
just checks the bitmasks to decide whether or not to let the message pass. All
bits in the mask must match the filter in and bit-wise AND logic, but the
mask may have more bits set than the filter. Consequently, false positive
matches are expected to happen, and userspace must deal with that fact.

Masks are entities that are always passed to the kernel as part of a match
(with an item of type KDBUS_ITEM_BLOOM_MASK), and filters can be attached to
broadcast messages (with an item of type KDBUS_ITEM_BLOOM_FILTER).

For a broadcast to match, all set bits in the filter have to be set in the
installed match mask as well. For example, consider a bus has a bloom size
of 8 bytes, and the following mask/filter combinations:

    filter  0x0101010101010101
    mask    0x0101010101010101
            -> matches

    filter  0x0303030303030303
    mask    0x0101010101010101
            -> doesn't match

    filter  0x0101010101010101
    mask    0x0303030303030303
            -> matches

Hence, in order to catch all messages, a mask filled with 0xff bytes can be
installed as a wildcard match rule.

Uploaded matches may contain multiple masks, each of which in the size of the
bloom size defined by the bus. Each block of a mask is called a 'generation',
starting at index 0.

At match time, when a broadcast message is about to be delivered, a bloom
mask generation is passed, which denotes which of the bloom masks the filter
should be matched against. This allows userspace to provide backward compatible
masks at upload time, while older clients can still match against older
versions of filters.


10.5 Removing a match
--------------------

Matches can be removed through the KDBUS_CMD_MATCH_REMOVE ioctl, which again
takes struct kdbus_cmd_match as argument, but its fields are used slightly
differently.

struct kdbus_cmd_match {
  __u64 size;
    The overall size of the struct. As it has no items in this use case, the
    value should yield 16.

  __u64 cookie;
    The cookie of the match, as it was passed when the match was added.
    All matches that have this cookie will be removed.

  __u64 flags;
    Unused for this use case,

  __u64 kernel_flags;
    Valid flags for this command, returned by the kernel upon each call.

  struct kdbus_item items[0];
    Unused for this use case.
};


11. Policy
===============================================================================

A policy databases restrict the possibilities of connections to own, see and
talk to well-known names. It can be associated with a bus (through a policy
holder connection) or a custom endpoint.

See section 8.1 for more details on the validity of well-known names.

Default endpoints of buses always have a policy database. The default
policy is to deny all operations except for operations that are covered by
implicit policies. Custom endpoints always have a policy, and by default,
a policy database is empty. Therefore, unless policy rules are added, all
operations will also be denied by default.

See section 11.5 for more details on implicit policies.

A set of policy rules is described by a name and multiple access rules, defined
by the following struct.

struct kdbus_policy_access {
  __u64 type;	/* USER, GROUP, WORLD */
    One of the following.

    KDBUS_POLICY_ACCESS_USER
      Grant access to a user with the uid stored in the 'id' field.

    KDBUS_POLICY_ACCESS_GROUP
      Grant access to a user with the gid stored in the 'id' field.

    KDBUS_POLICY_ACCESS_WORLD
      Grant access to everyone. The 'id' field is ignored.

  __u64 access;	/* OWN, TALK, SEE */
    The access to grant.

    KDBUS_POLICY_SEE
      Allow the name to be seen.

    KDBUS_POLICY_TALK
      Allow the name to be talked to.

    KDBUS_POLICY_OWN
      Allow the name to be owned.

  __u64 id;
    For KDBUS_POLICY_ACCESS_USER, stores the uid.
    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
};

Policies are set through KDBUS_CMD_HELLO (when creating a policy holder
connection), KDBUS_CMD_CONN_UPDATE (when updating a policy holder connection),
KDBUS_CMD_ENDPOINT_MAKE (creating a custom endpoint) or
KDBUS_CMD_ENDPOINT_UPDATE (updating a custom endpoint). In all cases, the name
and policy access information is stored in items of type KDBUS_ITEM_NAME and
KDBUS_ITEM_POLICY_ACCESS. For this transport, the following rules apply.

  * An item of type KDBUS_ITEM_NAME must be followed by at least one
    KDBUS_ITEM_POLICY_ACCESS item
  * An item of type KDBUS_ITEM_NAME can be followed by an arbitrary number of
    KDBUS_ITEM_POLICY_ACCESS items
  * An arbitrary number of groups of names and access levels can be passed

uids and gids are internally always stored in the kernel's view of global ids,
and are translated back and forth on the ioctl level accordingly.


11.2 Wildcard names
-------------------

Policy holder connections may upload names that contain the wildcard suffix
(".*"). That way, a policy can be uploaded that is effective for every
well-kwown name that extends the provided name by exactly one more level.

For example, if an item of a set up uploaded policy rules contains the name
"foo.bar.*", both "foo.bar.baz" and "foo.bar.bazbaz" are valid, but
"foo.bar.baz.baz" is not.

This allows connections to take control over multiple names that the policy
holder doesn't need to know about when uploading the policy.

Such wildcard entries are not allowed for custom endpoints.


11.3 Policy example
-------------------

For example, a set of policy rules may look like this:

  KDBUS_ITEM_NAME: str='org.foo.bar'
  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=1000
  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=TALK, id=1001
  KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=SEE
  KDBUS_ITEM_NAME: str='org.blah.baz'
  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=0
  KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=TALK

That means that 'org.foo.bar' may only be owned by uid 1000, but every user on
the bus is allowed to see the name. However, only uid 1001 may actually send
a message to the connection and receive a reply from it.

The second rule allows 'org.blah.baz' to be owned by uid 0 only, but every user
may talk to it.


11.4 TALK access and multiple well-known names per connection
-------------------------------------------------------------

Note that TALK access is checked against all names of a connection.
For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
permission is also granted to 'org.foo.bar'. That might sound illogical, but
after all, we allow messages to be directed to either the name or a well-known
name, and policy is applied to the connection, not the name. In other words,
the effective TALK policy for a connection is the most permissive of all names
the connection owns.

If a policy database exists for a bus (because a policy holder created one on
demand) or for a custom endpoint (which always has one), each one is consulted
during name registry listing, name owning or message delivery. If either one
fails, the operation is failed with -EPERM.

For best practices, connections that own names with a restricted TALK
access should not install matches. This avoids cases where the sent
message may pass the bloom filter due to false-positives and may also
satisfy the policy rules.

11.5 Implicit policies
----------------------

Depending on the type of the endpoint, a set of implicit rules might be
enforced. On default endpoints, the following set is enforced:

  * Privileged connections always override any installed policy. Those
    connections could easily install their own policies, so there is no
    reason to enforce installed policies.
  * Connections can always talk to connections of the same user. This
    includes broadcast messages.
  * Connections that own names might send broadcast messages to other
    connections that belong to a different user, but only if that
    destination connection does not own any name.

Custom endpoints have stricter policies. The following rules apply:

  * Policy rules are always enforced, even if the connection is a privileged
    connection.
  * Policy rules are always enforced for TALK access, even if both ends are
    running under the same user. This includes broadcast messages.
  * To restrict the set of names that can be seen, endpoint policies can
    install "SEE" policies.


12. Pool
===============================================================================

A pool for data received from the kernel is installed for every connection of
the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed
when one of the following ioctls is issued:

  * KDBUS_CMD_MSG_RECV, to receive a message
  * KDBUS_CMD_NAME_LIST, to dump the name registry
  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection

Internally, the pool is organized in slices, stored in an rb-tree. The offsets
returned by either one of the aforementioned ioctls describe offsets inside the
pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE
has to be called on the offset.

To access the memory, the caller is expected to mmap() it to its task, like
this:

  /*
   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
   * value that was previously passed in the .pool_size field of struct
   * kdbus_cmd_hello.
   */

  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);


13. Metadata
===============================================================================

When a message is delivered to a receiver connection, it is augmented by
metadata items in accordance to the destination's current attach flags. The
information stored in those metadata items refer to the sender task at the
time of sending the message, so even if any detail of the sender task has
already changed upon message reception (or if the sender task does not exist
anymore), the information is still preserved and won't be modfied until the
message is freed.

Note that there are two exceptions to the above rules:

  a) Kernel generated messages don't have a source connection, so they won't be
     augmented.

  b) If a connection was created with faked credentials (see section 6.2),
     the only attached metadata items are the ones provided by the connection
     itself. The destination's attach_flags won't be looked at in such cases.

Also, there are two things to be considered by userspace programs regarding
those metadata items:

  a) Userspace must cope with the fact that it might get more metadata than
     they requested. That happens, for example, when a broadcast message is
     sent and receivers have different attach flags. Items that haven't been
     requested should hence be silently ignored.

  b) Userspace might not always get all requested metadata items that it
     requested. That is because some of those items are only added if a
     corresponding kernel feature has been enabled. Also, the two exceptions
     described above will as well lead to less items be attached than
     requested.


13.1 Known item types
---------------------

The following attach flags are currently supported.

  KDBUS_ATTACH_TIMESTAMP
    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
    monotonic and the realtime timestamp, taken when the message was
    processed on the kernel side.

  KDBUS_ATTACH_CREDS
    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.

  KDBUS_ATTACH_AUXGROUPS
    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
    number of auxiliary groups the sending task was a member of.

  KDBUS_ATTACH_NAMES
    Attaches items of type KDBUS_ITEM_NAME, one for each name the sending
    connection currently owns. The name is stored in kdbus_item.str for each
    of them.

  KDBUS_ATTACH_COMM
    Attaches an items of type KDBUS_ITEM_PID_COMM and KDBUS_ITEM_TID_COMM,
    both transporting the sending task's 'comm', for both the pid and the tid.
    The strings are stored in kdbus_item.str.

  KDBUS_ATTACH_EXE
    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
    executable of the sending task, stored in kdbus_item.str.

  KDBUS_ATTACH_CMDLINE
    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
    arguments of the sending task, as an array of strings, stored in
    kdbus_item.str.

  KDBUS_ATTACH_CGROUP
    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.

  KDBUS_ATTACH_CAPS
    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
    that should be accessed via kdbus_item.caps.caps. Also, userspace should
    be written in a way that it takes kdbus_item.caps.last_cap into account,
    and derive the number of sets and rows from the item size and the reported
    number of valid capability bits.

  KDBUS_ATTACH_SECLABEL
    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
    security label of the sending task. Access via kdbus_item->str.

  KDBUS_ATTACH_AUDIT
    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
    of the sending taskj. Access via kdbus_item->str.

  KDBUS_ATTACH_CONN_NAME
    Attaches an item of type KDBUS_ITEM_CONN_NAME that contain's the
    sending's connection current name in kdbus_item.str.


13.1 Metadata and namespaces
----------------------------
Note that if the user or PID namespaces of a connection at the time of sending
differ from those that were active then the connection was created
(KDBUS_CMD_HELLO), data structures such as messages will not have any metadata
attached to prevent leaking security-relevant information.


14. Error codes
===============================================================================

Below is a list of error codes that might be returned by the individual
ioctl commands. The list focuses on the return values from kdbus code itself,
and might not cover those of all kernel internal functions.

For all ioctls:

  -ENOMEM	The kernel memory is exhausted
  -ENOTTY	Illegal ioctl command issued for the file descriptor
  -ENOSYS	The requested functionality is not available

For all ioctls that carry a struct as payload:

  -EFAULT	The supplied data pointer was not 64-bit aligned, or was
		inaccessible from the kernel side.
  -EINVAL	The size inside the supplied struct was smaller than expected
  -EMSGSIZE	The size inside the supplied struct was bigger than expected
  -ENAMETOOLONG	A supplied name is larger than the allowed maximum size

For KDBUS_CMD_BUS_MAKE:

  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid or
		the supplied name does not start with the current uid and a '-'
  -EEXIST	A bus of that name already exists
  -ESHUTDOWN	The domain for the bus is already shut down
  -EMFILE	The maximum number of buses for the current user is exhausted

For KDBUS_CMD_DOMAIN_MAKE:

  -EPERM	The calling user does not have CAP_IPC_OWNER set, or
  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid, or
		no name supplied for top-level domain
  -EEXIST	A domain of that name already exists

For KDBUS_CMD_ENDPOINT_MAKE:

  -EPERM	The calling user is not privileged (see Terminology)
  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid
  -EEXIST	An endpoint of that name already exists

For KDBUS_CMD_HELLO:

  -EFAULT	The supplied pool size was 0 or not a multiple of the page size
  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid, or
		an illegal combination of KDBUS_HELLO_MONITOR,
		KDBUS_HELLO_ACTIVATOR and KDBUS_HELLO_POLICY_HOLDER was passed
		in the flags, or an invalid set of items was supplied
  -EPERM	An KDBUS_ITEM_CREDS items was supplied, but the current user is
		not privileged
  -ESHUTDOWN	The bus has already been shut down
  -EMFILE	The maximum number of connection on the bus has been reached

For KDBUS_CMD_BYEBYE:

  -EALREADY	The connection has already been shut down
  -EBUSY	There are still messages queued up in the connection's pool

For KDBUS_CMD_MSG_SEND:

  -EOPNOTSUPP	The connection is unconnected, or a fd was passed that is
		either a kdbus handle itself or a unix domain socket. Both is
		currently unsupported.
  -EINVAL	The submitted payload type is KDBUS_PAYLOAD_KERNEL,
		KDBUS_MSG_FLAGS_EXPECT_REPLY was set without a timeout value,
		KDBUS_MSG_FLAGS_SYNC_REPLY was set without
		KDBUS_MSG_FLAGS_EXPECT_REPLY, an invalid item was supplied,
		src_id was != 0 and different from the current connection's ID,
		a supplied memfd had a size of 0, a string was not properly
		nul-terminated
  -ENOTUNIQ	KDBUS_MSG_FLAGS_EXPECT_REPLY was set, but the dst_id is set
		to KDBUS_DST_ID_BROADCAST
  -E2BIG	Too many items
  -EMSGSIZE	A payload vector was too big, and the current user is
		unprivileged.
  -ENOTUNIQ	A fd or memfd payload was passed in a broadcast message, or
		a timeout was given for a broadcast message
  -EEXIST	Multiple KDBUS_ITEM_FDS or KDBUS_ITEM_BLOOM_FILTER,
		KDBUS_ITEM_DST_NAME were supplied
  -EBADF	A memfd item contained an illegal fd
  -EMEDIUMTYPE	A file descriptor which is not a kdbus memfd was
		refused to send as KDBUS_MSG_PAYLOAD_MEMFD.
  -EMFILE	Too many file descriptors inside a KDBUS_ITEM_FDS
  -EBADMSG	An item had illegal size, both a dst_id and a
		KDBUS_ITEM_DST_NAME was given, or both a name and a bloom
		filter was given
  -ETXTBSY	A kdbus memfd file cannot be sealed or the seal removed,
		because it is shared with other processes or still mmap()ed
  -ECOMM	A peer does not accept the file descriptors addressed to it
  -EFAULT	The supplied bloom filter size was not 64-bit aligned
  -EDOM		The supplied bloom filter size did not match the bloom filter
		size of the bus
  -EDESTADDRREQ	dst_id was set to KDBUS_DST_ID_NAME, but no KDBUS_ITEM_DST_NAME
		was attached
  -ESRCH	The name to look up was not found in the name registry
  -EADDRNOTAVAIL KDBUS_MSG_FLAGS_NO_AUTO_START was given but the destination
		 connection is an activator.
  -ENXIO	The passed numeric destination connection ID couldn't be found,
		or is not connected
  -ECONNRESET	The destination connection is no longer active
  -ETIMEDOUT	Timeout while synchronously waiting for a reply
  -EINTR	System call interrupted while synchronously waiting for a reply
  -EPIPE	When sending a message, a synchronous reply from the receiving
		connection was expected but the connection died before
		answering
  -ECANCELED	A synchronous message sending was cancelled
  -ENOBUFS	Too many pending messages on the receiver side
  -EREMCHG	Both a well-known name and a unique name (ID) was given, but
		the name is not currently owned by that connection.

For KDBUS_CMD_MSG_RECV:

  -EINVAL	Invalid flags or offset
  -EAGAIN	No message found in the queue
  -ENOMSG	No message of the requested priority found

For KDBUS_CMD_MSG_CANCEL:

  -EINVAL	Invalid flags
  -ENOENT	Pending message with the supplied cookie not found

For KDBUS_CMD_FREE:

  -ENXIO	No pool slice found at given offset
  -EINVAL	Invalid flags provided, the offset is valid, but the user is
		not allowed to free the slice. This happens, for example, if
		the offset was retrieved with KDBUS_RECV_PEEK.

For KDBUS_CMD_NAME_ACQUIRE:

  -EINVAL	Illegal command flags, illegal name provided, or an activator
		tried to acquire a second name
  -EPERM	Policy prohibited name ownership
  -EALREADY	Connection already owns that name
  -EEXIST	The name already exists and can not be taken over
  -ECONNRESET	The connection was reset during the call

For KDBUS_CMD_NAME_RELEASE:

  -EINVAL	Invalid command flags, or invalid name provided
  -ESRCH	Name is not found found in the registry
  -EADDRINUSE	Name is owned by a different connection and can't be released

For KDBUS_CMD_NAME_LIST:

  -EINVAL	Invalid flags
  -ENOBUFS	No available memory in the connection's pool.

For KDBUS_CMD_CONN_INFO:

  -EINVAL	Invalid flags, or neither an ID nor a name was provided,
		or the name is invalid.
  -ESRCH	Connection lookup by name failed
  -ENXIO	No connection with the provided number connection ID found

For KDBUS_CMD_CONN_UPDATE:

  -EINVAL	Illegal flags or items
  -EOPNOTSUPP	Operation not supported by connection.
  -E2BIG	Too many policy items attached
  -EINVAL	Wildcards submitted in policy entries, or illegal sequence
		of policy items

For KDBUS_CMD_ENDPOINT_UPDATE:

  -E2BIG	Too many policy items attached
  -EINVAL	Invalid flags, or wildcards submitted in policy entries,
		or illegal sequence of policy items

For KDBUS_CMD_MATCH_ADD:

  -EINVAL	Illegal flags or items
  -EDOM		Illegal bloom filter size
  -EMFILE	Too many matches for this connection

For KDBUS_CMD_MATCH_REMOVE:

  -EINVAL	Illegal flags
  -ENOENT	A match entry with the given cookie could not be found.


to post comments


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds