1 A few hints on supporting kdbus as backend in your favorite D-Bus library.
5 Before you read this, have a look at the DIFFERENCES and
6 GVARIANT_SERIALIZATION texts, you find in the same directory where you
9 We invite you to port your favorite D-Bus protocol implementation
10 over to kdbus. However, there are a couple of complexities
11 involved. On kdbus we only speak GVariant marshaling, kdbus clients
12 ignore traffic in dbus1 marshaling. Thus, you need to add a second,
13 GVariant compatible marshaler to your libary first.
15 After you have done that: here's the basic principle how kdbus works:
17 You connect to a bus by opening its bus node in /dev/kdbus/. All
18 buses have a device node there, that starts with a numeric UID of the
19 owner of the bus, followed by a dash and a string identifying the
20 bus. The system bus is thus called /dev/kdbus/0-system, and for user
21 buses the device node is /dev/kdbus/1000-user (if 1000 is your user
24 (Before we proceed, please always keep a copy of libsystemd-bus next
25 to you, ultimately that's where the details are, this document simply
26 is a rough overview to help you grok things.)
30 To connect to a bus, simply open() its device node, and issue the
31 KDBUS_CMD_HELLO call. That's it. Now you are connected. Do not send
32 Hello messages or so (as you would on dbus1), that does not exist for
35 The structure you pass to the ioctl will contain a couple of
36 parameters that you need to know to operate on the bus.
38 There are two flags fields, one indicating features of the kdbus
39 kernel side ("conn_flags"), the other one ("bus_flags") indicating
40 features of the bus owner (i.e. systemd). Both flags fields are 64bit
43 When calling into the ioctl, you need to place your own supported
44 feature bits into these fields. This tells the kernel about the
45 features you support. When the ioctl returns, it will contain the
46 features the kernel supports.
48 If any of the higher 32bit are set on the two flags fields and your
49 client does not know what they mean, it must disconnect. The upper
50 32bit are used to indicate "incompatible" feature additions on the bus
51 system, the lower 32bit indicate "compatible" feature additions. A
52 client that does not support a "compatible" feature addition can go on
53 communicating with the bus, however a client that does not support an
54 "incompatible" feature must not proceed with the connection.
56 The hello structure also contains another flags field "attach_flags"
57 which indicates meta data that is optionally attached to all incoming
58 messages. You probably want to set KDBUS_ATTACH_NAMES unconditionally
59 in it. This has the effect that all well-known names of a sender are
60 attached to all incoming messages. You need this information to
61 implement matches that match on a message sender name correctly. Of
62 course, you should only request the attachment of as little metadata
65 The kernel will return in the "id" field your unique id. This is a
66 simple numeric value. For compatibility with classic dbus1 simply
67 format this as string and prefix ":0.".
69 The kernel will also return the bloom filter size used for the signal
70 broadcast bloom filter (see below).
72 The kernel will also return the bus ID of the bus in an 128bit field.
74 The pool size field specifies the size of the memory mapped buffer,
75 where received messages are placed by the kernel.
77 After the calling the hello ioctl, you should memory map the kdbus
78 fd. Use the pool size returned by the hello ioctl as map size. In this
79 memory mapped region the kernel will place all your incoming messages.
83 Use the MSG_SEND ioctl to send a message to another peer. The ioctl
84 takes a structure that contains a variety of fields:
86 The flags field corresponds closely to the old dbus1 message header
87 flags field, though the DONT_EXPECT_REPLY field got inverted into
90 The dst_id/src_id field contains the unique id of the destination and
91 the sender. The sender field is overridden by the kernel usually, hence
92 you shouldn't fill it in. The destination field can also take the
93 special value KDBUS_DST_ID_BROADCAST for broadcast messages. For
94 messages intended to a well-known name set the field to
95 KDBUS_DST_ID_NAME, and attach the name in a special "items" entry to
96 the message (see below).
98 The payload field indicates the payload. For all dbus traffic it
99 should carry the value 0x4442757344427573ULL. (Which encodes
102 The cookie field corresponds with the "serial" field of classic
103 dbus1. We simply renamed it here (and extended it to 64bit) since we
104 didn't want to imply the monotonicity of the assignment the way the
105 word "serial" indicates it.
107 When sending a message that expects a reply, you need to set the
108 EXPECT_REPLY flag in the message flag field. In this case you should
109 also fill out the "timeout_ns" value which indicates the timeout in
110 nsec for this call. If the peer does not respond in this time you will
111 get a notification of a timeout. Note that this is also used for
112 security purposes: a single reply messages is only allowed through the
113 bus as long as the timeout has not ended. With this timeout value you
114 hence "open a time window" in which the peer might respond to your
115 request and the policy allows the response to go through.
117 When sending a message that is a reply, you need to fill in the
118 cookie_reply field, which is similar to the reply_serial field of
119 dbus1. Note that a message cannot have EXPECT_REPLY and a reply_serial
122 This pretty much explains the ioctl header. The actual payload of the
123 data is now referenced in additional items that are attached to this
124 ioctl header structure at the end. When sending a message, you attach
125 items of the type PAYLOAD_VEC, PAYLOAD_MEMFD, FDS, BLOOM, DST_NAME to
128 KDBUS_ITEM_PAYLOAD_VEC: contains a pointer + length pair for
129 referencing arbitrary user memory. This is how you reference most
130 of your data. It's a lot like the good old iovec structure of glibc.
132 KDBUS_ITEM_PAYLOAD_MEMFD: for large data blocks it is preferable
133 to send prepared "memfds" (see below) over. This item contains an
134 fd for a memfd plus a size.
136 KDBUS_ITEM_PAYLOAD_FDS: for sending over fds attach an item of this
137 type with an array of fds.
139 KDBUS_ITEM_BLOOM: the calculated bloom filter of this message, only
140 for undirected (broadcast) message.
142 KDBUS_DST_NAME: for messages that are directed to a well-known name
143 (instead of a unique name), this item contains the well-known name
146 A single message may consists of no, one or more payload items of type
147 PAYLOAD_VEC or PAYLOAD_MEMFD. D-Bus protocol implementations should
148 treat them as a single block that just happens to be split up into
149 multiple items. Some restrictions apply however:
151 The message header in its entirety must be contained in a single
154 You may only split your message up right in front of each GVariant
155 contained in the payload, as well is immediately before framing of a
156 Gvariant, as well after as any padding bytes if there are any. The
157 padding bytes must be wholly contained in the preceding
158 PAYLOAD_VEC/PAYLOAD_MEMFD item. You may not split up simple types
159 nor arrays of trivial types. The latter is necessary to allow APIs
160 to return direct pointers to linear chunks of fixed size trivial
161 arrays. Examples: The simple types "u", "s", "t" have to be in the
162 same payload item. The array of simple types "ay", "ai" have to be
163 fully in contained in the same payload item. For an array "as" or
164 "a(si)" the only restriction however is to keep each string
165 individually in an uninterrupted item, to keep the framing of each
166 element and the array in a single uninterrupted item, however the
167 various strings might end up in different items.
169 Note again, that splitting up messages into separate items is up to the
170 implementation. Also note that the kdbus kernel side might merge
171 separate items if it deems this to be useful. However, the order in
172 which items are contained in the message is left untouched.
174 PAYLOAD_MEMFD items allow zero-copy data transfer (see below regarding
175 the memfd concept). Note however that the overhead of mapping these
176 makes them relatively expensive, and only worth the trouble for memory
177 blocks > 128K (this value appears to be quite universal across
178 architectures, as we tested). Thus we recommend sending PAYLOAD_VEC
179 items over for small messages and restore to PAYLOAD_MEMFD items for
180 messages > 128K. Since while building up the message you might not
181 know yet whether it will grow beyond this boundary a good approach is
182 to simply build the message unconditionally in a memfd
183 object. However, when the message is sealed to be sent away check for
184 the size limit. If the size of the message is < 128K, then simply send
185 the data as PAYLOAD_VEC and reuse the memfd. If it is >= 128K, seal
186 the memfd and send it as PAYLOAD_MEMFD, and allocate a new memfd for
191 Use the MSG_RECV ioctl to read a message from kdbus. This will return
192 an offset into the pool memory map, relative to its beginning.
194 The received message structure more or less follows the structure of
195 the message originally sent. However, certain changes have been
196 made. In the header the src_id field will be filled in.
198 The payload items might have gotten merged and PAYLOAD_VEC items are
199 not used. Instead, you will only find PAYLOAD_OFF and PAYLOAD_MEMFD
200 items. The former contain an offset and size into your memory mapped
201 pool where you find the payload.
203 If during the HELLO ioctl you asked for getting meta data attached to
204 your message, you will find additional KDBUS_ITEM_CREDS,
205 KDBUS_ITEM_PID_COMM, KDBUS_ITEM_TID_COMM, KDBUS_ITEM_TIMESTAMP,
206 KDBUS_ITEM_EXE, KDBUS_ITEM_CMDLINE, KDBUS_ITEM_CGROUP,
207 KDBUS_ITEM_CAPS, KDBUS_ITEM_SECLABEL, KDBUS_ITEM_AUDIT items that
208 contain this metadata. This metadata will be gathered from the sender
209 at the point in time it sends the message. This information is
210 uncached, and since it is appended by the kernel, trustable. The
211 KDBUS_ITEM_SECLABEL item usually contains the SELinux security label,
214 After processing the message you need to call the KDBUS_CMD_FREE
215 ioctl, which releases the message from the pool, and allows the kernel
216 to store another message there. Note that the memory used by the pool
217 is ordinary anonymous, swappable memory that is backed by tmpfs. Hence
218 there is no need to copy the message out of it quickly, instead you
219 can just leave it there as long as you need it and release it via the
220 FREE ioctl only after that's done.
224 The kernel does not understand dbus marshaling, it will not look into
225 the message payload. To allow clients to subscribe to specific subsets
226 of the broadcast matches we employ bloom filters.
228 When broadcasting messages, a bloom filter needs to be attached to the
229 message in a KDBUS_ITEM_BLOOM item (and only for broadcasting
230 messages!). If you don't know what bloom filters are, read up now on
231 Wikipedia. In short: they are a very efficient way how to
232 probabilistically check whether a certain word is contained in a
233 vocabulary. It knows no false negatives, but it does know false
236 The bloom filter that needs to be included has the parameters m=512
237 (bits in the filter), k=8 (nr of hash functions). The underlying hash
238 function is SipHash-2-4. We calculate two hash values for an input
239 strings, one with the hash key b9660bf0467047c18875c49c54b9bd15 (this
240 is supposed to be read as a series of 16 hexadecimially formatted
241 bytes), and one with the hash key
242 aaa154a2e0714b39bfe1dd2e9fc54a3b. This results in two 64bit hash
243 values, A and B. The 8 hash functions for the bloom filter require a 9
244 bit output each (since m=512=2^9), to generate these we XOR combine
245 the first 8 bit of A shifted to the left by 1, with the first 8 bit of
246 B. Then, for the next hash function we use the second 8 bit pair, and
249 For each message to send across the bus we populate the bloom filter
250 with all possible matchable strings. If a client then wants to
251 subscribe to messages of this type, it simply tells the kernel to test
252 its own calculated bit mask against the bloom filter of each message.
254 More specifically, the following strings are added to the bloom filter
255 of each message that is broadcasted:
257 The string "interface:" suffixed by the interface name
259 The string "member:" suffixed by the member name
261 The string "path:" suffixed by the path name
263 The string "path-slash-prefix:" suffixed with the path name, and
264 also all prefixes of the path name (cut off at "/"), also prefixed
265 with "path-slash-prefix".
267 The string "message-type:" suffixed with the strings "signal",
268 "method_call", "error" or "method_return" for the respective message
271 If the first argument of the message is a string, "arg0:" suffixed
272 with the first argument.
274 If the first argument of the message is a string, "arg0-dot-prefix"
275 suffixed with the first argument, and also all prefixes of the
276 argument (cut off at "."), also prefixed with "arg0-dot-prefix".
278 If the first argument of the message is a string,
279 "arg0-slash-prefix" suffixed with the first argument, and also all
280 prefixes of the argument (cut off at "/"), also prefixed with
283 Similar for all further arguments that are strings up to 63, for the
284 arguments and their "dot" and "slash" prefixes. On the first
285 argument that is not a string, addition to the bloom filter should be
288 (Note that the bloom filter does not contain sender nor receiver
291 When a client wants to subscribe to messages matching a certain
292 expression, it should calculate the bloom mask following the same
293 algorithm. The kernel will then simply test the mask against the
294 attached bloom filters.
296 Note that bloom filters are probabilistic, which means that clients
297 might get messages they did not expect. Your bus protocol
298 implementation must be capable of dealing with these unexpected
299 messages (which it needs to anyway, given that transfers are
300 relatively unrestricted on kdbus and people can send you all kinds of
305 To install matches for broadcast messages, use the KDBUS_CMD_ADD_MATCH
306 ioctl. It takes a structure that contains an encoded match expression,
307 and that is followed by one or more items, which are combined in an
308 AND way. (Meaning: a message is matched exactly when all items
309 attached to the original ioctl struct match).
311 To match against other user messages add a KDBUS_ITEM_BLOOM item in
312 the match (see above). Note that the bloom filter does not include
313 matches to the sender names. To additionally check against sender
314 names, use the KDBUS_ITEM_ID (for unique id matches) and
315 KDBUS_ITEM_NAME (for well-known name matches) item types.
317 To match against kernel generated messages (see below) you should add
318 items of the same type as the kernel messages include,
319 i.e. KDBUS_ITEM_NAME_ADD, KDBUS_ITEM_NAME_REMOVE,
320 KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD, KDBUS_ITEM_ID_REMOVE and
321 fill them out. Note however, that you have some wildcards in this
322 case, for example the .id field of KDBUS_ITEM_ADD/KDBUS_ITEM_REMOVE
323 structures may be set to 0 to match against any id addition/removal.
325 Note that dbus match strings do no map 1:1 to these ioctl() calls. In
326 many cases (where the match string is "underspecified") you might need
327 to issue up to six different ioctl() calls for the same match. For
328 example, the empty match (which matches against all messages), would
329 translate into one KDBUS_ITEM_BLOOM ioctl, one KDBUS_ITEM_NAME_ADD,
330 one KDBUS_ITEM_NAME_CHANGE, one KDBUS_ITEM_NAME_REMOVE, one
331 KDBUS_ITEM_ID_ADD and one KDBUS_ITEM_ID_REMOVE.
333 When creating a match, you may attach a "cookie" value to them, which
334 is used for deleting this match again. The cookie can be selected freely
335 by the client. When issuing KDBUS_CMD_REMOVE_MATCH, simply pass the
336 same cookie as before and all matches matching the same "cookie" value
337 will be removed. This is particularly handy for the case where multiple
338 ioctl()s are added for a single match strings.
342 The "memfd" concept is used for zero-copy data transfers (see
343 above). memfds are file descriptors to memory chunks of arbitrary
344 sizes. If you have a memfd you can mmap() it to get access to the data
345 it contains or write to it. They are comparable to file descriptors to
346 unlinked files on a tmpfs, or to anonymous memory that one may refer
347 to with an fd. They have one particular property: they can be
348 "sealed". A memfd that is "sealed" is protected from alteration. Only
349 memfds that are currently not mapped and to which a single fd refers
350 may be sealed (they may also be unsealed in that case).
352 The concept of "sealing" makes memfds useful for using them as
353 transport for kdbus messages: only when the receiver knows that the
354 message it has received cannot change while looking at, it can safely
355 parse it without having to copy it to a safe memory area. memfds can also
356 be reused in multiple messages. A sender may send the same memfd to
357 multiple peers, and since it is sealed, it can be sure that the receiver
358 will not be able to modify it. "Sealing" hence provides both sides of
359 a transaction with the guarantee that the data stays constant and is
362 memfds are a generic concept that can be used outside of the immediate
363 kdbus usecase. You can send them across AF_UNIX sockets too, sealed or
364 unsealed. In kdbus themselves, they can be used to send zero-copy
365 payloads, but may also be sent as normal fds.
367 memfds are allocated with the KDBUS_CMD_MEMFD_NEW ioctl. After allocation,
368 simply memory map them and write to them. To set their size, use
369 KDBUS_CMD_MEMFD_SIZE_SET. Note that memfds will be increased in size
370 automatically if you touch previously unallocated pages. However, the
371 size will only be increased in multiples of the page size in that
372 case. Thus, in almost all cases, an explicitl KDBUS_CMD_MEMFD_SIZE_SET
373 is necessary, since it allows setting memfd sizes in finer
374 granularity. To seal a memfd use the KDBUS_CMD_MEMFD_SEAL_SET ioctl
375 call. It will only succeed if the caller has the only fd reference to
376 the memfd open, and if the memfd is currently unmapped.
378 If memfds are shared, keep in mind that the file pointer used by
379 write/read/seek is shared too, only pread/pwrite are safe to use
382 memfds may be sent across kdbus via KDBUS_ITEM_PAYLOAD_MEMFD items
383 attached to messages. If this is done, the data included in the memfd
384 is considered part of the payload stream of a message, and are treated
385 the same way as KDBUS_ITEM_PAYLOAD_VEC by the receiving side. It is
386 possible to interleave KDBUS_ITEM_PAYLOAD_MEMFD and
387 KDBUS_ITEM_PAYLOAD_VEC items freely, by the reader they will be
388 considered a single stream of bytes in the order these items appear in
389 the message, that just happens to be split up at various places
390 (regarding rules how they may be split up, see above). The kernel will
391 refuse taking KDBUS_ITEM_PAYLOAD_MEMFD items that refer to memfds that
394 Note that sealed memfds may be unsealed again if they are not mapped
395 you have the only fd reference to them.
397 Alternatively to sending memfds as KDBUS_ITEM_PAYLOAD_MEMFD items
398 (where they are just a part of the payload stream of a message) you can
399 also simply attach any memfd to a message using
400 KDBUS_ITEM_PAYLOAD_FDS. In this case, the memfd contents is not
401 considered part of the payload stream of the message, but simply fds
402 like any other, that happen to be attached to the message.
404 MESSAGES FROM THE KERNEL
406 A couple of messages previously generated by the dbus1 bus driver are
407 now generated by the kernel. Since the kernel does not understand the
408 payload marshaling, they are generated by the kernel in a different
409 format. This is indicated with a the "payload type" field of the
410 messages set to 0. Library implementations should take these messages
411 and synthesize traditional driver messages for them on reception.
415 Instead of the NameOwnerChanged, NameLost, NameAcquired signals
416 there are kernel messages containing KDBUS_ITEM_NAME_ADD,
417 KDBUS_ITEM_NAME_REMOVE, KDBUS_ITEM_NAME_CHANGE, KDBUS_ITEM_ID_ADD,
418 KDBUS_ITEM_ID_REMOVE items are generated (each message will contain
419 exactly one of these items). Note that in libsystemd-bus we have
420 obsoleted NameLost/NameAcquired messages, since they are entirely
421 redundant to NameOwnerChanged. This library will hence only
422 synthesize NameOwnerChanged messages from these kernel messages,
423 and never generate NameLost/NameAcquired. If your library needs to
424 stay compatible to the old dbus1 userspace, you possibly might need
425 to synthesize both a NameOwnerChanged and NameLost/NameAcquired
426 message from the same kernel message.
428 When a method call times out, a KDBUS_ITEM_REPLY_TIMEOUT message is
429 generated. This should be synthesized into a method error reply
430 message to the original call.
432 When a method call fails because the peer terminated the connection
433 before responding, a KDBUS_ITEM_REPLY_DEAD message is
434 generated. Simiarly, it should be synthesized into a method error
437 For synthesized messages we recommend setting the cookie field to
438 (uint32_t) -1 (and not (uint64_t) -1!), so that the cookie is not 0
439 (which the dbus1 spec does not allow), but clearly recognizable as
442 Note that the KDBUS_ITEM_NAME_XYZ messages will actually inform you
443 about all kinds of names, including activatable ones. Classic dbus1
444 NameOwnerChanged messages OTOH are only generated when a name is
445 really acquired on the bus and not just simply activatable. This means
446 you must explicitly check for the case where an activatable name
447 becomes acquired or an acquired name is lost and returns to be
452 To acquire names on the bus, use the KDBUS_CMD_NAME_ACQUIRE ioctl(). It
453 takes a flags field similar to dbus1's RequestName() bus driver call,
454 however the NO_QUEUE flag got inverted into a QUEUE flag instead.
456 To release a previously acquired name use the KDBUS_CMD_NAME_RELEASE
459 To list acquired names use the KDBUS_CMD_CONN_INFO ioctl. It may be
460 used to list unique names, well known names as well as activatable
461 names and clients currently queuing for ownership of a well-known
462 name. The ioctl will return an offset into the memory pool. After
463 reading all the data you need, you need to release this via the
464 KDBUS_CMD_FREE ioctl(), similar how you release a received message.
468 kdbus can optionally attach various kinds of metadata about the sender at
469 the point of time of sending ("credentials") to messages, on request
470 of the receiver. This is both supported on directed and undirected
471 (broadcast) messages. The metadata to attach is selected at time of
472 the HELLO ioctl of the receiver via a flags field (see above). Note
473 that clients must be able to handle that messages contain more
474 metadata than they asked for themselves, to simplify implementation of
475 broadcasting in the kernel. The receiver should not rely on this data
476 to be around though, even though it will be correct if it happens to
477 be attached. In order to avoid programming errors in applications, we
478 recommend though not to pass this data on to clients that did not
479 explicitly ask for it.
481 Credentials may also be queried for a well-known or unique name. Use
482 the KDBUS_CMD_CONN_INFO for this. It will return an offset to the pool
483 area again, which will contain the same credential items as messages
484 have attached. Note that when issuing the ioctl, you can select a
485 different set of credentials to gather, than what was originally requested
486 for being attached to incoming messages.
488 Credentials are always specific to the sender namespace that was
489 current at the time of sending, and of the process that opened the
490 bus connection at the time of opening it. Note that this latter data
495 The kernel enforces only very limited policy on names. It will not do
496 access filtering by userspace payload, and thus not by interface or
499 This ultimately means that most fine-grained policy enforcement needs
500 to be done by the receiving process. We recommend using PolicyKit for
501 any more complex checks. However, libraries should make simple static
502 policy decisions regarding privileged/unprivileged method calls
503 easy. We recommend doing this by enabling KDBUS_ATTACH_CAPS and
504 KDBUS_ATTACH_CREDS for incoming messages, and then discerning client
505 access by some capability, or if sender and receiver UIDs match.
509 When connecting to kdbus use the "kernel:" protocol prefix in DBus
510 address strings. The device node path is encoded in its "path="
513 Client libraries should use the following connection string when
514 connecting to the system bus:
516 kernel:path=/dev/kdbus/0-system/bus;unix:path=/run/dbus/system_bus_socket
518 This will ensure that kdbus is preferred over the legacy AF_UNIX
519 socket, but compatibility is kept. For the user bus use:
521 kernel:path=/dev/kdbus/$UID-system/bus;unix:path=$XDG_RUNTIME_DIR/bus
523 With $UID replaced by the callers numer user ID, and $XDG_RUNTIME_DIR
524 following the XDG basedir spec.
526 Of course the $DBUS_SYSTEM_BUS_ADDRESS and $DBUS_SESSION_BUS_ADDRESS
527 variables should still take precedence.
531 Activatable services for kdbus may not use classic dbus1 service
532 activation files. Instead, programs should drop in native systemd
533 .service and .busname unit files, so that they are treated uniformly
534 with other types of units and activation of the system.
536 Note that this results in a major difference to classic dbus1:
537 activatable bus names can be established at any time in the boot process.
538 This is unlike dbus1 where activatable names are unconditionally available
539 as long as dbus-daemon is running. Being able to control when
540 activatable names are established is essential to allow usage of kdbus
541 during early boot and in initrds, without the risk of triggering
546 This all is so far just the status quo. We are putting this together, because
547 we are quite confident that further API changes will be smaller, but
548 to make this very clear: this is all subject to change, still!
550 We invite you to port over your favorite dbus library to this new
551 scheme, but please be prepared to make minor changes when we still
552 change these interfaces!