chiark / gitweb /
core: add "invocation ID" concept to service manager
authorLennart Poettering <lennart@poettering.net>
Tue, 30 Aug 2016 21:18:46 +0000 (23:18 +0200)
committerSven Eden <yamakuzure@gmx.net>
Wed, 5 Jul 2017 06:50:53 +0000 (08:50 +0200)
This adds a new invocation ID concept to the service manager. The invocation ID
identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is
generated each time a unit moves from and inactive to an activating or active
state.

The primary usecase for this concept is to connect the runtime data PID 1
maintains about a service with the offline data the journal stores about it.
Previously we'd use the unit name plus start/stop times, which however is
highly racy since the journal will generally process log data after the service
already ended.

The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel,
except that it applies to an individual unit instead of the whole system.

The invocation ID is passed to the activated processes as environment variable.
It is additionally stored as extended attribute on the cgroup of the unit. The
latter is used by journald to automatically retrieve it for each log logged
message and attach it to the log entry. The environment variable is very easily
accessible, even for unprivileged services. OTOH the extended attribute is only
accessible to privileged processes (this is because cgroupfs only supports the
"trusted." xattr namespace, not "user."). The environment variable may be
altered by services, the extended attribute may not be, hence is the better
choice for the journal.

Note that reading the invocation ID off the extended attribute from journald is
racy, similar to the way reading the unit name for a logging process is.

This patch adds APIs to read the invocation ID to sd-id128:
sd_id128_get_invocation() may be used in a similar fashion to
sd_id128_get_boot().

PID1's own logging is updated to always include the invocation ID when it logs
information about a unit.

A new bus call GetUnitByInvocationID() is added that allows retrieving a bus
path to a unit by its invocation ID. The bus path is built using the invocation
ID, thus providing a path for referring to a unit that is valid only for the
current runtime cycleof it.

Outlook for the future: should the kernel eventually allow passing of cgroup
information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we
can alter the invocation ID to be generated as hash from that rather than
entirely randomly. This way we can derive the invocation race-freely from the
messages.

man/sd_id128_get_machine.xml
src/basic/cgroup-util.c
src/basic/log.c
src/basic/log.h
src/core/cgroup.c
src/libelogind/sd-bus/bus-common-errors.c
src/libelogind/sd-id128/id128-util.c
src/libelogind/sd-id128/id128-util.h
src/libelogind/sd-id128/sd-id128.c
src/shared/bus-util.c

index 4fa6f84cf2c6c34725892149cab1cbcb1ea86d32..cd6c39032c497b8995bd705dda38cf4b0b07f243 100644 (file)
@@ -45,6 +45,7 @@
   <refnamediv>
     <refname>sd_id128_get_machine</refname>
     <refname>sd_id128_get_boot</refname>
+    <refname>sd_id128_get_invocation</refname>
     <refpurpose>Retrieve 128-bit IDs</refpurpose>
   </refnamediv>
 
         <paramdef>sd_id128_t *<parameter>ret</parameter></paramdef>
       </funcprototype>
 
+      <funcprototype>
+        <funcdef>int <function>sd_id128_get_invocation</function></funcdef>
+        <paramdef>sd_id128_t *<parameter>ret</parameter></paramdef>
+      </funcprototype>
+
     </funcsynopsis>
   </refsynopsisdiv>
 
     for more information. This function also internally caches the
     returned ID to make this call a cheap operation.</para>
 
-    <para>Note that <function>sd_id128_get_boot()</function> always
-    returns a UUID v4 compatible ID.
-    <function>sd_id128_get_machine()</function> will also return a
-    UUID v4-compatible ID on new installations but might not on older.
-    It is possible to convert the machine ID into a UUID v4-compatible
+    <para><function>sd_id128_get_invocation()</function> returns the invocation ID of the currently executed
+    service. In its current implementation, this reads and parses the <varname>$INVOCATION_ID</varname> environment
+    variable that the service manager sets when activating a service, see
+    <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry> for details. The
+    ID is cached internally. In future a different mechanism to determine the invocation ID may be added.</para>
+
+    <para>Note that <function>sd_id128_get_boot()</function> and <function>sd_id128_get_invocation()</function> always
+    return UUID v4 compatible IDs.  <function>sd_id128_get_machine()</function> will also return a UUID v4-compatible
+    ID on new installations but might not on older.  It is possible to convert the machine ID into a UUID v4-compatible
     one. For more information, see
     <citerefentry><refentrytitle>machine-id</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
 
   <refsect1>
     <title>Notes</title>
 
-    <para>The <function>sd_id128_get_machine()</function> and
-    <function>sd_id128_get_boot()</function> interfaces are available
-    as a shared library, which can be compiled and linked to with the
-    <literal>libelogind</literal> <citerefentry project='die-net'><refentrytitle>pkg-config</refentrytitle><manvolnum>1</manvolnum></citerefentry>
-    file.</para>
+    <para>The <function>sd_id128_get_machine()</function>, <function>sd_id128_get_boot()</function> and
+    <function>sd_id128_get_invocation()</function> interfaces are available as a shared library, which can be compiled
+    and linked to with the <literal>libsystemd</literal> <citerefentry
+    project='die-net'><refentrytitle>pkg-config</refentrytitle><manvolnum>1</manvolnum></citerefentry> file.</para>
   </refsect1>
 
   <refsect1>
     <title>See Also</title>
 
     <para>
-      <citerefentry><refentrytitle>elogind</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
+      <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
       <citerefentry><refentrytitle>sd-id128</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
       <citerefentry><refentrytitle>machine-id</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
-      <citerefentry><refentrytitle>random</refentrytitle><manvolnum>4</manvolnum></citerefentry>,
-      <citerefentry><refentrytitle>sd_id128_randomize</refentrytitle><manvolnum>3</manvolnum></citerefentry>
+      <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
+      <citerefentry><refentrytitle>sd_id128_randomize</refentrytitle><manvolnum>3</manvolnum></citerefentry>,
+      <citerefentry project='man-pages'><refentrytitle>random</refentrytitle><manvolnum>4</manvolnum></citerefentry>
     </para>
   </refsect1>
 
index 65884568e51cdb767c79f85d711cb4cfad4781a4..fce0b9e5df7ebf632d0f23663d8354bf16ff7e7e 100644 (file)
@@ -876,7 +876,7 @@ int cg_set_task_access(
         if (r < 0)
                 return r;
 
-        unified = cg_all_unified();
+        unified = cg_unified(controller);
         if (unified < 0)
                 return unified;
         if (unified)
@@ -891,6 +891,43 @@ int cg_set_task_access(
 }
 #endif // 0
 
+int cg_set_xattr(const char *controller, const char *path, const char *name, const void *value, size_t size, int flags) {
+        _cleanup_free_ char *fs = NULL;
+        int r;
+
+        assert(path);
+        assert(name);
+        assert(value || size <= 0);
+
+        r = cg_get_path(controller, path, NULL, &fs);
+        if (r < 0)
+                return r;
+
+        if (setxattr(fs, name, value, size, flags) < 0)
+                return -errno;
+
+        return 0;
+}
+
+int cg_get_xattr(const char *controller, const char *path, const char *name, void *value, size_t size) {
+        _cleanup_free_ char *fs = NULL;
+        ssize_t n;
+        int r;
+
+        assert(path);
+        assert(name);
+
+        r = cg_get_path(controller, path, NULL, &fs);
+        if (r < 0)
+                return r;
+
+        n = getxattr(fs, name, value, size);
+        if (n < 0)
+                return -errno;
+
+        return (int) n;
+}
+
 int cg_pid_get_path(const char *controller, pid_t pid, char **path) {
         _cleanup_fclose_ FILE *f = NULL;
         char line[LINE_MAX];
@@ -901,18 +938,17 @@ int cg_pid_get_path(const char *controller, pid_t pid, char **path) {
         assert(path);
         assert(pid >= 0);
 
-        unified = cg_all_unified();
+        if (controller) {
+                if (!cg_controller_is_valid(controller))
+                        return -EINVAL;
+        } else
+                controller = SYSTEMD_CGROUP_CONTROLLER;
+
+        unified = cg_unified(controller);
         if (unified < 0)
                 return unified;
-        if (unified == 0) {
-                if (controller) {
-                        if (!cg_controller_is_valid(controller))
-                                return -EINVAL;
-                } else
-                        controller = SYSTEMD_CGROUP_CONTROLLER;
-
+        if (unified == 0)
                 cs = strlen(controller);
-        }
 
         fs = procfs_file_alloca(pid, "cgroup");
         log_debug_elogind("Searching for PID %u in \"%s\" (controller \"%s\")",
@@ -980,7 +1016,7 @@ int cg_install_release_agent(const char *controller, const char *agent) {
 
         assert(agent);
 
-        unified = cg_all_unified();
+        unified = cg_unified(controller);
         if (unified < 0)
                 return unified;
         if (unified) /* doesn't apply to unified hierarchy */
@@ -1031,7 +1067,7 @@ int cg_uninstall_release_agent(const char *controller) {
         _cleanup_free_ char *fs = NULL;
         int r, unified;
 
-        unified = cg_all_unified();
+        unified = cg_unified(controller);
         if (unified < 0)
                 return unified;
         if (unified) /* Doesn't apply to unified hierarchy */
@@ -1087,7 +1123,7 @@ int cg_is_empty_recursive(const char *controller, const char *path) {
         if (controller && (isempty(path) || path_equal(path, "/")))
                 return false;
 
-        unified = cg_all_unified();
+        unified = cg_unified(controller);
         if (unified < 0)
                 return unified;
 
@@ -2276,9 +2312,10 @@ int cg_kernel_controllers(Set *controllers) {
 }
 #endif // 0
 
-static thread_local int unified_cache = -1;
+static thread_local CGroupUnified unified_cache = CGROUP_UNIFIED_UNKNOWN;
+
+static int cg_update_unified(void) {
 
-int cg_all_unified(void) {
         struct statfs fs;
 
         /* Checks if we support the unified hierarchy. Returns an
@@ -2286,8 +2323,8 @@ int cg_all_unified(void) {
          * have any other trouble determining if the unified hierarchy
          * is supported. */
 
-        if (unified_cache >= 0)
-                return unified_cache;
+        if (unified_cache >= CGROUP_UNIFIED_NONE)
+                return 0;
 
         if (statfs("/sys/fs/cgroup/", &fs) < 0)
                 return -errno;
@@ -2302,18 +2339,41 @@ int cg_all_unified(void) {
          * add such a support. */
         if (F_TYPE_EQUAL(fs.f_type, TMPFS_MAGIC)) {
 #endif // 0
-                unified_cache = true;
-        else if (F_TYPE_EQUAL(fs.f_type, TMPFS_MAGIC))
-                unified_cache = false;
-        else
+                unified_cache = CGROUP_UNIFIED_ALL;
+        else if (F_TYPE_EQUAL(fs.f_type, TMPFS_MAGIC)) {
+                if (statfs("/sys/fs/cgroup/systemd/", &fs) < 0)
+                        return -errno;
+
+                unified_cache = F_TYPE_EQUAL(fs.f_type, CGROUP2_SUPER_MAGIC) ?
+                        CGROUP_UNIFIED_SYSTEMD : CGROUP_UNIFIED_NONE;
+        } else
                 return -ENOMEDIUM;
 
-        return unified_cache;
+        return 0;
+}
+
+int cg_unified(const char *controller) {
+
+        int r;
+
+        r = cg_update_unified();
+        if (r < 0)
+                return r;
+
+        if (streq_ptr(controller, SYSTEMD_CGROUP_CONTROLLER))
+                return unified_cache >= CGROUP_UNIFIED_SYSTEMD;
+        else
+                return unified_cache >= CGROUP_UNIFIED_ALL;
+}
+
+int cg_all_unified(void) {
+
+        return cg_unified(NULL);
 }
 
 #if 0 /// UNNEEDED by elogind
 void cg_unified_flush(void) {
-        unified_cache = -1;
+        unified_cache = CGROUP_UNIFIED_UNKNOWN;
 }
 
 int cg_enable_everywhere(CGroupMask supported, CGroupMask mask, const char *p) {
@@ -2421,8 +2481,10 @@ bool cg_is_unified_systemd_controller_wanted(void) {
 
         r = get_proc_cmdline_key("systemd.legacy_systemd_cgroup_controller", NULL);
         if (r > 0) {
+        if (r > 0)
                 wanted = false;
         } else {
+        else {
                 _cleanup_free_ char *value = NULL;
 
                 r = get_proc_cmdline_key("systemd.legacy_systemd_cgroup_controller=", &value);
index a158212d586a0cbb82d1817f2278b0ec1b0517d0..521ea92efd16c892d4c1f3352f28a87db35e175d 100644 (file)
@@ -342,8 +342,6 @@ static int write_to_console(
                 const char *file,
                 int line,
                 const char *func,
-                const char *object_field,
-                const char *object,
                 const char *buffer) {
 
         char location[256], prefix[1 + DECIMAL_STR_MAX(int) + 2];
@@ -402,8 +400,6 @@ static int write_to_syslog(
                 const char *file,
                 int line,
                 const char *func,
-                const char *object_field,
-                const char *object,
                 const char *buffer) {
 
         char header_priority[2 + DECIMAL_STR_MAX(int) + 1],
@@ -465,8 +461,6 @@ static int write_to_kmsg(
                 const char *file,
                 int line,
                 const char *func,
-                const char *object_field,
-                const char *object,
                 const char *buffer) {
 
         char header_priority[2 + DECIMAL_STR_MAX(int) + 1],
@@ -498,7 +492,8 @@ static int log_do_header(
                 int level,
                 int error,
                 const char *file, int line, const char *func,
-                const char *object_field, const char *object) {
+                const char *object_field, const char *object,
+                const char *extra_field, const char *extra) {
 
         snprintf(header, size,
                  "PRIORITY=%i\n"
@@ -508,6 +503,7 @@ static int log_do_header(
                  "%s%s%s"
                  "%s%.*i%s"
                  "%s%s%s"
+                 "%s%s%s"
                  "SYSLOG_IDENTIFIER=%s\n",
                  LOG_PRI(level),
                  LOG_FAC(level),
@@ -526,6 +522,9 @@ static int log_do_header(
                  isempty(object) ? "" : object_field,
                  isempty(object) ? "" : object,
                  isempty(object) ? "" : "\n",
+                 isempty(extra) ? "" : extra_field,
+                 isempty(extra) ? "" : extra,
+                 isempty(extra) ? "" : "\n",
                  program_invocation_short_name);
 
         return 0;
@@ -539,6 +538,8 @@ static int write_to_journal(
                 const char *func,
                 const char *object_field,
                 const char *object,
+                const char *extra_field,
+                const char *extra,
                 const char *buffer) {
 
         char header[LINE_MAX];
@@ -548,7 +549,7 @@ static int write_to_journal(
         if (journal_fd < 0)
                 return 0;
 
-        log_do_header(header, sizeof(header), level, error, file, line, func, object_field, object);
+        log_do_header(header, sizeof(header), level, error, file, line, func, object_field, object, extra_field, extra);
 
         IOVEC_SET_STRING(iovec[0], header);
         IOVEC_SET_STRING(iovec[1], "MESSAGE=");
@@ -573,6 +574,8 @@ static int log_dispatch(
                 const char *func,
                 const char *object_field,
                 const char *object,
+                const char *extra,
+                const char *extra_field,
                 char *buffer) {
 
         assert(buffer);
@@ -604,7 +607,7 @@ static int log_dispatch(
                     log_target == LOG_TARGET_JOURNAL_OR_KMSG ||
                     log_target == LOG_TARGET_JOURNAL) {
 
-                        k = write_to_journal(level, error, file, line, func, object_field, object, buffer);
+                        k = write_to_journal(level, error, file, line, func, object_field, object, extra_field, extra, buffer);
                         if (k < 0) {
                                 if (k != -EAGAIN)
                                         log_close_journal();
@@ -616,7 +619,7 @@ static int log_dispatch(
                 if (log_target == LOG_TARGET_SYSLOG_OR_KMSG ||
                     log_target == LOG_TARGET_SYSLOG) {
 
-                        k = write_to_syslog(level, error, file, line, func, object_field, object, buffer);
+                        k = write_to_syslog(level, error, file, line, func, buffer);
                         if (k < 0) {
                                 if (k != -EAGAIN)
                                         log_close_syslog();
@@ -631,7 +634,7 @@ static int log_dispatch(
                      log_target == LOG_TARGET_JOURNAL_OR_KMSG ||
                      log_target == LOG_TARGET_KMSG)) {
 
-                        k = write_to_kmsg(level, error, file, line, func, object_field, object, buffer);
+                        k = write_to_kmsg(level, error, file, line, func, buffer);
                         if (k < 0) {
                                 log_close_kmsg();
                                 log_open_console();
@@ -639,7 +642,7 @@ static int log_dispatch(
                 }
 
                 if (k <= 0)
-                        (void) write_to_console(level, error, file, line, func, object_field, object, buffer);
+                        (void) write_to_console(level, error, file, line, func, buffer);
 
                 buffer = e;
         } while (buffer);
@@ -665,7 +668,7 @@ int log_dump_internal(
         if (_likely_(LOG_PRI(level) > log_max_level))
                 return -error;
 
-        return log_dispatch(level, error, file, line, func, NULL, NULL, buffer);
+        return log_dispatch(level, error, file, line, func, NULL, NULL, NULL, NULL, buffer);
 }
 
 int log_internalv(
@@ -692,7 +695,7 @@ int log_internalv(
 
         vsnprintf(buffer, sizeof(buffer), format, ap);
 
-        return log_dispatch(level, error, file, line, func, NULL, NULL, buffer);
+        return log_dispatch(level, error, file, line, func, NULL, NULL, NULL, NULL, buffer);
 }
 
 int log_internal(
@@ -721,6 +724,8 @@ int log_object_internalv(
                 const char *func,
                 const char *object_field,
                 const char *object,
+                const char *extra_field,
+                const char *extra,
                 const char *format,
                 va_list ap) {
 
@@ -754,7 +759,7 @@ int log_object_internalv(
 
         vsnprintf(b, l, format, ap);
 
-        return log_dispatch(level, error, file, line, func, object_field, object, buffer);
+        return log_dispatch(level, error, file, line, func, object_field, object, extra_field, extra, buffer);
 }
 
 int log_object_internal(
@@ -765,13 +770,15 @@ int log_object_internal(
                 const char *func,
                 const char *object_field,
                 const char *object,
+                const char *extra_field,
+                const char *extra,
                 const char *format, ...) {
 
         va_list ap;
         int r;
 
         va_start(ap, format);
-        r = log_object_internalv(level, error, file, line, func, object_field, object, format, ap);
+        r = log_object_internalv(level, error, file, line, func, object_field, object, extra_field, extra, format, ap);
         va_end(ap);
 
         return r;
@@ -796,7 +803,7 @@ static void log_assert(
 
         log_abort_msg = buffer;
 
-        log_dispatch(level, 0, file, line, func, NULL, NULL, buffer);
+        log_dispatch(level, 0, file, line, func, NULL, NULL, NULL, NULL, buffer);
 }
 
 noreturn void log_assert_failed(const char *text, const char *file, int line, const char *func) {
@@ -905,7 +912,7 @@ int log_struct_internal(
                 bool fallback = false;
 
                 /* If the journal is available do structured logging */
-                log_do_header(header, sizeof(header), level, error, file, line, func, NULL, NULL);
+                log_do_header(header, sizeof(header), level, error, file, line, func, NULL, NULL, NULL, NULL);
                 IOVEC_SET_STRING(iovec[n++], header);
 
                 va_start(ap, format);
@@ -953,7 +960,7 @@ int log_struct_internal(
         if (!found)
                 return -error;
 
-        return log_dispatch(level, error, file, line, func, NULL, NULL, buf + 8);
+        return log_dispatch(level, error, file, line, func, NULL, NULL, NULL, NULL, buf + 8);
 }
 
 int log_set_target_from_string(const char *e) {
index 0b82b768c97ac951f2bf96774b76a5242d42051a..2e1d8c5833bde82794d36f65d3798d3fa29ea77f 100644 (file)
@@ -102,18 +102,22 @@ int log_object_internal(
                 const char *func,
                 const char *object_field,
                 const char *object,
-                const char *format, ...) _printf_(8,9);
+                const char *extra_field,
+                const char *extra,
+                const char *format, ...) _printf_(10,11);
 
 int log_object_internalv(
                 int level,
                 int error,
-                const char*file,
+                const char *file,
                 int line,
                 const char *func,
                 const char *object_field,
                 const char *object,
+                const char *extra_field,
+                const char *extra,
                 const char *format,
-                va_list ap) _printf_(8,0);
+                va_list ap) _printf_(9,0);
 
 int log_struct_internal(
                 int level,
@@ -203,7 +207,6 @@ void log_assert_failed_return(
 #else
 #  define log_debug_elogind(...) do {} while (0)
 #endif // ENABLE_DEBUG_ELOGIND
-
 /* Structured logging */
 #define log_struct(level, ...) log_struct_internal(level, 0, __FILE__, __LINE__, __func__, __VA_ARGS__)
 #define log_struct_errno(level, error, ...) log_struct_internal(level, error, __FILE__, __LINE__, __func__, __VA_ARGS__)
index e55b54633e2b7fdce71733a52400c57e9fb976ab..ccdab64a623fb91699dcb0886dbd45dc76519e7a 100644 (file)
@@ -67,6 +67,7 @@ void cgroup_context_init(CGroupContext *c) {
 
         c->memory_high = CGROUP_LIMIT_MAX;
         c->memory_max = CGROUP_LIMIT_MAX;
+        c->memory_swap_max = CGROUP_LIMIT_MAX;
 
         c->memory_limit = CGROUP_LIMIT_MAX;
 
@@ -174,6 +175,7 @@ void cgroup_context_dump(CGroupContext *c, FILE* f, const char *prefix) {
                 "%sMemoryLow=%" PRIu64 "\n"
                 "%sMemoryHigh=%" PRIu64 "\n"
                 "%sMemoryMax=%" PRIu64 "\n"
+                "%sMemorySwapMax=%" PRIu64 "\n"
                 "%sMemoryLimit=%" PRIu64 "\n"
                 "%sTasksMax=%" PRIu64 "\n"
                 "%sDevicePolicy=%s\n"
@@ -195,6 +197,7 @@ void cgroup_context_dump(CGroupContext *c, FILE* f, const char *prefix) {
                 prefix, c->memory_low,
                 prefix, c->memory_high,
                 prefix, c->memory_max,
+                prefix, c->memory_swap_max,
                 prefix, c->memory_limit,
                 prefix, c->tasks_max,
                 prefix, cgroup_device_policy_to_string(c->device_policy),
@@ -618,7 +621,7 @@ static unsigned cgroup_apply_blkio_device_limit(Unit *u, const char *dev_path, u
 }
 
 static bool cgroup_context_has_unified_memory_config(CGroupContext *c) {
-        return c->memory_low > 0 || c->memory_high != CGROUP_LIMIT_MAX || c->memory_max != CGROUP_LIMIT_MAX;
+        return c->memory_low > 0 || c->memory_high != CGROUP_LIMIT_MAX || c->memory_max != CGROUP_LIMIT_MAX || c->memory_swap_max != CGROUP_LIMIT_MAX;
 }
 
 static void cgroup_apply_unified_memory_limit(Unit *u, const char *file, uint64_t v) {
@@ -666,7 +669,7 @@ static void cgroup_context_apply(Unit *u, CGroupMask mask, ManagerState state) {
                 bool has_weight = cgroup_context_has_cpu_weight(c);
                 bool has_shares = cgroup_context_has_cpu_shares(c);
 
-                if (cg_unified() > 0) {
+                if (cg_all_unified() > 0) {
                         uint64_t weight;
 
                         if (has_weight)
@@ -847,12 +850,14 @@ static void cgroup_context_apply(Unit *u, CGroupMask mask, ManagerState state) {
         }
 
         if ((mask & CGROUP_MASK_MEMORY) && !is_root) {
-                if (cg_unified() > 0) {
+                if (cg_all_unified() > 0) {
                         uint64_t max = c->memory_max;
+                        uint64_t swap_max = c->memory_swap_max;
 
-                        if (cgroup_context_has_unified_memory_config(c))
+                        if (cgroup_context_has_unified_memory_config(c)) {
                                 max = c->memory_max;
-                        else {
+                                swap_max = c->memory_swap_max;
+                        } else {
                                 max = c->memory_limit;
 
                                 if (max != CGROUP_LIMIT_MAX)
@@ -862,6 +867,7 @@ static void cgroup_context_apply(Unit *u, CGroupMask mask, ManagerState state) {
                         cgroup_apply_unified_memory_limit(u, "memory.low", c->memory_low);
                         cgroup_apply_unified_memory_limit(u, "memory.high", c->memory_high);
                         cgroup_apply_unified_memory_limit(u, "memory.max", max);
+                        cgroup_apply_unified_memory_limit(u, "memory.swap.max", swap_max);
                 } else {
                         char buf[DECIMAL_STR_MAX(uint64_t) + 1];
                         uint64_t val = c->memory_limit;
@@ -1021,7 +1027,7 @@ CGroupMask unit_get_own_mask(Unit *u) {
                 e = unit_get_exec_context(u);
                 if (!e ||
                     exec_context_maintains_privileges(e) ||
-                    cg_unified() > 0)
+                    cg_all_unified() > 0)
                         return _CGROUP_MASK_ALL;
         }
 
@@ -1247,7 +1253,7 @@ int unit_watch_cgroup(Unit *u) {
                 return 0;
 
         /* Only applies to the unified hierarchy */
-        r = cg_unified();
+        r = cg_unified(SYSTEMD_CGROUP_CONTROLLER);
         if (r < 0)
                 return log_unit_error_errno(u, r, "Failed detect whether the unified hierarchy is used: %m");
         if (r == 0)
@@ -1357,6 +1363,26 @@ int unit_attach_pids_to_cgroup(Unit *u) {
         return 0;
 }
 
+static void cgroup_xattr_apply(Unit *u) {
+        char ids[SD_ID128_STRING_MAX];
+        int r;
+
+        assert(u);
+
+        if (!MANAGER_IS_SYSTEM(u->manager))
+                return;
+
+        if (sd_id128_is_null(u->invocation_id))
+                return;
+
+        r = cg_set_xattr(SYSTEMD_CGROUP_CONTROLLER, u->cgroup_path,
+                         "trusted.invocation_id",
+                         sd_id128_to_string(u->invocation_id, ids), 32,
+                         0);
+        if (r < 0)
+                log_unit_warning_errno(u, r, "Failed to set invocation ID on control group %s, ignoring: %m", u->cgroup_path);
+}
+
 static bool unit_has_mask_realized(Unit *u, CGroupMask target_mask, CGroupMask enable_mask) {
         assert(u);
 
@@ -1400,6 +1426,7 @@ static int unit_realize_cgroup_now(Unit *u, ManagerState state) {
 
         /* Finally, apply the necessary attributes. */
         cgroup_context_apply(u, target_mask, state);
+        cgroup_xattr_apply(u);
 
         return 0;
 }
@@ -1526,6 +1553,8 @@ void unit_prune_cgroup(Unit *u) {
         if (!u->cgroup_path)
                 return;
 
+        (void) unit_get_cpu_usage(u, NULL); /* Cache the last CPU usage value before we destroy the cgroup */
+
         is_root_slice = unit_has_name(u, SPECIAL_ROOT_SLICE);
 
         r = cg_trim_everywhere(u->manager->cgroup_supported, u->cgroup_path, !is_root_slice);
@@ -1647,7 +1676,7 @@ int unit_watch_all_pids(Unit *u) {
         if (!u->cgroup_path)
                 return -ENOENT;
 
-        if (cg_unified() > 0) /* On unified we can use proper notifications */
+        if (cg_unified(SYSTEMD_CGROUP_CONTROLLER) > 0) /* On unified we can use proper notifications */
                 return 0;
 
         return unit_watch_pids_in_path(u, u->cgroup_path);
@@ -1721,7 +1750,7 @@ static int on_cgroup_inotify_event(sd_event_source *s, int fd, uint32_t revents,
 int manager_setup_cgroup(Manager *m) {
         _cleanup_free_ char *path = NULL;
         CGroupController c;
-        int r, unified;
+        int r, all_unified, systemd_unified;
         char *e;
 
         assert(m);
@@ -1762,11 +1791,17 @@ int manager_setup_cgroup(Manager *m) {
         if (r < 0)
                 return log_error_errno(r, "Cannot find cgroup mount point: %m");
 
-        unified = cg_unified();
-        if (unified < 0)
-                return log_error_errno(r, "Couldn't determine if we are running in the unified hierarchy: %m");
-        if (unified > 0)
+        all_unified = cg_all_unified();
+        systemd_unified = cg_unified(SYSTEMD_CGROUP_CONTROLLER);
+
+        if (all_unified < 0 || systemd_unified < 0)
+                return log_error_errno(all_unified < 0 ? all_unified : systemd_unified,
+                                       "Couldn't determine if we are running in the unified hierarchy: %m");
+
+        if (all_unified > 0)
                 log_debug("Unified cgroup hierarchy is located at %s.", path);
+        else if (systemd_unified > 0)
+                log_debug("Unified cgroup hierarchy is located at %s. Controllers are on legacy hierarchies.", path);
         else
                 log_debug("Using cgroup controller " SYSTEMD_CGROUP_CONTROLLER ". File system hierarchy is at %s.", path);
 
@@ -1774,7 +1809,7 @@ int manager_setup_cgroup(Manager *m) {
                 const char *scope_path;
 
                 /* 3. Install agent */
-                if (unified) {
+                if (systemd_unified) {
 
                         /* In the unified hierarchy we can get
                          * cgroup empty notifications via inotify. */
@@ -1852,7 +1887,7 @@ int manager_setup_cgroup(Manager *m) {
                         return log_error_errno(errno, "Failed to open pin file: %m");
 
                 /* 6.  Always enable hierarchical support if it exists... */
-                if (!unified)
+                if (!all_unified)
                         (void) cg_set_attribute("memory", "/", "memory.use_hierarchy", "1");
         }
 
@@ -1989,7 +2024,6 @@ int manager_notify_cgroup_empty(Manager *m, const char *cgroup) {
         return 0;
 }
 #endif // 0
-
 #if 0 /// UNNEEDED by elogind
 int unit_get_memory_current(Unit *u, uint64_t *ret) {
         _cleanup_free_ char *v = NULL;
@@ -2004,7 +2038,7 @@ int unit_get_memory_current(Unit *u, uint64_t *ret) {
         if ((u->cgroup_realized_mask & CGROUP_MASK_MEMORY) == 0)
                 return -ENODATA;
 
-        if (cg_unified() <= 0)
+        if (cg_all_unified() <= 0)
                 r = cg_get_attribute("memory", u->cgroup_path, "memory.usage_in_bytes", &v);
         else
                 r = cg_get_attribute("memory", u->cgroup_path, "memory.current", &v);
@@ -2049,7 +2083,7 @@ static int unit_get_cpu_usage_raw(Unit *u, nsec_t *ret) {
         if (!u->cgroup_path)
                 return -ENODATA;
 
-        if (cg_unified() > 0) {
+        if (cg_all_unified() > 0) {
                 const char *keys[] = { "usage_usec", NULL };
                 _cleanup_free_ char *val = NULL;
                 uint64_t us;
@@ -2089,7 +2123,21 @@ int unit_get_cpu_usage(Unit *u, nsec_t *ret) {
         nsec_t ns;
         int r;
 
+        assert(u);
+
+        /* Retrieve the current CPU usage counter. This will subtract the CPU counter taken when the unit was
+         * started. If the cgroup has been removed already, returns the last cached value. To cache the value, simply
+         * call this function with a NULL return value. */
+
         r = unit_get_cpu_usage_raw(u, &ns);
+        if (r == -ENODATA && u->cpu_usage_last != NSEC_INFINITY) {
+                /* If we can't get the CPU usage anymore (because the cgroup was already removed, for example), use our
+                 * cached value. */
+
+                if (ret)
+                        *ret = u->cpu_usage_last;
+                return 0;
+        }
         if (r < 0)
                 return r;
 
@@ -2098,7 +2146,10 @@ int unit_get_cpu_usage(Unit *u, nsec_t *ret) {
         else
                 ns = 0;
 
-        *ret = ns;
+        u->cpu_usage_last = ns;
+        if (ret)
+                *ret = ns;
+
         return 0;
 }
 
@@ -2108,6 +2159,8 @@ int unit_reset_cpu_usage(Unit *u) {
 
         assert(u);
 
+        u->cpu_usage_last = NSEC_INFINITY;
+
         r = unit_get_cpu_usage_raw(u, &ns);
         if (r < 0) {
                 u->cpu_usage_base = 0;
index d73b8faeac79cf52cc95840b0a41de245b27eb3d..ed3f5fa38e426b3e32982e240c2ed0118ccd4827 100644 (file)
@@ -28,6 +28,7 @@ BUS_ERROR_MAP_ELF_REGISTER const sd_bus_error_map bus_common_errors[] = {
 #if 0 /// UNNEEDED by elogind
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_SUCH_UNIT,                 ENOENT),
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_UNIT_FOR_PID,              ESRCH),
+        SD_BUS_ERROR_MAP(BUS_ERROR_NO_UNIT_FOR_INVOCATION_ID,    ENOENT),
         SD_BUS_ERROR_MAP(BUS_ERROR_UNIT_EXISTS,                  EEXIST),
         SD_BUS_ERROR_MAP(BUS_ERROR_LOAD_FAILED,                  EIO),
         SD_BUS_ERROR_MAP(BUS_ERROR_JOB_FAILED,                   EREMOTEIO),
@@ -54,6 +55,8 @@ BUS_ERROR_MAP_ELF_REGISTER const sd_bus_error_map bus_common_errors[] = {
         SD_BUS_ERROR_MAP(BUS_ERROR_MACHINE_EXISTS,               EEXIST),
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_PRIVATE_NETWORKING,        ENOSYS),
 #endif // 0
+        SD_BUS_ERROR_MAP(BUS_ERROR_NO_SUCH_USER_MAPPING,         ENXIO),
+        SD_BUS_ERROR_MAP(BUS_ERROR_NO_SUCH_GROUP_MAPPING,        ENXIO),
 
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_SUCH_SESSION,              ENXIO),
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_SESSION_FOR_PID,           ENXIO),
@@ -66,6 +69,7 @@ BUS_ERROR_MAP_ELF_REGISTER const sd_bus_error_map bus_common_errors[] = {
         SD_BUS_ERROR_MAP(BUS_ERROR_DEVICE_NOT_TAKEN,             EINVAL),
         SD_BUS_ERROR_MAP(BUS_ERROR_OPERATION_IN_PROGRESS,        EINPROGRESS),
         SD_BUS_ERROR_MAP(BUS_ERROR_SLEEP_VERB_NOT_SUPPORTED,     EOPNOTSUPP),
+        SD_BUS_ERROR_MAP(BUS_ERROR_SESSION_BUSY,                 EBUSY),
 
 #if 0 /// UNNEEDED by elogind
         SD_BUS_ERROR_MAP(BUS_ERROR_AUTOMATIC_TIME_SYNC_ENABLED,  EALREADY),
@@ -85,6 +89,25 @@ BUS_ERROR_MAP_ELF_REGISTER const sd_bus_error_map bus_common_errors[] = {
         SD_BUS_ERROR_MAP(BUS_ERROR_LINK_BUSY,                    EBUSY),
         SD_BUS_ERROR_MAP(BUS_ERROR_NETWORK_DOWN,                 ENETDOWN),
 
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "FORMERR",               EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "SERVFAIL",              EHOSTDOWN),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "NXDOMAIN",              ENXIO),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "NOTIMP",                ENOSYS),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "REFUSED",               EACCES),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "YXDOMAIN",              EEXIST),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "YRRSET",                EEXIST),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "NXRRSET",               ENOENT),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "NOTAUTH",               EACCES),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "NOTZONE",               EREMOTE),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADVERS",               EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADKEY",                EKEYREJECTED),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADTIME",               EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADMODE",               EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADNAME",               EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADALG",                EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADTRUNC",              EBADMSG),
+        SD_BUS_ERROR_MAP(_BUS_ERROR_DNS "BADCOOKIE",             EBADR),
+
         SD_BUS_ERROR_MAP(BUS_ERROR_NO_SUCH_TRANSFER,             ENXIO),
         SD_BUS_ERROR_MAP(BUS_ERROR_TRANSFER_IN_PROGRESS,         EBUSY),
 #endif // 0
index c3f527d6570f83d7ad323e42fec039b5c4303418..337eae24b45132adfd5c8f1ba6f7dfe7e094ca36 100644 (file)
@@ -192,3 +192,16 @@ int id128_write(const char *p, Id128Format f, sd_id128_t id, bool do_sync) {
 
         return id128_write_fd(fd, f, id, do_sync);
 }
+
+void id128_hash_func(const void *p, struct siphash *state) {
+        siphash24_compress(p, 16, state);
+}
+
+int id128_compare_func(const void *a, const void *b) {
+        return memcmp(a, b, 16);
+}
+
+const struct hash_ops id128_hash_ops = {
+        .hash = id128_hash_func,
+        .compare = id128_compare_func,
+};
index 3ba59acbcab33ae3074f198c2b4bfb0efadc6d6c..6b3855acbbba3f6268533dcaa40119ddffb29ef3 100644 (file)
@@ -22,6 +22,8 @@
 #include <stdbool.h>
 
 #include "sd-id128.h"
+
+#include "hash-funcs.h"
 #include "macro.h"
 
 char *id128_to_uuid_string(sd_id128_t id, char s[37]);
@@ -43,3 +45,7 @@ int id128_read(const char *p, Id128Format f, sd_id128_t *ret);
 
 int id128_write_fd(int fd, Id128Format f, sd_id128_t id, bool do_sync);
 int id128_write(const char *p, Id128Format f, sd_id128_t id, bool do_sync);
+
+void id128_hash_func(const void *p, struct siphash *state);
+int id128_compare_func(const void *a, const void *b) _pure_;
+extern const struct hash_ops id128_hash_ops;
index 9f47d04e61b26f1fe61750ae2858e87d4bf93172..d4450c70a003c357b0dafce92a542e2f67f87459 100644 (file)
@@ -129,6 +129,28 @@ _public_ int sd_id128_get_boot(sd_id128_t *ret) {
         return 0;
 }
 
+_public_ int sd_id128_get_invocation(sd_id128_t *ret) {
+        static thread_local sd_id128_t saved_invocation_id = {};
+        int r;
+
+        assert_return(ret, -EINVAL);
+
+        if (sd_id128_is_null(saved_invocation_id)) {
+                const char *e;
+
+                e = secure_getenv("INVOCATION_ID");
+                if (!e)
+                        return -ENXIO;
+
+                r = sd_id128_from_string(e, &saved_invocation_id);
+                if (r < 0)
+                        return r;
+        }
+
+        *ret = saved_invocation_id;
+        return 0;
+}
+
 static sd_id128_t make_v4_uuid(sd_id128_t id) {
         /* Stolen from generate_random_uuid() of drivers/char/random.c
          * in the kernel sources */
index fde3c3119a02875ebd60f311a7e59954021d6a43..d22a613af56c9b4e794726fd6f461c0d1850fbc4 100644 (file)
@@ -1348,7 +1348,7 @@ int bus_property_get_id128(
         if (sd_id128_is_null(*id)) /* Add an empty array if the ID is zero */
                 return sd_bus_message_append(reply, "ay", 0);
         else
-                return sd_bus_message_append_array(reply, 'b', id->bytes, 16);
+                return sd_bus_message_append_array(reply, 'y', id->bytes, 16);
 }
 
 #if __SIZEOF_SIZE_T__ != 8