Prep v234: Eventually fix the cgroup stuff. elogind is not init.
Prep v233.3: Unmask various functions for future coverage tests. These functions, although not used by elogind itself, are mostly tiny and crucial for important tests to work.
Prep v233: Add missing updates from upstream in src/core
core/mount-setup: if unified hierarchy is not supported, fall back to legacy We need this to gracefully support older or strangely configured kernels. v2: - do not install a callback handler, just embed the right conditions into cg_is_*_wanted() v3: - fix bug in cg_is_legacy_wanted()
Rename cg_is_unified_elogind_controller_wanted to cg_is_hybrid_wanted Less typing and doesn't make the table so incredibly wide.
core: add comment why we don't bother with MS_SHARED remounting of / in containers
core: make hybrid cgroup unified mode keep compat /sys/fs/cgroup/elogind hierarchy Currently the hybrid mode mounts cgroup v2 on /sys/fs/cgroup instead of the v1 name=elogind hierarchy. While this works fine for elogind itself, it breaks tools which expect cgroup v1 hierarchy on /sys/fs/cgroup/elogind. This patch updates the hybrid mode so that it mounts v2 hierarchy on /sys/fs/cgroup/unified and keeps v1 "name=elogind" hierarchy on /sys/fs/cgroup/elogind for compatibility. elogind itself doesn't depend on the "name=elogind" hierarchy at all. All operations take place on the v2 hierarchy as before but the v1 hierarchy is kept in sync so that any tools which expect it to be there can keep doing so. This allows elogind to take advantage of cgroup v2 process management without requiring other tools to be aware of the hybrid mode. The hybrid mode is implemented by mapping the special elogind controller to /sys/fs/cgroup/unified and making the basic cgroup utility operations - cg_attach(), cg_create(), cg_rmdir() and cg_trim() - also operate on the /sys/fs/cgroup/elogind hierarchy whenever the cgroup2 hierarchy is updated. While a bit messy, this will allow dropping complications from using cgroup v1 for process management a lot sooner than otherwise possible which should make it a net gain in terms of maintainability. v2: Fixed !cgns breakage reported by @evverx and renamed the unified mount point to /sys/fs/cgroup/unified as suggested by @brauner. v3: chown the compat hierarchy too on delegation. Suggested by @evverx. v4: [zj] - drop the change to default, full "legacy" is still the default.
tree-wide: stop using canonicalize_file_name(), use chase_symlinks() instead Let's use chase_symlinks() everywhere, and stop using GNU canonicalize_file_name() everywhere. For most cases this should not change behaviour, however increase exposure of our function to get better tested. Most importantly in a few cases (most notably nspawn) it can take the correct root directory into account when chasing symlinks.
core: use the unified hierarchy for the elogind cgroup controller hierarchy Currently, elogind uses either the legacy hierarchies or the unified hierarchy. When the legacy hierarchies are used, elogind uses a named legacy hierarchy mounted on /sys/fs/cgroup/elogind without any kernel controllers for process management. Due to the shortcomings in the legacy hierarchy, this involves a lot of workarounds and complexities. Because the unified hierarchy can be mounted and used in parallel to legacy hierarchies, there's no reason for elogind to use a legacy hierarchy for management even if the kernel resource controllers need to be mounted on legacy hierarchies. It can simply mount the unified hierarchy under /sys/fs/cgroup/elogind and use it without affecting other legacy hierarchies. This disables a significant amount of fragile workaround logics and would allow using features which depend on the unified hierarchy membership such bpf cgroup v2 membership test. In time, this would also allow deleting the said complexities. This patch updates elogind so that it prefers the unified hierarchy for the elogind cgroup controller hierarchy when legacy hierarchies are used for kernel resource controllers. * cg_unified(@controller) is introduced which tests whether the specific controller in on unified hierarchy and used to choose the unified hierarchy code path for process and service management when available. Kernel controller specific operations remain gated by cg_all_unified(). * "elogind.legacy_elogind_cgroup_controller" kernel argument can be used to force the use of legacy hierarchy for elogind cgroup controller. * nspawn: By default nspawn uses the same hierarchies as the host. If UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all. If 0, legacy for all. * nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of three options - legacy, only elogind controller on unified, and unified. The value is passed into mount setup functions and controls cgroup configuration. * nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount option is moved to mount_legacy_cgroup_hierarchy() so that it can take an appropriate action depending on the configuration of the host. v2: - CGroupUnified enum replaces open coded integer values to indicate the cgroup operation mode. - Various style updates. v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2. v4: Restored legacy container on unified host support and fixed another bug in detect_unified_cgroup_hierarchy().
Prep v231: Apply missing fixes from upstream (2/6) src/core
Prep v230: Apply missing upstream fixes and updates (4/8) src/core.
core/mount-setup.c: also relabel /dev/shm for selinux (#3039) daemons, which wish to transition state from the initramfs to the real root, might use /dev/shm for their state. As /dev is not relabeled across mount points, /dev/shm has to be relabled explicitly.
Prep v229: Add missing fixes from upstream [2/6] src/core
core: log about path_is_mount_point() errors We really shouldn't fail silently, but print a log message about these errors. Also make sure to attach error codes to all log messages where that makes sense. (While we are at it, add a couple of (void) casts to functions where we knowingly ignore return values.)
mount-setup.c: fix handling of symlink Smack labelling in cgroup setup The code introduced in f8c1a81c51 (= elogind 227) failed for me with: Failed to copy smack label from net_cls to /sys/fs/cgroup/net_cls: No such file or directory There is no need for a symlink in this case because source and target are identical. The symlink() call is allowed to fail when the target already exists. When that happens, copying the Smack label must be skipped. But the code also failed when there is a symlink, like "cpu -> cpu,cpuacct", because mac_smack_copy() got called with src="cpu,cpuacct" which fails to find the entry because the current directory is not inside /sys/fs/cgroup. The absolute path to the existing entry must be used instead.
Prep v228: Condense elogind source masks (5/5)
Prep v228: Condense elogind source masks (4/5)
Prep v228: Add remaining updates from upstream (3/3) Apply remaining fixes and the performed move of utility functions into their own foo-util.[hc] files on the rest of elogind.
[3/5] Apply missing fixes from upstream
mount: propagate error codes correctly Make sure to propagate error codes from mount-loops correctly. Right now, we return the return-code of the first mount that did _something_. This is not what we want. Make sure we return an error if _any_ mount fails (and then make sure to return the first error to not hide proper errors due to consequential errors like -ENOTDIR). Reported by cee1 <fykcee1@gmail.com>.