Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd 257 #356818

Merged
merged 9 commits into from
Dec 20, 2024
Merged

Systemd 257 #356818

merged 9 commits into from
Dec 20, 2024

Conversation

jmbaur
Copy link
Contributor

@jmbaur jmbaur commented Nov 17, 2024

Release notes for 257

CHANGES WITH 257:

Incompatible changes:

  • The --purge switch of systemd-tmpfiles (which was added in v256) has been reworked: it will now only apply to tmpfiles.d/ lines marked with the new "$" flag. This is an incompatible change, and means any tmpfiles.d/ files which shall be used together with --purge need to be updated accordingly. This change has been made to make it harder to accidentally delete too many files when using --purge incorrectly.

  • The systemd-creds 'cat' verb now expects base64-encoded encrypted credentials as input, for consistency with the 'decrypt' verb and the LoadCredentialEncrypted= service setting. Previously it could only read raw, unencoded binary data.

  • Support for automatic flushing of the nscd user/group database caches has been dropped.

  • The FileDescriptorName= setting for socket units is now honored by Accept=yes sockets too, where it was previously silently ignored and "connection" was used unconditionally.

  • systemd-logind now always obeys block inhibitor locks, where previously it ignored locks taken by the caller or when the caller was root. A privileged caller can always close the other sessions, remove the inhibitor locks, or use --force or --check-inhibitors=no to ignore the inhibitors. This change thus doesn't affect security, since everything that was possible before at a given privilege level is still possible, but it should make the inhibitor logic easier to use and understand, and also help avoiding accidental reboots and shutdowns. New 'block-weak' inhibitor modes were added, if taken they will make the inhibitor lock work as in the previous versions. Inhibitor locks can also be taken by remote users (subject to polkit policy).

  • systemd-nspawn will now mount the unified cgroup hierarchy into a container if no systemd installation is found in a container's root filesystem. $SYSTEMD_NSPAWN_UNIFIED_HIERARCHY=0 can be used to override this behavior.

  • /dev/disk/by-id/nvme-* block device symlinks without an NVMe namespace identifier are now fixed to namespace 1 of the device. If no namespace 1 exists for a device no such symlink is created. Previously, these symlinks would point to an unspecified namespace, and thus not be strictly stable references to multi-namespace NVMe devices. These un-namespaced symlinks are mostly obsolete, users and applications should always use the ones with encoded namespace information instead. This change should not affect too many systems, because most NVMe devices only know a namespace 1 by default.

  • Support for cgroup v1 ('legacy' and 'hybrid' hierarchies) is now considered obsolete and systemd by default will ignore configuration that enables them. To forcibly reenable cgroup v1 support, SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 must additionally be set on the kernel command line.

Announcements of Future Feature Removals:

  • The D-Bus method org.freedesktop.systemd1.StartAuxiliaryScope() is deprecated because accounting data and such cannot be reasonably migrated between cgroups. It is likely to be fully removed in a future release (reach out if you have use cases).

  • The recommended kernel baseline version has been bumped to v5.4 (released in 2019). Expect limited testing on older kernel versions, where "old-kernel" taint flag would also be set. Support for them will be phased out in a future release in 2025, i.e. we expect to bump the minimum baseline to v5.4 then too.

  • The complete removal of support for cgroup v1 ('legacy' and 'hybrid' hierarchies) is scheduled for v258.

  • Support for System V service scripts is deprecated and will be removed in v258. Please make sure to update your software now to include a native systemd unit file instead of a legacy System V script to retain compatibility with future systemd releases.

  • To work around limitations of X11's keyboard handling systemd's keyboard mapping hardware database (hwdb.d/60-keyboard.hwdb) so far mapped the microphone mute and touchpad on/off/toggle keys to the function keys F20, F21, F22, F23 instead of their correct key codes. This key code mangling will be removed in the next systemd release. To maintain compatibility with X11 applications that rely on the old function key code mappings, this mangling has now been moved to the relevant X11 keyboard driver modules. In order to ensure these keys continue to work, update to xf86-input-evdev >= 2.11.0 and xf86-input-libinput >= 1.5.0 before updating to systemd >= 258.

  • Support for the SystemdOptions EFI variable is deprecated. 'bootctl systemd-efi-options' will emit a warning when used. It seems that this feature is little-used and it is better to use alternative approaches like credentials and confexts. The plan is to drop support altogether at a later point, but this might be revisited based on user feedback.

  • systemd-run's switch --expand-environment= which currently is disabled by default when combined with --scope, will be changed in a future release to be enabled by default.

libsystemd:

  • systemd's JSON API is now available as public interface of libsystemd, under the name "sd-json". The purpose of the library is to allow structures to be conveniently created in C code and serialized to JSON, and for JSON to be conveniently deserialized into in-memory structures, using callbacks to handle specific keys. Various data types like integers, floats, booleans, strings, UUIDs, base64-encoded and hex-encoded binary data, and arrays are supported natively. The library has been part of systemd for a while as internal component, and is now made publicly available. One major user of sd-json is sd-varlink (see below). Note that the documentation of sd-json is very much incomplete for now, but the systemd codebase provides plenty real-life code examples.

  • systemd's Varlink IPC API is now available as part of libsystemd, under the name "sd-varlink". This library is a C implementation of the Varlink IPC system (https://varlink.org/) that has been adopted by systemd for various interfaces. It relies on the sd-json JSON component, see above. Note that the documentation of sd-varlink is very much incomplete for now, but the systemd codebase provides plenty real-life code examples.

  • sd-bus gained a new call sd_bus_pending_method_calls() which returns the number of currently open asynchronous method calls initiated on this connection towards peers.

  • sd-device gained a new call sd_device_monitor_is_running() that returns whether the specified monitor object is already running. It also gained sd_device_monitor_get_fd(), sd_device_monitor_get_events(), sd_device_monitor_get_timeout() and sd_device_monitor_receive() to permit sd-device to run on top of a foreign event loop implementation. It also gained sd_device_get_driver_subsystem() which returns the subsystem of driver objects. The new sd_device_get_device_id() call returns a short string identifying the device record.

System and Service Management:

  • The environment variable $REMOTE_ADDR is now set when using per-connection socket activation for AF_UNIX stream sockets. It contains the AF_UNIX peer address of the connection. (Previously the environment variable was only set for IP sockets.)

  • Multipath TCP (MPTCP) is now supported as a socket protocol for .socket units.

  • A new /etc/fstab option x-systemd.wants= creates "Wants=" dependencies. (This is similar to the previously available x-systemd.requires=.)

  • The initialization of the system clock during boot and updates has been simplified: both PID 1 or systemd-timesyncd will pick the latest minimum time as indicated by the compiled-in epoch, /usr/lib/clock-epoch, and /var/lib/systemd/timesync/clock. See systemd(1) for an detailed updated description.

  • The kernel's Ctrl-Alt-Delete handling is re-enabled during late shutdown, so that the user may use it to initiate a reboot if the system freezes otherwise.

  • The new value "identity" for the unit setting PrivateUsers= may be used to request a user namespace with an identity mapping for the first 65536 UIDs/GIDs. This is analogous to the systemd-nspawn's --private-users=identity.

  • The new value "disconnected" for the unit setting PrivateTmp= may be used to specify that a separate tmpfs instance should be used for /tmp/ and /var/tmp/ for the unit.

  • The server manager (and various other tools too) use pidfds in more places to refer to processes.

  • A build option -D link-executor-shared=false can be used to build the systemd-executor binary (added in a previous release) in a way where it does not link to shared libsystemd-shared-….so library. PID1 holds a reference to the executor binary that was on disk when the manager was started or restarted, but the shared libraries it is linked to are not loaded until the executor binary needs to be used. This partial static linking is a workaround for the issue where, during upgrades, the old libsystemd-shared-….so may have already been removed and the pinned executor binary will just fail to execute.

  • The systemd.machine_id= kernel command line parameter interpreted by PID 1 now supports an additional special value: if set to "firmware" the machine ID is initialized from the SMBIOS/DeviceTree system UUID. (Previously this was already done automatically in VM environments, this extends the concept to any system, but only on explicit request via this option.)

  • The ImportCredential= setting in service unit files now permits renaming of credentials as they are imported.

  • The RestartMode= setting gained a new "debug" value. If specified and the service fails so that it shall be restarted it is invoked in "debugging mode". Debugging mode means that the $DEBUG_INVOCATION environment variable will be set to "1" for the new invocation. Moreover, any setting LogLevelMax= will be temporarily changed to "debug" for the next invocation. This mode is useful to automatically repeat invocation of tools in case they fail – but with additional logging or testing routines enabled.

  • A new service setting BindLogSockets= has been added that controls whether the AF_UNIX sockets required for logging shall be bind mounted to the mount sandbox allocated for the service.

  • At early boot, PID 1 will now optionally load a policy for the new Linux IPE LSM.

  • Transient services (as invoked by the StartTransientUnit() D-Bus method) may now receive additional, arbitrary file descriptors to pass to executed service processes during activation using the new ExtraFileDescriptor= unit property.

  • Calendar .timer units gained a new boolean DeferReactivation= option. If enabled and the repetitive calendar timer elapses again while the service the timer activates is still running, immediate reactivation of the service once it finishes is skipped, and the timer has to elapse again before the service is reactivated.

  • Generator processes invoked by the service manager will now receive a new environment variable $SYSTEMD_SOFT_REBOOTS_COUNT that indicates how many times the system has been soft-rebooted since the kernel initialized.

  • A new service property ManagedOOMMemoryPressureDurationSec= has been added that complements the existing ManagedOOMMemoryPressureDurationLimit= and specifies the PSI measurement interval for the specific unit.

  • The sd_notify() protocol has been extended to allow changing the main PID of a process by providing a pidfd of the new main process, or by specifying the pidfd inode number. Previously this was only supported by specifying the classic UNIX PID, which of course is racy.

  • The SocketUser=/SocketGroup= settings of .socket units are now also applied to POSIX message queues.

  • The ProtectControlGroups= unit file setting now supports two additional values: if set to "private" a new cgroup namespace is allocated for the service and cgroupfs mounted accordingly; if set to "strict" a new cgroup namespace is allocated for the service, and cgroupfs is mounted read-only for the service.

  • The StateDirectory=, RuntimeDirectory=, CacheDirectory=, LogsDirectory=, and ConfigurationDirectory= settings gained support for configuring the respective directories as read-only, via a ':ro' flag that can be appended to each setting's value.

  • When DynamicUser= is combined with StateDirectory=/RuntimeDirectory=/CacheDirectory=/LogsDirectory= and ID mapped mounts are available on the referenced path, the data in there is now preferably made available by establishing ID mapped from the "nobody" user to the dynamic user, rather than via recursive chown()ing.

  • A new service property PrivatePIDs= has been added that runs executed processes as PID 1 - the init process - within their own PID namespace. PrivatePIDs= also mounts /proc/ so only processes within the new PID namespace are visible.

systemd-udevd:

  • udev rules now set 'uaccess' for /dev/udmabuf, giving locally logged-in users access to the hardware. This is useful in order to support IPMI cameras with libcamera.

  • Serial port devices will no longer show up as systemd units, unless they have an IO port or memory assigned to them. This means that only serial ports that actually exist should show up as .device units now.

  • mtd devices (i.e. certain kinds of flash memory devices) will now show up as .device units in systemd.

  • The firmware_node/sun sysfs attribute will now be used (if available) for naming slot-based network interfaces, i.e. ID_NET_NAME_SLOT. Moreover the interface aliases specified in DeviceTree are now searched for both on the interface's parent device (as before) and the device itself (new).

  • Various USB hardware wallets are now recognized by udev via a .hwdb file, and get the ID_HARDWARE_WALLET= property set, which enables "uaccess" for them, i.e. direct unprivileged access.

  • udevadm info will now output the device ID string in lines prefixed with "J:", and the driver subsystem in lines prefixed with "B:".

  • udev rules files now support case-insensitive attribute matching (e.g. ATTR{foo}==i"abcd")

systemd-logind:

  • New DesignatedMaintenanceTime= configuration option allows shutdowns to be automatically scheduled at the specified time.

  • logind now reacts to Ctrl-Alt-Shift-Esc being pressed. It will send out a org.freedesktop.login1.SecureAttentionKey signal, indicating a request by the user for the system to display a secure login dialog. The handling of SAK can be suppressed in logind configuration.

  • logind now supports handing off session-managed access to hidraw devices via its D-Bus APIs, the same way it already supports that for DRM and evdev input devices. This permits unprivileged clients to get hidraw fds for a device, that are automatically suspended when the session switches away.

  • systemd-logind now exposes two D-Bus properties CanLock and CanIdle for all sessions. These properties indicate whether the session's class supports screen locking and idleness detection.

  • systemd-inhibit now allows interactive polkit authorization. It gained a --no-ask-password option to suppress it.

systemd-machined:

  • Unprivileged clients are now allowed to register VMs and containers. Machines started via the [email protected] unit will now be registered with systemd-machined.

  • systemd-machined gained a pretty complete set of Varlink APIs exposing its functionality. This is an alternative to the pre-existing D-Bus interface.

systemd-resolved:

  • The resolvconf command now supports '-p' switch. If specified, the interface will not be used as the default route for domain name lookups.

  • resolvectl now enables interactive polkit authorization. It gained a --no-ask-password option to suppress it.

systemd-networkd and networkctl:

  • IPv6 address labels can be also configured in a new [IPv6AddressLabel] section with Prefix= and Label= settings in networkd.conf. Please see networkd.conf(5) for more details.

  • 'networkctl edit' can now read the new file contents from standard input with the new --stdin option.

  • 'networkctl edit' and 'cat' now support editing/showing .netdev files by link. 'networkctl cat' can also list all configuration files associated with an interface at once with ':all'.

  • networkctl gained a --no-ask-password option to suppress interactive polkit authorization.

  • "mac" has been added to the default AlternativeNamesPolicy= setting for network links (via 99-default.link). This means "enx*" interface names will now be added to the list of alternative interface names by default, for all interfaces that have a MAC address assigned by hardware.

  • networkd .netdev bridge devices gained a new setting FDBMaxLearned= for setting a limit on the number of dynamically learned FDB entries.

  • networkd .network files for bridge devices now support Layer 2 (in addition to the pre-existing Layer 3) MDB entries, via MulticastGroupAddress=.

  • systemd-networkd will now log when per-network sysctls belonging to network interfaces managed by it are changed outside of networkd, thus highlighting conflict of ownership/management of these knobs.

  • systemd-networkd will now make RFC9463 DNR fields available to systemd-resolved, for automatic DNS DoT configuration, and similar.

  • The "dhcp" and "dhcp-on-stop" values for KeepConfiguration= setting in .network file are replaced with "dynamic" and "dynamic-on-stop", respectively. When specified, systemd-networkd will preserve all dynamic configurations via DHCPv4, DHCPv6, NDISC, and IPv4LL with ACD, while previously only DHCPv4 configurations were kept. Also, when systemd-networkd is restarted, regardless of the setting, these dynamic configurations are unconditionally kept. So, systemd-networkd can be restarted without disturbing ongoing connections.

  • systemd-networkd now updates traffic control configuration without clearing existing settings. Thus, those settings can be updated by editing relevant .network files and triggering 'networkctl reload'.

  • systemd-networkd now gracefully updates netdev settings specified in .netdev files when 'networkctl reload' is called. Previously, if the relevant interfaces existed, new settings would not be applied. Now, new settings will be applied if possible. Some settings cannot be updated after a netdev is configured, e.g. VLAN ID can be only specified on creation. To change such settings, user needs to remove existing interfaces, and invoke 'networkctl reload' or restart systemd-networkd.

systemd-boot, systemd-stub, and related tools:

  • The EFI stub now supports loading of .ucode sections with microcode from PE add-on files. It also now supports loading .initrd sections from PE add-on files.

  • A new .profile PE section type is now documented and supported in systemd-measure, ukify, systemd-stub and systemd-boot. These new sections allow multiple "profiles" to be stored together in the UKI, where each .profile section creates groupings of sections in the UKI, allowing some sections to be shared and other sections like .cmdline or .initrd unique to the profile. This may be used to provide a single UKI that synthesizes multiple menu items in the boot menu (for example, a regular one to boot, plus a debugging one, or a factory reset one, and so on – which only differ in kernel command line, but nothing else).

  • New .dtbauto and .hwids sections are now documented and supported in systemd-measure, ukify, systemd-stub, and systemd-boot. A single UKI can contain multiple .dtbauto sections, and the 'compatible' string therein will be compared with the equivalent field in the DTB provided by the firmware, if present. If absent, SMBIOS will be used to calculate hardware IDs (CHIDs) and look them up in the content of .hwids, hopefully revealing an fallback 'compatible' string. This allows including multiple DTBs in a single UKI, with systemd-stub automatically loading the correct one for the current hardware.

  • ukify gained an --extend switch to import an existing UKI to be extended, and a --measure-base= switch to support measurement of multi-profile UKIs.

  • ukify gained a --certificate-provider switch to use an OpenSSL provider to load the certificate used to sign artifacts, instead of having to provide the path to a file on disk.

  • bootctl, systemd-keyutil, systemd-measure, systemd-repart, and systemd-sbsign gained a new --certificate-source switch that allows loading the X.509 certificate from an OpenSSL provider instead of a file system path.

  • systemd-boot's menu will now react to volume up/down rocker presses the same way as to arrow up/down presses: they move the menu item up or down. This is useful on device form factors that have only a volume rocker but no arrow keys (e.g. phones).

  • systemd-stub will report the partition UUID and image identifier its UKI executable is placed on separately from the data systemd-boot provides about where to find its own executable, via EFI variables. This is useful when systemd-boot and UKIs are placed on distinct partitions (i.e. ESP and XBOOTLDR).

  • bootctl gained new switches --print-loader-path and --print-stub-path that output the path to the boot loader or UKI used for the current boot.

  • bootctl kernel-identify now recognizes EFI add-ons.

  • bootctl gained a --random-seed=yes|no option to control provisioning of the random seed file in the ESP. (This is useful when producing an image that will be used in multiple instances.)

  • bootctl now optionally supports installing UEFI Secure Boot databases (i.e. db/dbx/… databases in ESL format) for systemd-boot to pick up and automatically enroll if the system is booted in Setup Mode. This is controlled via bootctl's new --secure-boot-auto-enroll=yes switch (and some auxiliary ones). A certificate can be provided in DER format, and is automatically converted into an ESL, as needed.

  • bootctl, systemd-measure, systemd-repart when referencing signing keys on OpenSSL engines may now query for PINs and similar via systemd's native systemd-ask-password logic (and take benefit of its caching and UI).

  • A new systemd-sbsign tool has been added, that can be used to sign EFI binaries (PE) for Secure Boot. This tool supports OpenSSL engines and providers, with pin caching support for PKCS11. ukify supports it as an alternative to sbsigntool and pesign.

  • A new systemd-keyutil tool has been added, that can be used to perform various operations on private keys and X.509 certificates.

The journal:

  • journalctl can now list invocations of a unit with the --list-invocation options and show logs for a specific invocation with the new --invocation/-I option. (This is analogous to the --list-boots/--boot/-b options.)

systemd-sysupdate and related tools:

  • systemd-sysupdated has been added as system service, allowing unprivileged clients to update the system via D-Bus calls. Note that for now the systemd-sysupdated API is considered experimental, and is not considered stable yet.
    A new updatectl command-line tool can be used to control the service.

  • systemd-sysupdate gained a new --offline option to force it to operate locally. This is useful when listing locally installed versions.

  • systemd-sysupdate gained a new --transfer-source= option to set the directory to which transfer sources configured with PathRelativeTo=explicit will be interpreted.

  • systemd-sysupdate now reports download progress via sd_notify().

  • systemd-sysupdate now supports output in JSON mode for all commands.

  • systemd-sysupdate definitions may now carry references to ChangeLog and AppStream metadata.

  • Transfer definitions for systemd-sysupdate are supposed to carry the ".transfer" suffix now, changing from ".conf". The latter remains supported for compatibility, but it's recommended to rename all files reflecting this suffix change.

  • systemd-sysupdate now supports new ".feature" files that may be used in conjunction with ".transfer" files to group them together, and allow them to be turned off or on, individually per group.

TPM & systemd-cryptsetup:

  • The 'has-tpm2' verb which reports whether TPM2 functionality is available has been moved from systemd-creds to systemd-analyze.

  • systemd-tpm2-setup will gracefully handle TPMs that have a PIN set on the TPM, and not attempt to automatically set up a Storage Root Key (SRK) in that case.

  • New crypttab option password-cache=yes|no|read-only can be used to customize password caching.

  • New crypttab options fido2-pin=, fido2-up=, fido2-uv= can be used to enable/disable the PIN query, User Presence check, and User Verification.

  • systemd-cryptenroll gained new options --fido2-salt-file= and --fido2-parameters-in-header= to simplify manual enrollment of FIDO2 tokens.

  • systemd-cryptenroll, systemd-repart, and systemd-storagetm gained a new --list-devices option to list appropriate candidate block devices.

  • systemd-cryptenroll/systemd-cryptsetup now support combined signed PCR policies and local systemd-pcrlock policies for unlocking a disk. Or in other words, it's now possible to bind unlocking of a local disk to a specific OS vendor and a locally managed set of measurements describing the local system.

varlinkctl:

  • varlinkctl gained a new verb 'list-methods' to show a list of methods implemented by a service.

  • varlinkctl gained a --quiet/-q option to suppress method call replies.

  • varlinkctl gained a --graceful= option to suppress specific Varlink errors, and treat them as success.

  • varlinkctl gained a --timeout= option to limit how long the invocation can take.

  • varlinkctl allows remote invocations over ssh, via the new "ssh-exec:" address specification. It'll make an ssh connection, start the specified executable on the remote side, and communicate with the remote process using the Varlink protocol.
    The "ssh:" address specification has been renamed to "ssh-unix:" (reflecting the fact it is used to connect to a remote AF_UNIX socket via SSH). The old syntax is still supported for backwards compatibility.

  • varlinkctl's 'introspect' verb no longer requires specification of an interface name. If none is specified all interfaces exposed by the service are shown. Moreover, more than one interface name may be specified now, in which case all specified ones are displayed.

systemd-repart:

  • systemd-repart's CopyBlocks= directive can now use a character device as source (in addition to previously supported regular files and block devices). This is useful for initializing a partition from /dev/urandom or similar.

  • systemd-repart gained new Compression= and CompressionLevel= settings to enable internal compression in filesystems created offline.

  • systemd-repart understands a new MakeSymlinks= option to create one or more symlinks (each specified as a symlink name and target) within a newly formatted file system.

  • systemd-repart gained a new SupplementFor= setting that allows allocating a partition only if some other existing partition cannot be adjusted to match the constraints defined for it. This is useful to generate an XBOOTLDR partition if and only if an ESP already exists that is too small for the required constraints.

  • The default size of verity hash partitions is now automatically derived from SizeMaxBytes= of the data partition it is protecting.

systemd-ssh-proxy:

  • systemd-ssh-proxy now also supports the AF_UNIX-based "VSOCK MUX" protocol used by CloudHypervisor/Firecracker to expose AF_VSOCK sockets of the VM on the host. Or in other words: it's now possible to directly connect to ssh via AF_VSOCK from hosts to VMs of these two hypervisors (previously this was only supported for hypervisors which expose AF_VSOCK on the host as AF_VSOCK, such as qemu).

  • systemd-ssh-proxy can now reference local VMs by their name: connect to any local VM "foobar" registered with systemd-machined via "ssh machine/foobar" using the AF_VSOCK protocol.

systemd-analyze:

  • systemd-analyze will now show the SMBIOS gimp 2.8 #11 vendor strings set for the machine with a new 'smbios11' verb.

  • systemd-analyze gained a new --instance= option that can be used to provide an instance name to analyze multiple templates instantiated with the same instance name.

  • systemd-analyze's "capability" verb now gained a new --mask parameter. If specified a numeric capbality mask can be specified which is decoded for its contained capabilities.

  • systemd-analyze's "plot" verb gained two new settings: --scale-svg= allows the X axis of the split to be stritched by a factor. If --detailed is specified activation timestamps are shown in the plot.

busctl:

  • 'busctl monitor' gained new options --limit-messages= and --timeout= to set the number of matches or limit the runtime of the command.

  • busctl now supports doing method calls with embedded unix file descriptors.

  • busctl acquired a new "wait" command to wait for a specific signal to arrive.

systemd-nspawn:

  • systemd-nspawn --bind-user= will now propagate the bound user's SSH public key (if included in the user record) into the container, ensuring that any such bound user is directly accessible via ssh.

  • systemd-nspawn now supports unprivileged FUSE inside containers.

systemd-importd:

  • A new generator sytemd-import-generator has been added to synthesize image download jobs. This provides functionality similar to importctl, but is configured via the kernel command line and system credentials. It may be used to automatically download sysext, confext, portable service, nspawn container or vmspawn VM images at boot.

  • systemd-importd now provides a Varlink IPC interface, in addition to its existing D-Bus IPC interface.

  • The individual import/export tools will now display a nice progress bar when downloading files.

systemd-userdb & systemd-homed:

  • userdbctl gained a pair of switches --uid-min= and --uid-max= to filter the UID/GID range of the listed users or groups. It also gained a new switch --disposition= to filter them by disposition (i.e. show only system users or only regular users, and so on). It also gained a new switch --fuzzy that permits a "fuzzy" search for a user, i.e. doing a substring and string distance search, and looking into the real name field of the user and other similar fields. It gained a new switch --boundaries=no for disabling display of the UID/GID range boundaries in its output.

  • User records learnt a new set of fields that may list field names that may be changed by the user themselves without requiring administrator authentication. This new field is honoured by systemd-homed to allow users to change selected properties of their own user records.

systemd-run & run0:

  • run0 gained a new pair of settings --pty and --pipe that control whether to invoke the specified binary on a freshly allocated pseudo TTY, or whether to pass the client's STDIN/STDOUT/STDERR through directly.

  • run0 gained a new switch --shell-prompt-prefix= that permits passing in a string to display on each shell prompt as prefix. If not specified otherwise this will show a superhero emoji (🦸), in order to visually communicate the temporarily elevated privileges a run0 session provides. This makes use of the $SHELL_PROMPT_PREFIX environment variables mentioned below.

  • systemd-run can output some of its runtime data in JSON format via the new --json= option.

systemd-tmpfiles:

  • systemd-tmpfiles --purge switch now requires specification of at least one tmpfiles.d/ drop-in file.

  • tmpfiles.d/ files gained a new '?' specifier for the 'L' line type to create a symlink only if the source exists, and gracefully skip the line otherwise.

Miscellaneous:

  • systemctl now supports the --now option with the 'reenable' verb.

  • systemd-mount can now output JSON with a new --json= switch, for use with --list-devices. It also shows the "diskseq" property in the block device list.

  • systemd-id128 gained a new 'var-partition-uuid' verb to calculate the DPS UUID for /var/ keyed by the local machine-id.

  • localectl gained a -l/--full option to show output without ellipsization.

  • timedatectl now supports interactive polkit authorization.

  • The new Linux mseal(), listmount(), statmount() syscalls have been added to relevant system call groups.

  • The systemd-ask-password logic has been extended with a per-user scope, i.e. user programs may now ask for passwords via the same mechanism and the previously system-wide only mechanism.

  • A new set of system/service credentials are added: shell.prompt.prefix, shell.prompt.suffix and shell.welcome. At login time these are propagated into the $SHELL_PROMPT_PREFIX, $SHELL_PROMPT_SUFFIX, $SHELL_PROMPT_WELCOME environment variables. These in turn are included in the shell prompt of interactive shells and shown at login time, via /etc/profile.d/70-systemd-shell-extra.sh. This functionality is useful to visually highlight the fact a specific shell prompt originates from a specific system, execution context or tool. These credentials and environment variables are supposed to be generically useful within and outside of the immediate systemd context. It is also used by 'run0', see above.

  • New RELEASE_TYPE=, EXPERIMENT=, EXPERIMENT_URL= fields have been defined for the /etc/os-release file. For example, "RELEASE_TYPE=development|stable|lts" can be used to indicate various stages of the release life cycle, and "RELEASE_TYPE=experimental" can indicate experimental builds, with the EXPERIMENT= field providing a human-readable description of the nature of the experiment.

  • A new sleep.conf HibernateOnACPower= option has been added, which when disabled will suppress hibernation in suspend-then-hibernate mode until the system is disconnected from a power source.

  • A bunch of patches to ease building against musl have been merged.

  • The various components that display progress bars (i.e. systemd-repart, systemd-sysupdate/updatectl, importctl), will now also issue the ANSI sequences for progress reports that Windows Terminal understands. Most Linux terminals currently do not support this sequence (and ignore it), but hopefully this will change one day. The progress information is used to display a nice progress animation in the terminal tab and icon. For details about the ANSI sequence and its effects, see:
    Implement ConEmu's OSC 9;4 to set the taskbar progress indicator microsoft/terminal#8055 https://conemu.github.io/en/AnsiEscapeCodes.html#ConEmu_specific_OSC

  • systemd-sysusers is now able to create fully locked user accounts. For compatibility it so far created accounts with a locked (i.e. invalid) password, but not marked locked as a whole. With the new "!" modifier for "u" lines, it is now possible to create fully locked accounts. The distinction between accounts with a locked password and fully locked accounts is relevant when considering non-password forms of authentication, i.e. SSH and such. It is strongly recommended to make use of this new feature for almost all system accounts, since they usually do not require (and should not permit) interactive logins. All of systemd's own system users have been changed to be marked as fully locked.

  • systemd-coredump now supports a new EnterNamespace= option, which defaults to off. If enabled systemd-coredump will access the mount namespace of any crashed process to acquire debug symbol information, in order to be able to symbolize backtraces. This option is useful to improve backtraces of processes of containerized applications. (Note that the host systemd-coredump preferably dispatches coredump processing to the container itself, if it supports that. Only full-OS containers which run systemd inside will support this however, in other cases EnterNamespace= might be an suitable approach to acquire symbolized backtraces.)

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@philiptaron
Copy link
Contributor

philiptaron commented Dec 11, 2024

Now that the release has happened, I'm going to get my machine cranking building this overnight seeing if I can boot with it.

Lennart Poettering has published a series of Mastodon posts about this release's highlights (source):

  1. Creating fully locked accounts with sysusers.d
  2. Combined signed PCR policies and local systemd-pcrlock policies
  3. Progress report terminal escape sequences for long-running operations
  4. UKIs with multiple "profiles"
  5. sd-varlink, and more use of Varlink
  6. Extending systemd-ask-password to the user scope
  7. A Secure Attention Key sequence to bring you to a secure login prompt
  8. Propagating user SSH keys into systemd-nspawn containers
  9. Deferred reactivation of systemd timers
  10. IPE policies
  11. Credentials to control the shell prompt
  12. Logging when networking sysctls are changed outside of systemd-networkd
  13. CPU microcode add-ons for UKIs
  14. systemd-sbsign — Secure Boot signing tool
  15. logind support for hidraw devices (gamepads, fancy keyboards, etc.)
  16. userdbctl options for filtering list of users
  17. MAC-address-based enx* alternate network link names
  18. Optional file/symlink creation with tmpfiles.d
  19. RestartMode=debug to restart services in debug mode if they fail
  20. Listing service invocations with journalctl
  21. SupplementFor= for systemd-repart
  22. Multiple Devicetree blobs in a single UKI
  23. varlinkctl over SSH
  24. Installing Secure Boot databases with bootctl
  25. Automatically downloading and importing sysexts, confexts, etc. on boot
  26. Designated maintenance times for scheduled reboots
  27. PrivatePids= for PID namespacing

@github-actions github-actions bot added 10.rebuild-darwin: 1-10 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS and removed 10.rebuild-darwin: 101-500 10.rebuild-linux: 501+ labels Dec 11, 2024
@Princemachiavelli Princemachiavelli mentioned this pull request Dec 11, 2024
13 tasks
philiptaron added a commit to philiptaron/flock.nix that referenced this pull request Dec 12, 2024
@philiptaron
Copy link
Contributor

I'm happy to report that philiptaron/flock.nix@8638c62 boots and runs successfully with this PR. 🎉

I haven't had a chance to read through the boot logs with a fine-toothed comb, but on the surface this was easy-peasy.

@philiptaron

This comment was marked as outdated.

@alyssais
Copy link
Member

@jmbaur
Copy link
Contributor Author

jmbaur commented Dec 18, 2024

musl patches are already available https://git.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd?h=master-next

Thanks @alyssais! Updated with latest patches

@philiptaron
Copy link
Contributor

@ElvishJerricco and crew, what's the checklist in your mind to get this merged?

@philiptaron
Copy link
Contributor

philiptaron commented Dec 19, 2024

https://github.com/systemd/systemd/releases/tag/v257.1

First dot release! I'll get my machine busy building this overnight and report back.

@jmbaur
Copy link
Contributor Author

jmbaur commented Dec 20, 2024

systemd-repart tests now fixed!

Removes the creation of a new PID namespace during nixos-enter so that
systemd doesn't detect the child process as being in a "container". In
systemd v257, container detection slightly changed to check if the
process is a part of the root PID namespace (see https://github.com/systemd/systemd/blob/96c4d9d94d06c6c0a8b68be376505f8d8b5eba2b/src/basic/virt.c#L735).
Tooling such as `bootctl` will not perform certain actions if it detects
we are in a container, such as not populating EFI variables. This
results in broken systemd-boot VM tests.
This ensures that GNU parted doesn't complain that partitions are
unaligned.
Copy link
Contributor

@ElvishJerricco ElvishJerricco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me, @philiptaron has done extensive testing (thanks!), and I looked through the systemd NEWS changelog and saw nothing else that needs changing. The eval error in CI appears to be unrelated. Let's send it, and figure out any unexpected fallout in staging.

@ElvishJerricco ElvishJerricco merged commit 704cf68 into NixOS:staging Dec 20, 2024
18 of 23 checks passed
@jmbaur jmbaur deleted the systemd-257 branch December 20, 2024 17:33
@jmbaur jmbaur mentioned this pull request Dec 20, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: systemd 10.rebuild-darwin: 1-10 10.rebuild-linux: 5001+
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants