-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.1] runc delete: call systemd's reset-failed #3932
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There is no such thing as linux.resources.memorySwap (the mem+swap is set as linux.resources.memory.swap). As it is not used in this test anyway, remove it. Fixes: 4929c05 Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit dacb3aa) Signed-off-by: Kir Kolyshkin <[email protected]>
Sometimes we call resetFailedUnit as a cleanup measure, and we don't care if it fails or not. So, move error reporting to its callers, and ignore error in cases we don't really expect it to succeed. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 91b4cd2) Signed-off-by: Kir Kolyshkin <[email protected]>
runc delete is supposed to remove all the container's artefacts. In case systemd cgroup driver is used, and the systemd unit has failed (e.g. oom-killed), systemd won't remove the unit (that is, unless the "CollectMode: inactive-or-failed" property is set). Call reset-failed from manager.Destroy so the failed unit will be removed during "runc delete". Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 43564a7) Signed-off-by: Kir Kolyshkin <[email protected]>
Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 58a811f) Signed-off-by: Kir Kolyshkin <[email protected]>
The passing run (with the fix) looks like this: ---- delete.bats ✓ runc delete removes failed systemd unit [4556] runc spec (status=0): runc run -d --console-socket /tmp/bats-run-B08vu1/runc.lbQwU5/tty/sock test-failed-unit (status=0): Warning: The unit file, source configuration file or drop-ins of runc-cgroups-integration-test-12869.scope changed on disk. Run 'systemctl daemon-reload' to reload units. × runc-cgroups-integration-test-12869.scope - libcontainer container integration-test-12869 Loaded: loaded (/run/systemd/transient/runc-cgroups-integration-test-12869.scope; transient) Transient: yes Drop-In: /run/systemd/transient/runc-cgroups-integration-test-12869.scope.d └─50-DevicePolicy.conf, 50-DeviceAllow.conf Active: failed (Result: timeout) since Tue 2023-06-13 14:41:38 PDT; 751ms ago Duration: 2.144s CPU: 8ms Jun 13 14:41:34 kir-rhat systemd[1]: Started runc-cgroups-integration-test-12869.scope - libcontainer container integration-test-12869. Jun 13 14:41:37 kir-rhat systemd[1]: runc-cgroups-integration-test-12869.scope: Scope reached runtime time limit. Stopping. Jun 13 14:41:38 kir-rhat systemd[1]: runc-cgroups-integration-test-12869.scope: Stopping timed out. Killing. Jun 13 14:41:38 kir-rhat systemd[1]: runc-cgroups-integration-test-12869.scope: Killing process 1107438 (sleep) with signal SIGKILL. Jun 13 14:41:38 kir-rhat systemd[1]: runc-cgroups-integration-test-12869.scope: Failed with result 'timeout'. runc delete test-failed-unit (status=0): Unit runc-cgroups-integration-test-12869.scope could not be found. ---- Before the fix, the test was failing like this: ---- delete.bats ✗ runc delete removes failed systemd unit (in test file tests/integration/delete.bats, line 194) `run -4 systemctl status "$SD_UNIT_NAME"' failed, expected exit code 4, got 3 .... ---- Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit ad040b1) Signed-off-by: Kir Kolyshkin <[email protected]>
lifubang
approved these changes
Jul 8, 2023
AkihiroSuda
approved these changes
Jul 12, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a backport of #3888 to release-1.1 branch. Original description follows.
runc delete is supposed to remove all the container's artefacts. In case systemd cgroup driver is used, and the systemd unit has failed (e.g. oom-killed), systemd won't remove the unit (that is, unless the "CollectMode: inactive-or-failed" property is set).
Call reset-failed from manager.Destroy so the failed unit will be removed during "runc delete".
This fixes Issue A from #3780 (which, in its original form, can only be reproduced with RHEL/CentOS 9 systemd version < 252.14, i.e. before they've added redhat-plumbers/systemd-rhel9#149). A test case that works with any recent systemd version is also added.