-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4.6] libct/cg/sd: reconnect and retry on dbus connection error #7
Commits on May 1, 2021
-
libct/cgroups/systemd: eliminate runc/systemd race
In case it takes more than 1 second for systemd to create a unit, startUnit() times out with a warning and then runc proceeds (to create cgroups using fs manager and so on). Now runc and systemd are racing, and multiple scenarios are possible. In one such scenario, by the time runc calls systemd manager's Apply() the unit is not yet created, the dbusConnection.SetUnitProperties() call fails with "unit xxx.scope not found", and the whole container start also fails. To eliminate the race, we need to return an error in case the timeout is hit. To reduce the chance to fail, increase the timeout from 1 to 30 seconds, to not error out too early on a busy/slow system (and times like 3-5 seconds are not unrealistic). While at it, as the timeout is quite long now, make sure to not leave a stray timer. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 3844789) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7235c92 - Browse repository at this point
Copy the full SHA 7235c92View commit details -
libct/cg/sd/systemdVersion: don't return error
As the caller of this function just logs the error, it does not make sense to pass it. Instead, log it (once) and return -1. This is a preparation for the second user. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit eee425f) [Minor merge conflict due to missing "return" removed by a hunk from commit 978fa6e] Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6951756 - Browse repository at this point
Copy the full SHA 6951756View commit details -
[@kolyshkin: documentation nits] Signed-off-by: Shiming Zhang <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit cdbed6f) [minor merge conflict due to missing upstream commit 73f22e7] Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7c6eb9d - Browse repository at this point
Copy the full SHA 7c6eb9dView commit details -
Generalize isUnitExists as isDbusError, and use errors.As while at it (which can handle wrapped errors as well). Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit bacfc2c) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 629e2cf - Browse repository at this point
Copy the full SHA 629e2cfView commit details -
libct/cg/sd: add renew dbus connection
[@kolyshkin: doc nits, use dbus.ErrClosed and isDbusError] Signed-off-by: Shiming Zhang <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 15fee98) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 57d2ef8 - Browse repository at this point
Copy the full SHA 57d2ef8View commit details -
Signed-off-by: Shiming Zhang <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 6122bc8) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 809ecfa - Browse repository at this point
Copy the full SHA 809ecfaView commit details -
libct/cg/sd: retry on dbus disconnect
Instead of reconnecting to dbus after some failed operations, and returning an error (so a caller has to retry), reconnect AND retry in place for all such operations. This should fix issues caused by a stale dbus connection after e.g. a dbus daemon restart. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 47ef9a1) [Minor merge conflicts due to missing upstream commits 52390d6 and af521ed.] Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for bda2c9a - Browse repository at this point
Copy the full SHA bda2c9aView commit details
Commits on May 21, 2021
-
libct/cg/sd: introduce and use getManagerProperty
Commit 47ef9a1 forgot to wrap GetManagerProperty("ControlGroup") into retryOnDisconnect. Since there's one other user of GetManagerProperty, add getManagerProperty wrapper and use it. Fixes: 47ef9a1 Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit 99c5c50) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9c9761f - Browse repository at this point
Copy the full SHA 9c9761fView commit details -
libct/cg/sd: use global dbus connection
Using per cgroup manager dbus connection instances means that every cgroup manager instance gets a new connection, and those connections are never closed, ultimately resulting in file descriptors limit being hit. Revert back to using a single global dbus connection for everything, without changing the callers. NOTE that it is assumed a runtime can't use both root and rootless dbus at the same time. If this happens, we panic. Signed-off-by: Kir Kolyshkin <[email protected]> (cherry picked from commit c7f847e) Signed-off-by: Kir Kolyshkin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 482b488 - Browse repository at this point
Copy the full SHA 482b488View commit details