-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote: exec: do not leak session IDs on errors #20405
remote: exec: do not leak session IDs on errors #20405
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
bc1fb91
to
edf1654
Compare
You're going to need to rebase & repush, to incorporate #20404 and then remove that skip, but no need to do that yet. Let's see how this CI run goes. Thank you for working on this. |
edf1654
to
7765d46
Compare
Idea LGTM. Haven't done a solid look over the code yet, though. |
d7dae7c
to
ddecb51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to expose a REST endpoint for this?
AFAIK via remote we use the 5 mins delay for the conmon cleanup process so the expectation of the test is just wrong IMO. (conf.Engine.ExitCommandDelay)
So if we want to cleanup right away this should be just set to 0 but that then would cause race conditions on the client as it has to inspect after exec finishes to get the exit code.
So I am not sure if that has to be fixed at all as the exec session will be cleanup up eventually. And in your test we only have a error condition so I see no reason why the server shouldn't handle the remove to make it work. Trusting lcients to do the right things sounds wrong to me.
As far as I can tell docker does also only cleanup exec session every 5 min:
https://github.com/moby/moby/blob/0253fedf03f4964c20906795e119404633cb9c1a/daemon/exec.go#L323
defer func() { | ||
if retErr != nil { | ||
_ = containers.ExecRemove(ic.ClientCtx, sessionID, nil) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be handled on the server side? We should not expect all API clients to know about this.
ddecb51
to
00d2303
Compare
commit fa19e1b partially introduced the fix, but was merged too quickly and didn't work with remote. Introduce a new binding to allow removing a session from the remote client. Signed-off-by: Giuseppe Scrivano <[email protected]>
This reverts commit 44ed415. Signed-off-by: Giuseppe Scrivano <[email protected]>
00d2303
to
1d2589c
Compare
Having the cleanup server-side would simplify the implementation but IMO it is cleaner if this is handled by the client since the client is already responsible for creating the session through If we handle it server-side, then should the session be deleted on any kind of error (e.g. invalid JSON request) or only on internal errors? Not sure if it classifies as a breaking change but it changes the current behavior since the session ID used by the client won't be valid anymore after an error, which is not the case now. |
Yes I guess that is a fair question but IMO dumping responsibility on clients makes it worse now, as they have the responsibility to cleanup exec session that they did not have before (ok it was just leaked for now). If we talk about podman-remote only I would agree that we can take care of it but this endpoint will be part of our public stable API. Also if I read the moby code correctly deleting the exec session on a failed start seems to be what docker is doing so I doubt that this would cause problems for API users |
aa7abd9
to
106b097
Compare
thanks for checking that. If that is the expectation for Docker compatibility, let's go with it. Pushed a simpler implementation with the server-side cleanup |
106b097
to
5a6252a
Compare
LGTM, tests are red though |
if we delete the session ID on errors, we cannot retrieve the exit code later |
5a6252a
to
1d2589c
Compare
we would break the current use case where we can query the sessionID after an error. I've reverted to the previous version of doing the removal client-side, if you don't like this approach, we can just disable the test for remote and rely on the timeout mechanism |
/lgtm Can we also do it server side in a separate PR? |
/hold |
I've tried doing it server side before, but it breaks the current client code where we check the exit status after the failure. If we delete the session immediately, the client won't be able to retrieve the exit code. Server-side I think the best we can do is what we already have with the timeout mechanism, unless we do it in a new API, which seems overkill for such a minor issue. I've not worked much on the server/client part, so if anyone has better ideas, I am all ears. @cevich CI is currently failing with:
|
/hold cancel |
CI has returned to normal |
commit fa19e1b partially introduced the fix, but was merged too quickly and didn't work with remote.
Introduce a new binding to allow removing a session from the remote client.
[NO NEW TESTS NEEDED] since it fixes a CI failure
Does this PR introduce a user-facing change?