Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that sandboxed processes exit before their sandboxes are clean… #18642

Merged
merged 1 commit into from
Mar 31, 2023

Conversation

jsirois
Copy link
Contributor

@jsirois jsirois commented Mar 31, 2023

…ed up (Cherry-pick of #18632)

As @jsirois discovered and described in
#16778 (comment), because the local::CommandRunner does not wait for its child process to have exited, it might be possible for a SIGKILL to not have taken effect on the child process before its sandbox has been torn down. And that could result in the process seeing a partial sandbox (and in the case of #16778, potentially cause named cache corruption).

This change ensures that we block to call wait when spawning local processes by wrapping them in the ManagedChild helper that we were already using for interactive processes. It then attempts to clarify the contract of the CapturedWorkdir trait (but in order to improve cherry-pickability does not attempt to adjust it), to make it clear that child processes should fully exit before the run_and_capture_workdir method has returned.

Fixes #16778 🤞.


Co-authored-by: John Sirois [email protected]
(cherry picked from commit 1462bef)

…ed up (Cherry-pick of pantsbuild#18632)

As @jsirois discovered and described in
pantsbuild#16778 (comment),
because the `local::CommandRunner` does not `wait` for its child process
to have exited, it might be possible for a SIGKILL to not have taken
effect on the child process before its sandbox has been torn down. And
that could result in the process seeing a partial sandbox (and in the
case of pantsbuild#16778, potentially cause named cache corruption).

This change ensures that we block to call `wait` when spawning local
processes by wrapping them in the `ManagedChild` helper that we were
already using for interactive processes. It then attempts to clarify the
contract of the `CapturedWorkdir` trait (but in order to improve
cherry-pickability does not attempt to adjust it), to make it clear that
child processes should fully exit before the `run_and_capture_workdir`
method has returned.

Fixes pantsbuild#16778 🤞.

---------

Co-authored-by: John Sirois <[email protected]>
(cherry picked from commit 1462bef)
@jsirois jsirois added the category:bugfix Bug fixes for released features label Mar 31, 2023
@jsirois jsirois requested review from stuhood, benjyw and tdyas March 31, 2023 21:43
@jsirois
Copy link
Contributor Author

jsirois commented Mar 31, 2023

This cherry-pick was not clean. The conflicts were use conflicts, {} -> {e} inline formats and the non-existence of a docker process executor in 2.14.x.

subprocess.kill().map_err(|e| format!("Failed to interrupt child process: {}", e)).await?;
};
subprocess.wait().await.map_err(|e| e.to_string())
let exit_status = session.clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected cargo fmt to want to re-indent this, since in 2.15.x+ it was nested in a macro, and it isn't here. But if it passes, then not worth re-running CI over.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. It does seem to have been fine with this; so I'll submit.

@jsirois jsirois merged commit 0e52fb8 into pantsbuild:2.14.x Mar 31, 2023
@jsirois jsirois deleted the 18632/cherry-pick/2.14.x branch March 31, 2023 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:bugfix Bug fixes for released features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants