Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli-plugins: don't use abstract sockets on macOS #4783

Merged
merged 2 commits into from
Jan 12, 2024

Conversation

laurazard
Copy link
Member

@laurazard laurazard commented Jan 12, 2024

macOS does not support abstract sockets, and as such the current implementation is problematic since the sockets are created in the current working directory, which alters build contexts and can break builds if bind-mounted.

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@codecov-commenter
Copy link

codecov-commenter commented Jan 12, 2024

Codecov Report

Merging #4783 (6d0b329) into master (d469be2) will increase coverage by 0.04%.
Report is 2 commits behind head on master.
The diff coverage is 0.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4783      +/-   ##
==========================================
+ Coverage   59.59%   59.64%   +0.04%     
==========================================
  Files         287      287              
  Lines       24784    24772      -12     
==========================================
+ Hits        14770    14775       +5     
+ Misses       9125     9109      -16     
+ Partials      889      888       -1     

@laurazard
Copy link
Member Author

I didn't tag you yet/add tags because I still wanted to do a little refactor, but wanted to get WIP out in the meantime.

@laurazard
Copy link
Member Author

@neersighted

@neersighted neersighted force-pushed the fix-no-abstract-sockets branch from a27c20f to 37f6838 Compare January 12, 2024 18:35
@neersighted
Copy link
Member

I've pushed up a proposed refactor/restructure to go with the non-abstract socket, PTAL @thaJeztah @laurazard

@neersighted neersighted requested a review from thaJeztah January 12, 2024 18:36
@neersighted neersighted self-assigned this Jan 12, 2024
@neersighted neersighted marked this pull request as ready for review January 12, 2024 18:36
@neersighted neersighted force-pushed the fix-no-abstract-sockets branch from 37f6838 to 7fb53b5 Compare January 12, 2024 18:43
}

func onAccept(conn *net.UnixConn, listener *net.UnixListener) {
syscall.Unlink(listener.Addr().String())
Copy link

@whalelines whalelines Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the socket getting automatically cleaned up on Linux but not on macOS? Since you are still using the namespaces socket on Linux, i.e., prefixing it with "@", won't it still get cleaned up? Shouldn't this unlink be in the socket_darwin onAccept?

Is there a way to test the behavior in both scenarios?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot, good catch! Looks like I forgot that correction with the git add -p.

To test the behavior of this, it's sufficient to build a copy of the CLI and run a plugin command, and check that the sockets are no longer leaked.

To test the core functionality of the socket, something like docker compose up works (checking that after Ctrl-C, there is no detached docker-compose process still running).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw @neersighted mention some reasoning behind this for Linux, but I agree that it's not obvious, so I would suggest adding a comment outlining this. (ISTR he had some references with details)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL.. browsers and race-conditions... ignore my comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean; the logic is Darwin-specific and not Linux/Windows-specific. Do you mean linking to the manpage for unix(7) where it is suggested that unlink(2) can be used to forward-remove a socket present in the filesystem?

Copy link

@whalelines whalelines Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this https://docker.slack.com/archives/C0W4XJVFW/p1705067942173219?thread_ts=1705055721.030329&cid=C0W4XJVFW , Go will try to remove sockets not starting with "@" upon close. So perhaps the unlink is not necessary in either case since no "@" is being prepended on darwin?

I was asking about automated tests…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing a forward removal so that even if the CLI process crashes/is killed, we don't leave a socket on "disk" (read: in $TMPDIR). You're right that it's not strictly needed if we exit gracefully, but we also don't need the socket to stick around longer than the first client connects as this is more-or-less a "handshake" or "rendezvous."

Regarding tests, we don't have any right now; I don't think it would be too terrible to integration test this, but we should leave that for a follow-up as this is a pretty painful regression and a blocker for 25.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also with regards to that manpage/the reason for doing the forward removal, the relevant text is here: https://man7.org/linux/man-pages/man7/unix.7.html#:~:text=Binding%20to%20a,it%20is%20closed.

@neersighted neersighted force-pushed the fix-no-abstract-sockets branch from 7fb53b5 to 36c3122 Compare January 12, 2024 18:47
@neersighted neersighted force-pushed the fix-no-abstract-sockets branch from 36c3122 to ceea10c Compare January 12, 2024 18:49
@neersighted
Copy link
Member

Pushed a squashed version of the refactor per @thaJeztah's request.


func listen(socketname string) (*net.UnixListener, error) {
return net.ListenUnix("unix", &net.UnixAddr{
Name: filepath.Join(os.TempDir(), socketname),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed from config.Path("run") correct?

LOL, also don't look at the windows implementation of os.TempDir() (which is a lot more complex than expected; https://cs.opensource.google/go/go/+/refs/tags/go1.21.6:src/os/file_windows.go;l=310-333)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is one difference from @laurazard's original version; I didn't see any reason to not just use the TMPDIR after verifying length should not be an issue.

As macOS does not support the abstract socket namespace, use a temporary
socket in $TMPDIR to connect with the plugin. Ensure this socket is
cleaned up even in the case of crash/ungraceful termination by removing
it after the first connection is accepted.

Co-authored-by: Laura Brehm <[email protected]>
Signed-off-by: Bjorn Neergaard <[email protected]>
@neersighted neersighted force-pushed the fix-no-abstract-sockets branch from ceea10c to 6d0b329 Compare January 12, 2024 19:31

// EnvKey represents the well-known environment variable used to pass the plugin being
// executed the socket name it should listen on to coordinate with the host CLI.
const EnvKey = "DOCKER_CLI_PLUGIN_SOCKET"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @milas @glours FYI this const was moved; in case code changes are needed when updating the module in compose

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit ad12276 into docker:master Jan 12, 2024
77 checks passed
@neersighted neersighted deleted the fix-no-abstract-sockets branch January 12, 2024 21:08
neersighted added a commit to neersighted/docker-cli that referenced this pull request Mar 22, 2024
Go's `net` package [will unlink][1] for us, as long as we used listen to
create the Unix socket.

Go will even skip the unlink when the socket appears to be abstract
(start with a NUL, represented by an @), though we must be cautious to
only create sockets with an abstract address on platforms that actually
support it -- this caused [several][2] [bugs][3] before.

  [1]: https://pkg.go.dev/net#UnixListener.SetUnlinkOnClose
  [2]: docker#4783
  [3]: docker#4863

Signed-off-by: Bjorn Neergaard <[email protected]>
neersighted added a commit to neersighted/docker-cli that referenced this pull request Mar 22, 2024
Go's `net` package [will unlink][1] for us, as long as we used Listen &
friends to create the Unix socket.

Go will even skip the unlink when the socket appears to be abstract
(starts with a NUL, represented by an @), though we must be cautious to
only create sockets with an abstract address on platforms that actually
support it -- this caused [several][2] [bugs][3] before.

  [1]: https://pkg.go.dev/net#UnixListener.SetUnlinkOnClose
  [2]: docker#4783
  [3]: docker#4863

Signed-off-by: Bjorn Neergaard <[email protected]>
neersighted added a commit to neersighted/docker-cli that referenced this pull request Mar 22, 2024
Go's `net` package [will unlink][1] for us, as long as we used Listen &
friends to create the Unix socket.

Go will even skip the unlink when the socket appears to be abstract
(starts with a NUL, represented by an @), though we must be cautious to
only create sockets with an abstract address on platforms that actually
support it -- this caused [several][2] [bugs][3] before.

  [1]: https://pkg.go.dev/net#UnixListener.SetUnlinkOnClose
  [2]: docker#4783
  [3]: docker#4863

Signed-off-by: Bjorn Neergaard <[email protected]>
rige1 pushed a commit to rige1/cli that referenced this pull request Apr 16, 2024
Go's `net` package [will unlink][1] for us, as long as we used Listen &
friends to create the Unix socket.

Go will even skip the unlink when the socket appears to be abstract
(starts with a NUL, represented by an @), though we must be cautious to
only create sockets with an abstract address on platforms that actually
support it -- this caused [several][2] [bugs][3] before.

  [1]: https://pkg.go.dev/net#UnixListener.SetUnlinkOnClose
  [2]: docker/cli#4783
  [3]: docker/cli#4863

Signed-off-by: Bjorn Neergaard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[25.0.0-rc.1] regression: CLI leaves behind plugin socket mount-points
5 participants