Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All communication with components should occur using unix sockets / named pipes #3998

Closed
cmacknz opened this issue Jan 3, 2024 · 13 comments · Fixed by #4249
Closed

All communication with components should occur using unix sockets / named pipes #3998

cmacknz opened this issue Jan 3, 2024 · 13 comments · Fixed by #4249
Assignees
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@cmacknz
Copy link
Member

cmacknz commented Jan 3, 2024

Today the control protocol uses local TCP on port 6789 to communicate with components.

# agent.grpc:
# # listen address for the GRPC server that spawned processes connect back to.
# address: localhost
# # port for the GRPC server that spawned processes connect back to.
# port: 6789

We regularly see issues where a firewall or iptables rules are configured to drop all traffic except for ports that have explicitly been whitelisted. The symptoms of this problem are not obvious in the agent logs and require uses to take extra steps to have Elastic Agent run at all.

We can solve this problem and a few others by moving away from local TCP to using Unix sockets or Windows named pipes to communicate with subprocesses.

This change should be made for:

  1. The control protocol connection to supervised components. This change should be transparent to components as they can simply be passed the new address in the connection information message used to bootstrap communication with agent. The file ownership of the sockets must be set to match the privilege level of the agent.

https://github.com/elastic/elastic-agent-client/blob/f57f63489dbbce98522c174dae00158f895ddc84/elastic-agent-client.proto#L458-L463

  1. The connection info server used to bootstrap endpoint security. In this case the connection info socket must be owed by root. This change must be coordinated with the endpoint security team.

listener, err := net.Listen("tcp", fmt.Sprintf("127.0.0.1:%d", port))
if err != nil {
return nil, fmt.Errorf("failed to start connection credentials listener: %w", err)
}
s := &connInfoServer{log: log, listener: listener, stopTimeout: defaultStopTimeout}
var cn context.CancelFunc
s.waitCtx, cn = context.WithCancel(context.Background())
go func() {
defer cn()
for {
conn, err := listener.Accept()

@cmacknz cmacknz added the Team:Elastic-Agent Label for the Agent team label Jan 3, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@lucabelluccini
Copy link
Contributor

This would be useful for users on Openshift who have Red Hat OpenShift Data Foundation Managed Service, as it uses the port 6789 for Ceph Monitor

@cmacknz
Copy link
Member Author

cmacknz commented Mar 25, 2024

In progress in #4249

@rgarcia89
Copy link

I would love to see this being configurable as well.

@rgarcia89
Copy link

@aleksmaus in which version will this be included?

@cmacknz
Copy link
Member Author

cmacknz commented Jun 11, 2024

There is follow up work to do before we can enable this by default and recommend it to users: #4899

It isn't enabled by default yet, Defend doesn't support it properly, and we haven't turned it on in our testing framework. All the plumbing through agent to enable it is done though.

In 8.15 you should be able to try this out, but it isn't GA functionality yet.

@aleksmaus
Copy link
Member

In 8.15 you should be able to try this out, but it isn't GA functionality yet.

Can't try it without changing Agent code in 8.15, the code was commented out or changed until gRPC via domain sockets is fully supported by Endpoint. Let me know if need a branch/PR before that, that enables this functionality, so you could build your own Agent and try it out.

@cmacknz
Copy link
Member Author

cmacknz commented Jun 11, 2024

As long as we aren't testing it let's keep it disabled to avoid unpleasant surprises for people.

@lucabelluccini
Copy link
Contributor

Hello @cmacknz
Do we have plans on when we will add tests and allow users to opt-in or switch them to default to named pipes in future versions?
Do we have a tracking issue for this?

If I correctly understand, this option would remove the use of the management port 6789/grcp in favor of named pipes/unix sockets, removing a possible source of port conflict with other applications/servers. Is it correct?

@cmacknz
Copy link
Member Author

cmacknz commented Aug 12, 2024

If I correctly understand, this option would remove the use of the management port 6789/grcp in favor of named pipes/unix sockets, removing a possible source of port conflict with other applications/servers. Is it correct?

Yes

Do we have plans on when we will add tests and allow users to opt-in or switch them to default to named pipes in future versions?
Do we have a tracking issue for this?

We are blocked behind the gRPC library used in Defend not support Windows named pipes, so we can use this everywhere by default once that is resolved. Until that is fixed being able to use this feature is integration and OS dependent.

@cmacknz
Copy link
Member Author

cmacknz commented Aug 12, 2024

(the tracking issue will be private since the endpoint security implementation is not open source).

@lucabelluccini
Copy link
Contributor

Perfect! Thank you @cmacknz for the clarifications. I'll follow the internal one.

@rgarcia89
Copy link

How can we manage these parameters for fleet managed agents?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants