Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: use RPC address and not serf after initial Consul discovery #16217

Merged
merged 1 commit into from
Mar 2, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Feb 17, 2023

Fixes #16211

Nomad servers can advertise independent IP addresses for serf and rpc. Somewhat unexpectedly, the serf address is also used for both Serf and server-to-server RPC communication (including Raft RPC). The address advertised for rpc is only used for client-to-server RPC. This split was introduced intentionally in Nomad 0.8.

When clients are using Consul discovery for connecting to servers, they get an initial discovery set from Consul and use the correct rpc tag in Consul to get a list of adddresses for servers. The client then makes a Status.Peers RPC to get the list of those servers that are raft peers. But this endpoint is shared between servers and clients, and provides the address used for Raft.

Most of the time this is harmless because servers will bind on 0.0.0.0 anyways., But in topologies where servers are on a private network and clients are on separate subnets (or even public subnets), clients will make initial contact with the server to get the list of peers but then populate their local server set with unreachable addresses.

Cluster administrators can work around this problem by using server_join with specific IP addresses (or DNS names), because the Node.UpdateStatus endpoint returns the correct set of RPC addresses when updating the node. So once a client has registered, it will get the correct set of RPC addresses.

This changeset updates the client logic to query Status.Members instead of Status.Peers, and then extract the correctly advertised address and port from the response body.


Notes:

  • The end-to-end testing required for this means it's going to just miss the 1.5.0 GA. This is probably safe to land in a patch release without worrying about backwards compatibility, because anyone with this topology is already using the server_join workaround described above.
  • This also closes fix wrong control plane discovery, when serf advertised address located in network, unreachable from clients #11895, which unfortunately got abandoned because we had some trouble understanding exactly what the problem was.
  • I've done some bench testing of this with a local development cluster, some extra IP addresses, and iptables rules.
  • I also stood it up on a real multi-host cluster using the changes in our E2E config from E2E: add multi-home networking to test infrastructure #16218 and everything seems to work as usual.
  • This is outside the code path of the non-Consul cloud discovery features, but I also tested it by disabling server join via Consul (by just breaking the consul config block) and used AWS EC2 auto-join via tags and that works just fine as well.

@tgross tgross added this to the 1.5.x milestone Feb 17, 2023
@tgross tgross force-pushed the client-discover-by-members-rpc branch from c1b0c79 to 3c094d3 Compare February 17, 2023 20:32
@tgross tgross self-assigned this Feb 17, 2023
tgross added a commit that referenced this pull request Feb 17, 2023
Add an Elastic Network Interface (ENI) to each Linux host, on a secondary subnet
we have provisioned in each AZ. Revise security groups as follows:

* Split out client security groups from servers so that we can't have clients
  accidentally accessing serf addresses or other unexpected cross-talk.
* Add new security groups for the secondary subnet that only allows
  communication within the security group so we can exercise behaviors with
  multiple IPs.

This changeset doesn't include any Nomad configuration changes needed to take
advantage of the extra network interface. I'll include those with testing for
PR #16217.
jrasell pushed a commit that referenced this pull request Feb 20, 2023
Add an Elastic Network Interface (ENI) to each Linux host, on a secondary subnet
we have provisioned in each AZ. Revise security groups as follows:

* Split out client security groups from servers so that we can't have clients
  accidentally accessing serf addresses or other unexpected cross-talk.
* Add new security groups for the secondary subnet that only allows
  communication within the security group so we can exercise behaviors with
  multiple IPs.

This changeset doesn't include any Nomad configuration changes needed to take
advantage of the extra network interface. I'll include those with testing for
PR #16217.
Nomad servers can advertise independent IP addresses for `serf` and
`rpc`. Somewhat unexpectedly, the `serf` address is also used for both Serf and
server-to-server RPC communication (including Raft RPC). The address advertised
for `rpc` is only used for client-to-server RPC. This split was introduced
intentionally in Nomad 0.8.

When clients are using Consul discovery for connecting to servers, they get an
initial discovery set from Consul and use the correct `rpc` tag in Consul to get
a list of adddresses for servers. The client then makes a `Status.Peers` RPC to
get the list of those servers that are raft peers. But this endpoint is shared
between servers and clients, and provides the address used for Raft.

Most of the time this is harmless because servers will bind on 0.0.0.0 anyways.,
But in topologies where servers are on a private network and clients are on
separate subnets (or even public subnets), clients will make initial contact
with the server to get the list of peers but then populate their local server
set with unreachable addresses.

Cluster administrators can work around this problem by using `server_join` with
specific IP addresses (or DNS names), because the `Node.UpdateStatus` endpoint
returns the correct set of RPC addresses when updating the node. So once a
client has registered, it will get the correct set of RPC addresses.

This changeset updates the client logic to query `Status.Members` instead of
`Status.Peers`, and then extract the correctly advertised address and port from
the response body.
@tgross tgross marked this pull request as ready for review February 28, 2023 21:53
@tgross tgross requested review from lgfa29 and angrycub February 28, 2023 21:53
@tgross tgross added backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line labels Feb 28, 2023
@tgross tgross force-pushed the client-discover-by-members-rpc branch from 3c094d3 to 0bf6e65 Compare February 28, 2023 21:55
Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

tgross added a commit that referenced this pull request Mar 14, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Create a new `Status.RPCServers` endpoint that mirrors the `Status.Peers`
endpoint but provides the RPC server addresses instead of the Raft
addresses. This fixes the authentication bug but also ensures we're only
registering with servers in the client's region and not in any other servers
that might have registered with Consul.

This changeset also expands the test coverage of the RPC endpoint and closes up
potential holes in the `ResolveACL` method that aren't currently bugs but easily
could become bugs if we called the method without ensuring its invariants are
upheld.

Co-authored-by: tantra35 <[email protected]>
tgross added a commit that referenced this pull request Mar 16, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.
tgross added a commit that referenced this pull request Mar 20, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.
tgross added a commit that referenced this pull request Mar 20, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.
tgross added a commit that referenced this pull request Mar 20, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.

Co-authored-by: Tim Gross <[email protected]>
tgross added a commit that referenced this pull request Mar 20, 2023
In #16217 we switched clients using Consul discovery to the `Status.Members`
endpoint for getting the list of servers so that we're using the correct
address. This endpoint has an authorization gate, so this fails if the anonymous
policy doesn't have `node:read`. We also can't check the `AuthToken` for the
request for the client secret, because the client hasn't yet registered so the
server doesn't have anything to compare against.

Instead of hitting the `Status.Peers` or `Status.Members` RPC endpoint, use the
Consul response directly. Update the `registerNode` method to handle the list of
servers we get back in the response; if we get a "no servers" or "no path to
region" response we'll kick off discovery again and retry immediately rather
than waiting 15s.

Co-authored-by: Tim Gross <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line backport/1.5.x backport to 1.5.x release line theme/client type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nomad client tries to connect to serf IP address, not to RPC
2 participants