Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server-acl-init: Add -server-address and -server-port #238

Merged
merged 4 commits into from
Apr 9, 2020

Conversation

ishustava
Copy link
Contributor

@ishustava ishustava commented Mar 30, 2020

This PR proposes to change the server-acl-init command to use user-provided server-address and server-port instead of discovering servers' Pod IPs and ports via Kubernetes API.

Why

Currently, the server-acl-init command knows a lot about how the servers are deployed. It makes some assumptions about the resource names of the stateful set and uses Kubernetes API as its discovery mechanism for server addresses. This creates a fair amount of extra logic to correctly discover the server IPs and ports. In addition, we have some logic that checks if the stateful set rollout is currently in progress and waits until it's complete. The Statefulset logic was added specifically to prevent a situation when we would discover the IPs of the servers right before Kubernetes would kill and restart the container, causing its IP to change and the server-acl-init job to run forever trying to talk to the pod IP that no longer exists.

This PR was motivated by the idea that we don't really need to use Kube API to discover the IPs of the servers. Kubernetes already has a built-in mechanism for discovering IPs of the stateful set because each pod in a stateful set has a "well-known" DNS name. Using these DNS names would allow us to not only simplify the logic of this command but also avoid potential edge cases where the pod IP we've discovered has changed while the job is running. Additionally, this would allow us to eventually run this command against external servers.

Summary of the Changes

Command Flags

  • ❗️ ❗️ [Breaking change] Add the required -server-address command flag
  • Add -server-port command flag and default it to 8500.
  • ❗️ ❗️ [Breaking change] Remove -expected-replicas, -release-name (already deprecated), and -server-label-selector flags because we no longer need them.

Other changes

  • Because we're requiring server addresses be provided, we're removing the logic that checks whether stateful set rollout is in progress. In the case when the server addresses are DNS names of the stateful set, the reasons for this logic go away since even if pod IPs change, the Kube DNS should handle that for us.

Testing

All testing was done on AKS with the ishustava/consul-k8s-dev:04-07-2020-497214b image (or equivalent).
Cases tested:

  • Clean install
  • Upgrade of servers
  • Upgrade when there's a config change causing an update to ACL policies
  • Upgrade of servers when servers were in an unhealthy state (split-brain)

Feedback

I'm looking for feedback in the following areas:

  • Am I missing any test cases?
  • Flag names

@ishustava ishustava marked this pull request as ready for review April 8, 2020 05:39
@ishustava ishustava requested a review from a team April 8, 2020 05:39
@ishustava ishustava force-pushed the acl-init-refactor branch from 497214b to c6b03fe Compare April 8, 2020 16:47
* Require -server-address to be provided instead of discovering the
  server IPs from Kubernetes pods. This allows us to eventually
  to run this command against external servers or servers deployed
  on Kube in the same way. On Kubernetes, instead of discovering Pod IPs,
  we can use server's stateful set DNS names.
* [Breaking change] Remove -expected-replicas, -release-name, and
  -server-label-selector flags because we no longer need them.
@ishustava ishustava force-pushed the acl-init-refactor branch from c6b03fe to c2581c5 Compare April 8, 2020 16:49
@lkysow lkysow self-assigned this Apr 9, 2020
Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to actually run the code but here's the code review. I love how much this has reduced LOC.

subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
@lkysow
Copy link
Member

lkysow commented Apr 9, 2020

I've tested this now and it works like a charm. I tried to mess it up by deleting pods and starting upgrades but it always worked!

Am I missing any test cases?

No, I think you're good.

Flag names

I don't know the best word for IP or hostname but not port. I think server-address is fine if it's documented that you shouldn't add the port.

@ishustava
Copy link
Contributor Author

@lkysow it could also be called -server-host. As far as I can tell, that's how this part of the URL is called.

@lkysow
Copy link
Member

lkysow commented Apr 9, 2020

But then you're using it as a possible join url in the next pr. Also join and join-wan use the term "address":

  -join=<value>
     Address of an agent to join at start time. Can be specified
     multiple times.

  -join-wan=<value>
     Address of an agent to join -wan at start time. Can be specified
     multiple times.

@ishustava
Copy link
Contributor Author

Agree. I left it as server-address.

I've made updates based on your suggestions (thank you so much for this careful review!).

subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
@ishustava ishustava merged commit afb4b84 into master Apr 9, 2020
@lkysow lkysow deleted the acl-init-refactor branch July 13, 2020 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/acls Related to ACLs type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants