Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ACL replication #226

Merged
merged 1 commit into from
Mar 16, 2020
Merged

Conversation

lkysow
Copy link
Member

@lkysow lkysow commented Mar 6, 2020

Summary

  • Adds flag -acl-replication-token-file for setting the ACL replication
    token. This token is used by secondary dc's to create ACL policies.
    This turns on ACL replication mode.
    In this mode we will not bootstrap ACLs since we expect replication
    to be running.
  • Modifies various policies and tokens to only be applicable to the
    local datacenter. These policies should have been only local before.
  • If running in a secondary DC, append the datacenter name to the
    policy name. This is required because policies must be globally
    unique.
  • Note: we aren't sharing policies between datacenters because
    each server-acl-init could modify the policy depending on
    its local config.
  • Adds agent:read permissions to the replication token which is needed to get the current datacenter

Test Description

I took an incremental approach to the tests rather than having all the existing tests take another permutation of replication enabled. I did this because the replication changes are incremental and because I thought adding that permutation to all the tests would make them overly complicated and harder to debug when they failed since multiple consul servers are involved and there's a ton of logs to pick through.

How to test

DC1

  • Bring up two kube clusters
  • Checkout the wan-fed-acls Helm branch
  • Run helm install using primary-config.yaml (see below)
  • Wait for mesh gateway svc to get external ip (kubectl get svc consul-mesh-gateway )
  • Edit primary-dc.yaml and update meshGateway.wanAddress.host to that IP. Run helm upgrade.
  • Export SSL certs and acl token
    kubectl get secret consul-ca-cert -o yaml > consul-ca-cert.yaml
    kubectl get secret consul-ca-key -o yaml > consul-ca-key.yaml
    kubectl get secret consul-acl-replication-acl-token -o yaml > consul-acl-replication-acl-token.yaml
    
  • Note down mesh gateway IP

DC2

  • Switch kubectl context to dc2
  • Edit secondary-dc.yaml and update server.extraConfig.primaryGateways to the primary's mesh gateway IP
  • Import the secrets: kubectl apply -f consul-ca-cert.yaml -f consul-ca-key.yaml -f consul-acl-replication-acl-token.yaml
  • Run helm install using secondary-dc.yaml (see below)
  • Wait for mesh gateway svc to get external ip (kubectl get svc consul-mesh-gateway )
  • Edit secondary-dc.yaml and update meshGateway.wanAddress.host to that IP. Run helm upgrade.
  • 🤞 it works!
primary-dc.yaml
global:
  image: lkysow/consul:1.8.0-beta1-mar10
  imageK8S: lkysow/consul-k8s-dev:mar10-2020-wanfed
  tls:
    enabled: true
  bootstrapACLs: true
  acls:
    createReplicationToken: true
  federation:
    enabled: true
  name: consul
server:
  replicas: 1
  bootstrapExpect: 1
  extraConfig: |
    {
      "log_level": "debug"
    }
client:
  extraConfig: |
    {
      "log_level": "debug"
    }
connectInject:
  enabled: true
  imageEnvoy: envoyproxy/envoy-alpine:v1.13.0
meshGateway:
  enabled: true
  replicas: 1
  enableHealthChecks: false
  wanAddress:
    useNodeIP: false
    host: "52.188.143.194"
  service:
    enabled: true
    type: LoadBalancer
  imageEnvoy: envoyproxy/envoy-alpine:v1.13.0
secondary-dc.yaml
global:
  datacenter: dc2
  image: lkysow/consul:1.8.0-beta1-mar10
  imageK8S: lkysow/consul-k8s-dev:mar10-2020-wanfed
  tls:
    enabled: true
    caCert:
      secretName: consul-ca-cert
      secretKey: tls.crt
    caKey:
      secretName: consul-ca-key
      secretKey: tls.key
  bootstrapACLs: true
  acls:
    replicationToken:
      secretName: consul-acl-replication-acl-token
      secretKey: token
  federation:
    enabled: true
  name: consul
server:
  replicas: 1
  bootstrapExpect: 1
  extraConfig: |
    {
      "primary_datacenter": "dc1",
      "primary_gateways": ["52.188.143.194:443"],
      "log_level": "debug"
    }
client:
  extraConfig: |
    {
      "log_level": "debug"
    }
connectInject:
  enabled: true
  imageEnvoy: envoyproxy/envoy-alpine:v1.13.0
meshGateway:
  enabled: true
  replicas: 1
  enableHealthChecks: false
  wanAddress:
    useNodeIP: false
    host: "52.143.82.105"
  service:
    enabled: true
    type: LoadBalancer
  imageEnvoy: envoyproxy/envoy-alpine:v1.13.0

@lkysow lkysow added area/multi-dc Related to running with multiple datacenters type/enhancement New feature or request labels Mar 6, 2020
@lkysow lkysow force-pushed the wan-federation-acl-replication branch from de7b01c to 8e27358 Compare March 6, 2020 22:48
@lkysow lkysow force-pushed the wan-federation-acl-replication branch from 8e27358 to e60bd68 Compare March 10, 2020 15:42
@lkysow
Copy link
Member Author

lkysow commented Mar 10, 2020

NOTE: I will put up a follow-up PR for configuring the anonymous policy to allow cross-dc calls to work.

@lkysow lkysow marked this pull request as ready for review March 10, 2020 19:29
@lkysow lkysow requested a review from a team March 10, 2020 19:30
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Luke, awesome work so far! I've reviewed the code and left some comments inline.

Now onto testing it with the Helm chart!

subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Show resolved Hide resolved
subcommand/server-acl-init/command.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/create_or_update.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/create_or_update.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
@lkysow lkysow requested a review from ishustava March 13, 2020 15:53
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving since my testing with the Helm chart PR were all successful 🎉 There are a couple of comments that are still unresolved, but I don't think they are blocking. Great job!

- Adds flag -acl-replication-token-file for setting the ACL replication
  token. This token is used by secondary dc's to create ACL policies.
  If set, this flag turns on ACL replication mode.
  In this mode we will not bootstrap ACLs since we expect replication
  to be running.
- Modifies various policies and tokens to only be applicable to the
  local datacenter. These policies should have been only local before.
- If running in a secondary DC, append the datacenter name to the
  policy name. This is required because policies must be globally
  unique.
- Note: we aren't sharing policies between datacenters because
  each server-acl-init could modify the policy depending on
  its local config.
- Adds agent:read permissions to the replication token which is
  needed to get the current datacenter.
@lkysow lkysow force-pushed the wan-federation-acl-replication branch from 377040e to 433e036 Compare March 16, 2020 16:57
@lkysow lkysow merged commit 7a5b597 into wan-federation-base Mar 16, 2020
@lkysow lkysow deleted the wan-federation-acl-replication branch July 13, 2020 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-dc Related to running with multiple datacenters type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants