Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Vault installation fails when using canal/calico network plugin #1398

Closed
przemyslavic opened this issue Jun 30, 2020 · 5 comments · Fixed by #1434
Closed

[BUG] Vault installation fails when using canal/calico network plugin #1398

przemyslavic opened this issue Jun 30, 2020 · 5 comments · Fixed by #1434
Assignees
Labels
Milestone

Comments

@przemyslavic
Copy link
Collaborator

Describe the bug
Vault installation fails when using canal/calico network plugin

To Reproduce
Steps to reproduce the behavior:

  1. edit config file to use calico/canal network plugin
  2. execute epicli apply

Expected behavior
The cluster was successfully deployed

OS (please complete the following information):

  • OS: [all]

Cloud Environment (please complete the following information):

  • Cloud Provider [e.g. AWS/Azure]

Additional context

2020-06-29T21:16:46.8872983Z 21:16:46 INFO cli.engine.ansible.AnsibleCommand - TASK [vault : Run configuration script] ****************************************
2020-06-29T21:21:51.6636713Z 21:21:51 INFO cli.engine.ansible.AnsibleCommand - fatal: [ec2-15-188-78-178.eu-west-3.compute.amazonaws.com]: FAILED! => {"changed": true, "cmd": "/opt/vault/bin/configure-vault.sh -c /opt/vault/script.config -a 10.1.1.125 -p https -v true", "delta": "0:05:03.375734", "end": "2020-06-29 21:21:51.418573", "msg": "non-zero return code", "rc": 1, "start": "2020-06-29 21:16:48.042839", "stderr": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100   176  100   176    0     0   1541      0 --:--:-- --:--:-- --:--:--  1557\nError: timed out waiting for the condition", "stderr_lines": ["  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current", "                                 Dload  Upload   Total   Spent    Left  Speed", "", "  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0", "100   176  100   176    0     0   1541      0 --:--:-- --:--:-- --:--:--  1557", "Error: timed out waiting for the condition"], "stdout": "2020-06-29-21:16:48: Checking if Vault is already initialized...\n2020-06-29-21:16:48: Initializing Vault...\n2020-06-29-21:16:48: Vault initialized.\n2020-06-29-21:16:48: Checking if vault is already unsealed...\n2020-06-29-21:16:48: Unsealing Vault...\nKey                Value\n---                -----\nSeal Type          shamir\nInitialized        true\nSealed             true\nTotal Shares       5\nThreshold          3\nUnseal Progress    1/3\nUnseal Nonce       6ede4193-ea85-7149-3abb-b5704997e066\nVersion            1.4.0\nHA Enabled         false\n2020-06-29-21:16:48: Unseal performed.\nKey                Value\n---                -----\nSeal Type          shamir\nInitialized        true\nSealed             true\nTotal Shares       5\nThreshold          3\nUnseal Progress    2/3\nUnseal Nonce       6ede4193-ea85-7149-3abb-b5704997e066\nVersion            1.4.0\nHA Enabled         false\n2020-06-29-21:16:48: Unseal performed.\nKey             Value\n---             -----\nSeal Type       shamir\nInitialized     true\nSealed          false\nTotal Shares    5\nThreshold       3\nVersion         1.4.0\nCluster Name    vault-cluster-4b42ed8f\nCluster ID      d3feab7d-4d06-dc5d-f2a4-8a65bd8933e9\nHA Enabled      false\n2020-06-29-21:16:48: Unseal performed.\n2020-06-29-21:16:48: Checking if vault is already unsealed...\nKey             Value\n---             -----\nSeal Type       shamir\nInitialized     true\nSealed          false\nTotal Shares    5\nThreshold       3\nVersion         1.4.0\nCluster Name    vault-cluster-4b42ed8f\nCluster ID      d3feab7d-4d06-dc5d-f2a4-8a65bd8933e9\nHA Enabled      false\n2020-06-29-21:16:48: Logging into Vault.\n2020-06-29-21:16:48: Login successful.\n2020-06-29-21:16:48: Checking if secret engine is already initialized...\n2020-06-29-21:16:48: Mounting secret engine...\nSuccess! Enabled the kv secrets engine at: secret/\n2020-06-29-21:16:48: Secret engine enabled under path: secret.\n2020-06-29-21:16:48: Checking if Kubernetes authentication is enabled...\n2020-06-29-21:16:48: Turning on Kubernetes authentication...\nSuccess! Enabled kubernetes auth method at: kubernetes/\n2020-06-29-21:16:48: Kubernetes authentication enabled.\n2020-06-29-21:16:48: Applying Epiphany default Vault policies...\nSuccess! Uploaded policy: admin\n2020-06-29-21:16:49: Admin policy applied.\nSuccess! Uploaded policy: provisioner\n2020-06-29-21:16:49: Provisioner policy applied.\n2020-06-29-21:16:49: Checking if userpass authentication is enabled...\n2020-06-29-21:16:49: Turning on userpass authentication...\nSuccess! Enabled userpass auth method at: userpass/\n2020-06-29-21:16:49: Userpass authentication enabled.\n2020-06-29-21:16:49: Creating user: admin...\nSuccess! Data written to: auth/userpass/users/admin\n2020-06-29-21:16:49: User: admin created.\n2020-06-29-21:16:49: Creating user: provisioner...\nSuccess! Data written to: auth/userpass/users/provisioner\n2020-06-29-21:16:49: User: provisioner created.\n2020-06-29-21:16:49: Configuring Kubernetes...\n2020-06-29-21:16:49: Applying vault-endpoint-configuration.yml...\nservice/external-vault created\nendpoints/external-vault created\n2020-06-29-21:16:50: vault-endpoint-configuration.yml: Success.\n2020-06-29-21:16:50: Applying vault-service-account.yml...\nserviceaccount/vault-auth created\nsecret/vault-auth created\nclusterrolebinding.rbac.authorization.k8s.io/role-tokenreview-binding created\n2020-06-29-21:16:50: vault-service-account.yml: Success.\n2020-06-29-21:16:50: Applying app-service-account.yml...\nserviceaccount/internal-app created\n2020-06-29-21:16:50: app-service-account.yml: Success.\n2020-06-29-21:16:50: Checking if Vault Agent Helm Chart is already installed...\n2020-06-29-21:16:51: Installing Vault Agent Helm Chart...\nRelease \"vault\" does not exist. Installing it now.\n2020-06-29-21:21:51: ERROR: There was an error during installation of Vault Agent Helm Chart. Exit status: 1", "stdout_lines": ["2020-06-29-21:16:48: Checking if Vault is already initialized...", "2020-06-29-21:16:48: Initializing Vault...", "2020-06-29-21:16:48: Vault initialized.", "2020-06-29-21:16:48: Checking if vault is already unsealed...", "2020-06-29-21:16:48: Unsealing Vault...", "Key                Value", "---                -----", "Seal Type          shamir", "Initialized        true", "Sealed             true", "Total Shares       5", "Threshold          3", "Unseal Progress    1/3", "Unseal Nonce       6ede4193-ea85-7149-3abb-b5704997e066", "Version            1.4.0", "HA Enabled         false", "2020-06-29-21:16:48: Unseal performed.", "Key                Value", "---                -----", "Seal Type          shamir", "Initialized        true", "Sealed             true", "Total Shares       5", "Threshold          3", "Unseal Progress    2/3", "Unseal Nonce       6ede4193-ea85-7149-3abb-b5704997e066", "Version            1.4.0", "HA Enabled         false", "2020-06-29-21:16:48: Unseal performed.", "Key             Value", "---             -----", "Seal Type       shamir", "Initialized     true", "Sealed          false", "Total Shares    5", "Threshold       3", "Version         1.4.0", "Cluster Name    vault-cluster-4b42ed8f", "Cluster ID      d3feab7d-4d06-dc5d-f2a4-8a65bd8933e9", "HA Enabled      false", "2020-06-29-21:16:48: Unseal performed.", "2020-06-29-21:16:48: Checking if vault is already unsealed...", "Key             Value", "---             -----", "Seal Type       shamir", "Initialized     true", "Sealed          false", "Total Shares    5", "Threshold       3", "Version         1.4.0", "Cluster Name    vault-cluster-4b42ed8f", "Cluster ID      d3feab7d-4d06-dc5d-f2a4-8a65bd8933e9", "HA Enabled      false", "2020-06-29-21:16:48: Logging into Vault.", "2020-06-29-21:16:48: Login successful.", "2020-06-29-21:16:48: Checking if secret engine is already initialized...", "2020-06-29-21:16:48: Mounting secret engine...", "Success! Enabled the kv secrets engine at: secret/", "2020-06-29-21:16:48: Secret engine enabled under path: secret.", "2020-06-29-21:16:48: Checking if Kubernetes authentication is enabled...", "2020-06-29-21:16:48: Turning on Kubernetes authentication...", "Success! Enabled kubernetes auth method at: kubernetes/", "2020-06-29-21:16:48: Kubernetes authentication enabled.", "2020-06-29-21:16:48: Applying Epiphany default Vault policies...", "Success! Uploaded policy: admin", "2020-06-29-21:16:49: Admin policy applied.", "Success! Uploaded policy: provisioner", "2020-06-29-21:16:49: Provisioner policy applied.", "2020-06-29-21:16:49: Checking if userpass authentication is enabled...", "2020-06-29-21:16:49: Turning on userpass authentication...", "Success! Enabled userpass auth method at: userpass/", "2020-06-29-21:16:49: Userpass authentication enabled.", "2020-06-29-21:16:49: Creating user: admin...", "Success! Data written to: auth/userpass/users/admin", "2020-06-29-21:16:49: User: admin created.", "2020-06-29-21:16:49: Creating user: provisioner...", "Success! Data written to: auth/userpass/users/provisioner", "2020-06-29-21:16:49: User: provisioner created.", "2020-06-29-21:16:49: Configuring Kubernetes...", "2020-06-29-21:16:49: Applying vault-endpoint-configuration.yml...", "service/external-vault created", "endpoints/external-vault created", "2020-06-29-21:16:50: vault-endpoint-configuration.yml: Success.", "2020-06-29-21:16:50: Applying vault-service-account.yml...", "serviceaccount/vault-auth created", "secret/vault-auth created", "clusterrolebinding.rbac.authorization.k8s.io/role-tokenreview-binding created", "2020-06-29-21:16:50: vault-service-account.yml: Success.", "2020-06-29-21:16:50: Applying app-service-account.yml...", "serviceaccount/internal-app created", "2020-06-29-21:16:50: app-service-account.yml: Success.", "2020-06-29-21:16:50: Checking if Vault Agent Helm Chart is already installed...", "2020-06-29-21:16:51: Installing Vault Agent Helm Chart...", "Release \"vault\" does not exist. Installing it now.", "2020-06-29-21:21:51: ERROR: There was an error during installation of Vault Agent Helm Chart. Exit status: 1"]}

A command that most likely causes problems:

  if [ "$helm_custom_values_set_bool" = "true" ] ; then
          helm upgrade --install --wait -f /tmp/vault_helm_chart_values.yaml vault /tmp/v0.4.0.tar.gz
        else
          helm upgrade --install --wait vault /tmp/v0.4.0.tar.gz
        fi
@mkyc
Copy link
Contributor

mkyc commented Jul 2, 2020

@przemyslavic isn't that related to #1072 ?

@erzetpe
Copy link
Contributor

erzetpe commented Jul 7, 2020

The problem is with network policies. To solve this issue I will add separate vault namespace and proper policies to handle traffic between kubernetes API, vault agent and external vault.

@przemyslavic
Copy link
Collaborator Author

przemyslavic commented Jul 23, 2020

All supported configurations were tested.
**Two of them are not working properly: **

- AWS/RHEL/flannel

- AWS/RHEL/canal

Steps to reproduce the bug:

  1. Deploy a new cluster
  2. Login to Vault vault login
  3. Add a secret vault kv put secret/devwebapp/config username='test' password='test'
  4. Deploy test application
apiVersion: apps/v1
kind: Deployment
metadata:
  name: devwebapp
  labels:
    app: devwebapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: devwebapp
  template:
    metadata:
      labels:
        app: devwebapp
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "devweb-app"
        vault.hashicorp.com/agent-inject-secret-credentials.txt: "secret/data/devwebapp/config"
        vault.hashicorp.com/tls-skip-verify: "true"
    spec:
      serviceAccountName: internal-app
      containers:
      - name: app
        image: busybox
        command:
        - sleep
        - "3600"
        imagePullPolicy: IfNotPresent
  1. Check logs for container vault-agent-init: kubectl logs devwebapp-xxx-xxx -c vault-agent-init
  2. Check if the secret exists in the application container kubectl exec devwebapp-xxx-xxx -c app -- cat /vault/secrets/credentials.txt

Actual behavior:
There is only one container named 'app'.

[ec2-user@ec2-xx-xx-xx-xx ~]$ kubectl get pods -A
NAMESPACE              NAME                                                                      READY   STATUS    RESTARTS   AGE
default                devwebapp-xx-xx                                                 1/1     Running   0          5s

Neither vault-agent-init nor vault-agent containers exist.
There is no possibility to inject secrets.

[ec2-user@ec2-xx-xx-xx-xx ~]$ kubectl logs devwebapp-xx-xx -c vault-agent-init
error: container vault-agent-init is not valid for pod devwebapp-xx-xx
[ec2-user@ec2-xx-xx-xx-xx ~]$ kubectl logs devwebapp-xx-xx -c vault-agent
error: container vault-agent is not valid for pod devwebapp-xx-xx

Apiserver logs showing the issue:

I0723 13:46:14.698896       1 trace.go:116] Trace: "Call mutating webhook" configuration:vault-agent-injector-cfg,webhook:vault.hashicorp.com,resource:/v1, Resource=pods,subresource:,operation:CREATE,UID:xxx (started: 2020-07-23 13:45:44.698698084 +0000 UTC m=+5802.412268297) (total time: 30.000155842s):
Trace: [30.000155842s] [30.000155842s] END
W0723 13:46:14.698966       1 dispatcher.go:168] Failed calling webhook, failing open vault.hashicorp.com: failed calling webhook "vault.hashicorp.com": Post https://vault-agent-injector-svc.vault.svc:443/mutate?timeout=30s: context deadline exceeded
E0723 13:46:14.698984       1 dispatcher.go:169] failed calling webhook "vault.hashicorp.com": Post https://vault-agent-injector-svc.vault.svc:443/mutate?timeout=30s: context deadline exceeded
I0723 13:46:14.702704       1 trace.go:116] Trace: "Create" url:/api/v1/namespaces/default/pods,user-agent:kube-controller-manager/v1.17.7 (linux/amd64) kubernetes/b445510/system:serviceaccount:kube-system:replicaset-controller,client:10.1.2.210 (started: 2020-07-23 13:45:44.693306426 +0000 UTC m=+5802.406876613) (total time: 30.00934833s):
Trace: [30.005736493s] [30.005653171s] About to store object in database

I also tested with tls disabled.
Exactly the same two configurations AWS/RHEL/flannel and AWS/RHEL/canal do not work properly.

@przemyslavic
Copy link
Collaborator Author

I also tested with tls disabled.
Exactly the same two configurations AWS/RHEL/flannel and AWS/RHEL/canal do not work properly.

@przemyslavic
Copy link
Collaborator Author

The issue reported and described in this task no longer exists. Another issue found, reported in #1500.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants