You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Juju units get stuck in "Waiting for vault to be available" waiting state when kubernetes pods get rescheduled.
To Reproduce
Deploy Vault
juju deploy vault-k8s --channel=edge --trust -n 4
Wait for vault units to be in active/idle state
Delete one of the pods
kubectl delete pod vault-k8s-1 -n <your model>
Run juju status
guillaume@potiron:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
dem microk8s-localhost microk8s/localhost 3.1.6 unsupported 11:50:00-04:00
App Version Status Scale Charm Channel Rev Address Exposed Message
vault-k8s waiting 4 vault-k8s edge 44 10.152.183.110 no installing agent
Unit Workload Agent Address Ports Message
vault-k8s/0* active idle 10.1.182.49
vault-k8s/1 waiting idle 10.1.182.42 Waiting for vault to be available
vault-k8s/2 active idle 10.1.182.57
vault-k8s/3 active idle 10.1.182.38
Expected behavior
The expectation is for them to go back to an active/idle state automatically.
Environment
Charm / library version (if relevant):
Juju version (output from juju --version): 3.1.6
Cloud Environment: MicroK8s v1.27.6
Kubernetes version: 1.27
The text was updated successfully, but these errors were encountered:
This issue may be due to the fact that we use IP's instead of hostnames. When pods go down and back up, they'll come back wit a different IP. I'd recommend that we use K8s hostnames instead so that identity is not impacted by pods being rescheduled.
Fixes#69 by using the FQDN to connect to the vault instance instead of
using the IP address.
Using the IP address caused an issue because the IP address would change
after a crash or removal of a pod. In turn, this would cause the TLS
certificate to no longer be valid, as the TLS certificate is validated
against the new IP address (but was only issued for the old IP address).
We could re-issue the certificate, but the certificate is also valid for
anything which uses the same FQDN which the new pod will share.
This change removes the IP address from the certificate, and relies on
the FQDN instead.
Describe the bug
Juju units get stuck in "Waiting for vault to be available" waiting state when kubernetes pods get rescheduled.
To Reproduce
juju status
Expected behavior
The expectation is for them to go back to an active/idle state automatically.
Environment
juju --version
): 3.1.6The text was updated successfully, but these errors were encountered: