-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vault not booting on AWS using awskms seal #368
Comments
I've got it working with this awskms block:
Maybe see if adding the region helps? |
I actually completely bypassed the extraEnvironmentVars:
VAULT_SEAL_TYPE: awskms
extraSecretEnvironmentVars:
- envName: AWS_ACCESS_KEY_ID
secretName: vault-aws-auth
secretKey: VAULT_AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: vault-aws-auth
secretKey: VAULT_AWS_SECRET_ACCESS_KEY
- envName: AWS_REGION
secretName: vault-aws-auth
secretKey: VAULT_AWS_REGION
- envName: VAULT_AWSKMS_SEAL_KEY_ID
secretName: vault-aws-auth
secretKey: VAULT_AWSKMS_SEAL_KEY_ID |
@echoboomer As @cbohrtarwater mentioned, I've usually seen region specified in the awskms config block. As for debugging it, we're working on adding more logging around the awskms credential code, but in the meantime check the logs of the vault container? vault should eventually log the error it encountered and which credentials it's using. |
I have tried to provide the region both in config as well using environment variables. There are no k8s logs neither logs in Running below in the pod. $ vault status
Error checking seal status: Get "http://127.0.0.1:8200/v1/sys/seal-status": dial tcp 127.0.0.1:8200: connect: connection refused The helm value I use are the following. server:
standalone:
enabled: true
extraEnvironmentVars:
VAULT_SEAL_TYPE: awskms
VAULT_AWS_REGION: $VAULT_KMS_KEY_ID
VAULT_AWSKMS_SEAL_KEY_ID: $AWS_REGION
dataStorage:
enabled: true
size: 10Gi
storageClass: null
accessMode: ReadWriteOnce
service:
enabled: true
ui:
enabled: true
serviceType: LoadBalancer Also tried this server:
standalone:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "file" {
path = "/vault/data"
}
service_registration "kubernetes" {}
seal "awskms" {
kms_key_id = "$VAULT_KMS_KEY_ID"
region = "$AWS_REGION"
}
dataStorage:
enabled: true
size: 10Gi
storageClass: null
accessMode: ReadWriteOnce
service:
enabled: true
ui:
enabled: true
serviceType: LoadBalancer Both are resulting in same issue, causing vault to not be reachable. |
When explicitly setting the following environment values it works. AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN: $AWS_SESSION_TOKEN According to documentation it should also be possible to run without these and take the role principle. We are sure we managed to get it working once, but can't reproduce. |
So vault should eventually crash and log which AWS credentials it's trying to use for auto-unseal, and why they failed. Something like this:
I have noticed that sometimes it takes a few minutes for the AWS auth to fail, the vault pod then exits with an error, and then the pod will restart. So you may need to use something like |
In case it helps, we added more debug logging to the awskms auto-unseal credential code in Vault 1.6: hashicorp/vault#9794 And as of Vault 1.5.5 and 1.6.0 we also decreased the timeout in the AWS client so that a failure to authenticate returns faster: https://www.vaultproject.io/docs/upgrading/upgrade-to-1.5.0#aws-instance-metadata-timeout |
Does EKS auto-unseal work with IAM Roles for Service Accounts? When my stateful set comes up, I get:
Using vault 1.6.1, seems like vault starts in seal migration mode. Same behavior when using the IAM role on the EC2 instance. Doesn't even get to the authentication step. |
@cabrinha Yes, Vault's auto-unseal works with EKS and IAM Roles for Service Accounts. Though it looks like this was unsealed using shamir, then started up with auto-unseal configured. I'd suggest taking a look at the migration docs: https://www.vaultproject.io/docs/concepts/seal#seal-migration |
End goal is to not need to migrate at all. After deleting and recreating my KMS key, DynamoDB table and statefulset, I'm now getting this error:
Is it required to run |
Yeah, if you just want auto-unseal, set an awskms seal config block: seal "awskms" {
region = "us-west-2"
kms_key_id = "alias/my-vault-role-key"
} And either add the role annotation to the service account in the chart values: server:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: <role-arn> or specify which existing service account to use: https://www.vaultproject.io/docs/platform/k8s/helm/configuration#serviceaccount server:
serviceAccount:
create: false
name: vault EKS will set AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables in the pod if IRSA is setup correctly, and the awskms logic will attempt to use those credentials for accessing the KMS (turn on debug logging for more info in that part of the process). Then run (If you haven't see this tutorial you may also find it useful: https://learn.hashicorp.com/tutorials/vault/autounseal-aws-kms) |
I have IRSA working for module "iam_assumable_role_vault" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "2.14.0"
create_role = true
role_name = "vault"
provider_url = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.vault.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:vault:vault"]
}
resource "aws_iam_policy" "vault" {
name_prefix = "vault"
description = "EKS vault cluster ${module.eks.cluster_id}"
policy = data.aws_iam_policy_document.vault.json
}
data "aws_iam_policy_document" "vault" {
statement {
sid = "VaultKMSUnseal"
effect = "Allow"
actions = [
"kms:Encrypt",
"kms:Decrypt",
"kms:DescribeKey",
]
resources = ["*"]
}
}
global:
enabled: true
server:
serviceAccount:
annotations:
# attach IAM vault role so Vault can interact with, and unlock, our vault
eks.amazonaws.com/role-arn: arn:aws:iam::472409228388:role/vault
extraEnvironmentVars:
VAULT_SEAL_TYPE: awskms
extraSecretEnvironmentVars:
- envName: AWS_ACCESS_KEY_ID
secretName: eks-creds
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: eks-creds
secretKey: AWS_SECRET_ACCESS_KEY
- envName: AWS_REGION
secretName: eks-creds
secretKey: VAULT_AWS_REGION
- envName: VAULT_AWSKMS_SEAL_KEY_ID
secretName: eks-creds
secretKey: VAULT_AWSKMS_SEAL_KEY_ID
ha:
enabled: true I
So far, so good. Here's where I encounter difficulties.
EKS sets
What am I doing wrong here? |
This turned out to be KMS key related. For some reason the original key just didn't want to allow permissions. Created a new key and boom, all is well |
Glad you got it figured out, @worldofgeese! Closing this issue for now. |
I'm deploying vault on EKS using following
values.yaml
.I'm continuously getting the following error.
$ vault status Error checking seal status: Get "http://127.0.0.1:8200/v1/sys/seal-status": dial tcp 127.0.0.1:8200: connect: connection refused
This prevents me from initializing vault.
My KMS key does have the required policy so the EKS node can access it.
Is there a way to debug?
I have also tried using the default image tag
1.4.2
, also this one fails.The text was updated successfully, but these errors were encountered: