-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vault loop - High CPU with token lookup #4179
Comments
Forgot to mention. I can stop the loop by manually removing /v1/kv/vault/sys/token/parent/a4773073fd5dede80c0ea08b8bac4bf78144aec0/ from consul |
I don't think you can get into this state unless the storage entries were edited manually. As soon as the last child token is revoked and the parent token no longer has any child tokens, there shouldn't be an entry for that parent token under the Can you provide reproducible steps for this error? |
@rhuddleston When Vault gets stuck in this loop, can you successfully shutdown Vault by sending a SIGINT? I'm experiencing a [possibly] similar scenario but I'm unable to shutdown without forcefully killing it. |
@SoMuchToGrok @rhuddleston can you try out a build off master? Once built and running, you'd have to run the tidy operation to clean up any dangling parent prefixes. Let us know if that fixes your issue! |
Thanks for the assistance @calvn. Just deployed master and ran the tidy, but unfortunately no luck. Still seeing the loop in the consul monitor. Vault v0.9.6 ('5e3930cc9d38b2ec0412f46da9a4845448979aac')
Consul monitor output
|
Thanks for the info! I think I might know what's going on, but I'll to need to verify some things on my end. Hang tight in the meantime! |
I thought I had something going on, but upon further inspection it seems that the issue is not likely where I'm looking. |
@rhuddleston let me know if a build and tidy from the latest master did the trick for you. These could be two different issues that we're bumping into. |
A bit of an update, I had this issue again. Went ahead and upgraded to v0.10.1 after the upgrade it was still stuck in the loop. Went ahead and ran tidy. The first run complained about consistency issues but the second went through and the looping stopped. I'll let you know if this issue comes up again now that were on the new version. |
@rhuddleston we're still finalizing some things on an open PR, but we've got confirmation from some folks on issue #4143 that PR #4512 has resolved looping issues similar to the one you're experiencing. |
This just happened again on 0.10.1 and tidy did not clean it up. I had to manually delete a bunch of keys until the looping stopped. Any chance you can cut 0.10.2 soon? |
Would it be possible for you build a Vault binary based off master and test using that? |
I upgraded to 0.10.2 and put it on the clusters that keep having this issue. After upgrading and running tidy it fixed the immediate issue but the problem reoccured later and started looping over a couple values: 1062 /v1/kv/vault/sys/token/parent/69cb89543ac2aafb16dd31be26f1be561948b977/?keys=&separator=%2F so in a few seconds it hit each of the above over 1000 times when I straced it. The /v1/kv/vault/sys/token/parent/9b3d89943dfbaa68aaebf02587de19423a7c15a4 folder exitsted and had one value under it of 69cb89543ac2aafb16dd31be26f1be561948b977. Both exited under /v1/kv/vault/sys/token/id/ . Deleting the looping keys stops the problem then returns for a diferent key later |
Would it be possible for you to enable the You can read the entries with the following command: |
Hey @calvn this happened for us as well. I grabbed a raw output, but not sure how much there is to get from it. Looks like it was a Nomad server token. It was also last renewed in January it looks like?
|
@holtwilkins what version? |
|
|
I keep having to manually fix these CPU loops, I can upgrade but don't want to unless we think there was something fixed to stop it |
Can you list steps to repro the issue (perhaps the way the tokens got created)? In your log bits above, I see that Can you post the result from a |
I see that |
The issue might have been fixed in #5364, would you be willing to give v0.11.2 or above a try? |
No response after a year, closing. |
Environment:
Vault Config File:
storage "consul" {
address = "127.0.0.1:8500"
path = "vault/"
advertise_addr = "https://node1.prod:8200"
redirect_addr = "https://node1.prod:8200"
cluster_addr = "https://node1.demo.prod:8199"
service_tags = "0.9.5"
token = "xxxxxx"
}
listener "tcp" {
address = "0.0.0.0:8200"
cluster_address = "0.0.0.0:8199"
tls_cert_file = "/keystore/corp.inter.root.bundle.crt.pem"
tls_key_file = "/keystore/corp.key.pem"
tls_min_version = "tls12"
tls_cipher_suites = "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
tls_prefer_server_cipher_suites = "true"
}
Expected Behavior:
If a vault token doesn't exist anymore but is still listed under a parent don't go into a loop
Actual Behavior:
Since upgrading to 0.9.5 I've had two instances where vault goes into a loop hammering consul requests. For example in this case it constantly making the following requests in a loop:
GET /v1/kv/vault/sys/token/parent/4148eb28e502542be3e32db4e68bece1e65f9868/?keys=&separator=%2F
GET /v1/kv/vault/sys/token/parent/a4773073fd5dede80c0ea08b8bac4bf78144aec0/?keys=&separator=%2F
GET /v1/kv/vault/sys/token/id/4148eb28e502542be3e32db4e68bece1e65f9868
Of these three keys only one currently exists:
/v1/kv/vault/sys/token/parent/a4773073fd5dede80c0ea08b8bac4bf78144aec0/
The above key is actually a directory with a single key "4148eb28e502542be3e32db4e68bece1e65f9868"
So it seems that if vault gets into this situation that the key doesn't exist from the parent dir in the /v1/kv/vault/sys/token/id/ keyspace that it ends up going into an infinite loop.
I'm opening this issue as it's now occurred twice since I upgraded vault recently, but in any case I'm hoping you can take a look at the code and avoid this loop if this condition occurs in the future
The text was updated successfully, but these errors were encountered: