-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect Vault CA token updates are not retained on restart #11363
Comments
Thank you for the bug report! Sorry it has taken some time to respond. I see the error is from a secondary datacenter. The context on the error indicates that something about the I assume that all 3 datacenters are configured with the Vault provider, and point at the same instance of Vault. They should all use the same path for After revoking the Vault token, did you update the token in all 3 datacenters? The primary DC only shares its public CA certificate with secondaries, it does not share the CA config, so you would need to update all 3 DCs. Because of the issue documented in #11811, every leader rotation in the primary will cause this Did you also see this error in the primary DC, or only secondary DCs? If there was another error in the primary DC can you please share it as well. I expect the error in the primary to contain more information about why the update might have failed. |
Hey @rrijkse Are you still experiencing this issue? |
@Amier3 I did experience this yesterday when upgrading from 1.10.3 to 1.11.1, one of the datacenters "forgot" that we switched to the Consul CA away from the Vault CA (to mitigate this issue). @dnephin All datacenters did use the Vault CA and they all used the same I don't have any way of testing this with the Vault CA provider since we have updated to use the Consul built-in CA, however with that setup the settings are still reverting from the Consul CA to the Vault CA and I couldn't find anything specific in the logs about why that is happening. |
Apologies for the delayed response. How are you currently managing consul (i.e systemd, config management)? If two different CA certs were generated and used in the same cluster it'd lead to the same issues you're experiencing |
It's managed using config management, and we did have a couple of tries at getting the vault config setup properly before it worked could that have caused issues? If so how do I clear out any of the old configs/CA's? I have ran the |
I'm having a somewhat similar looking issue. To reproduce:
> consul connect ca get-config > connect_ca.json
# Edit connect_ca.json with new token
> consul connect ca set-config -config-file connect_ca.json
Error setting CA configuration: Unexpected response code: 500 (backend not initialized) Actually after spamming that a bit, it eventually works... Maybe this will help someone |
The log fragment from the original report suggests that Consul is trying to lookup the details of the Vault token provided in the CA config, but that token lacks the Vault permissions to lookup itself:
What happens if the Vault policy associated with that Vault token is updated to include the following rules?
|
For others following this issue who aren't the original reporter: Did you also observe the same behavior with an error in the log about a 403 on a Vault |
On the latest version (consul 1.14.1), I am able to consistently set a new connect ca token with |
@jkirschner-hashicorp I am seeing the 403 log about Vault's Context:
I get the following error in Consul's logs:
And when adding a Vault audit device I see the following audit logs:
Which I guess means that the token is invalid ( I also checked that the new token has the correct permissions: So it seems like Consul is still using a ghost token for Vault, even after a full restart? Using Edit: found out about Vault audit devices
|
When filing a bug, please include the following headings if possible. Any example text in this template can be deleted.
Overview of the Issue
When you use the
consul connect ca set-config
command to update the Vault token the configuration is updated and successfully connects, however when the leadership in the cluster changes or Consul is restarted on a node the vault token in the configuration reverts back to the previous version (as seen by aconsul connect ca get-config
command.Reproduction Steps
Steps to reproduce this issue, eg:
consul connect ca set-config
commandIt doesn't always happen but I can reproduce this on all of our clusters (DM me if you want remote access to our SBX environment)
Consul info for both Client and Server
Server info
Operating system and Environment details
Amazon Linux 2 running on EC2 with Consul version 1.10.3 (before upgrade was 1.9.6)
Log Fragments
The text was updated successfully, but these errors were encountered: