You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Many Dynamic Secrets engines can not support a high number of credential requests from replicated workloads. For example, if the Atlas Secrets Engine needed to provision 100 database credentials for 100 pods, this would likely lock any other vital automation in the Atlas environment such as backups or scaling.
The solution to this issue is to run a Vault Agent as a Caching Proxy for credential requests. If all pods use a single k8s service account via the Vault Caching Proxy then the Vault Server only provisions a single instance of the dynamic credential for all 100 pods. The credentials are now "service account scoped" instead of "pod scoped".
Describe the solution you'd like
Preferably, the helm chart would support a k8s Deployment that pushes out a cluster (replicated or not) of Vault Agent proxies behind a k8s service.
Currently #749 attempts to add the Vault Agent Proxy as a side care for the CSI storage engine. This provides no benefit for the Vault Injector. A standalone proxy would help both and give operators the control they need to confidently administrate Vault workflows.
Describe alternatives you've considered
I've looked at a lot of "operators" that make K8S secrets from Vault, but that introduces a lot of moving parts and we lose the air gapped environment Vault is aiming to provide.
Wants to be able to configure telemetry on each agent in a pod. Leads to a bunch of low value time series for Prometheus, etc. A central proxy would be easier to configure and provide higher value time series.
I believe the issue indicates that in a deployment with 50 pods, 50 credentials are deployed, but the CSI only uses the last credential deployed. Proxy would only provision 1 and the CSI would use that.
Other Technical Advantages
In general, I think there are strong reasons to treat the Vault Agent Proxy as a standalone deployment:
HA/DR
Deploy multiple instances of a cache with topology aware scheduling to be resilient against zonal failures.
Simpler run books: scale up, restart, for an individual component instead of a coupled component.
Monitoring
Monitoring all Injected Agents for the Vault Injector may be untenable for overloaded prometheus instances.
A central cache establishes a good "bottle neck" to monitor the aggregate and then identify the issue.
Improve Cache Hit Rates
In large clusters it may be valuable to partition Vault Proxies by application to have smaller deployments with higher cache hit rates.
More Generic -> More Use Cases
Building the Vault Agent proxy into the injector or the CSI is a good idea, but a standalone instance can support more use cases.
More use cases means more improvements delivered to a smaller set of files in the code base.
The text was updated successfully, but these errors were encountered:
The credentials are now "service account scoped" instead of "pod scoped".
Just to note on this point, to get a cache hit on Agent currently, the token used for logging in has to be the exact same token. But in modern k8s versions every pod gets its own projected service account token with a different TTL/pod owner etc. So to get cache hits from different pods, we'd either have to engineer every pod using the same token (probably not tenable), or implement a feature in Agent that allows a cache hit based on some local token validation and service account matching, or some other similar feature that relaxes the requirements for a cache hit without risking impersonation by attackers.
That's not to say it's not possible, but it's a bit more work than it looks like upfront.
Is your feature request related to a problem? Please describe.
Many Dynamic Secrets engines can not support a high number of credential requests from replicated workloads. For example, if the Atlas Secrets Engine needed to provision 100 database credentials for 100 pods, this would likely lock any other vital automation in the Atlas environment such as backups or scaling.
The solution to this issue is to run a Vault Agent as a Caching Proxy for credential requests. If all pods use a single k8s service account via the Vault Caching Proxy then the Vault Server only provisions a single instance of the dynamic credential for all 100 pods. The credentials are now "service account scoped" instead of "pod scoped".
Describe the solution you'd like
Preferably, the helm chart would support a k8s Deployment that pushes out a cluster (replicated or not) of Vault Agent proxies behind a k8s service.
Currently #749 attempts to add the Vault Agent Proxy as a side care for the CSI storage engine. This provides no benefit for the Vault Injector. A standalone proxy would help both and give operators the control they need to confidently administrate Vault workflows.
Describe alternatives you've considered
Additional Context
Vault Agent Injector
Secrets CSI Provider
Other Technical Advantages
In general, I think there are strong reasons to treat the Vault Agent Proxy as a standalone deployment:
The text was updated successfully, but these errors were encountered: