-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Initial Kubernetes secrets plugin implementation. #1
base: master
Are you sure you want to change the base?
Conversation
da6f911
to
2054ea0
Compare
6e53cc0
to
0da2914
Compare
storage logical.Storage | ||
salt *salt.Salt | ||
// TODO: Do we need to support invalidation + salt replacement? | ||
initSalt sync.Once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approle
backend has a mechanism for invalidating the cached salt:
https://github.com/hashicorp/vault/blob/master/builtin/credential/approle/backend.go#L141
I'm trying to figure out if/when that is needed. If the value of the salt actually changed, we would lose ability to authenticate any of the previously-issued tokens. If it doesn't change, we don't need to invalidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jefferai do I need to support invalidation of the cached salt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will update to support this. When is invalidation triggered, though? It is a disruptive event, and I want to understand the implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this will happen if a mount of this type is synced over from a replication primary. The mount will get synced first, but not the underlying salt data. Then the salt data will get synced, at which point it should be invalidated.
pkg/backend/path_token.go
Outdated
} | ||
username := fmt.Sprintf("v_%s_%s_%s", req.DisplayName, roleName, salt.SaltID(req.ClientToken)) | ||
// The UID should be opaque, but stable with the username. Just salt + hash the username. | ||
uid := salt.SaltID(username) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how bad it is to re-use the same salt here. We could use a separate one.
} | ||
|
||
func (b *backend) tokenPath(ctx context.Context, secret string) (string, error) { | ||
s, err := b.Salt(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approle
backend uses a dedicated HMAC key per role: https://github.com/hashicorp/vault/blob/2f67754951b993e49dff1e40ce17d5315a302e22/builtin/credential/approle/path_role.go#L29
We could do something similar, but I'm not sure it's worth the complexity cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jefferai should I use a different salt for each role, or is a single salt OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approle uses both: for normal storage of roles and secret id accessors (which are random/unique) we do a simple salt. The reason HMAC is used for secret IDs is to prevent the same secret ID in two different roles from having the same underlying hashed value. So whether you need a similar scheme depends on whether any given value generated from across multiple roles might end up with the same value.
if err != nil { | ||
return logical.ErrorResponse("extra must be a map: string -> list of strings"), nil | ||
} | ||
role := &roleEntry{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment I don't really see the need for locking around role objects during read or write operations.
I believe the storage itself is thread-safe? So, with or without locking we are still just in a last-write-wins situation?
I guess if we were creating a dedicated salt per-role, we would need to lock around role creation to avoid races where two create operations happen (with different random salts, leading to problems where the salt would change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good comment for the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jefferai do I need to do locking around role object manipulations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AppRole does role level locking for that exact reason that you mentioned. I think probably it comes down to whether you end up needing per-role hmac keys (as you suggested).
"user": map[string]interface{}{ | ||
"username": entry.Username, | ||
"uid": entry.UID, | ||
"groups": entry.Groups, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could actually read the groups + extra fields from the current configuration of the role at review time, rather than storing the groups + extra fields on the token entry storage object. Effectively this would mean that the "permissions" assigned to already-issued tokens would change when the role was updated.
I have two concerns about doing this:
- It breaks with the mental model established by other secret backends (like
aws
ordatabase
), which generally seem not to have that behavior. - If we decided at some point to support specifying groups+extra fields at token-request time (so, you would POST to a particular role's token endpoint with the groups you want) the model would get VERY confusing. See discussion here for why we might want this behavior: Add other subject attributes than CN on PKI signing hashicorp/vault#4562 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jefferai should the permissions attached to a token be resolved dynamically from the role at Kubernetes token verification time, or should they be fixed to the role's permissions when the token was created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea. :-)
What does "read the groups + extra fields from the current configuration of the role at review time" mean? What is "review time"? Who is reviewing, when does it happen, what are they reviewing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the flow is this:
- Vault admin defines a role "eng-role" on the kubernetes backend, with a set of assigned
groups
(say "read-group", "deploy-group"). Those groups are associated with permissions in Kubernetes. - Authenticated Vault user reads the
kubernetes/token/eng-role
endpoint, and gets a token back (just a random UUID) - User presents the token to Kubernetes as request header
- Kubernetes POSTs a query with the token to the Vault
kubernetes/tokenreviews
endpoint, and Vault responds with an "allow" (it's a valid token), plus the set of configuredgroups
for "eng-role" (permissions for the user).
- Kubernetes POSTs a query with the token to the Vault
The question is how to handle updates to role configuration. If I remove the "deploy-group" group from the "eng-role" role after the user gets their token from Vault, should Vault respond to the Kubernetes tokenreviews query with the set of groups configured at the time the credentials were obtained ("read-group", "deploy-group") or at the time of the review (just "read-group")? Should the permissions attached to credentials change after the credentials are issued?
For comparison, the permissions attached to database
and aws
plugin credentials do not change dynamically (it would be difficult to implement that behavior). If we added support for client cert creds to the Kubernetes backend, it would be difficult to match the dynamic behavior. And if in the future we allow people to POST requested groups when getting credentials, that would be incompatible with the dynamic behavior.
So at the moment I've opted for freezing permisssions at the time credentials are obtained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to have Kubernetes consume a JWT for this? It'd be nice if Kubernetes didn't need to reach back into Vault. Then you'd save a network call (or many) and you'd not have this problem in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, Kubernetes does support OpenID Connect, and we might be able to craft OIDC ID-compatible JWTs and have Kubernetes accept them without actually implementing the rest of the OIDC protocol. The Kubernetes docs aren't totally clear on this point - they both suggest that Kubernetes won't "phone home" and state strict requirements for the OIDC server. (I suppose we could also fully implement OIDC in Vault, but that seems out of scope).
That approach seems hacky though. Plus, the docs suggest both that it would not support revocation and that you can't use OIDC tokens for Kubernetes Dashboard access, a specific goal of this work.
I definitely agree that the situation is not ideal, but I think in this case we need to meet Kubernetes where it is. Kubernetes caches token review results for a configurable duration (which helps cut down on the network requests), and the decision here isn't so much a problem as a behavior choice.
If we did add OIDC-token based integration, your permissions would also be fixed at the time you got the token. I see that as further reason to keep the current behavior of fixing token permissions at token fetch time.
@jefferai any chance you could take a look at this? Also, now that hashicorp/vault#4663 has landed I may try to incorporate Vault entity IDs into the IDs we give to kubernetes users. Should I consider the entity ID sensitive in any way, or is it safe to expose to Kubernetes? |
It's safe to expose it. In fact, if you want to have us pull this in upstream for wide distribution, I won't accept a plugin that uses req.ClientToken. |
Excellent, I will certainly switch to entity IDs then. Thank you! |
You may also want to think about aliases as they can be a way to tie friendly names in. It's more complicated for sure, and entity IDs have the benefit that they can never be re-used. |
I'd love to pull in alias data for use in assigning Kubernetes usernames (which aim to be more user-friendly). The Kubernetes definitions of usernames + user IDs map pretty well to aliases and entity IDs (usernames are allowed to be re-used and may not be stable, user IDs are stable not not re-used). Do you have any code pointers for accessing alias data for a given entity ID from a secrets backend? Since entity IDs can have multiple aliases, do you have any suggestions for determining which alias to use for human-readable identity data? Is there any notion of "primary" alias, or "active" alias based on authentication method used? My current plan is just to take the first alias in the list. Would that lead to consistent names, or could the first alias be different on different requests? |
The mount accessor should be explicitly configured by an admin, then, use the alias name corresponding to that accessor. |
b4295a8
to
8532872
Compare
The client plugin implements the Kubernetes client-go credential plugin API so that users can run kubectl and transparently get Kubernetes credentials from Vault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Take the first alias available and use the entity ID as the user ID, if available. Fall back to the display name and a random UUID. Bump the required Vault version for entity fetch features.
uid = entity.ID | ||
if len(entity.Aliases) > 0 { | ||
// Take the first alias available for the username. | ||
alias := entity.Aliases[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alias information available (the mount type, the mount accessor) make it pretty user-unfriendly for admins to configure alias preference - mount type still isn't unique, and the accessor has a random number in it rather than just being a path.
For now I'm picking the first alias here, if available. It will be the ideal thing in the very common case of one alias, and slightly suboptimal in the rare case of multiple. Alias preference can be implemented later if we find a good UX story for admins.
@jefferai I've switched to using entity information (thanks @chrishoffman for hashicorp/vault#4681!), though I am still just taking the first alias. Would you mind taking a general look at the code, or perhaps just looking at the GitHub comments above on the parts I'm less certain about? @tyrannosaurus-becks perhaps you might want to take a look? (pinging because you've worked on https://github.com/hashicorp/vault-plugin-auth-kubernetes recently) @rajeshnair2k as an fyi. |
The Travis integration is temporarily busted because I tried to switch to the new "checks" api... |
Travis CI is back with new "Checks API" integration. |
@jgiles You asked about pulling it upstream. I've love to have it in upstream! There are two things, IMHO, that should happen before that's considered:
Is that doable? Thanks! |
@jefferai sounds reasonable, and should be doable. When you say RFC, do you have a particular format in mind? If so, can you link an example? Otherwise I can assemble a pretty standard design-doc kind of artifact. I've already spent most of the normal work cycles I can on this (and we have something that solves our immediate problem), so progress will be stop-and-go. I'm going to make the invalidation code changes, merge this PR (it's getting unwieldy), and then work on docs/RFC as available. |
This Kubernetes secrets backend implements hashicorp/vault#3839 (comment).
There are 2 binaries:
kubectl
.The overall plugin structure is based on the example Vault plugin: https://github.com/hashicorp/vault-auth-plugin-example
The token generation + verification is based loosely on the mechanisms in the
approle
auth backend: https://github.com/hashicorp/vault/tree/master/builtin/credential/approleThe "roles" model for controlling groups + extra fields for authenticated users is roughly modeled on those in the
database
andaws
secret backends: https://github.com/hashicorp/vault/tree/master/builtin/logical/database, https://github.com/hashicorp/vault/tree/master/builtin/logical/aws