Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Initial Kubernetes secrets plugin implementation. #1

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jgiles
Copy link
Member

@jgiles jgiles commented May 27, 2018

This Kubernetes secrets backend implements hashicorp/vault#3839 (comment).

There are 2 binaries:

The overall plugin structure is based on the example Vault plugin: https://github.com/hashicorp/vault-auth-plugin-example

The token generation + verification is based loosely on the mechanisms in the approle auth backend: https://github.com/hashicorp/vault/tree/master/builtin/credential/approle

The "roles" model for controlling groups + extra fields for authenticated users is roughly modeled on those in the database and aws secret backends: https://github.com/hashicorp/vault/tree/master/builtin/logical/database, https://github.com/hashicorp/vault/tree/master/builtin/logical/aws

@jgiles jgiles force-pushed the initial-impl branch 6 times, most recently from da6f911 to 2054ea0 Compare May 27, 2018 03:34

Verified

This commit was signed with the committer’s verified signature.
Kielek Piotr Kiełkowicz
@jgiles jgiles force-pushed the initial-impl branch 2 times, most recently from 6e53cc0 to 0da2914 Compare May 29, 2018 14:40
storage logical.Storage
salt *salt.Salt
// TODO: Do we need to support invalidation + salt replacement?
initSalt sync.Once
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approle backend has a mechanism for invalidating the cached salt:

https://github.com/hashicorp/vault/blob/master/builtin/credential/approle/backend.go#L141

I'm trying to figure out if/when that is needed. If the value of the salt actually changed, we would lose ability to authenticate any of the previously-issued tokens. If it doesn't change, we don't need to invalidate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jefferai do I need to support invalidation of the cached salt?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will update to support this. When is invalidation triggered, though? It is a disruptive event, and I want to understand the implications.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this will happen if a mount of this type is synced over from a replication primary. The mount will get synced first, but not the underlying salt data. Then the salt data will get synced, at which point it should be invalidated.

}
username := fmt.Sprintf("v_%s_%s_%s", req.DisplayName, roleName, salt.SaltID(req.ClientToken))
// The UID should be opaque, but stable with the username. Just salt + hash the username.
uid := salt.SaltID(username)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how bad it is to re-use the same salt here. We could use a separate one.

}

func (b *backend) tokenPath(ctx context.Context, secret string) (string, error) {
s, err := b.Salt(ctx)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approle backend uses a dedicated HMAC key per role: https://github.com/hashicorp/vault/blob/2f67754951b993e49dff1e40ce17d5315a302e22/builtin/credential/approle/path_role.go#L29

We could do something similar, but I'm not sure it's worth the complexity cost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jefferai should I use a different salt for each role, or is a single salt OK?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approle uses both: for normal storage of roles and secret id accessors (which are random/unique) we do a simple salt. The reason HMAC is used for secret IDs is to prevent the same secret ID in two different roles from having the same underlying hashed value. So whether you need a similar scheme depends on whether any given value generated from across multiple roles might end up with the same value.

if err != nil {
return logical.ErrorResponse("extra must be a map: string -> list of strings"), nil
}
role := &roleEntry{
Copy link
Member Author

@jgiles jgiles May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment I don't really see the need for locking around role objects during read or write operations.

I believe the storage itself is thread-safe? So, with or without locking we are still just in a last-write-wins situation?

I guess if we were creating a dedicated salt per-role, we would need to lock around role creation to avoid races where two create operations happen (with different random salts, leading to problems where the salt would change).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good comment for the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jefferai do I need to do locking around role object manipulations?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppRole does role level locking for that exact reason that you mentioned. I think probably it comes down to whether you end up needing per-role hmac keys (as you suggested).

"user": map[string]interface{}{
"username": entry.Username,
"uid": entry.UID,
"groups": entry.Groups,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could actually read the groups + extra fields from the current configuration of the role at review time, rather than storing the groups + extra fields on the token entry storage object. Effectively this would mean that the "permissions" assigned to already-issued tokens would change when the role was updated.

I have two concerns about doing this:

  • It breaks with the mental model established by other secret backends (like aws or database), which generally seem not to have that behavior.
  • If we decided at some point to support specifying groups+extra fields at token-request time (so, you would POST to a particular role's token endpoint with the groups you want) the model would get VERY confusing. See discussion here for why we might want this behavior: Add other subject attributes than CN on PKI signing hashicorp/vault#4562 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jefferai should the permissions attached to a token be resolved dynamically from the role at Kubernetes token verification time, or should they be fixed to the role's permissions when the token was created?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. :-)

What does "read the groups + extra fields from the current configuration of the role at review time" mean? What is "review time"? Who is reviewing, when does it happen, what are they reviewing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the flow is this:

  • Vault admin defines a role "eng-role" on the kubernetes backend, with a set of assigned groups (say "read-group", "deploy-group"). Those groups are associated with permissions in Kubernetes.
  • Authenticated Vault user reads the kubernetes/token/eng-role endpoint, and gets a token back (just a random UUID)
  • User presents the token to Kubernetes as request header
    • Kubernetes POSTs a query with the token to the Vault kubernetes/tokenreviews endpoint, and Vault responds with an "allow" (it's a valid token), plus the set of configured groups for "eng-role" (permissions for the user).

The question is how to handle updates to role configuration. If I remove the "deploy-group" group from the "eng-role" role after the user gets their token from Vault, should Vault respond to the Kubernetes tokenreviews query with the set of groups configured at the time the credentials were obtained ("read-group", "deploy-group") or at the time of the review (just "read-group")? Should the permissions attached to credentials change after the credentials are issued?

For comparison, the permissions attached to database and aws plugin credentials do not change dynamically (it would be difficult to implement that behavior). If we added support for client cert creds to the Kubernetes backend, it would be difficult to match the dynamic behavior. And if in the future we allow people to POST requested groups when getting credentials, that would be incompatible with the dynamic behavior.

So at the moment I've opted for freezing permisssions at the time credentials are obtained.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to have Kubernetes consume a JWT for this? It'd be nice if Kubernetes didn't need to reach back into Vault. Then you'd save a network call (or many) and you'd not have this problem in the first place.

Copy link
Member Author

@jgiles jgiles Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, Kubernetes does support OpenID Connect, and we might be able to craft OIDC ID-compatible JWTs and have Kubernetes accept them without actually implementing the rest of the OIDC protocol. The Kubernetes docs aren't totally clear on this point - they both suggest that Kubernetes won't "phone home" and state strict requirements for the OIDC server. (I suppose we could also fully implement OIDC in Vault, but that seems out of scope).

That approach seems hacky though. Plus, the docs suggest both that it would not support revocation and that you can't use OIDC tokens for Kubernetes Dashboard access, a specific goal of this work.

I definitely agree that the situation is not ideal, but I think in this case we need to meet Kubernetes where it is. Kubernetes caches token review results for a configurable duration (which helps cut down on the network requests), and the decision here isn't so much a problem as a behavior choice.

If we did add OIDC-token based integration, your permissions would also be fixed at the time you got the token. I see that as further reason to keep the current behavior of fixing token permissions at token fetch time.

@jgiles
Copy link
Member Author

jgiles commented Jun 4, 2018

@jefferai any chance you could take a look at this?

Also, now that hashicorp/vault#4663 has landed I may try to incorporate Vault entity IDs into the IDs we give to kubernetes users. Should I consider the entity ID sensitive in any way, or is it safe to expose to Kubernetes?

@jefferai
Copy link

jefferai commented Jun 4, 2018

It's safe to expose it. In fact, if you want to have us pull this in upstream for wide distribution, I won't accept a plugin that uses req.ClientToken.

@jgiles
Copy link
Member Author

jgiles commented Jun 4, 2018

Excellent, I will certainly switch to entity IDs then. Thank you!

@jefferai
Copy link

jefferai commented Jun 4, 2018

You may also want to think about aliases as they can be a way to tie friendly names in. It's more complicated for sure, and entity IDs have the benefit that they can never be re-used.

@jgiles
Copy link
Member Author

jgiles commented Jun 4, 2018

I'd love to pull in alias data for use in assigning Kubernetes usernames (which aim to be more user-friendly). The Kubernetes definitions of usernames + user IDs map pretty well to aliases and entity IDs (usernames are allowed to be re-used and may not be stable, user IDs are stable not not re-used).

Do you have any code pointers for accessing alias data for a given entity ID from a secrets backend?

Since entity IDs can have multiple aliases, do you have any suggestions for determining which alias to use for human-readable identity data? Is there any notion of "primary" alias, or "active" alias based on authentication method used? My current plan is just to take the first alias in the list. Would that lead to consistent names, or could the first alias be different on different requests?

@jefferai
Copy link

jefferai commented Jun 4, 2018

See hashicorp/vault#4681

@jefferai
Copy link

jefferai commented Jun 4, 2018

The mount accessor should be explicitly configured by an admin, then, use the alias name corresponding to that accessor.

@jgiles jgiles force-pushed the initial-impl branch 6 times, most recently from b4295a8 to 8532872 Compare June 7, 2018 18:19
The client plugin implements the Kubernetes client-go credential plugin
API so that users can run kubectl and transparently get Kubernetes
credentials from Vault.
Copy link
Member

@bmperrea bmperrea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Take the first alias available and use the entity ID as the user ID, if
available. Fall back to the display name and a random UUID.

Bump the required Vault version for entity fetch features.
uid = entity.ID
if len(entity.Aliases) > 0 {
// Take the first alias available for the username.
alias := entity.Aliases[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alias information available (the mount type, the mount accessor) make it pretty user-unfriendly for admins to configure alias preference - mount type still isn't unique, and the accessor has a random number in it rather than just being a path.

For now I'm picking the first alias here, if available. It will be the ideal thing in the very common case of one alias, and slightly suboptimal in the rare case of multiple. Alias preference can be implemented later if we find a good UX story for admins.

@jgiles jgiles requested a review from rajeshnair2k June 7, 2018 19:57
@jgiles
Copy link
Member Author

jgiles commented Jun 7, 2018

@jefferai I've switched to using entity information (thanks @chrishoffman for hashicorp/vault#4681!), though I am still just taking the first alias. Would you mind taking a general look at the code, or perhaps just looking at the GitHub comments above on the parts I'm less certain about?

@tyrannosaurus-becks perhaps you might want to take a look? (pinging because you've worked on https://github.com/hashicorp/vault-plugin-auth-kubernetes recently)

@rajeshnair2k as an fyi.

@jgiles jgiles changed the title WIP: Initial working implementation. WIP: Initial Kubernetes secrets plugin implementation. Jun 7, 2018
@jgiles
Copy link
Member Author

jgiles commented Jun 7, 2018

The Travis integration is temporarily busted because I tried to switch to the new "checks" api...

@jgiles
Copy link
Member Author

jgiles commented Jun 8, 2018

Travis CI is back with new "Checks API" integration.

@jefferai
Copy link

@jgiles You asked about pulling it upstream. I've love to have it in upstream! There are two things, IMHO, that should happen before that's considered:

  1. You should codify what this is doing, why, how, open questions, etc. via an RFC (ideally via Google Docs). It will help get all stakeholders on the same page rather than simply looking at a combination of the code and the docs in the repo. A very good reason to do this is that...

  2. ...I want to leverage our partnership with Google and try to get feedback on the RFC. Their internal Kubernetes team provided guidance and feedback on the Kubernetes auth backend, and it would be great if we can have them do the same with this, including answering the open questions (they might have some insight into the Kubernetes roadmap that will dictate best approaches). That means having a doc as a coordination point though.

Is that doable?

Thanks!

@jgiles
Copy link
Member Author

jgiles commented Jun 24, 2018

@jefferai sounds reasonable, and should be doable. When you say RFC, do you have a particular format in mind? If so, can you link an example? Otherwise I can assemble a pretty standard design-doc kind of artifact.

I've already spent most of the normal work cycles I can on this (and we have something that solves our immediate problem), so progress will be stop-and-go.

I'm going to make the invalidation code changes, merge this PR (it's getting unwieldy), and then work on docs/RFC as available.

@chrsoo
Copy link

chrsoo commented Aug 7, 2019

@jgiles @jefferai last comment was a year ago, any chance of seeing this merged upstream?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants