-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server Plugin UpstreamAuthority cert-manager #2274
Server Plugin UpstreamAuthority cert-manager #2274
Conversation
ca82960
to
133adda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this contrib @JoshVanL, we sincerely appreciate it! Very excited to gain support for cert-manager :-D
Sorry for the delay in review - I was out most of last week on vacation. I took a pass today and it looks great, had a few small questions here and there but nothing major.
I'm wondering if we should have an integration test for this. How stable is the CertificateRequest API? Is there anything special we need in terms of role bindings, cert-manager config, etc that we need to make all this work?
Finally, it would be great to hear your feedback on the process of authoring a new plugin. We are putting some attention on that currently and hearing a little about your experience would be helpful.
|
||
This plugin will request a signing certificate from cert-manager via a | ||
[CertificateRequest](https://cert-manager.io/docs/concepts/certificaterequest/) | ||
resource. Once the referenced issuer has signed the request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we lose some text at the end here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have added the rest :)
|
||
| Configuration | Description | | ||
| ------------------------- | ----------------------------------------------------------------- | | ||
| kube_config_path | Path to the kubeconfig used to connect to the Kubernetes cluster. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this to kube_config_file
and have the default be in-cluster config? That will make it consistent with all the other k8s-y plugins
| ------------------------- | ----------------------------------------------------------------- | | ||
| kube_config_path | Path to the kubeconfig used to connect to the Kubernetes cluster. | | ||
| namespace | The namespace to create CertificateRequests for signing. | | ||
| issuer_name | The name of the issuer to reference in CertificateRequests. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because we can request certs from different issuers using the same cert-manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly
go.mod
Outdated
k8s.io/kube-aggregator v0.18.2 | ||
k8s.io/utils v0.0.0-20201110183641-67b214c5f920 | ||
sigs.k8s.io/controller-runtime v0.6.0 | ||
k8s.io/api v0.20.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@azdagron raised a question in a call last week about this dep bump. Is it needed for cert-manager upstream support? To date, I think our version has been held back to give a better compat guarantee with older k8s clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it is because of importing the cert-manager repo. After some go mod wrangling, I've struggled to reduce the versions. Happy to instead use an unstructured client to make things easier? This would resolve having to import jetstack/cert-manager
.
) | ||
|
||
const ( | ||
pluginName = "cert-manager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we normally follow snake case for plugin names ... that said, I know that cert-manager is named cert-manager 😅
I don't know what's worse, inconsistent naming with the other plugins or inconsistent naming with the upstream thing! My initial thought here is to just leave it as is, but thought I'd comment for other reviewers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I went back and forth a few times on this! Happy to change it either way 😄
} | ||
// If no issuer_group given, default to cert-manager.io | ||
if len(config.IssuerGroup) == 0 { | ||
p.log.Warn("configuration has empty issuer_group property, defaulting to 'cert-manager.io'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these warnings are necessary. If the default is worthy of a warning, my preference would be to have a different default or none at all. Is there something specific we're worried about here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I will change the default to a debug.
The defaults are the same that are used in cert-manager so made sense to me to port them here from a UX perspective. Happy to remove them completely though, and just error out if they are not defined.
|
||
if cmutil.CertificateRequestIsDenied(cr) { | ||
cond := cmutil.GetCertificateRequestCondition(cr, cmapi.CertificateRequestConditionDenied) | ||
return fmt.Errorf("created CertificateRequest %s/%s has been denied: %s:%s", cr.Namespace, cr.Name, cond.Reason, cond.Message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably log this error with the appropriate fields, and then return a more succinct error message to core. That way we'll get structured logging since core won't extract all of this stuff out from the error it receives.
Similar comment for all the fmt.Errorf
messages that are templating details into the error message
} | ||
|
||
// If the configured issuer did not populate the CA on the request we cannot | ||
// build the upstream roots. We can only error here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Is this something that can happen "normally"? Do we need to document anything about it e.g. our requirements on a certain cert-manager configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is up to the cert-manager Issuer implementation as to whether they support setting the CA or not, and we can't programmatically determine this before as they are just stringy references.
For cert-manager core Issuers,Vault and CA support setting this field, but ACME for example doesn't.
Thanks for the review @evan2645 😄
Sure thing, will work on an integration test next. The CertificateRequest API is v1 now and should be stable from the client point of view. From a role perspective, the client need only "create" permissions for "CertificateRequests" in the defined namespace. The identity of the client to the API server is important though, as this ties into the approval mechanism. I'll add a note about that in the documentation.
Sure! I mostly found it fairly simple to know which bits I should be adding in, and didn't have much trouble find things. Knowing what goes into |
@evan2645 Had a go at an integration test for cert-manager, let me know what you think 🙂 The go dependency changes are tricky.. Feel that using an unstructured client may be the way to go, wdyt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woot! This is looking great @JoshVanL I think we are almost there! A few more comments from me, couple things I missed on the last pass :( sorry about that.
Sure! I mostly found it fairly simple to know which bits I should be adding in, and didn't have much trouble find things. Knowing what goes into MintX509CAResponse was a bit tricky, though that is likely due to me not knowing enough about the project yet.
Aah yes ok ... I can see now that you are using our pre-1.0 protos. We have a new set of protos which are better documented, I wonder if they would have helped. You can check them out here, curious if that would have made things better for you. @azdagron do we need to do anything special here since we're still in the middle of this plugin subsystem change?
Had a go at an integration test for cert-manager, let me know what you think 🙂
Awesome, thank you for this!! One of the challenges we've had is maintaining integration code for things we don't know a lot about. I'm sure you all have this problem too, if you have any pointers they'd be very much welcome :-D in the meantime, this integration test goes a long way in helping to mitigate the overhead.
I had some failures trying to run it locally, all related to openssl invocations of course 😆 I think maybe there are a couple typos there. Otherwise, the approach of calling mint and examining the issuer and the bundle sounds good to me.
The go dependency changes are tricky.. Feel that using an unstructured client may be the way to go, wdyt?
I'll defer to @azdagron on what he wants to do here. I am not sure what other implications unstructured client may bring
} | ||
|
||
// loadConfig parses and defaults incoming configure requests | ||
func (p *Plugin) loadConfig(req *spi.ConfigureRequest) (*Config, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing a check here for config.Namespace
? The docs suggest it's required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that was an oversight
cmapi.UsageDigitalSignature, | ||
cmapi.UsageKeyEncipherment, | ||
cmapi.UsageCertSign, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the usages that we normally set when cutting a CA cert:
spire/pkg/server/ca/templates.go
Lines 34 to 36 in e1ffe67
KeyUsage: x509.KeyUsageDigitalSignature | | |
x509.KeyUsageCertSign | | |
x509.KeyUsageCRLSign, |
The key encipherment usage caught my eye here. I don't think that digital signature is strictly necessary and probably should be omitted across the board, but since it's already there let's just keep it?
cmapi.UsageDigitalSignature, | |
cmapi.UsageKeyEncipherment, | |
cmapi.UsageCertSign, | |
cmapi.UsageDigitalSignature, | |
cmapi.UsageCertSign, | |
cmapi.UsageCRLSign, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, I've also added UsageCRLSign
} | ||
// If no issuer_kind given, default to Issuer | ||
if len(config.IssuerKind) == 0 { | ||
p.log.Debug("configuration has empty issuer_kind property, defaulting to 'Issuer'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry I missed this before. All log messages have to be formatted as a sentence without a period. I think most of the messages here can conform to the guideline by simply capitalizing the first letter
p.log.Debug("configuration has empty issuer_kind property, defaulting to 'Issuer'") | |
p.log.Debug("Configuration has empty issuer_kind property, defaulting to 'Issuer'") |
log.Info("waiting for certificaterequest to be signed") | ||
|
||
// Poll the CertificateRequest until it is signed. If not signed after 300 | ||
// polls, error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that there is some kind of periodic clean up that occurs on cert-manager side or elsewhere? I don't see us removing the resource we created, and if we die at this point we'll leak the one we made and create a new one on next boot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't actually.. we should do something here to clean up.
I think what would be best is that we delete successful CertificateRequests in MintX509CA.
We can also annotate created CRs: cert-manager.spiffe.io/trust-domain: <GlobalConfig.trust-domain>
. We can either delete these CRs optimistically at the start of MintX509CA or at the end as part of the same cleanup. If MintX509CA always fails, it makes sense to me to delete at the start. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some prior art to this:
cert-manager revision-manager: This deletes CRs based on a max revision history. This requires running a kube controller which wouldn't be appropriate here AFAIK.
istio-csr optionally deletes the CR after it being successful
cni-lib which optimistically deletes stale CRs during issuance: This is similar to the approach described above^ which I think is the best.
leafURIResult=$(./bin/kubectl exec -n spire $(./bin/kubectl get pod -n spire -o name) -- cat svid.pem | openssl x509 --noout -ext subjectAltName | grep URI | sed 's/^ *//g') | ||
leafIssuerResult=$(./bin/kubectl exec -n spire $(./bin/kubectl get pod -n spire -o name) -- cat svid.pem | openssl x509 --noout -issuer) | ||
caSubjectResult=$(./bin/kubectl exec -n spire $(./bin/kubectl get pod -n spire -o name) -- cat bundle.pem | openssl x509 --noout --subject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These flags are single dash i.e. -noout
, -subject
When I ran it locally, it also complained about -ext
. I couldn't find an equivalent flag for this that made sense given what you're trying to do, I normally use -noout -text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is based on a commit in our main branch that was published prior to our switch to GitHub Actions ... hence, this PR is currently not triggering CI. We require branches be up-to-date prior to merge, so it's not a problem per se, but if you update your branch now you'll get the CI runners going on this PR which may help for debugging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These flags are single dash i.e.
-noout
,-subject
When I ran it locally, it also complained about
-ext
. I couldn't find an equivalent flag for this that made sense given what you're trying to do, I normally use-noout -text
Thanks, I'll move it over to that. CLIs between OS's is always tricky 😅, sed
comes to mind 😆
This PR is based on a commit in our main branch that was published prior to our switch to GitHub Actions ... hence, this PR is currently not triggering CI. We require branches be up-to-date prior to merge, so it's not a problem per se, but if you update your branch now you'll get the CI runners going on this PR which may help for debugging this.
Cheers! Once changing to the unstructured client it should make rebasing a bit easier.
ca_ttl = "12h" | ||
ca_subject { | ||
country = ["GB"] | ||
organization = ["cert-manager.io"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might recommend changing this to something SPIRE specific so that it disambiguates the results when we look at leaf issuer and CA (i.e. we can see clearly that leaf was signed by SPIRE, but the root CA is cert-manager). One failure mode here is that we don't take the Upstream path and instead fall back to SPIRE root for some reason, and this case will be harder to distinguish here if we use cert-manager subject for the SPIRE intermediate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I'll change that back to the standard values.
Naw, we can convert this to v1 trivially after it is merged.
This isn't the first time folks have wanted to update the k8s client deps. We've generally pushed back when there wasn't a requirement. In this case, the unstructured client seems like a workaround, but it's not clear to me how much work. In the end, latching the client code dependency to our minimum supported version only helps compile time compatibility against the minimum version but says nothing about compatibility with the newest versions. A more robust, comprehensive integration test suite that exercised all of the k8s api interactions with at least the minimum and latest versions would give us better assurances. All this to say that I'm probably ok if we want to bump this now, unless the unstructured client is a fairly trivial drop-in. |
@evan2645 @azdagron Thanks both for your reviews! I'll work on adding the cleanup mechanism described above^ and fixing up the integration openssl commands. I'm also going to give a go at moving the client to unstructured- this shouldn't be a huge change and resolves the mod issues (also doesn't mean this PR changes the kube controllers elsewhere).
The kube compatibility of this integration is effectively tied to the compatibility of the cert-manager installation. Since we use the v1 API which was released in cert-manager v1.0, we have compatibility with kube back till v1.11. |
Awesome, I appreciate your willingness! Glad to hear it won't be costly.
😍 that is really good to hear! My comment about the integration tests were geared towards SPIRE's k8s integrations in general, not necessarily this particular interaction. I've opened up #2350 to track that separate discussion. |
142b522
to
d2ec5b9
Compare
Hi @evan2645 @azdagron! After struggling when testing using the unstructured client and the controller-runtime fake client, I have opted to go with adding the Go struct types in an internal package. This seems to work well. I have also added a note that this can be swapped out in future after we create a separate repo with just these types. This is something I want to pick up for the cert-manger v1.5 release. I have added in a cleanup function, which is called on defer when minting. It will delete any CertificateRequests which the SPIRE server is responsible for (the same namespace with the label I have expanded the unit testing out quite a bit, and now covers the |
Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
config example Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
changes Signed-off-by: joshvanl <[email protected]>
considerations and kube_config_file change Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
clean up stale CertificateRequests Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
cleaned-up Signed-off-by: joshvanl <[email protected]>
Signed-off-by: joshvanl <[email protected]>
d2ec5b9
to
9fb031e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so sorry it took me so long to get back on this. I think what you have here is fine. If/when that change comes in that moves these types to another repo, even better, but this doesn't sound too bad to maintain. I'm good with the direction this takes.
@evan2645, any other thoughts? are we ready to take this?
Signed-off-by: joshvanl <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you @JoshVanL!
Pull Request check list
Affected functionality
This PR introduces cert-manager as a server upstream authority plugin. This plugin creates CertificateRequests in the target cluster, configured with the referenced issuer. Once signed by the Issuer it returns the signed certificate and CA.
Description of change
Adds cert-manager as a new UA plugin.
Updates k8s dependencies required when importing cert-manager.
Updates test files and kube controllers with new func signatures from updating dependencies.
Let me know if there is anything missing or would like me to split up the PR.
Which issue this PR fixes
fixes #1796