Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller certificate CN too long #445

Closed
SapientGuardian opened this issue Sep 3, 2020 · 7 comments · Fixed by #822
Closed

Controller certificate CN too long #445

SapientGuardian opened this issue Sep 3, 2020 · 7 comments · Fixed by #822
Labels
bug Something isn't working community

Comments

@SapientGuardian
Copy link

SapientGuardian commented Sep 3, 2020

Describe the bug
Kafka clusters with medium sized names in namespaces with medium sized names fail to create, with an error like the following in the logs:

{"level":"error","ts":1599151167.7324488,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"kafkauser","request":"pushdelivery-stage/kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local","error":"could not create user certificate: admission webhook \"webhook.cert-manager.io\" denied the request: spec.commonName: Too long: must have at most 64 bytes","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90"}

Steps to reproduce the issue:
Create a Kafka cluster in a namespace whose effective commonName is longer than 64 characters

Additional context
I ran into this error rebuilding my cluster after #444 happened. Weirdly, this was successful before, and in production I have an even longer name ("production" vs "stage") that is running just fine. Even though I have the same versions of cert-manager and kafka-operator in both environments, the stage environment has been upgraded to Kubernetes 1.16, while production is still on 1.15.

I found a workaround for this, which was to disable the admission webhook for cert-manager. I think I created these clusters (and their certs) before the webhook was in play, which is why production is still working - the resource was already admitted before the validation was added.

@stoader
Copy link
Member

stoader commented Sep 3, 2020

The error message indicates the generation of a certificate for a KafkaUser was rejected by cert-manager due to the CN being longer than 64 characters. (see https://cert-manager.io/docs/reference/api-docs/#cert-manager.io/v1beta1.CertificateSpec -> commonName)

kafka-operator passes the name of KafkaUser into CN when requesting a certificate from cert-manager for the user.

Do you have any KafkaUser with name longer than 64 chars?

@SapientGuardian
Copy link
Author

I don't. I really don't think this was about a user, I could see the cert kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local didn't exist, and couldn't get created until I disabled the webhook.

@stoader
Copy link
Member

stoader commented Sep 3, 2020

If you look carefully at the error log line you pasted above:

"controller":"kafkauser","request":"pushdelivery-stage/kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local"

This shows that the kakfauser controller ("controller":"kafkauser") which is responsible for reconciling KafkaUser custom resources received pushdelivery-stage/kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local. This translates to that in namespace pushdelivery-stage there is a KafkaUser custom resource named kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local.

For details see:

https://github.com/banzaicloud/kafka-operator/blob/e4655553ce9c244cdc25d8b0b7236bbd1793fc6e/controllers/kafkauser_controller.go#L162

https://github.com/banzaicloud/kafka-operator/blob/master/pkg/pki/certmanagerpki/certmanager_user.go#L58

https://github.com/banzaicloud/kafka-operator/blob/e4655553ce9c244cdc25d8b0b7236bbd1793fc6e/pkg/pki/certmanagerpki/certmanager_user.go#L168

Since kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local is longer than 64 chars it was rejected by cert-manager's validation webhook. By disabling the webhook the 64 chars length validation is not enforced.

@SapientGuardian
Copy link
Author

You're right, there is technically a user called kafka-pushdelivery-stage-controller.pushdelivery-stage.mgt.cluster.local. But that isn't a user that I create, it gets created implicitly as part of the cluster.

@stoader
Copy link
Member

stoader commented Sep 3, 2020

What do you mean by it gets created implicitly as part of the cluster.? Your cluster setup process creates this user?

@SapientGuardian
Copy link
Author

SapientGuardian commented Sep 3, 2020

// BrokerControllerFQDNTemplate is combined with the above and cluster namespace
	// to create a 'fake' full-name for the controller user

https://github.com/banzaicloud/kafka-operator/blob/2aa393b9d2df813913779a2292dddd9d059fce53/pkg/util/pki/common.go#L44

That's the broker controller user, isn't it?

@stoader
Copy link
Member

stoader commented Sep 3, 2020

You're right that is the user created for the controller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants