Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent RoleBinding Drops #453

Closed
oliverbaehler opened this issue Oct 19, 2021 · 0 comments · Fixed by #457
Closed

Silent RoleBinding Drops #453

oliverbaehler opened this issue Oct 19, 2021 · 0 comments · Fixed by #457
Assignees
Labels
bug Something isn't working
Milestone

Comments

@oliverbaehler
Copy link
Collaborator

Bug description

Rolebindings are not created if one item in additionalRoleBindings is misconfigured.

How to reproduce

Steps to reproduce the behavior:

  1. Provide the Capsule Tenant YAML definitions

That's a tenant with a wrong subject, as type ServiceAccount does only support RFC 1123 format for the name field. You can create the tenant without any problems:

apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: bug
spec:
  additionalRoleBindings:
  - clusterRoleName: tenant-faulty
    subjects:
    - kind: Group
      name: idp_idp_groupA
    - kind: Group
      name: idp_idp_groupB
    - kind: ServiceAccount
      name: system:serviceaccount:default:default

  - clusterRoleName: tenant-pass
    subjects:
    - kind: Group
      name: idp_idp_groupA
    - kind: Group
      name: idp_idp_groupB
  owners:
  - kind: ServiceAccount
    name: system:serviceaccount:tenants:tenant-orchestrator
  - kind: Group
    name: idp_kubernetes/applications  
  resourceQuotas:
    items:
    - hard:
        requests.cpu: 0
        requests.memory: 0
        requests.storage: 0
    scope: Tenant

Only the tenant-faulty` binding has an incorrect value as name. note that for later.
But when you create a new namespace for the tenant

kubectl apply -f - << EOF
kind: Namespace
apiVersion: v1
metadata:
  name: gas-production
  labels:
    capsule.clastix.io/tenant: bug
EOF

There won't be any rolebindings there (not even the namespace-deleteror namespace:admin rolebinding) nor the one which wasn't faulty:

$ kubectl get rolebinding -n  gas-production
No resources found in gas-production namespace.

The tenant is active, so that's not the problem

kubectl get tenant 
NAME           STATE    NAMESPACE QUOTA   NAMESPACE COUNT   NODE SELECTOR                  AGE
bug            Active                     1                                                6m7s

(See the Logs section)

Expected behavior

In conclusion that means if one roleBinding is faulty none is gonna be created. Which imho seems very critical. So the first thing we should make sure, is that all the other rolebindings can be created. Either we reject it on tenant creation or just skip it. If we choose the second one we are missing a component to the crds and that's kinda a health status. How am I as operator gonna know if a dev made a mistake in his binding that it's wrong? I could verify in the controller logs but he could just guess based on the symptoms. Maybe we should add an indication to an tenant cr, that says if it has any problems and what the problems might be, so a dev could potentially fix it himself.

What do you think?

Logs

The rolebinding creation was rejected for the faulty rolebinding. But I dont see anything about the other rolebindings being rejected. Are they batched into one request?

{"level":"debug","ts":"2021-10-19T14:48:51.277Z","logger":"controller-runtime.webhook.webhooks","msg":"received request","webhook":"/namespaces","UID":"67a12615-b2de-4ef8-80af-0dd31fd321c6","kind":"/v1, Kind=Namespace","resource":{"group":"","version":"v1","resource":"namespaces"}}
{"level":"debug","ts":"2021-10-19T14:48:51.277Z","logger":"controller-runtime.webhook.webhooks","msg":"wrote response","webhook":"/namespaces","code":200,"reason":"","UID":"67a12615-b2de-4ef8-80af-0dd31fd321c6","allowed":true}
{"level":"info","ts":"2021-10-19T14:48:51.279Z","logger":"controllers.Tenant","msg":"Starting processing of Resource Quotas","Request.Name":"bug","items":1}
{"level":"debug","ts":"2021-10-19T14:48:51.279Z","logger":"controller-runtime.manager.events","msg":"Normal","object":{"kind":"Tenant","name":"bug","uid":"ae0b5a4b-363e-4011-8670-dbce48a2a321","apiVersion":"capsule.clastix.io/v1beta1","resourceVersion":"41228420"},"reason":"gas-production","message":"Ensuring Namespace metadata"}
{"level":"info","ts":"2021-10-19T14:48:51.279Z","logger":"controllers.Tenant","msg":"Desired hard requests.cpu quota is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.279Z","logger":"controllers.Tenant","msg":"Computed requests.cpu quota for the whole Tenant is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.280Z","logger":"controllers.Tenant","msg":"Desired hard requests.memory quota is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.280Z","logger":"controllers.Tenant","msg":"Computed requests.memory quota for the whole Tenant is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.281Z","logger":"controllers.Tenant","msg":"Desired hard requests.storage quota is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.281Z","logger":"controllers.Tenant","msg":"Computed requests.storage quota for the whole Tenant is 0","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.281Z","logger":"controllers.Tenant","msg":"Pruning objects with label selector capsule.clastix.io/resource-quota,capsule.clastix.io/resource-quota notin (0)","Request.Name":"bug"}
{"level":"info","ts":"2021-10-19T14:48:51.285Z","logger":"controllers.Tenant","msg":"Resource Quota sync result: unchanged","Request.Name":"bug","name":"capsule-bug-0","namespace":"gas-production"}
{"level":"info","ts":"2021-10-19T14:48:51.285Z","logger":"controllers.Tenant","msg":"Ensuring additional RoleBindings for owner","Request.Name":"bug"}
{"level":"debug","ts":"2021-10-19T14:48:51.285Z","logger":"controller-runtime.manager.events","msg":"Normal","object":{"kind":"Tenant","name":"bug","uid":"ae0b5a4b-363e-4011-8670-dbce48a2a321","apiVersion":"capsule.clastix.io/v1beta1","resourceVersion":"41228420"},"reason":"gas-production","message":"Ensuring ResourceQuota capsule-bug-0"}
{"level":"info","ts":"2021-10-19T14:48:51.285Z","logger":"controllers.Tenant","msg":"Pruning objects with label selector capsule.clastix.io/role-binding,capsule.clastix.io/role-binding notin (4a25b5ce4852895e,f47fd25b84ed1669)","Request.Name":"bug"}
{"level":"error","ts":"2021-10-19T14:48:51.290Z","logger":"controllers.Tenant","msg":"Cannot sync Additional RoleBinding","Request.Name":"bug","error":"RoleBinding.rbac.authorization.k8s.io \"capsule-bug-0-tenant-faulty\" is invalid: subjects[2].name: Invalid value: \"system:serviceaccount:default:default\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')","stacktrace":"github.com/clastix/capsule/controllers/tenant.(*Manager).syncAdditionalRoleBindings.func2\n\t/workspace/controllers/tenant/rolebindings.go:46\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"}
{"level":"info","ts":"2021-10-19T14:48:51.290Z","logger":"controllers.Tenant","msg":"Additional RoleBindings sync result: unchanged","Request.Name":"bug","name":"capsule-bug-0-tenant-faulty","namespace":"gas-production"}
{"level":"error","ts":"2021-10-19T14:48:51.290Z","logger":"controllers.Tenant","msg":"Cannot sync additional RoleBindings items","Request.Name":"bug","error":"RoleBinding.rbac.authorization.k8s.io \"capsule-bug-0-tenant-faulty\" is invalid: subjects[2].name: Invalid value: \"system:serviceaccount:default:default\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}
{"level":"debug","ts":"2021-10-19T14:48:51.290Z","logger":"controller-runtime.manager.events","msg":"Warning","object":{"kind":"Tenant","name":"bug","uid":"ae0b5a4b-363e-4011-8670-dbce48a2a321","apiVersion":"capsule.clastix.io/v1beta1","resourceVersion":"41228420"},"reason":"gas-production","message":"Ensuring additional RoleBinding capsule-bug-0-tenant-faulty"}
{"level":"error","ts":"2021-10-19T14:48:51.290Z","logger":"controller-runtime.manager.controller.tenant","msg":"Reconciler error","reconciler group":"capsule.clastix.io","reconciler kind":"Tenant","name":"bug","namespace":"","error":"RoleBinding.rbac.authorization.k8s.io \"capsule-bug-0-tenant-faulty\" is invalid: subjects[2].name: Invalid value: \"system:serviceaccount:default:default\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}

Additional context

  • Capsule version: 0.1.0
  • Helm Chart version: 0.1.0
  • Kubernetes version: (kubectl version)
@oliverbaehler oliverbaehler added blocked-needs-validation Issue need triage and validation bug Something isn't working labels Oct 19, 2021
@prometherion prometherion added this to the v0.1.1 milestone Oct 24, 2021
@prometherion prometherion removed the blocked-needs-validation Issue need triage and validation label Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants