Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Modify multinamespaced cache to support cluster scoped resources #1418

Merged

Conversation

varshaprasad96
Copy link
Member

@varshaprasad96 varshaprasad96 commented Mar 8, 2021

This PR modifies the multinamespacedcache implementation to:

  • create a global cache mapping for an empty namespace, so that when
    cluster scoped resources are fetched, namespace is not required.
  • deduplicate the objects in the List call, based on
    unique combination of resource name and namespace.

Closes: #1377
Closes: #1378

Signed-off-by: varshaprasad96 [email protected]

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 8, 2021
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 8, 2021
@varshaprasad96
Copy link
Member Author

cc: @alvaroaleman @estroz

Copy link
Contributor

@estroz estroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior right now is a little weird. If I pass client.InNamespace(""), which is corev1.NamespaceAll, multiNamespaceCache.List() will first call the List() method of all the namespaced caches it knows about then that of the global cache, making those namespaced calls redundant. If I set client.InNamespace("my-ns"), then removeDuplicates() is called unnecessarily.

IMO the correct behavior is as follows:

namespaces := getNamespaces(opts)
if contains(namespaces, "") {
	return globalCache.List()
} else {
	var list []client.Object
	for _, namespace := range namespaces {
		out := c.caches[namespace].List()
		list = append(list, out...)
	}
	return list
}

This change is breaking because the caller needs to specify every namespace it wants to list from if non-global. However it results in corev1.NamespaceAll being correctly respected.

An alternative is to define const NamespaceMulti = "__multi", that can be passed like

cache.List(ctx, list, client.InNamespace(pkgcache.NamespaceMulti)

for listing from all cache namespaces.

Then the above would be

namespaces := getNamespaces(opts)
if contains(namespaces, "") {
	return globalCache.List()
} else if contains(namespaces, NamespaceMulti) {
	var list []client.Object
	for _, cache := range c.caches {
		out := cache.List()
		list = append(list, out...)
	}
	return list
} else {
	var list []client.Object
	for _, namespace := range namespaces {
		out := c.caches[namespace].List()
		list = append(list, out...)
	}
	return list
}

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 14, 2021
@@ -57,6 +57,7 @@ func (n *namespacedClient) RESTMapper() meta.RESTMapper {

// isNamespaced returns true if the object is namespace scoped.
// For unstructured objects the gvk is found from the object itself.
// TODO: this is repetitive code. Remove this and use ojectutil.IsNamespaced.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR references a bug, will do refactoring in follow up.

limitations under the License.
*/

package objectutil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have renamed this file from filter.go to objectutil.go to make it generic

pkg/cache/multi_namespace_cache.go Outdated Show resolved Hide resolved
pkg/internal/objectutil/objectutil.go Outdated Show resolved Hide resolved
@estroz
Copy link
Contributor

estroz commented Apr 28, 2021

This should be labelled as 🐛 @varshaprasad96

This PR modifies the multinamespacedcache implementation to:
- create a global cache mapping for an empty namespace, so that when
cluster scoped resources are fetched, namespace is not required.
- deduplicate the objects in the `List` call, based on
unique combination of resource name and namespace.

Signed-off-by: varshaprasad96 <[email protected]>
Modify multinamespaced cache to accept restmapper, which
can be used to identify the scope of the object and handle
the cluster scoped objects accordingly.
@varshaprasad96 varshaprasad96 changed the title ✨ Modify multinamespaced cache to support cluster scoped resources 🐛 Modify multinamespaced cache to support cluster scoped resources Apr 29, 2021
Copy link
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/label tide/merge-method-squash
/hold
@estroz anything to add or can we merge this?

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. labels Apr 30, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 30, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, varshaprasad96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2021
@estroz
Copy link
Contributor

estroz commented Apr 30, 2021

Nope LGTM!
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2021
@k8s-ci-robot k8s-ci-robot merged commit 5673ada into kubernetes-sigs:master Apr 30, 2021
@k8s-ci-robot k8s-ci-robot added this to the v0.9.x milestone Apr 30, 2021
aayushrangwala pushed a commit to aayushrangwala/controller-runtime that referenced this pull request May 5, 2021
…ubernetes-sigs#1418)

* 🐛 Modify multinamespaced cache to support cluster scoped resources

This PR modifies the multinamespacedcache implementation to:
- create a global cache mapping for an empty namespace, so that when
cluster scoped resources are fetched, namespace is not required.
- deduplicate the objects in the `List` call, based on
unique combination of resource name and namespace.

Signed-off-by: varshaprasad96 <[email protected]>

* Add restmapper to multinamespaced cache

* Use restmapper to identify scope of the object

Modify multinamespaced cache to accept restmapper, which
can be used to identify the scope of the object and handle
the cluster scoped objects accordingly.

* Rename fileter.go to objectutil.go

Signed-off-by: varshaprasad96 <[email protected]>
@sbueringer
Copy link
Member

sbueringer commented May 8, 2021

@varshaprasad96 @alvaroaleman Could it be that this change leads to get calls on a _cluster-scope Namespace?

E0508 14:49:56.301011       1 reflector.go:138] external/io_k8s_client_go/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:dhc-resource-burster" cannot list resource "pods" in API group "" in the namespace "_cluster-scope"

with higher log-level:

I0508 17:01:13.539542 4157540 round_trippers.go:432] GET https://53.6.42.77:6443/api/v1/namespaces/_cluster-scope/pods?allowWatchBookmarks=true&resourceVersion=3148&timeoutSeconds=506&watch=true

We're creating the multinamespaced cache like this: (in this case with namespaces=kube-system)

namespacedCache = cache.MultiNamespacedCacheBuilder(strings.Split(namespaces, ","))

I only noticed the error because I'm running this controller only with rights to the kube-system namespace.

@estroz
Copy link
Contributor

estroz commented May 8, 2021

@sbueringer yep. It’s probably worth renaming this value to something like _multinamespace_cache_global_id to make errors like this traceable, or checking for this error type and returning a better message.

@alvaroaleman
Copy link
Member

But that is wrong, we shouldn't be creating a cluster-scoped cache unless something cluster-scoped is actually requested?

@estroz
Copy link
Contributor

estroz commented May 8, 2021

Oh yeah I didn’t notice that, whoops. Agreed that looks like a bug. It doesn’t look like the global cache is being created/used though; instead, the global cache’s namespace is being used in a namespaces lookup. Is it not being filtered out when iterating through the cache set during a list?

@varshaprasad96
Copy link
Member Author

varshaprasad96 commented May 8, 2021

The global cache is being created by default here. Only when cluster scoped objects are requested we look up there.
I think the bug is here. The global namespace shouldn't be looked at and be filtered out of the list.

@varshaprasad96
Copy link
Member Author

varshaprasad96 commented May 8, 2021

@alvaroaleman @estroz I think #1520 should solve this, or is there something else which I am missing?

@sbueringer
Copy link
Member

sbueringer commented May 8, 2021

@varshaprasad96 I tried your fix locally and it wasn't enough, but it's a bit hard to say where the call is coming from. (I set a breakpoint at the reflector log line)
image

@sbueringer
Copy link
Member

I think the problem is at least also that we start an informer for the "_cluster-scope namespace" here: https://github.com/kubernetes-sigs/controller-runtime/pull/1520/files#diff-cf71f1c7387ae1d9a63da1f81380effac1b826d0c3c1594c4d05da4324be2159R105-R116

@alvaroaleman
Copy link
Member

@sbueringer that link doesn't work, can you just link to the code on a branch? Do you mean this

func (c *multiNamespaceCache) Start(ctx context.Context) error {
for ns, cache := range c.namespaceToCache {
go func(ns string, cache Cache) {
err := cache.Start(ctx)
if err != nil {
log.Error(err, "multinamespace cache failed to start namespaced informer", "namespace", ns)
}
}(ns, cache)
}
<-ctx.Done()
return nil
}
?

@sbueringer
Copy link
Member

@alvaroaleman sorry, yes that's the part I mean

naemono added a commit to naemono/cloud-on-k8s that referenced this pull request Dec 20, 2021
…abled, as

  it causes 'unknown namespace' errors in controller-runtime, and cluster-scoped
  resources are handled properly now in ctrl-runtime.
  see kubernetes-sigs/controller-runtime#1418
naemono added a commit to elastic/cloud-on-k8s that referenced this pull request Feb 14, 2022
* Don't append the 'all' namespaces when storage class validation is enabled, as
  it causes 'unknown namespace' errors in controller-runtime, and cluster-scoped
  resources are handled properly now in ctrl-runtime.
  see kubernetes-sigs/controller-runtime#1418

* Webhook ignores requests from namespaces that it doesn't manage
Updates webhook tests to ensure behavior.

* remove spaces in if err block
move log line down for consistency

* Move namespace validation within the Handle func to reduce duplication.

* ES spelling in pkg/controller/elasticsearch/validation/webhook.go

Co-authored-by: Peter Brachwitz <[email protected]>

* ES spelling in pkg/controller/elasticsearch/validation/webhook.go

Co-authored-by: Peter Brachwitz <[email protected]>

* Use set operations to enhance readability.

* Create new package to duplicate less code for setting up webhooks,
  and to allow managedNamespaces in webhooks to be handled properly
  across all webhooks.
Adjust all managed objects to use new webhook package on setup.

* Adjust common webhook to copy object properly, and use metav1.Object to query namespace.
Add logging when skipping resource validation.
Add additional information to 'reason' for allowing request.

* add missing header

* Check type assertion.

* Ensure that the set of managed namespaces isn't all before checking whether the given namespaces is managed.

* Simplify common webhook validation
Adjust how update is handled

* Remove debugging from webhook

* Proper casing in comments.
Decode the old object in upgrade, not the original object.

* Adding unit tests for webhook validation.

* Use keys in the webhook test structs

Co-authored-by: Peter Brachwitz <[email protected]>
fantapsody pushed a commit to fantapsody/cloud-on-k8s that referenced this pull request Feb 7, 2023
* Don't append the 'all' namespaces when storage class validation is enabled, as
  it causes 'unknown namespace' errors in controller-runtime, and cluster-scoped
  resources are handled properly now in ctrl-runtime.
  see kubernetes-sigs/controller-runtime#1418

* Webhook ignores requests from namespaces that it doesn't manage
Updates webhook tests to ensure behavior.

* remove spaces in if err block
move log line down for consistency

* Move namespace validation within the Handle func to reduce duplication.

* ES spelling in pkg/controller/elasticsearch/validation/webhook.go

Co-authored-by: Peter Brachwitz <[email protected]>

* ES spelling in pkg/controller/elasticsearch/validation/webhook.go

Co-authored-by: Peter Brachwitz <[email protected]>

* Use set operations to enhance readability.

* Create new package to duplicate less code for setting up webhooks,
  and to allow managedNamespaces in webhooks to be handled properly
  across all webhooks.
Adjust all managed objects to use new webhook package on setup.

* Adjust common webhook to copy object properly, and use metav1.Object to query namespace.
Add logging when skipping resource validation.
Add additional information to 'reason' for allowing request.

* add missing header

* Check type assertion.

* Ensure that the set of managed namespaces isn't all before checking whether the given namespaces is managed.

* Simplify common webhook validation
Adjust how update is handled

* Remove debugging from webhook

* Proper casing in comments.
Decode the old object in upgrade, not the original object.

* Adding unit tests for webhook validation.

* Use keys in the webhook test structs

Co-authored-by: Peter Brachwitz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
6 participants