Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ClusterSnapshot implementations to include StorageInfos lister #4904

Closed
x13n opened this issue May 20, 2022 · 9 comments · Fixed by #5013
Closed

Extend ClusterSnapshot implementations to include StorageInfos lister #4904

x13n opened this issue May 20, 2022 · 9 comments · Fixed by #5013
Assignees
Labels
area/cluster-autoscaler help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@x13n
Copy link
Member

x13n commented May 20, 2022

Which component are you using?:

cluster autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

Recent scheduler framework changes will cause Cluster Autoscaler code to stop compiling once we update the dependency on scheduler.

Describe the solution you'd like.:

There are 3 objects implementing SharedFramework interface:

Both BasicClusterSnapshot and DeltaClusterSnapshot can maintain a mapping from namespaced PVC names to sets of namespaced pod names. IsPVCUsedByPods can then just check if size of the set is non-0. This will have O(1) complexity on adding/removing pods. In delta snapshot case it also needs to track deleted pods similarly to how node infos are tracked today, to avoid O(n) computation when forking.

Describe any alternative solutions you've considered.:

Doing no-op implementation - would be simpler, but would incorrectly handle ReadWriteOncePod PVCs, leading to scale ups that wouldn't be able to help some pending pods.

Additional context.:

Scheduler PR changing SharedLister interface: kubernetes/kubernetes#109715

@x13n x13n added the kind/feature Categorizes issue or PR as related to a new feature. label May 20, 2022
@x13n
Copy link
Member Author

x13n commented May 23, 2022

/help

@k8s-ci-robot
Copy link
Contributor

@x13n:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 23, 2022
@ahaysx
Copy link
Contributor

ahaysx commented May 25, 2022

/assign

@ahaysx
Copy link
Contributor

ahaysx commented May 29, 2022

@x13n there hasn't been a k/k release > v1.25.0-alpha.0 with kubernetes/kubernetes#109715. Besides implementing IsPVCUsedByPods, is it fine for in this change just to set up the snapshots and delegating lister for the eventual vendor update? The update change will just need to add StorageInfos.

type basicClusterSnapshotStorageInfoLister BasicClusterSnapshot
func (snapshot *basicClusterSnapshotStorageInfoLister) IsPVCUsedByPods(key string) bool

Also, I have some questions about the right strategy in the delta snapshot. I don't really see how a PVC cache would be helpful due to the cache invalidation calls in pod add/remove. If the nodeInfoList were a map, I could update it directly in nodeToModify and only need to invalidate the pod caches in pod add/remove.

Also, maybe we should consider changing all the caches to maps, which would allow for updating in place and removing invalidation. It would make them more effective and maybe reduce the caching complexity in this file.

We can chat on slack at some point this week if that's faster?

@x13n
Copy link
Member Author

x13n commented May 30, 2022

Thanks for working on this!

@x13n there hasn't been a k/k release > v1.25.0-alpha.0 with kubernetes/kubernetes#109715. Besides implementing IsPVCUsedByPods, is it fine for in this change just to set up the snapshots and delegating lister for the eventual vendor update? The update change will just need to add StorageInfos.

I think everything can be implemented beforehand, including StorageInfos. That way the vendor update should be require no extra changes. The function will just be unused until then.

type basicClusterSnapshotStorageInfoLister BasicClusterSnapshot
func (snapshot *basicClusterSnapshotStorageInfoLister) IsPVCUsedByPods(key string) bool

Also, I have some questions about the right strategy in the delta snapshot. I don't really see how a PVC cache would be helpful due to the cache invalidation calls in pod add/remove. If the nodeInfoList were a map, I could update it directly in nodeToModify and only need to invalidate the pod caches in pod add/remove.

Also, maybe we should consider changing all the caches to maps, which would allow for updating in place and removing invalidation. It would make them more effective and maybe reduce the caching complexity in this file.

Good point about cache invalidation. Since the caches are only used for listing, maintaining them during add/remove makes sense for Revert - they are no longer needed then and can be dropped. Forking, on the other hand, would require building the entire map, which is O(n) instead of the current O(1). This may be problematic in cases when snapshot is forked, modified, then reverted (and never listed). Current approach keeps O(n) complexity at listing only (unless cache is invalid). I think some hybrid approach where we would populate the cache on first listing, but then update when needed (if cache is not empty) would be best from performance perspective.

We can chat on slack at some point this week if that's faster?

Yup, we can, just note I'm in CEST timezone.

@x13n
Copy link
Member Author

x13n commented Jun 23, 2022

So we chatted about this with @ahaysx on Slack. He won't be able to follow-up on this, but some conclusions were:

  • we can turn nodeInfoList into a map to allow updating/deleting entries without invalidating the entire cache.
  • it's harder to do the same trick with PVC cache, since we need to iterate over all pods on a node to update this. I thought about benchmarking this, but it is probably still going to be faster than invalidating the entire cache.

@x13n
Copy link
Member Author

x13n commented Jun 23, 2022

/unassign @ahaysx

@x13n
Copy link
Member Author

x13n commented Jun 30, 2022

/assign @jayantjain93

@jayantjain93
Copy link
Contributor

I've started looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
5 participants