Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster ExtractList. Add ExtractListWithAlloc variant. #113362

Merged

Conversation

sxllwx
Copy link
Member

@sxllwx sxllwx commented Oct 26, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #113305

related: issue #102718

Special notes for your reviewer:

After the reflector obtains the ObjectList through ListAndWatch, it will obtain the Object list through ExtractList (internally obtain the list of objects through go-reflect) and store it in Indexer. Assuming that an Object in the List does not change (). Then the memory occupied by the entire ObjectList cannot be released. Although most of the data in this List is out of date.

We remove the dependency on ObjectList through Alloc && Set (ShallowCopy) on the object. Help golang runtime gc this section does not need to use memory.

Does this PR introduce a user-facing change?

client-go: Improved memory use of reflector caches when watching large numbers of objects which do not change frequently

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 26, 2022
@sxllwx sxllwx force-pushed the ftr/extract_listobject_use_copy branch from a80dd26 to 2b73ffa Compare October 26, 2022 16:08
@lavalamp
Copy link
Member

I'm not convinced we should deep copy at all. But definitely it shouldn't go here, there are other callers of this code path than just the watch cache.

@sxllwx
Copy link
Member Author

sxllwx commented Oct 26, 2022

I'm not convinced we should deep copy at all. But definitely it shouldn't go here, there are other callers of this code path than just the watch cache.

@lavalamp

Thank you for your suggestion ~ 😄

I find a more suitable place to add the relevant logic.

@leilajal
Copy link
Contributor

/cc @liggitt @wojtek-t @lavalamp
/triage accepted

@k8s-ci-robot k8s-ci-robot requested a review from lavalamp October 27, 2022 16:44
@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 27, 2022
@sxllwx sxllwx force-pushed the ftr/extract_listobject_use_copy branch from 2b73ffa to 5dfcb42 Compare October 28, 2022 15:33
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 28, 2022
@sxllwx sxllwx changed the title Ftr: DeepCopy Object when ExtractList Ftr: Alloc Object when ExtractList Oct 28, 2022
@sxllwx
Copy link
Member Author

sxllwx commented Oct 28, 2022

/ping @lavalamp

Hi @lavalamp . In order not to destroy the original ExtractList. I added ExtractWithAlloc. And instead of using DeepCopy, a shallow copy of the newly generated object is done by using reflect.New and reflect.Value.Set. The map and slice in the newly generated object will share the same memory space with the elements in the list. Only the basic types of golang fields such as int and string will allocate memory. After my tests, it can work well (unit tests are being written), and I hope to get some suggestions from you on the solution. grateful.

@lavalamp
Copy link
Member

This still double-allocates for some amount of time, it's too late.

I'm actually not even convinced that deep copy is enough, are e.g. strings copied or are they still references to the original deserialized data?

A correct fix is going to be very invasive, which is why I'm questioning that the benefit would be high enough.

@sxllwx sxllwx force-pushed the ftr/extract_listobject_use_copy branch from 5dfcb42 to 101c195 Compare October 29, 2022 11:26
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 29, 2022
@sxllwx
Copy link
Member Author

sxllwx commented Oct 29, 2022

Some clarifications are necessary here:

Take corev1.ConfigMapList as an example (for the sake of convenience, I directly copied the relevant part of the code to prevent jumping around):

type ConfigMapList struct {
         ....
	Items []ConfigMap
}

type ConfigMap struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
        ....
	Data map[string]string `json:"data,omitempty" protobuf:"bytes,2,rep,name=data"`
}

func (in *ConfigMap) DeepCopyObject() runtime.Object {
       ...
}

Items elements are all ConfigMap, ConfigMap does not implement runtime.Object, (the receiver of the DeepCopyObject method is *ConfigMap), so we need to use *ConfigMap.

The return value of our ExtractList is []runtime.Object, so we have the following available:

  1. Get the pointer of the ConfigMap.Items[i] element directly (the current implementation), but this will cause the Slice of ConfigMapList.Items to not be released

  2. DeepCopy (Deprecated): Causes double memory consumption, ps: Not only will golang-basic type [int, string,] field values ​​(Kind, APIVersion, Name, UID, ...) be copied, when encountering To Data, or Labels, Annotations field type is Map[string]string will also be copied.

  3. Alloc a new ConfigMap, and assign a value to it. Execute code similar to the following:

ret := configMapList.Items[0]

Only the fields of the basic type (int, string... mentioned above) will be copied here, and the Data field will point to configMapList.Items[0].Data. When go-runtime executes the gc-routine, if configMapList.Items[0] has no related references. will be released. But since Data is referenced by ret.Data at the same time. This memory will not be freed.

Here is a very strange question, why do method 1 and method 3 have the same number of allocations in the benchmark results?

This is because the following code exists in ExtractList:

func ExtractList(obj runtime.Object) ([]runtime.Object, error) {
	....
	for i := range list {
		raw := items.Index(i)
                
		switch item := raw.Interface().(type) {
		....
       ....

items := raw.Interface().(type) The code here has actually done an Alloc Object && Set to the ConfigMap

However, ExtractListWithAlloc determines the type by comparing raw.Type. When it is determined that the current Object is a ConfigMap, Alloc creates a new Configmap and assigns it a value. So it can not only guarantee the number of memory allocations, but also avoid directly referencing the Items list.

@sxllwx sxllwx force-pushed the ftr/extract_listobject_use_copy branch from 101c195 to b546958 Compare October 29, 2022 14:54
@k8s-ci-robot k8s-ci-robot merged commit fe9ef26 into kubernetes:master May 29, 2023
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Jan 17, 2024
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Jan 29, 2024
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Feb 13, 2024
krunalhinguu pushed a commit to krunalhinguu/rancher that referenced this pull request Feb 13, 2024
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Feb 14, 2024
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Feb 16, 2024
chiukapoor added a commit to chiukapoor/rancher that referenced this pull request Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Reflector retains initial list result for a long time
6 participants