Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport upstream changes to watch cache enablement #16398

Merged
merged 3 commits into from
Sep 26, 2017

Conversation

smarterclayton
Copy link
Contributor

Disables the watch cache for most resources by default, except those accessed by many clients. This has been shown to have minor impacts on the production workload.

Fixes #16112

@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 17, 2017
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 17, 2017
@stevekuznetsov
Copy link
Contributor

/unassign

@smarterclayton
Copy link
Contributor Author

/retest

Backport the change that allows a global default watch cache size as
well as being able to disable an individual watch cache item
Remove some complexity in RESTOptionsGetter and add default watch cache
sizes for resources that are read by nodes.
Any resource named by the heuristics gets a watch cache by default.
Admins can restore the previous behavior by setting
`--default-watch-cache-size` to a positive integer. This reduces the
amount of total memory allocated on large cluster significantly at minor
cost in CPU on the etcd process and an increase in network bandwidth to
etcd.
@smarterclayton
Copy link
Contributor Author

/retest

@@ -150,7 +148,7 @@ func BuildKubeAPIserverOptions(masterConfig configapi.MasterConfig) (*kapiserver
server.Etcd.StorageConfig.KeyFile = masterConfig.EtcdClientInfo.ClientCert.KeyFile
server.Etcd.StorageConfig.CertFile = masterConfig.EtcdClientInfo.ClientCert.CertFile
server.Etcd.StorageConfig.CAFile = masterConfig.EtcdClientInfo.CA
server.Etcd.DefaultWatchCacheSize = DefaultWatchCacheSize
server.Etcd.DefaultWatchCacheSize = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what is setting us to "off by default", right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

@@ -507,6 +505,20 @@ func buildKubeApiserverConfig(
return originLongRunningRequestRE.MatchString(r.URL.Path) || kubeLongRunningFunc(r, requestInfo)
}

if apiserverOptions.Etcd.EnableWatchCache {
glog.V(2).Infof("Initializing cache sizes based on %dMB limit", apiserverOptions.GenericServerRunOptions.TargetRAMMB)
sizes := cachesize.NewHeuristicWatchCacheSizes(apiserverOptions.GenericServerRunOptions.TargetRAMMB)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we set this target RAMMB to anything by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we set this target RAMMB to anything by default?

I'm not seeing where we write a default here, which would set all the heuristic ones to zero by default, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a "min" function on the heuristic so we always get something even at 0

@smarterclayton
Copy link
Contributor Author

Anything else?

@smarterclayton
Copy link
Contributor Author

ping @deads2k

@deads2k
Copy link
Contributor

deads2k commented Sep 25, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 25, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:
  • OWNERS [deads2k,smarterclayton]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 16546, 16398, 16157)

@openshift-merge-robot openshift-merge-robot merged commit fe04a6f into openshift:master Sep 26, 2017
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 26, 2017

@smarterclayton: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/extended_conformance_gce 01aeb23 link /test extended_conformance_gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jeremyeder
Copy link
Contributor

@openshift/svt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants