This repository has been archived by the owner on Apr 25, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 530
feat: introduce informer cache sync timeout #1460
Merged
k8s-ci-robot
merged 4 commits into
kubernetes-retired:master
from
zqzten:cache_sync_timeout
Oct 21, 2021
Merged
feat: introduce informer cache sync timeout #1460
k8s-ci-robot
merged 4 commits into
kubernetes-retired:master
from
zqzten:cache_sync_timeout
Oct 21, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
k8s-ci-robot
added
cncf-cla: yes
Indicates the PR's author has signed the CNCF CLA.
size/L
Denotes a PR that changes 100-499 lines, ignoring generated files.
labels
Oct 15, 2021
zqzten
force-pushed
the
cache_sync_timeout
branch
from
October 15, 2021 05:22
8b9ff93
to
c1accac
Compare
zqzten
force-pushed
the
cache_sync_timeout
branch
from
October 15, 2021 05:53
c1accac
to
bfd7751
Compare
mars1024
approved these changes
Oct 18, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
friendly ping @hectorj2f
/assign @xunpan |
hectorj2f
reviewed
Oct 18, 2021
hectorj2f
reviewed
Oct 18, 2021
friendly ping @xunpan for approval or further opinions |
hectorj2f
approved these changes
Oct 21, 2021
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hectorj2f, mars1024, zqzten The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
k8s-ci-robot
added
the
approved
Indicates a PR has been approved by an approver from all required OWNERS files.
label
Oct 21, 2021
9 tasks
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
approved
Indicates a PR has been approved by an approver from all required OWNERS files.
cncf-cla: yes
Indicates the PR's author has signed the CNCF CLA.
lgtm
Indicates that a PR is ready to be merged.
size/L
Denotes a PR that changes 100-499 lines, ignoring generated files.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
According to this issue of controller-runtime, an informer cache timeout is needed to prevent controllers from blocking indefinitely. In KubeFed, we have far more informers than a simple controller and any of the informer cache sync problems will block the whole reconcile loop (which has been encountered in our prod env serveral times).
This PR introduces a configurable informer cache timeout to the core controllers of KubeFed. It can let KubeFed controller error out if its informers are unable to sync their caches within this timeout. With this behavior, one KubeFed controller will never be kept running without working which can be useful for users to discover watch/list problems in time and can also give chance to other working replicas to run.
This PR also adds logs to the
ClustersSynced
check to help find out which member cluster's informer cache sync is blocking.Which issue(s) this PR fixes:
Fixes #1459