Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix controller can't restart in helm for dependent secret not found #5305

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

chaosi-zju
Copy link
Member

@chaosi-zju chaosi-zju commented Aug 6, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

fix controller can't restart in helm for dependent secret not found.

In helm installation method, when installing karmada-controller-manager, we use a initContainer to wait for the ready status of karmada-apiserver, which prevents the karmada-controller-manager from CrashLoopBack. This feature is introduced in #5010.

In order to access host cluster kube-apiserver in initContainer, we mounted a service-account-token type Secret, because the deployment of karmada-controller-manager is defined automountServiceAccountToken: false. Unset automountServiceAccountToken is introduced in #2523.

However, in #5010, we deleted the Secret mentioned above when we finished installation. Actually, we still need this secret after installation finished, otherwise karmada-controller-manager can't restart for lack of the mounted secret.

Which issue(s) this PR fixes:

Fixes #5233

Special notes for your reviewer:

target installation order in helm after the PR:

  1. deploy etcd
  2. deploy karmada-apiserver (it has a init-container, it keeps checking etcd connectivity with the curl command, waiting for etcd to be ready)
  3. deploy Job/karmada-static-resource, which is used to deploy static resources such as crd (it use kubectl rollout status command to wait for karmada-apiserver readay), when finished, it writes a configmap to karmada-apiserver.
  4. deploy other components (they each have a int-container, using kubectl get command to wait for above configmap exist, which means job/karmada-static-resource has finished applying current version crds to apiserver)

Does this PR introduce a user-facing change?:

fix controller can't restart in helm for dependent secret not found

@karmada-bot karmada-bot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 6, 2024
@karmada-bot karmada-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 6, 2024
@chaosi-zju
Copy link
Member Author

/cc @XiShanYongYe-Chang please help a review

@karmada-bot
Copy link
Collaborator

@chaosi-zju: GitHub didn't allow me to request PR reviews from the following users: please, a, review.

Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @XiShanYongYe-Chang please help a review

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 30.82%. Comparing base (608af76) to head (5710883).
Report is 11 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5305      +/-   ##
==========================================
- Coverage   31.06%   30.82%   -0.24%     
==========================================
  Files         639      640       +1     
  Lines       44343    44414      +71     
==========================================
- Hits        13774    13691      -83     
- Misses      29573    29744     +171     
+ Partials      996      979      -17     
Flag Coverage Δ
unittests 30.82% <ø> (-0.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@XiShanYongYe-Chang
Copy link
Member

Hi @chaosi-zju, this char lint has failed.

@chaosi-zju chaosi-zju force-pushed the helm-0627 branch 2 times, most recently from 267c42f to 1661e5a Compare August 8, 2024 11:50
@chaosi-zju
Copy link
Member Author

/retest

@XiShanYongYe-Chang
Copy link
Member

Ask @calvin0327 @zhzhuang-zju to help take a review~
/cc @calvin0327 @zhzhuang-zju

Copy link
Member

@iawia002 iawia002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM

charts/karmada/templates/_helpers.tpl Outdated Show resolved Hide resolved
@chaosi-zju chaosi-zju force-pushed the helm-0627 branch 2 times, most recently from e678b33 to ae57f84 Compare August 27, 2024 08:07
Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I just realized it's a mistake for initcontainer holds the secret of the host cluster.
  2. I feels a little bit tricky to have a configmap to represent Karmada version, but I don't have a better idea.

Given this patch is an improvement, we can move forward even it is not the idea solution.

This is much better than the solution in #5150. Thanks.

charts/karmada/templates/_helpers.tpl Outdated Show resolved Hide resolved
charts/karmada/templates/_helpers.tpl Outdated Show resolved Hide resolved
charts/karmada/templates/karmada-static-resource-job.yaml Outdated Show resolved Hide resolved
charts/karmada/templates/karmada-static-resource-job.yaml Outdated Show resolved Hide resolved
@chaosi-zju
Copy link
Member Author

@RainbowMango comments fixed.

Besides, I have another tangled thing: after this PR merged, you can see now post-install-job.yaml only do onething, that is:

kubectl delete job {{ $name }}-static-resource -n {{ $namespace }}

Actually, we can directly set that job ttlAfterFinished=0 so that it can auto removed, in that case, the file post-install-job.yaml is redundant and can be removed.

However, I am worried that after deleting the file post-install-job.yaml, it will be troublesome to get this file back if we want to add some logic to this post-install hook point in the future.

@RainbowMango
Copy link
Member

However, I am worried that after deleting the file post-install-job.yaml, it will be troublesome to get this file back if we want to add some logic to this post-install hook point in the future.

Why can't get this file back?

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 28, 2024
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 28, 2024
@karmada-bot karmada-bot merged commit b51840e into karmada-io:master Aug 28, 2024
13 checks passed
@RainbowMango RainbowMango added this to the v1.11 milestone Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

karmada-controller-manager can't restart in helm installation for dependent secret not found
6 participants