Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to using kubelet config file for all supported flags #10433

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

brandond
Copy link
Member

@brandond brandond commented Jun 29, 2024

Proposed Changes

Switch to using kubelet config file for all supported flags

Types of Changes

enhancement / tech debt

Verification

See linked issue

Testing

Linked Issues

User-Facing Change

Further Comments

Currently using a hacky string replace when marshaling the config file to work around an upstream issue:

@brandond brandond requested a review from a team as a code owner June 29, 2024 07:52
@brandond brandond force-pushed the kubelet-config-dir branch 3 times, most recently from c6f443b to 5b85991 Compare June 30, 2024 08:00
Copy link

codecov bot commented Jun 30, 2024

Codecov Report

Attention: Patch coverage is 66.48936% with 63 lines in your changes missing coverage. Please review.

Project coverage is 43.47%. Comparing base (b55aaeb) to head (0d6cc13).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
pkg/daemons/agent/agent.go 71.05% 31 Missing and 13 partials ⚠️
pkg/daemons/agent/agent_linux.go 48.27% 13 Missing and 2 partials ⚠️
pkg/agent/config/config.go 40.00% 2 Missing and 1 partial ⚠️
pkg/etcd/etcd.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10433      +/-   ##
==========================================
- Coverage   47.83%   43.47%   -4.37%     
==========================================
  Files         181      181              
  Lines       18794    18906     +112     
==========================================
- Hits         8990     8219     -771     
- Misses       8452     9479    +1027     
+ Partials     1352     1208     -144     
Flag Coverage Δ
e2etests 35.81% <66.48%> (-7.21%) ⬇️
inttests 18.62% <1.59%> (-16.20%) ⬇️
unittests 14.18% <0.00%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

manuelbuil
manuelbuil previously approved these changes Jul 1, 2024
dereknola
dereknola previously approved these changes Jul 8, 2024
Copy link
Member

@dereknola dereknola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending RKE2 cgroup checks

@brandond
Copy link
Member Author

I'm going to hold on this until the August release cycle, as I'd like more lead time on testing it.

@brandond brandond marked this pull request as draft July 10, 2024 22:03
@brandond brandond changed the title Switch to using kubelet config file for all supported flags [wip] Switch to using kubelet config file for all supported flags Jul 10, 2024
@brandond brandond force-pushed the kubelet-config-dir branch 2 times, most recently from bf8a25f to c9b86b7 Compare September 24, 2024 22:59
@brandond brandond force-pushed the kubelet-config-dir branch from c9b86b7 to c0a241f Compare October 4, 2024 19:44
@brandond
Copy link
Member Author

brandond commented Oct 5, 2024

@dereknola I have confirmed this works on Windows, using RKE2: rancher/rke2#6909

It looks like RKE2 adds some extra CLI flags to the kubelet, so those will need to be migrated over to config as well once this is merged: https://github.com/rancher/rke2/blob/b696280a35f87bb260f26b256d7be839f1945535/pkg/pebinaryexecutor/pebinary.go#L146-L168

root@dev02:~# kubectl get node -o wide
NAME    STATUS   ROLES                       AGE     VERSION          INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                  KERNEL-VERSION     CONTAINER-RUNTIME
dev02   Ready    control-plane,etcd,master   3h42m   v1.31.1+rke2r1   10.0.1.184    <none>        Ubuntu 24.04 LTS                          6.8.0-45-generic   containerd://1.7.21-k3s2
win02   Ready    <none>                      3h31m   v1.31.1          10.0.1.224    <none>        Windows Server 2022 Standard Evaluation   10.0.20348.2113    containerd://1.7.21-k3s2
Running kubelet --alsologtostderr=false --cloud-provider=external --config-dir=C:\var\lib\rancher\rke2\agent\etc\kubelet.conf.d --feature-gates=CloudDualStackNodeIPs=true --hostname-override=win02 --kubeconfig=C:\var\lib\rancher\rke2\agent\kubelet.kubeconfig --log-file=\var\lib\rancher\rke2\agent\logs\kubelet.log --log-file-max-size=50 --logtostderr=false --node-ip=10.0.1.224 --node-labels= --stderrthreshold=FATAL
Running RKE2 kubelet [--cgroups-per-qos=false --enforce-node-allocatable= --file-check-frequency=5s --hairpin-mode=promiscuous-bridge --resolv-conf= --sync-frequency=30s --cloud-provider=external --config-dir=C:\var\lib\rancher\rke2\agent\etc\kubelet.conf.d --feature-gates=CloudDualStackNodeIPs=true --hostname-override=win02 --kubeconfig=C:\var\lib\rancher\rke2\agent\kubelet.kubeconfig --node-ip=10.0.1.224 --node-labels=
PS C:\Users\Administrator> c:\usr\local\bin\rke2.exe --version
rke2.exe version v1.31.1+dev.045680fc (045680fc54b2e82cbb5491c9a916655535cd8fb7)
go version go1.22.6

PS C:\Users\Administrator> dir C:\var\lib\rancher\rke2\agent\etc\kubelet.conf.d\


    Directory: C:\var\lib\rancher\rke2\agent\etc\kubelet.conf.d


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----         10/5/2024  12:51 AM           1516 00-rke2-defaults.conf


PS C:\Users\Administrator> cat C:\var\lib\rancher\rke2\agent\etc\kubelet.conf.d\00-rke2-defaults.conf
address: 0.0.0.0
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: C:\var\lib\rancher\rke2\agent\client-ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
cgroupDriver: cgroupfs
clusterDNS:
- 10.43.0.10
clusterDomain: cluster.local
containerRuntimeEndpoint: npipe:////./pipe/containerd-containerd
cpuManagerReconcilePeriod: 10s
evictionHard:
  imagefs.available: 5%
  nodefs.available: 5%
evictionMinimumReclaim:
  imagefs.available: 10%
  nodefs.available: 10%
evictionPressureTransitionPeriod: 5m0s
failSwapOn: false
fileCheckFrequency: 20s
healthzBindAddress: 127.0.0.1
httpCheckFrequency: 20s
imageMaximumGCAge: 0s
imageMinimumGCAge: 2m0s
kind: KubeletConfiguration
logging:
  flushFrequency: 5s
  format: text
  options:
    json:
      infoBufferSize: "0"
    text:
      infoBufferSize: "0"
  verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 5m0s
nodeStatusUpdateFrequency: 10s
resolverConfig: C:\var\lib\rancher\rke2\agent\etc\resolv.conf
runtimeRequestTimeout: 2m0s
serializeImagePulls: false
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: C:\var\lib\rancher\rke2\agent\pod-manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
tlsCertFile: C:\var\lib\rancher\rke2\agent\serving-kubelet.crt
tlsPrivateKeyFile: C:\var\lib\rancher\rke2\agent\serving-kubelet.key
volumeStatsAggPeriod: 1m0s

@brandond brandond marked this pull request as ready for review October 5, 2024 08:33
@brandond brandond requested review from dereknola, manuelbuil and a team October 5, 2024 08:41
@brandond brandond changed the title [wip] Switch to using kubelet config file for all supported flags Switch to using kubelet config file for all supported flags Oct 5, 2024
galal-hussein
galal-hussein previously approved these changes Oct 7, 2024
@brandond
Copy link
Member Author

moving to WIP until closer to 1.32.

@brandond brandond marked this pull request as draft October 10, 2024 21:27
@brandond brandond changed the title Switch to using kubelet config file for all supported flags [wip] Switch to using kubelet config file for all supported flags Oct 10, 2024
@caroline-suse-rancher
Copy link
Contributor

Hey @brandond, are we still planning to do this for 1.32?

@brandond
Copy link
Member Author

brandond commented Dec 4, 2024

I am watching what upstream does with the drop-in config dir feature in kubernetes/enhancements#3983. If it is on by default for 1.32 then I think we can move forward.

Makes logged output more consistent when k3s fails during initialization

Signed-off-by: Brad Davidson <[email protected]>
Expose actual error, so that we can tell if the deployment is not found or not ready/available

Signed-off-by: Brad Davidson <[email protected]>
@brandond brandond marked this pull request as ready for review December 19, 2024 01:19
@brandond brandond changed the title [wip] Switch to using kubelet config file for all supported flags Switch to using kubelet config file for all supported flags Dec 19, 2024
Comment on lines 158 to 164
// replace resolvConf with resolverConfig until Kubernetes 1.32
// ref: https://github.com/kubernetes/kubernetes/pull/127421
b = bytes.ReplaceAll(b, []byte("resolvConf: "), []byte("resolverConfig: "))
return os.WriteFile(filepath.Join(path, "00-"+version.Program+"-defaults.conf"), b, 0600)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be changed now that 1.32 is here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped.

Copy link
Member Author

@brandond brandond Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh nope, still required, I put it back.

root@systemd-node-1:/# grep resolv /var/lib/rancher/k3s/agent/etc/kubelet.conf.d/00-k3s-defaults.conf
resolvConf: /var/lib/rancher/k3s/agent/etc/resolv.conf

root@systemd-node-1:/# kubectl get --raw /api/v1/nodes/systemd-node-1/proxy/configz 2>/dev/null | jq . | grep resolv
    "resolvConf": "/etc/resolv.conf",

root@systemd-node-1:/# echo -e "apiVersion: kubelet.config.k8s.io/v1beta1\nkind: KubeletConfiguration\nresolverConfig: /var/lib/rancher/k3s/agent/etc/resolv.conf" > /var/lib/rancher/k3s/agent/etc/kubelet.conf.d/99-resolv.conf

root@systemd-node-1:/# systemctl restart k3s

root@systemd-node-1:/# grep -rF resolv /var/lib/rancher/k3s/agent/etc/kubelet.conf.d/
/var/lib/rancher/k3s/agent/etc/kubelet.conf.d/00-k3s-defaults.conf:resolvConf: /var/lib/rancher/k3s/agent/etc/resolv.conf
/var/lib/rancher/k3s/agent/etc/kubelet.conf.d/99-resolv.conf:resolverConfig: /var/lib/rancher/k3s/agent/etc/resolv.conf

root@systemd-node-1:/# kubectl get --raw /api/v1/nodes/systemd-node-1/proxy/configz 2>/dev/null | jq . | grep resolv
    "resolvConf": "/var/lib/rancher/k3s/agent/etc/resolv.conf",

@brandond brandond force-pushed the kubelet-config-dir branch 4 times, most recently from 0f93378 to 7f6911b Compare December 19, 2024 21:45
@brandond brandond requested review from dereknola and a team December 19, 2024 21:47
Signed-off-by: Brad Davidson <[email protected]>
Avoid "snapshot save already in progress" flake when snapshot reconcile from previous save is still in progress.

Signed-off-by: Brad Davidson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants