✨ Enable configuring the kube config timeout #8865

cnmcavoy · 2023-06-15T17:09:44Z

What this PR does / why we need it:

Allows configuring the kubernetes client configuration timeout. Also moves the other related configuration into a standard place and reduces duplicated logic in various main functions.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #8864

k8s-ci-robot · 2023-06-15T17:09:52Z

Welcome @cnmcavoy!

It looks like this is your first PR to kubernetes-sigs/cluster-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2023-06-15T17:09:53Z

Hi @cnmcavoy. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2023-06-15T17:09:54Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cecilerobertmichon for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

controllers/remote/flags.go

nojnhuh · 2023-06-16T19:34:58Z

controllers/remote/flags.go

+	restConfig := ctrl.GetConfigOrDie()
+	restConfig.QPS = restConfigQPS
+	restConfig.Burst = restConfigBurst


Is there a reason Timeout is not also defined here?

The kubernetes clients used for the management clusters never had a timeout assigned in any of the various main.go and I didn't want to introduce a unintended change in behavior.

If we prefer consistency, I can add the timeout here as well.

nojnhuh · 2023-06-16T19:38:54Z

main.go

-
-	fs.IntVar(&restConfigBurst, "kube-api-burst", 30,
-		"Maximum number of queries that should be allowed in one burst from the controller client to the Kubernetes API server. Default 30")
+	remote.AddRestConfigFlags(fs)


This helper is very obviously pulling its weight! 🙌

nojnhuh · 2023-06-19T04:26:52Z

/ok-to-test

fabriziopandini · 2023-06-20T09:45:46Z

/hold
If I'm not wrong the solution proposed is not addressing the problem being discussed (making the timeout for draining workload clusters configurable), but instead it configures the timeout for management cluster clients.
see also #8864 (comment)

killianmuldoon · 2023-06-20T09:50:03Z

If I'm not wrong the solution proposed is not addressing the problem being discussed (making the timeout for draining workload clusters configurable), but instead it configures the timeout for management cluster clients.

Agreed this doesn't address the issue with drain, but this does configure both workload and management cluster clients AFAIK - just not for drain.

fabriziopandini · 2023-06-20T11:05:28Z

Agreed this doesn't address the issue with drain, but this does configure both workload and management cluster clients AFAIK - just not for drain.

I'm not sure this applies to workload clusters, could you kindly point me to where this is happening

killianmuldoon · 2023-06-20T11:10:37Z

I'm not sure this applies to workload clusters, could you kindly point me to where this is happening

You can see this on the change in #8882. The defaultClientTimeout is set in the remote.RESTConfig function. This is called in a number of places to create clients for workload clusters, including in the ClusterCacheTracker's createAccessor here:

cluster-api/controllers/remote/cluster_cache_tracker.go

Line 275 in ed31880

config, err := RESTConfig(ctx, t.controllerName, t.client, cluster)

sbueringer · 2023-06-20T11:29:52Z

This feels like a very implicit way of configuring the ClusterCacheTracker

cnmcavoy · 2023-06-20T18:45:22Z

This feels like a very implicit way of configuring the ClusterCacheTracker

Can you clarify? I'm not really sure what you mean.

If I'm not wrong the solution proposed is not addressing the problem being discussed (making the timeout for draining workload clusters configurable), but instead it configures the timeout for management cluster clients.

I commented over in the issue, but the HTTP timeout is very much what we are interested in. If we feel that making the QPS flags consistent is an unwanted change, I can revert that part and make this PR only the timeout.

sbueringer · 2023-06-21T09:34:08Z

This feels like a very implicit way of configuring the ClusterCacheTracker

Can you clarify? I'm not really sure what you mean.

Sorry that was not very actionable and clear feedback :). What I meant is I don't like to introduce flags, which are then writing package global variables and thus affect the behavior of the ClusterCacheTracker and the RESTConfig func.

There are folks using CAPI as a library and use ClusterCacheTracker directly. I don't want to force them to have to use the flags to be able to configure the timeouts.

I would prefer something like what we did with flags.TLSOptions. Provide a util to define the flags and then explicitly hand them over (in the TLSOptions case to the webhookserver, in our case here to the ClusterCacheTracker).

If I'm not wrong the solution proposed is not addressing the problem being discussed (making the timeout for draining workload clusters configurable), but instead it configures the timeout for management cluster clients.

I commented over in the issue, but the HTTP timeout is very much what we are interested in. If we feel that making the QPS flags consistent is an unwanted change, I can revert that part and make this PR only the timeout.

I think you're right that this would change the timeout used for draining (but also more). Let's continue the discussion on the issue and once we have consensus come back to the PR.

cnmcavoy · 2023-06-26T17:44:16Z

Closed in preference to #8917

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 15, 2023

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 15, 2023

k8s-ci-robot requested review from jackfrancis and JoelSpeed June 15, 2023 17:09

cnmcavoy force-pushed the cnmcavoy/configurable-kube-config branch from c85cc2d to 1339b83 Compare June 15, 2023 17:15

nojnhuh reviewed Jun 16, 2023

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 19, 2023

killianmuldoon mentioned this pull request Jun 19, 2023

[WIP] 🐛 Set defaultClientTimeout to 0 for external clients #8882

Closed

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 20, 2023

Enable configuring the kube config timeout

2f681d3

cnmcavoy force-pushed the cnmcavoy/configurable-kube-config branch from 1339b83 to 2f681d3 Compare June 20, 2023 18:42

cnmcavoy mentioned this pull request Jun 21, 2023

Allow configuring kube client timeouts #8864

Closed

cnmcavoy closed this Jun 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Enable configuring the kube config timeout #8865

✨ Enable configuring the kube config timeout #8865

cnmcavoy commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

nojnhuh Jun 16, 2023

cnmcavoy Jun 20, 2023

nojnhuh Jun 16, 2023

nojnhuh commented Jun 19, 2023

fabriziopandini commented Jun 20, 2023

killianmuldoon commented Jun 20, 2023

fabriziopandini commented Jun 20, 2023

killianmuldoon commented Jun 20, 2023

sbueringer commented Jun 20, 2023

cnmcavoy commented Jun 20, 2023

sbueringer commented Jun 21, 2023 •

edited

Loading

cnmcavoy commented Jun 26, 2023

✨ Enable configuring the kube config timeout #8865

✨ Enable configuring the kube config timeout #8865

Conversation

cnmcavoy commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

k8s-ci-robot commented Jun 15, 2023

nojnhuh Jun 16, 2023

Choose a reason for hiding this comment

cnmcavoy Jun 20, 2023

Choose a reason for hiding this comment

nojnhuh Jun 16, 2023

Choose a reason for hiding this comment

nojnhuh commented Jun 19, 2023

fabriziopandini commented Jun 20, 2023

killianmuldoon commented Jun 20, 2023

fabriziopandini commented Jun 20, 2023

killianmuldoon commented Jun 20, 2023

sbueringer commented Jun 20, 2023

cnmcavoy commented Jun 20, 2023

sbueringer commented Jun 21, 2023 • edited Loading

cnmcavoy commented Jun 26, 2023

sbueringer commented Jun 21, 2023 •

edited

Loading