Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for Async Azure Resource Creation and Deletion #1541

Merged

Conversation

CecileRobertMichon
Copy link
Contributor

What type of PR is this?

What this PR does / why we need it: This is a proposal to formalize some of the work that was already done in #1067, and to extend it to other Azure PUT/DELETE operations in the codebase. Working on a POC in parallel but opening this up to get some feedback asynchronously (😉 ).

Credit goes to @devigned for kickstarting this effort with AzureMachinePools, and to @shysank for reflections in #1181.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 20, 2021
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 20, 2021
@CecileRobertMichon
Copy link
Contributor Author

Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have a couple comments and a small nit. I really like the tie between long running operation state and conditions, though it might be useful to provide a little more perscriptive guidance on whether or not a long running operation would tie to a given condition. For example, would a condition state be tied 1 to 1 with a long running operation and would that state change with the state of a long running operation.

Great work! This has the potential to make a significant improvement in the reactivity of the system and the experience of a user interacting with the controller.

docs/Makefile Outdated Show resolved Hide resolved
@CecileRobertMichon
Copy link
Contributor Author

though it might be useful to provide a little more perscriptive guidance on whether or not a long running operation would tie to a given condition. For example, would a condition state be tied 1 to 1 with a long running operation and would that state change with the state of a long running operation.

Will add. My current thinking is that conditions should not be tied 1 to 1 with a long running operation, although we might have conditions that summarize the state of 1 or more resources created/deleted via long running operations. So long-running operations conditions might be a subset of all conditions.

I'll think about this further and come up with an initial list of added conditions.


The 5s timeout, which is quite aggressive, would be per operation (each call). Then we'd have another overall reconcile loop timeout, which already exists. Currently, that timeout is set at 90 minutes (which is also the ARM deployment timeout). With this proposal, we could bring that overall timeout down to something more responsive, such as 30 seconds.

The specific numbers are not set in stone, and should be revised after doing some performance testing with different values and calculating the P99 expected completion time of operations that are not long-running.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting the numbers to become more aggressive with some testing, maybe closer to 2s per call / 15 seconds overall

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 23, 2021
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the updates. This is looking really solid. I'm added a couple comments and questions.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 26, 2021
Copy link
Contributor

@shysank shysank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CecileRobertMichon for formalizing the ideas. Looks good overall with a couple of nits/clarifications.

@CecileRobertMichon
Copy link
Contributor Author

There haven't been any comments on this for a while, I'd like to start a lazy consensus for end of next week if no other comments get added by then.

/hold for lazy consensus

/cc @jackfrancis @alexeldeib @mboersma in case you haven't had a chance to review this yet

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2021
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Great work!

@nader-ziada
Copy link
Contributor

a little bit late, but also lgtm 🎉

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 23, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: devigned

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2021
@k8s-ci-robot k8s-ci-robot merged commit a8f7d33 into kubernetes-sigs:main Aug 23, 2021
@CecileRobertMichon CecileRobertMichon mentioned this pull request Sep 1, 2021
3 tasks
@shysank shysank mentioned this pull request Sep 20, 2021
23 tasks
@Jont828 Jont828 mentioned this pull request Dec 10, 2021
3 tasks
@CecileRobertMichon CecileRobertMichon deleted the async-proposal branch February 17, 2023 23:24
@Jont828 Jont828 mentioned this pull request Aug 4, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants