-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 Implement e2e test for clusterctl upgrade #3708
🌱 Implement e2e test for clusterctl upgrade #3708
Conversation
/area testing /milestone v0.4.0 |
4fd507a
to
134b7af
Compare
/test pull-cluster-api-e2e-full-main |
134b7af
to
9c288cb
Compare
9c288cb
to
1dbc0aa
Compare
/test pull-cluster-api-e2e-full-main |
/retest |
/test pull-cluster-api-e2e-full-main |
1dbc0aa
to
6100c5a
Compare
/test pull-cluster-api-e2e-full-main |
@wfernandes @srm09 if you have some spare time PTAL |
6100c5a
to
66d342e
Compare
/test pull-cluster-api-e2e-full-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to run the test on my system and saw the following error:
Should create a management cluster and then upgrade all the providers [It]
/Users/wfernandes/workspace/cluster-api/test/e2e/clusterctl_upgrade.go:79
Failed to run clusterctl config cluster
Unexpected error:
<*yamlprocessor.errMissingVariables | 0xc000ac6200>: {
Missing: ["CNI_RESOURCES"],
}
value for variables [CNI_RESOURCES] is not set. Please set the value using os environment variables or the clusterctl config file
occurred
Do I need to configure some variables for this E2E test?
66d342e
to
6fc8b94
Compare
@wfernandes the error was due to a conflicting merge with #3896 recently merged; now I have rebased, so everything should be fine /test pull-cluster-api-e2e-full-main |
The test error seems unrelated to this PR... |
For some reason when I run
I'll try and investigate some more but it's not immediately clear as to what is happening. Any thoughts @fabriziopandini ? |
d12ba1e
to
e1a622f
Compare
/test pull-cluster-api-e2e-main |
e1a622f
to
706caea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the approach looks good to me.
@fabriziopandini Should I do another round of review or wait until the test is green or you ping again? |
8b19078
to
4bf0c08
Compare
4bf0c08
to
bb2fbfe
Compare
@@ -219,7 +219,7 @@ func (p *providerComponents) Delete(options DeleteOptions) error { | |||
|
|||
func (p *providerComponents) DeleteWebhookNamespace() error { | |||
log := logf.Log | |||
log.V(5).Info("Deleting %s namespace", repository.WebhookNamespaceName) | |||
log.V(5).Info("Deleting", "namespace", repository.WebhookNamespaceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an error in clusterctl detected by this test
targetPort: 9443 | ||
selector: | ||
control-plane: controller-manager | ||
targetPort: webhook-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an inconsistency in CAPD vs other providers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide some more context what the inconsistency is, why we have it and why we are dropping the selector now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly for consistency across all the providers in the CAPI codebase, eg.
- All the provider are using the named port, CAPD was using the port number
- All the provider are not using a selector (they rely on the labels added by kustomize), CAPD had the selector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it, so we're consistent now after this change. Wasn't sure ;).
/test pull-cluster-api-e2e-full-main @vincepri @sbueringer this test already helped in finding a controller runtime defect and a clusterctl error; the job passed when running alone, I'm now testing with e2e-full, but we should now be ready to go |
@fabriziopandini Great! I put it on my review list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits. I think relative to the complexity of the test it reads very nicely! :)
@@ -82,6 +85,61 @@ func Init(ctx context.Context, input InitInput) { | |||
Expect(err).ToNot(HaveOccurred(), "failed to run clusterctl init") | |||
} | |||
|
|||
// InitWithBinary uses clusterctl binary to run init with the list of providers defined in the local repository. | |||
func InitWithBinary(_ context.Context, binary string, input InitInput) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the idea behind the context param here (and the other funcs in this file is). Maybe just hand it over in case we need it later on?
I imagine at least in the funcs with binary we will probably never use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to have some context with timeout percolating down the code, and potentially down to the exec command as well...
targetPort: 9443 | ||
selector: | ||
control-plane: controller-manager | ||
targetPort: webhook-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide some more context what the inconsistency is, why we have it and why we are dropping the selector now?
@fabriziopandini: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm Just to confirm. No changes in test-infra required as this test will just run with our other tests in pull-cluster-api-e2e-full-main? |
Yes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vincepri The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR implements a new E2E test providing a signal on clusterctl upgrades
Which issue(s) this PR fixes:
Fixes #3690