-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ clusterctl e2e tests #2236
✨ clusterctl e2e tests #2236
Conversation
Hi @Arvinderpal. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Arvinderpal many many thanks for tacking up this task!
I'm +100 to get E2E tests for clusterctl, but I defer to @chuckha the validation of the overall approach for ensuring consistency with the other E2E tests.
/assign @chuckha
In the meantime, I did first pass on the WIP from a clusterctl PoV
// Create clusterctl.yaml | ||
tmpDir = createTempDir() | ||
cfgFile = createLocalTestClusterCtlConfig(tmpDir, "clusterctl.yaml", "DOCKER_SERVICE_DOMAIN: \"docker.cluster.local\"") | ||
// Let's setup some varibles for the workload cluster template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about moving this UP to line 66 and changing into:
-> Defining variables for the workload cluster template; testing both variables from the clusterctl config file and variables from OS environment variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I see how moving this to line 66 changes anything. Are you saying I should define the same vars in both the config file as well as in the environment?
CheckAndWaitDeploymentExists(kindClient, "capi-kubeadm-bootstrap-system", "capi-kubeadm-bootstrap-controller-manager") | ||
CheckAndWaitDeploymentExists(kindClient, "capd-system", "capd-controller-manager") | ||
|
||
options := clusterctlclient.GetClusterTemplateOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires a template available in the docker repository or as a local override, which is something that is not handled by the local override script; also it creates an external dependency which I would like to avoid to make the test more predictable
I guess we should wait for #2133 to be fixed and use a template self-contained in the test / stored in the test folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for the time being I'm just creating cluster-template.yaml
in the docker overrides folder.
#2133 would be useful here, though I think just the filesystem repository should also work. I'll look into this more.
set -o nounset | ||
set -o pipefail | ||
|
||
REPO_ROOT=$(dirname "${BASH_SOURCE[0]}")/../../.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about using git rev-parse --show-toplevel
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the cluster-api/scripts/ci-capd-e2e.sh approach.
var _ = BeforeSuite(func() { | ||
ctx = context.Background() | ||
// Docker image to load into the kind cluster for testing | ||
managerImage = os.Getenv("MANAGER_IMAGE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a silly question, but how the locally built image gets injected into the kind image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here:
kindCluster, err := kind.NewCluster(ctx, kindClusterName, clientgoscheme.Scheme, managerImage)
I build a local image and ensure the local overrides folder points to that image. For example:
docker/v0.3.0/infrastructure-components.yaml:502: image: gcr.io/arvinders-1st-project/docker-provider-manager-amd64:dev
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe clusterctl is going through some more changes. Thanks for all this work! I'm sure we can build upon this once clusterctl changes slow down 😄
_, _, err = c.Init(initOpt) | ||
Expect(err).ToNot(HaveOccurred()) | ||
// Confirm controllers exists | ||
CheckAndWaitDeploymentExists(kindClient, "capi-system", "capi-controller-manager") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a e2e test framework with some convenience methods. We could use this method to verify if cert-manager is available.
cluster-api/test/framework/convenience.go
Line 50 in 5cfd0f1
func WaitForAPIServiceAvailable(ctx context.Context, mgmt Waiter, serviceName string) { |
cmd/clusterctl/test/e2e/helpers.go
Outdated
"sigs.k8s.io/controller-runtime/pkg/client" | ||
) | ||
|
||
func CreateKindClusterAndClient() (*kind.Cluster, client.Client, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may help in creating a kind cluster.
func NewCluster(ctx context.Context, name string, scheme *runtime.Scheme, images ...string) (*Cluster, error) { |
d1a197b
to
9673250
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take another look. I have incorporated all the feedback. It should be possible for you to run the tests locally as well. For the workload cluster create test, you will need to specify a cluster template in the local-overrides folder for docker. Let me know if you need a copy of that.
"sigs.k8s.io/controller-runtime/pkg/client" | ||
) | ||
|
||
var _ = Describe("clusterctl config cluster", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. However, currently I have these two tests in separate files. If you don't mind, I would like to keep it that way. At least to me, it makes for better organization to have _init, _config, _upgrade, etc.. tests in their respective files.
// Create clusterctl.yaml | ||
tmpDir = createTempDir() | ||
cfgFile = createLocalTestClusterCtlConfig(tmpDir, "clusterctl.yaml", "DOCKER_SERVICE_DOMAIN: \"docker.cluster.local\"") | ||
// Let's setup some varibles for the workload cluster template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I see how moving this to line 66 changes anything. Are you saying I should define the same vars in both the config file as well as in the environment?
os.RemoveAll(tmpDir) | ||
}) | ||
|
||
Context("using default infra and bootstrap provider", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
CheckAndWaitDeploymentExists(kindClient, "capi-kubeadm-bootstrap-system", "capi-kubeadm-bootstrap-controller-manager") | ||
CheckAndWaitDeploymentExists(kindClient, "capd-system", "capd-controller-manager") | ||
|
||
options := clusterctlclient.GetClusterTemplateOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for the time being I'm just creating cluster-template.yaml
in the docker overrides folder.
#2133 would be useful here, though I think just the filesystem repository should also work. I'll look into this more.
@Arvinderpal I'll try and take a look at these as soon as I can. I would definitely appreciate a copy of the cluster-template for the workload cluster create test 🙂. Thanks. |
9673250
to
9472b35
Compare
@fabriziopandini Added a simple move test -- move objects of a single node capd workload cluster from one mgmt node to another. It's passing for me. |
9472b35
to
421c839
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Arvinderpal thanks for adding the move test! Let's now consolidate this PR and get this merged by
- making the test self contained using a local respository instead of local overrides
- conflate init and cluster config test into a single test (to shorten the overall execution time)
- creating some utility func to make the code cleaner/have some building blocks for creating new tests
}) | ||
|
||
Context("single node workerload cluster", func() { | ||
It("should move all Cluster API objects to the new mgmt cluster, unpause the Cluster and delete all objects from previous mgmt cluster", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This set of tests are strictly related to the cluster template. Does it make sense to move this in a separate function into the same file that is defining the cluster template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about expecting the same test in the from cluster/any time we create the cluster template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabriziopandini As per your feedback:
- e2e tests are self-contained. They make use of the local repository in the artifacts folder.
- I have also refactored to crate util funcs that init a mgmt cluster and create a workload cluster.
- The init test has been removed.
- I added a couple of delete tests as well.
@fabriziopandini FYI, the |
In the next iteration we should make this consistent with #2294 |
e8ec819
to
66f0bcc
Compare
Are there any docs to go along with this? I cloned the repo and ran |
createTestWorkloadCluster(ctx, mgmtInfo, workloadInfo) | ||
}) | ||
|
||
AfterEach(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this tears down the kind cluster and spins the whole thing back up. Do you think it would be possible to create the kind cluster once and then install and remove components without completely tearing down and standing up the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I debated this for a bit. Initially I went with the reusing mgmt cluster for all tests; however, for that you need to be certain that clusterctl delete -all
does work 100% or that you manually delete all state (and wait for the delete to complete). Given that spinning up a new kind cluster takes a small portion of the overall execution time, and gives you a clean mgmt cluster, it seemed like the right way to go.
} | ||
return nil | ||
}, 3*time.Minute, 5*time.Second, | ||
).ShouldNot(HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our convention for this would be .Should(BeNil())
or maybe .Should(Succeed())
but in situations like this we tend to avoid the .ShouldNot(HaveOccurred())
pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chuckha Please see my comments. I'll message you on slack to see why it's not running on your machine.
createTestWorkloadCluster(ctx, mgmtInfo, workloadInfo) | ||
}) | ||
|
||
AfterEach(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I debated this for a bit. Initially I went with the reusing mgmt cluster for all tests; however, for that you need to be certain that clusterctl delete -all
does work 100% or that you manually delete all state (and wait for the delete to complete). Given that spinning up a new kind cluster takes a small portion of the overall execution time, and gives you a clean mgmt cluster, it seemed like the right way to go.
66f0bcc
to
5df11f0
Compare
finally got a good slot to test it locally, some points that should be addressed in follow up PRs.
|
@fabriziopandini @chuckha Please see my latest commit. I added a README for anyone who wants to run the tests locally. I also generate the docker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only two nits from my side, everything else is for next iterations (as per the previous comment)
@@ -0,0 +1,17 @@ | |||
# Running the tests | |||
|
|||
./run-e2e.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the hack should be called before run-e2e test, or I'm wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. I have updated the docs. The docker override folder will have to be deleted since we generated it's yamls in the test script.
We can add auto-generation of all other component yamls is in a follow up PR.
43ae2a1
to
5531675
Compare
I can't get this running locally. From a clean check out this is what happens:
|
@chuckha the test is not self-contained now and it depends on running the clusterctl hack before (see comment #2236 (comment)). I'm prototyping on top of this PR so we can get the test use the config file defined in the framework + other things in the framework line NewClusterForCAPD, so we can clean-up some dependency and make the experience more consistent across projects |
This has the problem of assuming host networking when trying to contact the workload cluster on CAPD. This assumption makes the e2es fail on OS X where host networking is not available. |
5531675
to
6a69ce8
Compare
I updated README.md to reflect the lack of OS X support at the moment. Perhaps we can address this in a follow up PR. |
sounds great, looks good to me :D! Thanks for helping me fix it locally |
/lgtm @chuckha do you think we can merge this PR? |
* Makes use of the capi e2e framework * Added a script to run clusterctl e2e tests. The script will issue a `make docker-build` and setup env var to load image into kind. Setup various cluserctl files and env vars. * Added a test that creates a workload cluster. * Added a clusterctl move test -- it moves a Cluster API objects associated with a single node capd cluster from one mgmt cluster to another mgmt cluster. * Added a delete test. * Use a local respository instead of local overrides for docker * run-e2e.sh script updated to setup local repo in _artifacts dir and also create a custom clusterctl.yaml. * Created util funcs to init a mgmt cluster and create a workload cluster. * Added clusterctl delete tests. * Added a README.md with instructions to run tests.
6a69ce8
to
57ae35d
Compare
@fabriziopandini @chuckha I squashed all the commits into a single commit. Please let me know if want any additional changes; otherwise, I think we can merge this. |
/approve /assign @fabriziopandini |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Arvinderpal, chuckha The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
What this PR does / why we need it:
This PR brings e2e tests to clusterctl. The work leverages the existing capi e2e test framework and the capd infra provider.
Which issue(s) this PR fixes
Rif #1729
/assign @fabriziopandini