Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ clusterctl e2e tests #2236

Merged
merged 1 commit into from
Feb 18, 2020
Merged

Conversation

Arvinderpal
Copy link
Contributor

What this PR does / why we need it:
This PR brings e2e tests to clusterctl. The work leverages the existing capi e2e test framework and the capd infra provider.

Which issue(s) this PR fixes
Rif #1729

/assign @fabriziopandini

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 31, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @Arvinderpal. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 31, 2020
@k8s-ci-robot k8s-ci-robot requested review from justinsb and ncdc January 31, 2020 16:13
@Arvinderpal
Copy link
Contributor Author

This is just a WIP at this point. I wanted to share it now to get early feedback.
/cc @ncdc @vincepri

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Arvinderpal many many thanks for tacking up this task!
I'm +100 to get E2E tests for clusterctl, but I defer to @chuckha the validation of the overall approach for ensuring consistency with the other E2E tests.
/assign @chuckha

In the meantime, I did first pass on the WIP from a clusterctl PoV

// Create clusterctl.yaml
tmpDir = createTempDir()
cfgFile = createLocalTestClusterCtlConfig(tmpDir, "clusterctl.yaml", "DOCKER_SERVICE_DOMAIN: \"docker.cluster.local\"")
// Let's setup some varibles for the workload cluster template
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about moving this UP to line 66 and changing into:
-> Defining variables for the workload cluster template; testing both variables from the clusterctl config file and variables from OS environment variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I see how moving this to line 66 changes anything. Are you saying I should define the same vars in both the config file as well as in the environment?

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
CheckAndWaitDeploymentExists(kindClient, "capi-kubeadm-bootstrap-system", "capi-kubeadm-bootstrap-controller-manager")
CheckAndWaitDeploymentExists(kindClient, "capd-system", "capd-controller-manager")

options := clusterctlclient.GetClusterTemplateOptions{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a template available in the docker repository or as a local override, which is something that is not handled by the local override script; also it creates an external dependency which I would like to avoid to make the test more predictable
I guess we should wait for #2133 to be fixed and use a template self-contained in the test / stored in the test folder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for the time being I'm just creating cluster-template.yaml in the docker overrides folder.
#2133 would be useful here, though I think just the filesystem repository should also work. I'll look into this more.

set -o nounset
set -o pipefail

REPO_ROOT=$(dirname "${BASH_SOURCE[0]}")/../../..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using git rev-parse --show-toplevel instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the cluster-api/scripts/ci-capd-e2e.sh approach.

var _ = BeforeSuite(func() {
ctx = context.Background()
// Docker image to load into the kind cluster for testing
managerImage = os.Getenv("MANAGER_IMAGE")
Copy link
Member

@fabriziopandini fabriziopandini Feb 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a silly question, but how the locally built image gets injected into the kind image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here:
kindCluster, err := kind.NewCluster(ctx, kindClusterName, clientgoscheme.Scheme, managerImage)

I build a local image and ensure the local overrides folder points to that image. For example:

docker/v0.3.0/infrastructure-components.yaml:502:        image: gcr.io/arvinders-1st-project/docker-provider-manager-amd64:dev

cmd/clusterctl/test/run-e2e.sh Outdated Show resolved Hide resolved
cmd/clusterctl/test/run-e2e.sh Show resolved Hide resolved
cmd/clusterctl/test/run-e2e.sh Show resolved Hide resolved
@fabriziopandini
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 1, 2020
@ncdc ncdc added this to the v0.3.0 milestone Feb 3, 2020
Copy link
Contributor

@wfernandes wfernandes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe clusterctl is going through some more changes. Thanks for all this work! I'm sure we can build upon this once clusterctl changes slow down 😄

_, _, err = c.Init(initOpt)
Expect(err).ToNot(HaveOccurred())
// Confirm controllers exists
CheckAndWaitDeploymentExists(kindClient, "capi-system", "capi-controller-manager")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a e2e test framework with some convenience methods. We could use this method to verify if cert-manager is available.

func WaitForAPIServiceAvailable(ctx context.Context, mgmt Waiter, serviceName string) {

cmd/clusterctl/test/e2e/helpers.go Outdated Show resolved Hide resolved
"sigs.k8s.io/controller-runtime/pkg/client"
)

func CreateKindClusterAndClient() (*kind.Cluster, client.Client, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may help in creating a kind cluster.

func NewCluster(ctx context.Context, name string, scheme *runtime.Scheme, images ...string) (*Cluster, error) {

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 5, 2020
Copy link
Contributor Author

@Arvinderpal Arvinderpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take another look. I have incorporated all the feedback. It should be possible for you to run the tests locally as well. For the workload cluster create test, you will need to specify a cluster template in the local-overrides folder for docker. Let me know if you need a copy of that.

"sigs.k8s.io/controller-runtime/pkg/client"
)

var _ = Describe("clusterctl config cluster", func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. However, currently I have these two tests in separate files. If you don't mind, I would like to keep it that way. At least to me, it makes for better organization to have _init, _config, _upgrade, etc.. tests in their respective files.

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
// Create clusterctl.yaml
tmpDir = createTempDir()
cfgFile = createLocalTestClusterCtlConfig(tmpDir, "clusterctl.yaml", "DOCKER_SERVICE_DOMAIN: \"docker.cluster.local\"")
// Let's setup some varibles for the workload cluster template
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I see how moving this to line 66 changes anything. Are you saying I should define the same vars in both the config file as well as in the environment?

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/run-e2e.sh Show resolved Hide resolved
cmd/clusterctl/test/run-e2e.sh Show resolved Hide resolved
os.RemoveAll(tmpDir)
})

Context("using default infra and bootstrap provider", func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

CheckAndWaitDeploymentExists(kindClient, "capi-kubeadm-bootstrap-system", "capi-kubeadm-bootstrap-controller-manager")
CheckAndWaitDeploymentExists(kindClient, "capd-system", "capd-controller-manager")

options := clusterctlclient.GetClusterTemplateOptions{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for the time being I'm just creating cluster-template.yaml in the docker overrides folder.
#2133 would be useful here, though I think just the filesystem repository should also work. I'll look into this more.

@wfernandes
Copy link
Contributor

@Arvinderpal I'll try and take a look at these as soon as I can. I would definitely appreciate a copy of the cluster-template for the workload cluster create test 🙂. Thanks.

@Arvinderpal
Copy link
Contributor Author

@fabriziopandini Added a simple move test -- move objects of a single node capd workload cluster from one mgmt node to another. It's passing for me.

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Arvinderpal thanks for adding the move test! Let's now consolidate this PR and get this merged by

  • making the test self contained using a local respository instead of local overrides
  • conflate init and cluster config test into a single test (to shorten the overall execution time)
  • creating some utility func to make the code cleaner/have some building blocks for creating new tests

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/run-e2e.sh Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/move_test.go Outdated Show resolved Hide resolved
})

Context("single node workerload cluster", func() {
It("should move all Cluster API objects to the new mgmt cluster, unpause the Cluster and delete all objects from previous mgmt cluster", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set of tests are strictly related to the cluster template. Does it make sense to move this in a separate function into the same file that is defining the cluster template?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about expecting the same test in the from cluster/any time we create the cluster template

Copy link
Contributor Author

@Arvinderpal Arvinderpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabriziopandini As per your feedback:

  • e2e tests are self-contained. They make use of the local repository in the artifacts folder.
  • I have also refactored to crate util funcs that init a mgmt cluster and create a workload cluster.
  • The init test has been removed.
  • I added a couple of delete tests as well.

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
@Arvinderpal
Copy link
Contributor Author

@fabriziopandini FYI, the deletes everything spec is currently failing. I'm trying to debug the issue. The remaining tests should pass and you should be able to run them locally.

@fabriziopandini
Copy link
Member

In the next iteration we should make this consistent with #2294

@Arvinderpal Arvinderpal changed the title ✨ [wip] clusterctl e2e tests ✨ clusterctl e2e tests Feb 11, 2020
@chuckha
Copy link
Contributor

chuckha commented Feb 11, 2020

Are there any docs to go along with this? I cloned the repo and ran ./cmd/clusterctl/test/run-e2e.sh but that gave me a number of errors

createTestWorkloadCluster(ctx, mgmtInfo, workloadInfo)
})

AfterEach(func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this tears down the kind cluster and spins the whole thing back up. Do you think it would be possible to create the kind cluster once and then install and remove components without completely tearing down and standing up the cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I debated this for a bit. Initially I went with the reusing mgmt cluster for all tests; however, for that you need to be certain that clusterctl delete -all does work 100% or that you manually delete all state (and wait for the delete to complete). Given that spinning up a new kind cluster takes a small portion of the overall execution time, and gives you a clean mgmt cluster, it seemed like the right way to go.

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/delete_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/delete_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/helpers.go Show resolved Hide resolved
}
return nil
}, 3*time.Minute, 5*time.Second,
).ShouldNot(HaveOccurred())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our convention for this would be .Should(BeNil()) or maybe .Should(Succeed()) but in situations like this we tend to avoid the .ShouldNot(HaveOccurred()) pattern

cmd/clusterctl/test/e2e/move_test.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Arvinderpal Arvinderpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chuckha Please see my comments. I'll message you on slack to see why it's not running on your machine.

cmd/clusterctl/test/e2e/config_cluster_test.go Outdated Show resolved Hide resolved
createTestWorkloadCluster(ctx, mgmtInfo, workloadInfo)
})

AfterEach(func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I debated this for a bit. Initially I went with the reusing mgmt cluster for all tests; however, for that you need to be certain that clusterctl delete -all does work 100% or that you manually delete all state (and wait for the delete to complete). Given that spinning up a new kind cluster takes a small portion of the overall execution time, and gives you a clean mgmt cluster, it seemed like the right way to go.

cmd/clusterctl/test/e2e/delete_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/delete_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/test/e2e/helpers.go Show resolved Hide resolved
cmd/clusterctl/test/e2e/move_test.go Outdated Show resolved Hide resolved
@fabriziopandini
Copy link
Member

finally got a good slot to test it locally, some points that should be addressed in follow up PRs.

  • repository setup:
    • not override user's clusterctl-settings.json (we are not calling the clusterctl hack)
    • generate the docker infrastructure-components.yaml (we are using a copy ATM)
    • not rely on local overrides at all (we are still relying on providers for everything except the docker provider); we should mimic docker approach building manifests and store them in the local repository
  • to align the test suite to docker's one (use reporter)
  • to reuse framework approach/framework helpers introduced by 🏃 Refactor of the e2e framework #2294

@Arvinderpal
Copy link
Contributor Author

@fabriziopandini @chuckha Please see my latest commit. I added a README for anyone who wants to run the tests locally. I also generate the docker infrastructure-components.yaml instead of copying.

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only two nits from my side, everything else is for next iterations (as per the previous comment)

cmd/clusterctl/test/run-e2e.sh Outdated Show resolved Hide resolved
@@ -0,0 +1,17 @@
# Running the tests

./run-e2e.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the hack should be called before run-e2e test, or I'm wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I have updated the docs. The docker override folder will have to be deleted since we generated it's yamls in the test script.
We can add auto-generation of all other component yamls is in a follow up PR.

@chuckha
Copy link
Contributor

chuckha commented Feb 12, 2020

I can't get this running locally. From a clean check out this is what happens:

$ ./cmd/clusterctl/test/run-e2e.sh 
# some time later...
./cmd/clusterctl/test/run-e2e.sh: line 53: /Users/cha/dev/capi-dev/cluster-api/_artifacts/testdata/docker/v0.3.0/infrastructure-components.yaml: No such file or directory

@fabriziopandini
Copy link
Member

@chuckha the test is not self-contained now and it depends on running the clusterctl hack before (see comment #2236 (comment)).

I'm prototyping on top of this PR so we can get the test use the config file defined in the framework + other things in the framework line NewClusterForCAPD, so we can clean-up some dependency and make the experience more consistent across projects

@chuckha
Copy link
Contributor

chuckha commented Feb 12, 2020

This has the problem of assuming host networking when trying to contact the workload cluster on CAPD. This assumption makes the e2es fail on OS X where host networking is not available.

@Arvinderpal
Copy link
Contributor Author

This has the problem of assuming host networking when trying to contact the workload cluster on CAPD. This assumption makes the e2es fail on OS X where host networking is not available.

I updated README.md to reflect the lack of OS X support at the moment. Perhaps we can address this in a follow up PR.

@chuckha
Copy link
Contributor

chuckha commented Feb 12, 2020

sounds great, looks good to me :D! Thanks for helping me fix it locally

@fabriziopandini
Copy link
Member

/lgtm

@chuckha do you think we can merge this PR?
I have already started to address some of the follow-up work in #2321, introducing a config file similar to the one included in the framework, so we can greatly simplify the run-e2e script introduced by this PR and make the experience of configuring the test much more consistent and flexible

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2020
 * Makes use of the capi e2e framework
 * Added a script to run clusterctl e2e tests. The script will
issue a `make docker-build` and setup env var to load image into
kind. Setup various cluserctl files and env vars.
 * Added a test that creates a workload cluster.
 * Added a clusterctl move test -- it moves a Cluster API objects
associated with a single node capd cluster from one mgmt cluster
to another mgmt cluster.
 * Added a delete test.
 * Use a local respository instead of local overrides for docker
 * run-e2e.sh script updated to setup local repo in _artifacts dir
and also create a custom clusterctl.yaml.
 * Created util funcs to init a mgmt cluster and create a workload
cluster.
 * Added clusterctl delete tests.
 * Added a README.md with instructions to run tests.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2020
@Arvinderpal
Copy link
Contributor Author

@fabriziopandini @chuckha I squashed all the commits into a single commit. Please let me know if want any additional changes; otherwise, I think we can merge this.

@chuckha
Copy link
Contributor

chuckha commented Feb 13, 2020

/approve
looks great! It doesn't work on os x but that's documented so assigning to fabrizio for final lgtm

/assign @fabriziopandini
for lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Arvinderpal, chuckha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2020
@fabriziopandini
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2020
@k8s-ci-robot k8s-ci-robot merged commit ebced80 into kubernetes-sigs:master Feb 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants