🐛[e2e] Avoid deleting the shared cluster after each test #2655

sedefsavas · 2020-03-12T15:32:28Z

What this PR does / why we need it:
This PR avoid deleting the shared cluster after each test.

Which issue(s) this PR fixes
Fixes #2654

k8s-ci-robot · 2020-03-12T15:32:36Z

Hi @sedefsavas. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vincepri · 2020-03-12T15:33:11Z

/ok-to-test

test/infrastructure/docker/e2e/docker_test.go

sedefsavas · 2020-03-12T16:43:42Z

/hold
I am combining this into the AfterSuite in docker_suite_test, as per @detiber 's suggestion.

chuckha · 2020-03-13T13:57:47Z

cc

sedefsavas · 2020-03-13T18:41:41Z

/hold cancel

chuckha · 2020-03-13T18:44:55Z

/assign

chuckha · 2020-03-13T18:45:57Z

/lgtm

chuckha · 2020-03-13T20:41:35Z

/assign @detiber

detiber · 2020-03-13T20:48:11Z

test/infrastructure/docker/e2e/docker_suite_test.go

+	framework.DeleteCluster(ctx, deleteClusterInput)
+
+	waitForClusterDeletedInput := framework.WaitForClusterDeletedInput{
+		Getter:  mgmtClient,
+		Cluster: cluster,
+	}
+	framework.WaitForClusterDeleted(ctx, waitForClusterDeletedInput)
+
+	assertAllClusterAPIResourcesAreGoneInput := framework.AssertAllClusterAPIResourcesAreGoneInput{
+		Lister:  mgmtClient,
+		Cluster: cluster,
+	}
+	framework.AssertAllClusterAPIResourcesAreGone(ctx, assertAllClusterAPIResourcesAreGoneInput)
+
+	ensureDockerDeletedInput := ensureDockerArtifactsDeletedInput{
+		Lister:  mgmtClient,
+		Cluster: cluster,
+	}
+	ensureDockerArtifactsDeleted(ensureDockerDeletedInput)


Do these helpers fail (somewhat) gracefully when mgmtClient or cluster are nil? If not, then this might fail during teardown prior to getting the logs below or tearing down the Management cluster.

Though it looks like we'll already fail before tearing down the management cluster if we fail to get the logs

You are right, they do fail. Trying to find a way to make it continue on failure.

Maybe deletion shouldn't be in AfterSuite and should instead be its own set of assertions that run after the tests run.

It'd be cool if we could do something like FOCUS='Create|Upgrade|ScaleDown|ScaleUp|Delete'. Basically be able to control exactly which operations are running. Unfortunately ginkgo doesn't give us a great way to do that except try and abuse FOCUS, but that is surely fraught with peril.

@chuckha the problem here is that you should not assume ordering between the various specs that match focus.

In general specs that are defined withing the same Describe are run in the order they are defined (unless running in parallel), but it does not hold up across different Describe blocks.

Yeah, this is the general assumption I broke when originally writing these tests, mostly for the sake of test-running-speed. I really don't want to spin up a new cluster for each test, but working with ginkgo instead of against it will likely yield us better results code wise and, depending on how much prow we can use, worse or similar results speed wise.

Perhaps one option (getting way out of scope of the PR now...) is to make the before each is as quick as possible by spinning up a single node control plane cluster with no workers instead of a 3 node control plane. That would put the runtime somewhere around 4 minutes for each test's start up time plus the time the test takes to run (an additional ~2 minutes per control plane that gets created). But it would make the code organization much better and allow for parallelization.

The other idea I had was to manage the ordering of tests internally, pass some kind of flag to the test run that lists the tests you want to run (-tests="upgrade,scaleup,scaledown") and then the tests keep track of the order and run the tests in the correct order skipping tests that are not selected. However, this ruins any possibility of parallelization.

chuckha · 2020-03-17T13:46:44Z

@sedefsavas I think the only way forward here is to remove the assertion failures on delete :/ let's have a wait of the same amount of time, but if it times out there is no exception or assertion failure until after the logs have been dumped.

chuckha · 2020-03-17T15:02:31Z

/milestone v0.3.2

vincepri · 2020-03-23T17:56:35Z

/approve

Approving / merging this for now given the immediate improvement / gain that we get from it, @sedefsavas can you open up an issue with the follow-up items from above?

/milestone v0.3.3

k8s-ci-robot · 2020-03-23T17:57:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sedefsavas, vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~test/infrastructure/docker/OWNERS~~ [vincepri]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 12, 2020

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 12, 2020

k8s-ci-robot requested review from justinsb and ncdc March 12, 2020 15:32

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 12, 2020

detiber reviewed Mar 12, 2020

View reviewed changes

test/infrastructure/docker/e2e/docker_test.go Outdated Show resolved Hide resolved

sedefsavas force-pushed the e2efix branch 2 times, most recently from 2fd92d8 to 6acec28 Compare March 12, 2020 16:22

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 12, 2020

sedefsavas force-pushed the e2efix branch from 6acec28 to f3540e2 Compare March 13, 2020 15:55

[e2e] Move cluster cleanups from AfterEach() to AfterSuite()

156359d

sedefsavas force-pushed the e2efix branch from f3540e2 to 156359d Compare March 13, 2020 18:28

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 13, 2020

k8s-ci-robot assigned chuckha Mar 13, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 13, 2020

k8s-ci-robot assigned detiber Mar 13, 2020

detiber reviewed Mar 13, 2020

View reviewed changes

k8s-ci-robot added this to the v0.3.2 milestone Mar 17, 2020

vincepri modified the milestones: v0.3.2, v0.3.x Mar 19, 2020

k8s-ci-robot modified the milestones: v0.3.x, v0.3.3 Mar 23, 2020

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2020

k8s-ci-robot merged commit cf8be83 into kubernetes-sigs:master Mar 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛[e2e] Avoid deleting the shared cluster after each test #2655

🐛[e2e] Avoid deleting the shared cluster after each test #2655

sedefsavas commented Mar 12, 2020

k8s-ci-robot commented Mar 12, 2020

vincepri commented Mar 12, 2020

sedefsavas commented Mar 12, 2020

chuckha commented Mar 13, 2020

sedefsavas commented Mar 13, 2020

chuckha commented Mar 13, 2020

chuckha commented Mar 13, 2020

chuckha commented Mar 13, 2020

detiber Mar 13, 2020

sedefsavas Mar 13, 2020

chuckha Mar 16, 2020

detiber Mar 16, 2020

chuckha Mar 16, 2020 •

edited

Loading

chuckha commented Mar 17, 2020

chuckha commented Mar 17, 2020

vincepri commented Mar 23, 2020

k8s-ci-robot commented Mar 23, 2020

🐛[e2e] Avoid deleting the shared cluster after each test #2655

🐛[e2e] Avoid deleting the shared cluster after each test #2655

Conversation

sedefsavas commented Mar 12, 2020

k8s-ci-robot commented Mar 12, 2020

vincepri commented Mar 12, 2020

sedefsavas commented Mar 12, 2020

chuckha commented Mar 13, 2020

sedefsavas commented Mar 13, 2020

chuckha commented Mar 13, 2020

chuckha commented Mar 13, 2020

chuckha commented Mar 13, 2020

detiber Mar 13, 2020

Choose a reason for hiding this comment

sedefsavas Mar 13, 2020

Choose a reason for hiding this comment

chuckha Mar 16, 2020

Choose a reason for hiding this comment

detiber Mar 16, 2020

Choose a reason for hiding this comment

chuckha Mar 16, 2020 • edited Loading

Choose a reason for hiding this comment

chuckha commented Mar 17, 2020

chuckha commented Mar 17, 2020

vincepri commented Mar 23, 2020

k8s-ci-robot commented Mar 23, 2020

chuckha Mar 16, 2020 •

edited

Loading