Re-work TestCreateServiceInstanceWithProvisionFailure to mitigate its flake #2153

staebler · 2018-06-22T20:39:24Z

I have thought about this flake a lot many different times and have not been able to rationalize either (1) why the test would produce failures if the code being tested is correct or (2) why the code being tested is incorrect. I have never been able to reproduce this flake locally, so I am not sure whether this re-work will do anything to fix the flake. However, if it doesn't, then I will be very concerned about what is going on in the code being tested.

I will kick this PR repeatedly to try to provoke a flake in the unit tests.

carolynvs · 2018-06-22T20:56:52Z

Thanks for tackling this! Hit me up when you think it's ready to review after all your kicking. 😀

staebler · 2018-06-23T03:06:17Z

/retest

staebler · 2018-06-23T04:33:13Z

/retest

staebler · 2018-06-24T20:20:30Z

/retest

staebler · 2018-06-25T02:14:12Z

/retest

staebler · 2018-06-25T13:30:56Z

/retest

staebler · 2018-06-25T17:09:04Z

/retest

staebler · 2018-06-26T13:29:54Z

/retest

staebler · 2018-06-28T19:36:29Z

/retest

staebler · 2018-06-28T19:37:44Z

Thanks for tackling this! Hit me up when you think it's ready to review after all your kicking. grinning

@carolynvs I feel good about this PR fixing the flake. I have not had any test failures in 10 runs of the test. It is ready for review.

carolynvs

LGTM but honestly I'm not familiar enough with this area of service catalog for that to be meaningful. So we'll need someone else to do a more thorough review on it.

MHBauer · 2018-06-29T18:09:09Z

test/util/util.go

@@ -345,7 +345,9 @@ func AssertServiceInstanceCondition(t *testing.T, instance *v1beta1.ServiceInsta
 	}

 	if !foundCondition {
-		t.Fatalf("%v condition not found", conditionType)
+		if status != v1beta1.ConditionFalse || len(reason) != 0 {


how weird, this function wasn't used before this. and the other one was used only once.

change makes sense, but maybe could use a comment that we have indeed found a condition because we have a set status

I'll create a separate function for the false-or-absent assert.

I meant to remove this in lieu of the new function. I've pushed up a new commit with this removed.

MHBauer · 2018-06-29T18:10:10Z

test/integration/controller_instance_test.go

-		conditionReason          string
-		expectFailCondition      bool
+		provisionErrorReason     string
+		failReason               string
 		triggersOrphanMitigation bool


can we annotate the struct fields with some comments for use?

MHBauer · 2018-06-29T18:14:46Z

test/integration/controller_instance_test.go

+			// the core of the test so that the resource cleanup can proceed.
+			defer atomic.StoreInt32(&respondWithProvisionSuccess, 1)
+			defer atomic.StoreInt32(&respondWithDeprovisionSuccess, 1)
+			defer atomic.StoreInt32(&blockDeprovision, 0)


not a super fan of atomic ints for what are essentially bools.
I'm fine with the "throw a big lock around it" solution.

It is more code (which is subjectively more difficult to read). I personally don't find it worthwhile to use a mutex to avoid using as integer as a boolean, but I'll make the change nevertheless.

You are right about select and chan being more idiomatic, though. I'll use that instead.

MHBauer · 2018-06-29T18:31:05Z

test/integration/controller_instance_test.go

-							},
-						}))
+						func(r *osb.ProvisionRequest) (*osb.ProvisionResponse, error) {
+							if atomic.LoadInt32(&respondWithProvisionSuccess) != 0 {


would prefer if this was == 1. we set it explicitly up above and I find equality easier to read than inequality.

MHBauer · 2018-06-29T18:31:48Z

test/integration/controller_instance_test.go

+							if atomic.LoadInt32(&respondWithDeprovisionSuccess) != 0 {
+								return &osb.DeprovisionResponse{}, nil
+							} else {
+								for atomic.LoadInt32(&blockDeprovision) != 0 {


same with these. assert truth rather than other way around.

MHBauer

seems okay. need to take another pass.

the atomics is pretty straightforward, but wondering if channels and selects would be more idiomatic.

… test to mitigate its flake

MHBauer

I think this flows a lot better from top to bottom.
LGTM
/lgtm
/approve

MHBauer · 2018-07-02T23:11:59Z

test/util/util.go

@@ -345,7 +345,9 @@ func AssertServiceInstanceCondition(t *testing.T, instance *v1beta1.ServiceInsta
 	}

 	if !foundCondition {
-		t.Fatalf("%v condition not found", conditionType)
+		if status != v1beta1.ConditionFalse || len(reason) != 0 {


k8s-ci-robot · 2018-07-03T00:00:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MHBauer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [MHBauer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2018-07-03T13:30:37Z

New changes are detected. LGTM label has been removed.

staebler · 2018-07-03T20:31:33Z

Closes #2036.

k8s-ci-robot requested review from duglin and MHBauer June 22, 2018 20:39

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 22, 2018

staebler added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 22, 2018

staebler force-pushed the fix_integration_flake branch from ffae2a4 to ae300e0 Compare June 28, 2018 13:00

carolynvs removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 29, 2018

carolynvs approved these changes Jun 29, 2018

View reviewed changes

MHBauer reviewed Jun 29, 2018

View reviewed changes

Re-work the TestCreateServiceInstanceWithProvisionFailure integration…

7da4277

… test to mitigate its flake

staebler force-pushed the fix_integration_flake branch from ae300e0 to 7da4277 Compare July 2, 2018 20:34

k8s-ci-robot assigned MHBauer Jul 3, 2018

MHBauer approved these changes Jul 3, 2018

View reviewed changes

k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 3, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 3, 2018

MHBauer added the LGTM1 label Jul 3, 2018

Revert change to AssertServiceInstanceCondition

907e126

k8s-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 3, 2018

carolynvs added LGTM2 and removed LGTM2 labels Jul 3, 2018

carolynvs added the LGTM2 label Jul 13, 2018

carolynvs merged commit 7b9f8bd into kubernetes-retired:master Jul 13, 2018

MHBauer mentioned this pull request Jul 19, 2018

v0.1.26 Release #2213

Merged

MHBauer mentioned this pull request Jul 27, 2018

reenable and debug test flake in test/integration/controller_instance_test.go:TestCreateServiceInstanceWithProvisionFailure #2036

Closed

cblecker unassigned MHBauer Jun 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-work TestCreateServiceInstanceWithProvisionFailure to mitigate its flake #2153

Re-work TestCreateServiceInstanceWithProvisionFailure to mitigate its flake #2153

staebler commented Jun 22, 2018

carolynvs commented Jun 22, 2018

staebler commented Jun 23, 2018

staebler commented Jun 23, 2018

staebler commented Jun 24, 2018

staebler commented Jun 25, 2018

staebler commented Jun 25, 2018

staebler commented Jun 25, 2018

staebler commented Jun 26, 2018

staebler commented Jun 28, 2018

staebler commented Jun 28, 2018

carolynvs left a comment

MHBauer Jun 29, 2018

staebler Jul 2, 2018

MHBauer Jul 2, 2018

staebler Jul 3, 2018

MHBauer Jun 29, 2018

MHBauer Jun 29, 2018

staebler Jul 2, 2018

staebler Jul 2, 2018

MHBauer Jun 29, 2018

MHBauer Jun 29, 2018

MHBauer left a comment

MHBauer left a comment

MHBauer Jul 2, 2018

k8s-ci-robot commented Jul 3, 2018

k8s-ci-robot commented Jul 3, 2018

staebler commented Jul 3, 2018

Re-work TestCreateServiceInstanceWithProvisionFailure to mitigate its flake #2153

Re-work TestCreateServiceInstanceWithProvisionFailure to mitigate its flake #2153

Conversation

staebler commented Jun 22, 2018

carolynvs commented Jun 22, 2018

staebler commented Jun 23, 2018

staebler commented Jun 23, 2018

staebler commented Jun 24, 2018

staebler commented Jun 25, 2018

staebler commented Jun 25, 2018

staebler commented Jun 25, 2018

staebler commented Jun 26, 2018

staebler commented Jun 28, 2018

staebler commented Jun 28, 2018

carolynvs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MHBauer left a comment

Choose a reason for hiding this comment

MHBauer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jul 3, 2018

k8s-ci-robot commented Jul 3, 2018

staebler commented Jul 3, 2018