Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make stress-test-e2e run detects a race condition in test framework.go #1055

Closed
aLekSer opened this issue Sep 10, 2019 · 2 comments · Fixed by #1065
Closed

make stress-test-e2e run detects a race condition in test framework.go #1055

aLekSer opened this issue Sep 10, 2019 · 2 comments · Fixed by #1065
Labels
area/tests Unit tests, e2e tests, anything to make sure things don't break kind/bug These are bugs.
Milestone

Comments

@aLekSer
Copy link
Collaborator

aLekSer commented Sep 10, 2019

Golang stress tests detects a race condition.

What happened:
Data race detected on 7 minute of stress E2E test run.

      /usr/local/go/src/testing/testing.go:672 +0x9e                                                                                                                                                     
  agones.dev/agones/test/e2e/framework.(*Framework).WaitForFleetCondition()                                                                                                                              
      /go/src/agones.dev/agones/test/e2e/framework/framework.go:146 +0x504
  agones.dev/agones/test/e2e.scaleAndWait()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:1005 +0xc4
  agones.dev/agones/test/e2e.TestScaleUpAndDownInParallelStressTest.func1()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:824 +0xe4

What you expected to happen:
No race conditions.

How to reproduce it (as minimally and precisely as possible):

make gcloud-test-cluster
STRESS_TEST_LEVEL=4 make stress-test-e2e

Anything else we need to know?:

Environment:

  • Agones version: 0.12
  • Kubernetes version (use kubectl version): 1.12
  • Cloud provider or hardware configuration: Google Cloud
  • Install method (yaml/helm): yaml
  • Troubleshooting guide log(s):
time="2019-09-10 19:29:37.352" level=info msg="fleet scale-fleet-7-b8l2k has 26/40 ready replicas"                                                                                                       
time="2019-09-10 19:29:37.353" level=info msg="error waiting for fleet condition" error="timed out waiting for the condition" fleet=scale-fleet-7-b8l2k                                                  
==================                                                                                                                                                                                       
WARNING: DATA RACE                                                                                                                                                                                       
Write at 0x00c0002fe151 by goroutine 86:                                                                                                                                                                 
  testing.(*common).FailNow()                                                                                                                                                                            
      /usr/local/go/src/testing/testing.go:608 +0x4c                                                                                                                                                     
  testing.(*common).Fatalf()
      /usr/local/go/src/testing/testing.go:672 +0x9e                                                                                                                                                     
  agones.dev/agones/test/e2e/framework.(*Framework).WaitForFleetCondition()
      /go/src/agones.dev/agones/test/e2e/framework/framework.go:146 +0x504
  agones.dev/agones/test/e2e.scaleAndWait()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:1005 +0xc4
  agones.dev/agones/test/e2e.TestScaleUpAndDownInParallelStressTest.func1()                                                                                                                              
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:824 +0xe4

Previous write at 0x00c0002fe151 by goroutine 82:
  testing.(*common).FailNow()                                                                                                                                                                            
      /usr/local/go/src/testing/testing.go:608 +0x4c
  testing.(*common).Fatalf()
      /usr/local/go/src/testing/testing.go:672 +0x9e                                                                                                                                                     
  agones.dev/agones/test/e2e/framework.(*Framework).WaitForFleetCondition()                                                                                                                              
      /go/src/agones.dev/agones/test/e2e/framework/framework.go:146 +0x504
  agones.dev/agones/test/e2e.scaleAndWait()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:1005 +0xc4
  agones.dev/agones/test/e2e.TestScaleUpAndDownInParallelStressTest.func1()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:824 +0xe4

Goroutine 86 (running) created at:
  agones.dev/agones/test/e2e.TestScaleUpAndDownInParallelStressTest()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:809 +0x128d
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:865 +0x163

Goroutine 82 (finished) created at:
  agones.dev/agones/test/e2e.TestScaleUpAndDownInParallelStressTest()
      /go/src/agones.dev/agones/test/e2e/fleet_test.go:809 +0x128d
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:865 +0x163
==================

...
time="2019-09-10 19:29:47.609" level=info msg=stress_4_fleet_40_scale_down avg=6.64017396 count=5 duration=4.7771784 max=9.9468814 min=5.1667113 p50=5.791677825 p90=9.473440700000001 p95=9.71016105 p99=9.899537330000001 p999=9.942146993000001
--- FAIL: TestScaleUpAndDownInParallelStressTest (355.56s)
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-3-jj5fc
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-7-b8l2k
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-1-hkw7f
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-9-82n47
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-5-tht4x
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-8-cbtwh
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-0-zsjtc
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-2-6r72j
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-6-4krtn
    fleet_test.go:1005: error waiting for fleet condition on fleet scale-fleet-4-sx8dp
    testing.go:809: race detected during execution of test
FAIL
time="2019-09-10 19:29:47.617" level=info msg="Cleaning up now."
time="2019-09-10 19:30:08.409" level=info msg="Finished cleanup."
FAIL    agones.dev/agones/test/e2e      378.119s
  • Others:
@aLekSer aLekSer added the kind/bug These are bugs. label Sep 10, 2019
@aLekSer
Copy link
Collaborator Author

aLekSer commented Sep 10, 2019

I think that the race was detected in test framework itself.
Also note that running tests with STRESS_TEST_LEVEL of 1 passes:

STRESS_TEST_LEVEL=1 make stress-test-e2e 
...
"KUBECONFIG=/root/.kube/config" -e "GO111MODULE=on" -w /go/src/agones.dev/agones  agones-build:af35076ed7 go test -mod=vendor -race agones.dev/agones/test/e2e  --kubeconfig /root/.kube/config \
                -timeout 1h \
                -run '.*StressTest.*' \
                --gameserver-image=gcr.io/agones-images/udp-server:0.14 \
                --pullsecret= \
                --stress 1 \
                --perf-output /go/src/agones.dev/agones/build/.perf
ok      agones.dev/agones/test/e2e      387.278s

@aLekSer
Copy link
Collaborator Author

aLekSer commented Sep 11, 2019

Found a similar ticket. It is not thread-safe to call t.Fatalf() twice.

		t.Fatalf("error waiting for fleet condition on fleet %v", flt.Name)

golang/go#20940

A test ends when its Test function returns or calls any of the methods FailNow, 
Fatal, Fatalf, SkipNow, Skip, or Skipf. Those methods, as well as the Parallel method, 
must be called only from the goroutine running the Test function.

https://tip.golang.org/pkg/testing/#T

@aLekSer aLekSer changed the title make stress-test-e2e run detects a race condition make stress-test-e2e run detects a race condition in test framework.go Sep 11, 2019
@markmandel markmandel added this to the 1.1.0 milestone Oct 22, 2019
@markmandel markmandel added the area/tests Unit tests, e2e tests, anything to make sure things don't break label Oct 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tests Unit tests, e2e tests, anything to make sure things don't break kind/bug These are bugs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants