Deadlock when using testing.T and goroutine #346

Tatskaari · 2019-11-12T15:52:08Z

Current state

The gomock framework causes a deadlock when a call is made on a mock from within a goroutine when using testing.T as the TestReporter. This is because testing.T.FailNow() can't be called from any goroutine that's not the main test routine. If it is, that goroutine will exit and may cause hard to diagnose deadlocks in your code.

Here is an example "baz" service that depends on a "foo" service.

package service

import (
	"context"
)

// Service is a service that depends on the Foo service
type Service struct {}

// Foo is an external service
type Foo interface {
	// Bar is an RPC that can be cancelled via it's context
	Bar(ctx context.Context, arg string)
}

// Baz is an endpoint on our service
func (*Service) Baz(ctx context.Context, mock Foo) {
	done := make(chan struct{})
	go func() {
                // This might result in a runtime.Goexit() which means the message on the done chan is never sent 
		mock.Bar(ctx, "test") 
		done <- struct{}{}
	}()

	select {
	case <-done:
	case <-ctx.Done():
	}
}

Here are some tests for the baz service:

import (
	"context"
	"github.com/golang/mock/gomock"
	"service/mock_service"
	"testing"
)


// this test results in 
// === RUN   TestThatCausesDeadlock
// fatal error: all goroutines are asleep - deadlock!
func TestThatCausesDeadlock(t *testing.T) {
	ctrl := gomock.NewController(t)
	mock := mock_service.NewMockFoo(ctrl)

        //We have neglected to set up the expected calls in this test
	new(Service).Baz(context.Background(), mock)

	ctrl.Finish()
}

// This test passes as expected 
func TestThatWorksFine(t *testing.T) {
	ctrl := gomock.NewController(t)
	mock := mock_playgroundf.NewMockFoo(ctrl)

	mock.EXPECT().Bar(gomock.Any(), gomock.Any())

	new(Service).Baz(context.Background(), mock)

	ctrl.Finish()
}

Desired state

The unexpected call to Bar should result in the test erroring out with an "unexpected call" error. We could implement a TestReporter such as:

type testReporter struct {
	t testing.T
}

func (tr *testReporter) Errorf(format string, args ...interface{}) {
	tr.t.Errorf(format, args...)
}

func (tr *testReporter) Fatalf(format string, args ...interface{}) {
	panic(fmt.Sprintf(format, args...))
}

This will result in the whole test process dying so the next tests won't execute however at least it's clear what went wrong.

I've spent quite a lot of time sifting through code to try and figure out what's been happening. The goroutine this was happening in was actually deep in a library we're using so you can imagine how much fun I've had. Would it be worth having NewController wrap the test reporter in such a way so others don't get caught out like this?

Side moan: if we could be trusted with IDs for goroutines, we could detect if we're on the main thread and only panic if we're not...

The text was updated successfully, but these errors were encountered:

codyoss · 2019-11-20T16:56:06Z

Thanks for the report I will look into this

cvgw · 2020-01-11T02:01:04Z

I'm having a hard time coming up with a solution that doesn't involve panic and doesn't involve changing the API for tests.

If there was a reliable way to tell whether we are running in the main test thread we could panic only when we are in a subroutine

cvgw · 2020-01-11T02:04:07Z

It would be great if we could do something channel based (or something like that)

I'm not sure how we would package the API but something like

select {
case <- testFinished:
....
case <- subRoutineFailed:
  t.FailNow()
}

cvgw · 2020-01-18T18:34:42Z

relevant issue golang/go#15758

cvgw · 2020-01-18T18:39:42Z

It may be possible to do something similar to the http2 pkg in order to track the main test routine https://github.com/golang/net/blob/2491c5de3490fced2f6cff376127c667efeed857/http2/gotrack.go#L22

cvgw · 2020-01-18T20:20:39Z

The more I think about it the more building some kind of test wrapper seems like a really good idea.

T.Fatal, T.FailNow, etc shouldn't be called in a goroutine and that probably will never change
Calling panic is non-optimal as it prevents other tests from running

A test wrapper could look something like this
https://play.golang.org/p/NXGIGTlK7fv

func Wrapper(t *testing.T, failChan chan bool, wrappedTestFunc func() error) {
	done := func() chan bool {
		doneChan := make(chan bool)
		go func() {
			wrappedTestFunc()
			doneChan <- true
		}()

		return doneChan
	}()

	select {
	case <-done:
		log.Print("done")
	case <-failChan:
		time.Sleep(time.Second)
		t.Fatal("meow")
	}
}

The only other option I can think of would be to allow the function call on the mock and return zero values (iirc this is default behavior when Return is not specified). We could then call T.FailNow inside Controller.Finish. This would prevent deadlocks and T.FailNow from being called outside the main test routine. It seems like it could produce some unexpected behavior, but no more so than an EXPECT with a Return.

I'm not sure which would be better. The wrapper feels like idiomatic golang IMO and seems like it would be a more robust solution for concurrent/async code.

Allowing methods to be called on mocks even when there is no EXPECT and then failing from the main test routine seems desirable because it requires no change to API. It does seem more brittle to me in some way though; it does depend on users calling Controller.Finish which I don't know how often people forget that.

codyoss · 2020-03-04T16:28:37Z

Related #145

prashantv · 2020-05-10T00:13:48Z

To check whether we're on the main test goroutine, I have a couple of possible ideas:

Track the stacktrace when NewController is called, and when a mock fails, check whether the earlier captured stacktrace is part of the current goroutine's stacktrace. If it isn't, then we are not on the same goroutine.
Check the goroutine stack for testing function (e.g., testing.tRunner), and if we find that, we are on the main test goroutine and can use t.Fatalf, otherwise, we cannot (and may be better off using panic).

The first suggestion assumes that functions like t.Run don't start a separate goroutine, while the second suggestion relies on internals of the testing library that may change across versions.

codyoss added the status: investigating label Nov 20, 2019

cvgw self-assigned this Jan 11, 2020

codyoss added status: backlog and removed status: investigating labels Mar 4, 2020

codyoss unassigned cvgw Mar 30, 2020

lrettig mentioned this issue Sep 25, 2020

Don't call T.FatalX inside goroutines in tests spacemeshos/go-spacemesh#2151

Closed

IronSavior mentioned this issue Feb 28, 2023

Cancelable resource.Test() / UnitTest() hashicorp/terraform-plugin-testing#66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock when using testing.T and goroutine #346

Deadlock when using testing.T and goroutine #346

Tatskaari commented Nov 12, 2019 •

edited

Loading

codyoss commented Nov 20, 2019

cvgw commented Jan 11, 2020

cvgw commented Jan 11, 2020

cvgw commented Jan 18, 2020

cvgw commented Jan 18, 2020

cvgw commented Jan 18, 2020

codyoss commented Mar 4, 2020

prashantv commented May 10, 2020

Deadlock when using testing.T and goroutine #346

Deadlock when using testing.T and goroutine #346

Comments

Tatskaari commented Nov 12, 2019 • edited Loading

Current state

Desired state

codyoss commented Nov 20, 2019

cvgw commented Jan 11, 2020

cvgw commented Jan 11, 2020

cvgw commented Jan 18, 2020

cvgw commented Jan 18, 2020

cvgw commented Jan 18, 2020

codyoss commented Mar 4, 2020

prashantv commented May 10, 2020

Tatskaari commented Nov 12, 2019 •

edited

Loading