Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Featuretests test data race/panic due to Testing.Log called after test finishes #6055

Closed
sunjayBhatia opened this issue Jan 4, 2024 · 3 comments · Fixed by #6134
Closed
Assignees
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test.

Comments

@sunjayBhatia
Copy link
Member

sunjayBhatia commented Jan 4, 2024

See test runs:

Tests have a data race/panic because Testing.Logf is called after the test ends

The offending call comes from here:

log.Debug("stream terminated")

We set up and wait for teardown of the Contour gRPC server here:

contour_xds_v3.RegisterServer(contour_xds_v3.NewContourServer(log, xdscache.ResourcesOf(resources)...), srv)
var wg sync.WaitGroup
ctx, cancel := context.WithCancel(context.Background())
wg.Add(1)
go func() {
// Returns once GracefulStop() is called by the below goroutine.
// nolint:errcheck
srv.Serve(l)
wg.Done()
}()
wg.Add(1)
go func() {
// Returns once the context is cancelled by the cleanup func.
// nolint:errcheck
eh.Start(ctx)
// Close the gRPC server and its listener.
srv.GracefulStop()
wg.Done()
}()
cc, err := grpc.Dial(l.Addr().String(), grpc.WithTransportCredentials(insecure.NewCredentials()))
require.NoError(t, err)
rh := &resourceEventHandler{
EventHandler: eh,
EndpointsHandler: et,
Sequence: eh.Sequence(),
statusUpdateCacher: statusUpdateCacher,
}
return rh, &Contour{
T: t,
ClientConn: cc,
statusUpdateCache: statusUpdateCacher,
}, func() {
// close client connection
cc.Close()
// stop server
cancel()
// wait for everything to gracefully stop.
wg.Wait()
}
}

GracefulStop() and Serve should be waiting for all server handler goroutines to exit before returning, the test cleanup function should then be waiting for these to return before letting the test end, e.g.

rh, c, done := setup(t)
defer done()

However as can be seen in the tests above, this is not strictly the case, it seems the server is allowed to exit before some of the handlers fully return. Repro by running make check-test-race, modified with -count=50 (we can exacerbate this by adding a sleep to the done function in the contour xDS server).

This issue seems to have popped up post bumping grpc-go to 1.60.0+, I cannot repro it on 1.59.0

Ironically, this was an issue that was supposedly addressed in 1.60.0:

@sunjayBhatia sunjayBhatia added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. labels Jan 4, 2024
@sunjayBhatia sunjayBhatia self-assigned this Jan 12, 2024
@sunjayBhatia
Copy link
Member Author

Minimal repro here: https://github.com/sunjayBhatia/grpc-go-issue-repro/tree/main/shutdown-cleanup

Tracked the offending commit down to grpc/grpc-go#6489

@sunjayBhatia
Copy link
Member Author

Submitted grpc-go issue here: grpc/grpc-go#6921

@sunjayBhatia
Copy link
Member Author

Going to roll back to v1.59.0 until this is fixed to get CI green

sunjayBhatia added a commit to sunjayBhatia/contour that referenced this issue Jan 13, 2024
Temporarily to get CI green

See: projectcontour#6055

Signed-off-by: Sunjay Bhatia <[email protected]>
sunjayBhatia added a commit that referenced this issue Jan 16, 2024
Temporarily to get CI green

See: #6055

Signed-off-by: Sunjay Bhatia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant