-
Notifications
You must be signed in to change notification settings - Fork 31
write: fix test races and enable CI -race
flag
#500
Conversation
writer/payload.go
Outdated
@@ -336,7 +336,7 @@ func (s *QueuablePayloadSender) flushQueue() error { | |||
func (s *QueuablePayloadSender) removeQueuedPayload(e *list.Element) *list.Element { | |||
next := e.Next() | |||
payload := e.Value.(*Payload) | |||
s.currentQueuedSize -= int64(len(payload.Bytes)) | |||
atomic.AddInt64(&s.currentQueuedSize, -int64(len(payload.Bytes))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't find a better way here to stop the race from happening. I think the "correct" way would be to refactor a bit and improve the overall testing of the package, but that seems like a rather large job for this PR alone. The scope here was simply to add the -race
flag into CI. I do plan however to separately improve the testing and perhaps help get rid of the syncBarrier
channel too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's causing the race here? The idea when I implemented this was that currentQueuedSize
was only accessed from the sender routine.
EDIT: Oh I see, it's because you started accessing it from the test. Could we remove the atomic and use the syncBarrier
on the sender to get around the race? We'll probably also have to drop the NumQueuedPayloads
method or then we'll have to add a lock to s.QueuedPayloads
.
statsd/statsd.go
Outdated
|
||
// Client returns the global StatsClient. | ||
func Client() StatsClient { | ||
mu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of those cases where 100% race correctness bugs me. In normal operation you initialize Client during app bootstrap and never set it again so we could access it lock free everywhere.
However, because we call SetClient
on tests, we now have to introduce a lock on what was before a lock-free operation. I'm not sure if the Go compiler is smart enough to realize it can ditch the lock outside of test runs but I suspect this could have a non-negligible toll on hot and loopy code paths where Client()
is called often.
I wonder if there's a different way to appease -race
in these cases by triggering some global coroutine sync (or something like C/C++ macros, where we could disable some locks at compile time if we detect we are not compiling for a test run).
Anyway, -race
rant mode off 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it's not nice to introduce locks here for the sake of tests. I should've spent more time on this. I'll try again tomorrow by using TestMain(m *testing.M)
and initializing the mock statsd client only once. That should work and it should eliminate the race.
writer/multi_writer_test.go
Outdated
assert.Equal(t, "ping2", msg2.(string)) | ||
assert.ElementsMatch(t, | ||
[]string{"ping1", "ping2"}, | ||
[]string{(<-multi.mch).(string), (<-multi.mch).(string)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is execution order guaranteed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the slice expression is not evaluated until both receives complete. As for assert.ElementsMatch
matches elements regardless of their order.
writer/payload.go
Outdated
@@ -336,7 +336,7 @@ func (s *QueuablePayloadSender) flushQueue() error { | |||
func (s *QueuablePayloadSender) removeQueuedPayload(e *list.Element) *list.Element { | |||
next := e.Next() | |||
payload := e.Value.(*Payload) | |||
s.currentQueuedSize -= int64(len(payload.Bytes)) | |||
atomic.AddInt64(&s.currentQueuedSize, -int64(len(payload.Bytes))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's causing the race here? The idea when I implemented this was that currentQueuedSize
was only accessed from the sender routine.
EDIT: Oh I see, it's because you started accessing it from the test. Could we remove the atomic and use the syncBarrier
on the sender to get around the race? We'll probably also have to drop the NumQueuedPayloads
method or then we'll have to add a lock to s.QueuedPayloads
.
3192e40
to
cc28b52
Compare
@AlexJF PTAL, I've used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.