-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/bucket: fix data race #6575
Conversation
The matchers slice is now sorted in each call but that introduces a data race because the slice is shared between all calls. Do the sorting once on the outermost function. Signed-off-by: Giedrius Statkevičius <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @GiedriusS, can you please explain more, why this change fixed the issue? Thanks
That would require some digging but from my adhoc tests I cannot reproduce the original problem anymore. I guess I can try to add tests that loop infinitely and see what happens. Maybe this can be done separately? The data race needs to be fixed either way 😄 |
@GiedriusS I am wondering if this also fixes #6495 |
:O I forgot about that issue. I think this also should fix it because in my tests I was pressing execute many times and very rarely the values were completely different even though the query is the same. I'll work on tests tomorrow |
Found another race in chunks reader. But that's a small one and a separate issue. I just opened a separate pr #6578 |
cc @rabenhorst potentially a fix for #6495. |
Yeah I think I can reproduce the data race using the test updated yeya24@be85c85 with multiple matchers. Data race is at https://github.com/thanos-io/thanos/blob/main/pkg/store/bucket.go#L2208 and other place we need to read the matchers.
|
Signed-off-by: Giedrius Statkevičius <[email protected]>
Added a test for this exact race. The test fails on the main branch immediately almost:
However, I deployed this prod and #6545 still occurs. I couldn't reproduce #6495 with this. I also tried writing this test using BucketStore but with it, I wasn't able to reproduce the race :/ probably race detector slows down the code so much that the first sorting happens before all of the others. |
@GiedriusS I will merge this pr first to fix the data race. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
* store/bucket: remove sort.Slice data race The matchers slice is now sorted in each call but that introduces a data race because the slice is shared between all calls. Do the sorting once on the outermost function. Signed-off-by: Giedrius Statkevičius <[email protected]> * store: add test for ExpandPostings() race Signed-off-by: Giedrius Statkevičius <[email protected]> --------- Signed-off-by: Giedrius Statkevičius <[email protected]>
Add a fix for issue #6545. Unfortunately, we don't have tests with
-race
at the moment but I tested this locally. Somehow the data race introduces inconsistencies in the retrieved data. I couldn't reproduce anymore after this fix.