-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSAN false positives with chromium clang toolchain #953
Comments
You need to add llvm-symbolizer binary to path. Maybe there is also a runtime flag that can be used to specify binary path. And also of course compile with |
I see that |
But! This can also be a real bug in absl mutex. Can you obtain a report with line numbers? |
I tried everything you recommended to get line numbers but was not successful. The llvm-symbolizer is in my path, I compile with -g, and I even set TSAN_SYMBOLIZER_PATH and TSAN_OPTIONS=symbolize=1. For absl, it does indeed have some logic that depends on THREAD_SANITIZER being defined. I do pass -DTHREAD_SANITIZER when I compiled and I did check by adding a line to absl that it auto-detects TSAN support correctly. |
What the threads are doing:
Nothing wrong with the test as far as I can tell. The race seems to trigger in absl' s mutex internals where we do 2 writes of different sizes (4 and 8 bytes). Maybe I'm doing something wrong with how I use / Bazel configure the chromium clang toolchain. Would there be anything to look out for? |
Zooming into mutex internals: T1 calls LockWhen which is where its associated thread identity gets created. The memset in the trace sets this identity object to 0. T0(main) unlocks the mutex and by the time this happens some identities were created with associated waker objects. T0 notifies the waker objects of the Unlock which is a write to waker and thus to thread_identity. I'm probably missing something but I don't see how the memset write is synchronized with the write to waker state which is part of the identity. |
Does symbolization work for the simplest cases? It's not a good idea to throw in all possible unknowns into an equation from the beginning. E.g. does the following work for you?
|
Does adding |
Thanks for the help with symbolization. Turns out I had a typo in my blazerc file (s/asan/tsan) which prevented the -g compile option to be added in tsan mode. I can get line numbers now:
|
Also, adding report_atomic_races=0 to TSAN_OPTIONS prevents the race reports. |
Then I think hypothesis in #953 (comment) is correct. Unfortunately, I am not sure we will have time to fix this in near future. +@kcc
? |
For others running into this issue, one way to silence atomic races reports is by adding the following to your .bazelrc (assuming you run TSAN with --config=tsan):
|
It looks like on Mac there is a specific work around in tsan for |
To be clear, adding |
I think this is a false positive related to the fact that we specifically ignore most synchronization in Mutex code: |
Actually I think that this is an actual race, and TSan is right to report it! I still need to dig a bit deeper, but from what I saw so far there simply is no proper happens-before relation between the initialization of the Waiter (which uses some non-atomic writes to initialize the memory) and the later fetch-add. The fact that the Waiter ctor uses an atomic store to explicitly initialize the futex is irrelevant. This reminds me very much of this issue where we concluded that it actually is a race: #1009 |
I have a similar problem with my grpc-project. I have a suspicion that the problem is related to the abseil-cpp fixed bug |
abseil/abseil-cpp@bb7be49 I do pretty same for our project and now it works, so I think this issue at least for library users is resolved |
I've been trying to extend the Bazel toolchain at https://github.com/vsco/bazel-toolchains (based on Chromium clang toolchain) to add support for sanitizers. I had success with eliminating MSAN false positives after instrumenting libcxx but the same approach failed for TSAN. In particular, TSAN (with instrumented libcxx confirmed via nm) reports races when I try to run Abseil (https://github.com/abseil/abseil-cpp) synchronization tests. I added one report below (sorry not sure how to get line numbers to work).
Any ideas what may be happening here?
The text was updated successfully, but these errors were encountered: