-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race aggregation #1
base: master
Are you sure you want to change the base?
Conversation
@SumedhArani Great job! I'll take a look this week and I'll try to give it a try! |
also I think we should just keep the race-aggregation branch as a development branch, no need for PR now, once everything works than we can do a PR. |
@simoatze: typically having an experimental PR helps ppl to review code with diffing support, inline commenting etc. We won't certainly merge it right away. With this, could you reopen this? |
and if you can keep it updated with the llvm-mirror/compiler-rt. Basically we update both master and race-aggregation branches so they are always allined to the llvm-mirror. |
@dongahn Got it, it makes more sense :) |
Nice work @SumedhArani! A couple of thoughts. Is there a way in which we can actually intercept PrintReport. E.g., using LD_PRELOAD trick of some sort. If we can, perhaps we can keep this logic into Archer itself without having to modify the compiler runtime. This way, we may be able to make this logic work on any Clang/LLVM we install on a system? Two things that caught my eyes.
Is a more relative concept like module+offset or source file+line number also available through ReportSpec? That might be a bit better criteria to use? |
@dongahn, I know that better algorithms could be put to use but I came out with a prototype. My first thoughts were to use a map but tsan has very less documentation and they have written their own data structures (Vector instead of vector) so I found it a lot buggy to get started itself and it shows up with error when I tried to include the STL containers. So for now I've done a linear search with the vector data structure that they have provided. One of the plus points for map are that it if targeted properly, we can even output which races were reduced on which factor. As for point two, I did understand the inhibitions of this type of reduction. Yeah it's possible to do source file + line number. I posted this prototype to verify the design of my code with your vision. If so, I'll have the remaining two scenarios as told in the issue comments up and running with necessary tweaks and also optimisations where possible. And I don't know about LD_PRELOAD! I need to Google it! Thanks, |
And also @dongahn, thanks for the appreciation!!:) It strikes me now that I can probably make use of ADT plugins provided by the llvm framework and make use of a dense map. Let's see how does it come out. I'll keep you updated and will try that denseMap after my today's exam gets over. |
@SumedhArani: did you look at LD_PRELOAD to see if we can replace PrintReport even without having to make any modifications to the compiler runtime source file? I don't recall all the details of LD_PRELOAD but I think it's worth looking into it. |
@dongahn , No. Sorry I was a bit busy with my internal exams happening in my college. I'll look at it and get back to you after three days ? Had a few personal commitments over this weekend. I'll get back to you on Monday with the same. Thanks, |
perfectly fine! Take as much time as you need. |
Thanks! I'll surely look into it once I'm back to college!! Thanks for being so amiable!!:) Cheers, |
@dongahn I was thinking that LD_PRELOAD wouldn't work because tsan is a static library. I think a solution would be to have PrintReport declared as weak symbol and then define a new PrintReport during the instrumentation pass inside the module where the main function is, like I did for the suppressions. I was talking to @SumedhArani and he'll looking into it. |
I see. Yeah it seems the weak symbol trick is the second best then. To support existing clang/llvm installation, though, is there a way to change the attribute of and existing symbol to become weak? (binary patch...) Let me look. If TSan accepts the small patch to change the attribute, this won't be neccessary, of course. |
Yeah after @SumedhArani does the feasibility study, if everything works I can talk to Dmitry and see if they would accept the patch. |
sounds good. In the meantime, https://linux.die.net/man/1/objcopy If you |
Yes it shows up. For the test Sumedh can also just set the function as weak in the source. |
Yes I agree. |
After many long discussions with @simoatze as to how to go about to implement the above suggested technique, I feel that it is not so feasible to get the workaround using weak attributes. Reasons as to why I feel so,
Let me know as to what to proceed with next? Thanks, |
@dongahn I was suggesting @SumedhArani to import the entire "compiler-rt" as a submodule of archer, in this way we can just redefine the __tsan_on_report function and for compiling purposes we will have all the header file we need. The function __tsan_on_report has a (void *) ptr to the RepDesc structure that has all the info for the report reduction, in this way we can just do everything in our shared library and don't need for any edit on the compiler-rt. Does it make sense? |
Nice suggestion! Are the API exposed from compiler-rt pretty stable? Just make sure the new definition of __tsan_on_report is written to only stable APIs which don't change much between different versions of compiler-rt. This way, you don't want to suffer from slight API/ABI mismatches between the imported compiler-rt and the compiler-rt used in a installed version. Also if there are ways to do API "version" checks, make sure you folks use these mechanisms. |
atos is apparently not able to resolve symbol addresses properly on i386-darwin reliably any more. This is causing bot flakiness: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/6841 There have not been any SDK changes on the bot as of late. /Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/llvm/projects/compiler-rt/test/asan/TestCases/Darwin/atos-symbolizer.cc:20:12: error: expected string not found in input // CHECK: #1 0x{{.*}} in main {{.*}}atos-symbolizer.cc:[[@LINE-4]] ^ <stdin>:35:27: note: scanning from here #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56) ^ <stdin>:35:27: note: with expression "@LINE-4" equal to "16" #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56) ^ <stdin>:36:168: note: possible intended match here #1 0xb6f20 in main (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/tools/clang/runtime/compiler-rt-bins/test/asan/I386DarwinConfig/TestCases/Darwin/Output/atos-symbolizer.cc.tmp:i386+0x1f20) git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@304674 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: __builtin_clz used for Log calculation returns an undefined result when argument is 0. I noticed that issue when was testing some fuzzers: ``` /src/libfuzzer/FuzzerTracePC.h:282:33: runtime error: shift exponent 450349 is too large for 32-bit type 'uint32_t' (aka 'unsigned int') #0 0x43d83f in operator() /src/libfuzzer/FuzzerTracePC.h:283:33 #1 0x43d83f in void fuzzer::TracePC::CollectFeatures<fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1>(fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1) const /src/libfuzzer/FuzzerTracePC.h:290 llvm-mirror#2 0x43cbd4 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:445:7 llvm-mirror#3 0x43e5f1 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:706:5 llvm-mirror#4 0x43e9e1 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:739:3 llvm-mirror#5 0x432f8c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:754:6 llvm-mirror#6 0x42ee18 in main /src/libfuzzer/FuzzerMain.cpp:20:10 llvm-mirror#7 0x7f17ffeb182f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) llvm-mirror#8 0x407838 in _start (/out/rotate_fuzzer+0x407838) Reviewers: kcc Reviewed By: kcc Subscribers: llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41457 git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@321211 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Running sanitized 32-bit x86 programs on glibc 2.27 crashes at startup, with: ERROR: AddressSanitizer: SEGV on unknown address 0xf7a8a250 (pc 0xf7f807f4 bp 0xff969fc8 sp 0xff969f7c T16777215) The signal is caused by a WRITE memory access. #0 0xf7f807f3 in _dl_get_tls_static_info (/lib/ld-linux.so.2+0x127f3) #1 0xf7a92599 (/lib/libasan.so.5+0x112599) llvm-mirror#2 0xf7a80737 (/lib/libasan.so.5+0x100737) llvm-mirror#3 0xf7f7e14f in _dl_init (/lib/ld-linux.so.2+0x1014f) llvm-mirror#4 0xf7f6eb49 (/lib/ld-linux.so.2+0xb49) The problem is that glibc changed the calling convention for the GLIBC_PRIVATE symbol that sanitizer uses (even when it should not, GLIBC_PRIVATE is exactly for symbols that can change at any time, be removed etc.), see https://sourceware.org/ml/libc-alpha/2017-08/msg00497.html Fixes google/sanitizers#954 Patch By: Jakub Jelinek Reviewed By: vitalybuka, Lekensteyn Differential Revison: https://reviews.llvm.org/D44623 git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@334363 91177308-0d34-0410-b5e6-96231b3b80d8
Race aggregation based on the memory operation.
Example for in the case of a data race happening at a specific location is shown but it may not show in the case of an array as different locations come in picture.