Race aggregation #1

SumedhArani · 2017-02-13T23:27:31Z

Race aggregation based on the memory operation.

Example for in the case of a data race happening at a specific location is shown but it may not show in the case of an array as different locations come in picture.

simoatze · 2017-02-13T23:40:11Z

@SumedhArani Great job! I'll take a look this week and I'll try to give it a try!
Also, let's move the conversation and issue discussions in here so I'll close or the issure in the archer repository. Thanks!

simoatze · 2017-02-13T23:42:39Z

also I think we should just keep the race-aggregation branch as a development branch, no need for PR now, once everything works than we can do a PR.

dongahn · 2017-02-13T23:44:14Z

@simoatze: typically having an experimental PR helps ppl to review code with diffing support, inline commenting etc. We won't certainly merge it right away. With this, could you reopen this?

simoatze · 2017-02-13T23:44:26Z

and if you can keep it updated with the llvm-mirror/compiler-rt. Basically we update both master and race-aggregation branches so they are always allined to the llvm-mirror.

simoatze · 2017-02-13T23:45:05Z

@dongahn Got it, it makes more sense :)

SumedhArani · 2017-02-14T00:16:32Z

@simoatze Thank you! :)
I'll move the conversation here from next time onwards and when necessary.
Take your own time to review it cause anyways I'm busy with my college internal exams this week.

@dongahn
I'll post in an experimental PR whenever required.

Thank you,
Sumedh.

dongahn · 2017-02-14T03:52:40Z

Nice work @SumedhArani!

A couple of thoughts. Is there a way in which we can actually intercept PrintReport. E.g., using LD_PRELOAD trick of some sort. If we can, perhaps we can keep this logic into Archer itself without having to modify the compiler runtime. This way, we may be able to make this logic work on any Clang/LLVM we install on a system?

Two things that caught my eyes.

There is a linear search in searching for previously seen races. This can become expensive if there is larger number of races. Hash container (e.g., map in C++ STL) is provided, this can be a better data structure to use which allow you to do this operation in O (1).
Doing a reduction on an absolute address will work for reducing races within a single process but won't work across difference processes of an MPI application. Even if a same variable is used, it can be resolved to different addresses across different MPI processes. Further, this certainly won't work on addresses from multiple programs.

Is a more relative concept like module+offset or source file+line number also available through ReportSpec? That might be a bit better criteria to use?

SumedhArani · 2017-02-14T04:47:28Z

@dongahn, I know that better algorithms could be put to use but I came out with a prototype. My first thoughts were to use a map but tsan has very less documentation and they have written their own data structures (Vector instead of vector) so I found it a lot buggy to get started itself and it shows up with error when I tried to include the STL containers.

So for now I've done a linear search with the vector data structure that they have provided.

One of the plus points for map are that it if targeted properly, we can even output which races were reduced on which factor.

As for point two, I did understand the inhibitions of this type of reduction.

Yeah it's possible to do source file + line number.

I posted this prototype to verify the design of my code with your vision. If so, I'll have the remaining two scenarios as told in the issue comments up and running with necessary tweaks and also optimisations where possible.

And I don't know about LD_PRELOAD! I need to Google it!

Thanks,
Sumedh.

SumedhArani · 2017-02-14T04:48:11Z

And also @dongahn, thanks for the appreciation!!:)

It strikes me now that I can probably make use of ADT plugins provided by the llvm framework and make use of a dense map. Let's see how does it come out.

I'll keep you updated and will try that denseMap after my today's exam gets over.

…y other files

dongahn · 2017-02-18T01:53:56Z

@SumedhArani: did you look at LD_PRELOAD to see if we can replace PrintReport even without having to make any modifications to the compiler runtime source file? I don't recall all the details of LD_PRELOAD but I think it's worth looking into it.

SumedhArani · 2017-02-18T02:22:40Z

@dongahn , No. Sorry I was a bit busy with my internal exams happening in my college. I'll look at it and get back to you after three days ? Had a few personal commitments over this weekend.

I'll get back to you on Monday with the same.

Thanks,
Sumedh.

dongahn · 2017-02-18T02:28:05Z

perfectly fine! Take as much time as you need.

SumedhArani · 2017-02-18T13:31:55Z

Thanks! I'll surely look into it once I'm back to college!! Thanks for being so amiable!!:)

Cheers,
Sumedh

simoatze · 2017-02-21T01:11:49Z

@dongahn I was thinking that LD_PRELOAD wouldn't work because tsan is a static library. I think a solution would be to have PrintReport declared as weak symbol and then define a new PrintReport during the instrumentation pass inside the module where the main function is, like I did for the suppressions. I was talking to @SumedhArani and he'll looking into it.

dongahn · 2017-02-21T01:20:37Z

I see. Yeah it seems the weak symbol trick is the second best then.

To support existing clang/llvm installation, though, is there a way to change the attribute of and existing symbol to become weak? (binary patch...) Let me look.

If TSan accepts the small patch to change the attribute, this won't be neccessary, of course.

simoatze · 2017-02-21T01:21:59Z

Yeah after @SumedhArani does the feasibility study, if everything works I can talk to Dmitry and see if they would accept the patch.

dongahn · 2017-02-21T01:37:23Z

sounds good. In the meantime, objcopy has --weaken-symbol option which may be able to rewrite the attrubute.

https://linux.die.net/man/1/objcopy

If you nm on this TSan runtime library, does this PrintReport function at least shows up as an externally visible symbol though?

simoatze · 2017-02-21T01:57:20Z

Yes it shows up. For the test Sumedh can also just set the function as weak in the source.

dongahn · 2017-02-21T03:01:56Z

Yes I agree.

SumedhArani · 2017-02-26T03:40:53Z

After many long discussions with @simoatze as to how to go about to implement the above suggested technique, I feel that it is not so feasible to get the workaround using weak attributes.

Reasons as to why I feel so,

__tsan_on_report_(the function marked as a weak symbol) does have a pointer to an object of class DescRep which internally has many members of different objects.
Because of the above point, there will be a significant amount of dependent header files(in the tsan library) and their corresponding implementation to be imported (copied) into archer.
This so because, the tsan implementation involves their own data structures and have not made use of data structures available in STL library.
Even if decided to be implemented, we would have to update every now and then
Also we would have to copy all of PrintReport as there is a call to it before the __tsan_on_report is called. (__tsan_on_report and PrintReport are called for every race detected) As found out by @simoatze we can set the log path to /dev/null of PrintReport call instead of the stderr to nullify the PrintReport call that happens before __tsan_on_report
Hence, will it not become almost equivalent to shipping our own tsan library ? and defeat the purpose ?

Let me know as to what to proceed with next?

Thanks,
Sumedh.

simoatze · 2017-03-03T00:08:57Z

@dongahn I was suggesting @SumedhArani to import the entire "compiler-rt" as a submodule of archer, in this way we can just redefine the __tsan_on_report function and for compiling purposes we will have all the header file we need. The function __tsan_on_report has a (void *) ptr to the RepDesc structure that has all the info for the report reduction, in this way we can just do everything in our shared library and don't need for any edit on the compiler-rt. Does it make sense?

dongahn · 2017-03-03T01:25:01Z

Nice suggestion! Are the API exposed from compiler-rt pretty stable? Just make sure the new definition of __tsan_on_report is written to only stable APIs which don't change much between different versions of compiler-rt. This way, you don't want to suffer from slight API/ABI mismatches between the imported compiler-rt and the compiler-rt used in a installed version. Also if there are ways to do API "version" checks, make sure you folks use these mechanisms.

atos is apparently not able to resolve symbol addresses properly on i386-darwin reliably any more. This is causing bot flakiness: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/6841 There have not been any SDK changes on the bot as of late. /Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/llvm/projects/compiler-rt/test/asan/TestCases/Darwin/atos-symbolizer.cc:20:12: error: expected string not found in input // CHECK: #1 0x{{.*}} in main {{.*}}atos-symbolizer.cc:[[@LINE-4]] ^ <stdin>:35:27: note: scanning from here #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56) ^ <stdin>:35:27: note: with expression "@LINE-4" equal to "16" #0 0x112f56 in wrap_free (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/lib/clang/5.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:i386+0x56f56) ^ <stdin>:36:168: note: possible intended match here #1 0xb6f20 in main (/Users/buildslave/jenkins/sharedspace/clang-stage1-cmake-RA_workspace/clang-build/tools/clang/runtime/compiler-rt-bins/test/asan/I386DarwinConfig/TestCases/Darwin/Output/atos-symbolizer.cc.tmp:i386+0x1f20) git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@304674 91177308-0d34-0410-b5e6-96231b3b80d8

Summary: __builtin_clz used for Log calculation returns an undefined result when argument is 0. I noticed that issue when was testing some fuzzers: ``` /src/libfuzzer/FuzzerTracePC.h:282:33: runtime error: shift exponent 450349 is too large for 32-bit type 'uint32_t' (aka 'unsigned int') #0 0x43d83f in operator() /src/libfuzzer/FuzzerTracePC.h:283:33 #1 0x43d83f in void fuzzer::TracePC::CollectFeatures<fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1>(fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*)::$_1) const /src/libfuzzer/FuzzerTracePC.h:290 llvm-mirror#2 0x43cbd4 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:445:7 llvm-mirror#3 0x43e5f1 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:706:5 llvm-mirror#4 0x43e9e1 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:739:3 llvm-mirror#5 0x432f8c in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:754:6 llvm-mirror#6 0x42ee18 in main /src/libfuzzer/FuzzerMain.cpp:20:10 llvm-mirror#7 0x7f17ffeb182f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) llvm-mirror#8 0x407838 in _start (/out/rotate_fuzzer+0x407838) Reviewers: kcc Reviewed By: kcc Subscribers: llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41457 git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@321211 91177308-0d34-0410-b5e6-96231b3b80d8

Summary: Running sanitized 32-bit x86 programs on glibc 2.27 crashes at startup, with: ERROR: AddressSanitizer: SEGV on unknown address 0xf7a8a250 (pc 0xf7f807f4 bp 0xff969fc8 sp 0xff969f7c T16777215) The signal is caused by a WRITE memory access. #0 0xf7f807f3 in _dl_get_tls_static_info (/lib/ld-linux.so.2+0x127f3) #1 0xf7a92599 (/lib/libasan.so.5+0x112599) llvm-mirror#2 0xf7a80737 (/lib/libasan.so.5+0x100737) llvm-mirror#3 0xf7f7e14f in _dl_init (/lib/ld-linux.so.2+0x1014f) llvm-mirror#4 0xf7f6eb49 (/lib/ld-linux.so.2+0xb49) The problem is that glibc changed the calling convention for the GLIBC_PRIVATE symbol that sanitizer uses (even when it should not, GLIBC_PRIVATE is exactly for symbols that can change at any time, be removed etc.), see https://sourceware.org/ml/libc-alpha/2017-08/msg00497.html Fixes google/sanitizers#954 Patch By: Jakub Jelinek Reviewed By: vitalybuka, Lekensteyn Differential Revison: https://reviews.llvm.org/D44623 git-svn-id: https://llvm.org/svn/llvm-project/compiler-rt/trunk@334363 91177308-0d34-0410-b5e6-96231b3b80d8

SumedhArani added 2 commits February 14, 2017 04:48

Working aggregator for mem loc

be11cc3

Added a missing line from the orig rep

72c2eff

simoatze closed this Feb 13, 2017

simoatze reopened this Feb 13, 2017

Same prototype maintained: Changes now only in tsan_report and not an…

10159ae

…y other files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race aggregation #1

Race aggregation #1

SumedhArani commented Feb 13, 2017

simoatze commented Feb 13, 2017

simoatze commented Feb 13, 2017

dongahn commented Feb 13, 2017

simoatze commented Feb 13, 2017

simoatze commented Feb 13, 2017

SumedhArani commented Feb 14, 2017

dongahn commented Feb 14, 2017

SumedhArani commented Feb 14, 2017 •

edited

Loading

SumedhArani commented Feb 14, 2017 •

edited

Loading

dongahn commented Feb 18, 2017

SumedhArani commented Feb 18, 2017

dongahn commented Feb 18, 2017

SumedhArani commented Feb 18, 2017

simoatze commented Feb 21, 2017

dongahn commented Feb 21, 2017

simoatze commented Feb 21, 2017 •

edited

Loading

dongahn commented Feb 21, 2017

simoatze commented Feb 21, 2017

dongahn commented Feb 21, 2017

SumedhArani commented Feb 26, 2017

simoatze commented Mar 3, 2017

dongahn commented Mar 3, 2017

Race aggregation #1

Are you sure you want to change the base?

Race aggregation #1

Conversation

SumedhArani commented Feb 13, 2017

simoatze commented Feb 13, 2017

simoatze commented Feb 13, 2017

dongahn commented Feb 13, 2017

simoatze commented Feb 13, 2017

simoatze commented Feb 13, 2017

SumedhArani commented Feb 14, 2017

dongahn commented Feb 14, 2017

SumedhArani commented Feb 14, 2017 • edited Loading

SumedhArani commented Feb 14, 2017 • edited Loading

dongahn commented Feb 18, 2017

SumedhArani commented Feb 18, 2017

dongahn commented Feb 18, 2017

SumedhArani commented Feb 18, 2017

simoatze commented Feb 21, 2017

dongahn commented Feb 21, 2017

simoatze commented Feb 21, 2017 • edited Loading

dongahn commented Feb 21, 2017

simoatze commented Feb 21, 2017

dongahn commented Feb 21, 2017

SumedhArani commented Feb 26, 2017

simoatze commented Mar 3, 2017

dongahn commented Mar 3, 2017

SumedhArani commented Feb 14, 2017 •

edited

Loading

SumedhArani commented Feb 14, 2017 •

edited

Loading

simoatze commented Feb 21, 2017 •

edited

Loading