Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to build with TSan #10553

Closed
kit-ty-kate opened this issue May 21, 2024 · 5 comments · Fixed by #10554 or ocaml/opam-repository#25931
Closed

Failure to build with TSan #10553

kit-ty-kate opened this issue May 21, 2024 · 5 comments · Fixed by #10554 or ocaml/opam-repository#25931

Comments

@kit-ty-kate
Copy link
Member

kit-ty-kate commented May 21, 2024

Using OCaml 5.2 with the TSan option on a Linux/arm64 machine, trying to compile dune.3.15.2 results in the following error:

+ /home/kit_ty_kate/.opam/opam-init/hooks/sandbox.sh "build" "./_boot/dune.exe" "build" "dune.install" "--release" "--profile" "dune-bootstrap" "-j" "9" (CWD=/home/kit_ty_kate/.opam/tsan/.opam-switch/build/dune.3.15.2)
- ==================
- WARNING: ThreadSanitizer: data race (pid=72807)
-   Atomic write of size 8 at 0xffff323edd88 by main thread (mutexes: write M0):
-     #0 oldify_one runtime/minor_gc.c:250 (dune.exe+0x14a9740) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #1 caml_do_local_roots runtime/roots.c:64 (dune.exe+0x14ae8b8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #2 caml_thread_scan_roots /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:179 (dune.exe+0x145d95c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #3 caml_empty_minor_heap_promote runtime/minor_gc.c:615 (dune.exe+0x14aa81c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #4 caml_stw_empty_minor_heap_no_major_slice runtime/minor_gc.c:746 (dune.exe+0x14aab90) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #5 caml_stw_empty_minor_heap runtime/minor_gc.c:777 (dune.exe+0x14aae34) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #6 caml_try_run_on_all_domains_with_spin_work runtime/domain.c:1576 (dune.exe+0x1486900) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #7 caml_try_empty_minor_heap_on_all_domains runtime/minor_gc.c:808 (dune.exe+0x14ab000) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #8 caml_empty_minor_heaps_once runtime/minor_gc.c:829 (dune.exe+0x14ab000)
-     #9 caml_poll_gc_work runtime/domain.c:1745 (dune.exe+0x1486178) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #10 caml_handle_gc_interrupt runtime/domain.c:1778 (dune.exe+0x1486f40) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #11 caml_alloc_small_dispatch runtime/minor_gc.c:855 (dune.exe+0x14ab0e4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #12 caml_alloc_string runtime/alloc.c:185 (dune.exe+0x14789c0) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #13 caml_create_bytes runtime/str.c:78 (dune.exe+0x14b59cc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #14 caml_c_call <null> (dune.exe+0x14bfc88) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #15 camlStdlib__Bytes.sub_305 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/stdlib/bytes.ml:68 (dune.exe+0x1386224) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #16 camlStdlib__Printf.k$27_451 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/stdlib/bytes.ml:73 (dune.exe+0x13ed3b4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #17 camlStdune__Env.fun_2616 otherlibs/stdune/src/env.ml:44 (dune.exe+0x12de234) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #18 camlStdlib__Map.fold_631 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/stdlib/map.ml:329 (dune.exe+0x13bba4c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #19 camlStdlib__Map.fold_631 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/stdlib/map.ml:329 (dune.exe+0x13bba34) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #20 camlStdune__Env.to_unix_1284 otherlibs/stdune/src/env.ml:44 (dune.exe+0x12de10c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #21 camlDune_engine__Process.spawn_inner_4527 src/dune_engine/process.ml:936 (dune.exe+0x114d2f0) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #22 camlDune_engine__Process.fun_4584 src/dune_engine/process.ml:1012 (dune.exe+0x114dcb4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #23 caml_apply2 <null> (dune.exe+0xa00b9c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #24 camlFiber__Scheduler.advance_1192 vendor/fiber/src/scheduler.ml:194 (dune.exe+0x123057c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #25 camlFiber.loop_391 vendor/fiber/src/fiber.ml:15 (dune.exe+0x12227e8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #26 camlDune_engine__Scheduler.run_3186 vendor/fiber/src/fiber.ml:17 (dune.exe+0x115a464) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #27 camlDune_engine__Scheduler.run_and_cleanup_3379 src/dune_engine/scheduler.ml:1141 (dune.exe+0x115adfc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #28 camlDune_engine__Scheduler.go_inner_5501 src/dune_engine/scheduler.ml:1363 (dune.exe+0x115debc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #29 camlCmdliner_term.fun_639 vendor/cmdliner/src/cmdliner_term.ml:24 (dune.exe+0x132b7fc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #30 camlCmdliner_eval.run_parser_576 vendor/cmdliner/src/cmdliner_eval.ml:34 (dune.exe+0x132134c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #31 camlCmdliner_eval.eval_value_inner_1700 vendor/cmdliner/src/cmdliner_eval.ml:202 (dune.exe+0x1322fe4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #32 camlMain.entry bin/main.ml:106 (dune.exe+0xa014b4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #33 caml_program <null> (dune.exe+0x9f890c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #34 caml_start_program <null> (dune.exe+0x14bfe10) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #35 caml_startup_common runtime/startup_nat.c:132 (dune.exe+0x14bf5c4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #36 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0x14bf5c4)
-     #37 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0x14bf6c4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #38 caml_startup runtime/startup_nat.c:144 (dune.exe+0x14bf6c4)
-     #39 caml_main runtime/startup_nat.c:151 (dune.exe+0x14bf6c4)
-     #40 main runtime/main.c:37 (dune.exe+0x9f2480) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
- 
-   Previous read of size 8 at 0xffff323edd88 by thread T3:
-     #0 dune_wait4 otherlibs/stdune/src/wait4_stubs.c:63 (dune.exe+0x146b8d8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #1 caml_c_call <null> (dune.exe+0x14bfc88) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #2 camlStdune__Proc.wait_1084 otherlibs/stdune/src/proc.ml:68 (dune.exe+0x1292954) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #3 camlDune_engine__Scheduler.wait_unix_2604 src/dune_engine/scheduler.ml:600 (dune.exe+0x1156c80) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #4 camlDune_engine__Scheduler.run_2615 src/dune_engine/scheduler.ml:617 (dune.exe+0x1156dc8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #5 camlDune_engine__Scheduler.f_1701 src/dune_engine/scheduler.ml:102 (dune.exe+0x1152f4c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #6 camlThread.fun_768 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/thread.ml:48 (dune.exe+0x134e63c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #7 caml_start_program <null> (dune.exe+0x14bfe10) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #8 caml_callback_exn runtime/callback.c:201 (dune.exe+0x1481cc8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #9 caml_thread_start /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:634 (dune.exe+0x145f3dc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
- 
-   Location is stack of thread T3.
- 
-   Mutex M0 (0x000001ccbef8) created at:
-     #0 pthread_mutex_init <null> (libtsan.so.2+0x5cd98) (BuildId: 7c77b6e125be6c249d0920f10f0c505c1d25ff16)
-     #1 caml_plat_mutex_init runtime/platform.c:57 (dune.exe+0x14ada80) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #2 caml_init_domains runtime/domain.c:943 (dune.exe+0x14856a0) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #3 caml_init_gc runtime/gc_ctrl.c:353 (dune.exe+0x1492cdc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #4 caml_startup_common runtime/startup_nat.c:111 (dune.exe+0x14bf4c8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #5 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0x14bf4c8)
-     #6 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0x14bf6c4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #7 caml_startup runtime/startup_nat.c:144 (dune.exe+0x14bf6c4)
-     #8 caml_main runtime/startup_nat.c:151 (dune.exe+0x14bf6c4)
-     #9 main runtime/main.c:37 (dune.exe+0x9f2480) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
- 
-   Thread T3 (tid=72812, running) created by main thread at:
-     #0 pthread_create <null> (libtsan.so.2+0x5e2bc) (BuildId: 7c77b6e125be6c249d0920f10f0c505c1d25ff16)
-     #1 st_thread_create /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_pthreads.h:57 (dune.exe+0x145f530) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #2 caml_thread_new /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:685 (dune.exe+0x145f530)
-     #3 caml_c_call <null> (dune.exe+0x14bfc88) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #4 camlThread.create_288 /home/kit_ty_kate/.opam/tsan/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/thread.ml:45 (dune.exe+0x134e574) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #5 camlDune_engine__Scheduler.spawn_1697 src/dune_engine/scheduler.ml:107 (dune.exe+0x1152e80) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #6 camlDune_engine__Scheduler.init_2618 src/dune_engine/scheduler.ml:630 (dune.exe+0x1156fc8) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #7 camlDune_engine__Scheduler.fun_4584 src/dune_engine/scheduler.ml:972 (dune.exe+0x1159760) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #8 camlDune_engine__Scheduler.go_inner_5501 src/dune_engine/scheduler.ml:1342 (dune.exe+0x115de48) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #9 camlCmdliner_term.fun_639 vendor/cmdliner/src/cmdliner_term.ml:24 (dune.exe+0x132b7fc) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #10 camlCmdliner_eval.run_parser_576 vendor/cmdliner/src/cmdliner_eval.ml:34 (dune.exe+0x132134c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #11 camlCmdliner_eval.eval_value_inner_1700 vendor/cmdliner/src/cmdliner_eval.ml:202 (dune.exe+0x1322fe4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #12 camlMain.entry bin/main.ml:106 (dune.exe+0xa014b4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #13 caml_program <null> (dune.exe+0x9f890c) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #14 caml_start_program <null> (dune.exe+0x14bfe10) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #15 caml_startup_common runtime/startup_nat.c:132 (dune.exe+0x14bf5c4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #16 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0x14bf5c4)
-     #17 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0x14bf6c4) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
-     #18 caml_startup runtime/startup_nat.c:144 (dune.exe+0x14bf6c4)
-     #19 caml_main runtime/startup_nat.c:151 (dune.exe+0x14bf6c4)
-     #20 main runtime/main.c:37 (dune.exe+0x9f2480) (BuildId: 961f64b40e1ac76fbc665f18e8a98d9072dd741d)
- 
- SUMMARY: ThreadSanitizer: data race runtime/minor_gc.c:250 in oldify_one
- ==================
- ThreadSanitizer: reported 1 warnings
@jmid
Copy link
Contributor

jmid commented May 21, 2024

@OlivierNicole and I previously bumped into this one. The problem is that this line

pid = wait4(Int_val(v_pid), &status, cv_flags, &ru);

uses a local variable v_pid without the runtime lock in place, which is against the rules.
For integer values, sometimes the rules are bent, but that practice has been overruled by Xavier here: ocaml/ocaml#12737 (comment)

@emillon
Copy link
Collaborator

emillon commented May 21, 2024

Thanks for the quick answer @jmid. If I understand correctly, the fix is to move the Int_val call outside the locked section?

@OlivierNicole
Copy link

That’s correct.

emillon added a commit to emillon/dune that referenced this issue May 21, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
emillon added a commit to emillon/dune that referenced this issue May 21, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
@emillon
Copy link
Collaborator

emillon commented May 21, 2024

There seem to be other similar cases, e.g.

caml_release_runtime_system();
DWORD res = WaitForSingleObject(fsenv->signal, INFINITE);
if (res == WAIT_OBJECT_0 && Int_val(v_debounce) > 0) Sleep(Int_val(v_debounce));
caml_acquire_runtime_system();

How does tsan work, will it always warn when such code is executed or is there a random factor?

@OlivierNicole
Copy link

There is a luck factor on some programs as TSan remembers only the last 4 memory accesses to every memory location. In addition, the memory location needs to be effectively accessed without synchronization.

However, this looks like Windows code, and TSan is not supported on Windows as of today.

emillon added a commit that referenced this issue May 23, 2024
Fixes #10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
emillon added a commit to emillon/dune that referenced this issue May 24, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
emillon added a commit to emillon/dune that referenced this issue May 24, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
emillon added a commit that referenced this issue May 24, 2024
Fixes #10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
emillon added a commit to emillon/opam-repository that referenced this issue May 24, 2024
CHANGES:

### Fixed

- Fix interpretation of `exists_if` predicate in `META` files of installed
  libraries containing more than one element. (ocaml/dune#10564, fixes ocaml/dune#10563, @dbuenzli,
  @nojb)

- Fix TSAN warning in wait4 stubs (ocaml/dune#10554, fixes ocaml/dune#10553, @emillon)
MA0010 pushed a commit to MA0010/dune that referenced this issue Jun 5, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
avsm pushed a commit to avsm/opam-repository that referenced this issue Sep 5, 2024
CHANGES:

### Fixed

- Fix interpretation of `exists_if` predicate in `META` files of installed
  libraries containing more than one element. (ocaml/dune#10564, fixes ocaml/dune#10563, @dbuenzli,
  @nojb)

- Fix TSAN warning in wait4 stubs (ocaml/dune#10554, fixes ocaml/dune#10553, @emillon)
anmonteiro pushed a commit to anmonteiro/dune that referenced this issue Nov 17, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants