Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TSAN tests pass with the GIL disabled in free-threaded builds #117657

Open
2 of 23 tasks
Tracked by #108219
mpage opened this issue Apr 8, 2024 · 3 comments · Fixed by #123697
Open
2 of 23 tasks
Tracked by #108219

Make TSAN tests pass with the GIL disabled in free-threaded builds #117657

mpage opened this issue Apr 8, 2024 · 3 comments · Fixed by #123697
Assignees
Labels
topic-free-threading type-feature A feature request or enhancement

Comments

@mpage
Copy link
Contributor

mpage commented Apr 8, 2024

Feature or enhancement

We need to make the TSAN tests pass with the GIL disabled before we can disable the GIL by default in free-threaded builds.

This should proceed in two phases:

  1. Add suppressions for existing TSAN warnings.
  2. Triage, fix, and remove the suppressions for the warnings enumerated in (1).

How to run the TSAN tests

  1. Build with TSAN:
    env CC=clang CXX=clang++ ./configure --disable-gil --with-thread-sanitizer --with-pydebug && make -j
  2. Run tests:
    env TSAN_OPTIONS=suppressions=<repo_root>/Tools/tsan/suppressions_free_threading.txt ./python -m test --tsan -j4

Working on a race

  1. Verify that the TSAN tests are passing using the steps from above.
  2. Pick a suppression from the section below and assign it to yourself (edit it and add your username next to it). Some of them may be related. If you find that other suppressions are related to the race you're working on please assign them to yourself or contact their owner if they're already assigned.
  3. Delete the suppression from <repo_root>/Tools/tsan/suppressions_free_threading.txt , run the TSAN tests, and verify that the race is reported by TSAN. You may need to comment out unrelated functions (notably, _PyEval_EvalFrameDefault) in order to reproduce the race.
  4. Fix the race.

Suppressions

Tasks

Linked PRs

@mpage mpage added type-feature A feature request or enhancement topic-free-threading labels Apr 8, 2024
@mpage mpage self-assigned this Apr 8, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Apr 12, 2024
colesbury pushed a commit that referenced this issue Apr 15, 2024
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
colesbury pushed a commit that referenced this issue Apr 15, 2024
…rld()` and `tstate_try_attach()` (#117828)

TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…ython#117736)

Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…the_world()` and `tstate_try_attach()` (python#117828)

TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
DinoV pushed a commit that referenced this issue Apr 17, 2024
…#117954)

Fix data races in the method cache in free-threaded builds

These are technically data races, but I think they're benign (to
the extent that that is actually possible). We update cache entries
non-atomically but read them atomically from another thread, and there's
nothing that establishes a happens-before relationship between the
reads and writes that I can see.
DinoV added a commit to mpage/cpython that referenced this issue Apr 17, 2024
DinoV pushed a commit that referenced this issue Apr 17, 2024
…#117955)

Quiet erroneous TSAN reports of data races in `_PySeqLock`

TSAN reports a couple of data races between the compare/exchange in
`_PySeqLock_LockWrite` and the non-atomic loads in `_PySeqLock_{Abandon,Unlock}Write`.
This is another instance of TSAN incorrectly modeling failed compare/exchange
as a write instead of a load.
DinoV added a commit that referenced this issue Apr 19, 2024
)

Use relaxed load to check if dictkeys are immortal
DinoV added a commit that referenced this issue Apr 19, 2024
…th strong enough semantics (#118111)

Use acquire for load of ob_ref_shared
DinoV pushed a commit that referenced this issue Apr 23, 2024
… `tstate->state` (#118165)

Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state`
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
The `used` field must be written using atomic stores because `set_len`
and iterators may access the field concurrently without holding the
per-object lock.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
The ProcessPoolForkserver combined with resource_tracker starts a thread
after forking, which is not supported by TSan.

Also skip test_multiprocessing_fork for the same reason
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…1551)

The only remaining race in dictobject.c was in _PyDict_CheckConsistency
when the dictionary has shared keys.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
The functions look thread-safe and I haven't seen any warnings issued
when running the tests locally.
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…#121599)

This avoids messages like:

  ThreadSanitizer: starting new threads after multi-threaded fork is not
  supported. Dying (set die_after_fork=0 to override)
colesbury added a commit to colesbury/cpython that referenced this issue Jul 23, 2024
…ded build

The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
colesbury added a commit to colesbury/cpython that referenced this issue Jul 23, 2024
These tests fail when run under thread sanitizer due to the use of fork
and threads..
colesbury added a commit that referenced this issue Jul 23, 2024
These tests fail when run under thread sanitizer due to the use of fork
and threads.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 23, 2024
)

These tests fail when run under thread sanitizer due to the use of fork
and threads.
(cherry picked from commit 64e221d)

Co-authored-by: Sam Gross <[email protected]>
colesbury added a commit that referenced this issue Jul 23, 2024
…122198)

These tests fail when run under thread sanitizer due to the use of fork
and threads.
(cherry picked from commit 64e221d)

Co-authored-by: Sam Gross <[email protected]>
nohlson pushed a commit to nohlson/cpython that referenced this issue Jul 24, 2024
These tests fail when run under thread sanitizer due to the use of fork
and threads.
nohlson pushed a commit to nohlson/cpython that referenced this issue Jul 24, 2024
These tests fail when run under thread sanitizer due to the use of fork
and threads.
colesbury added a commit that referenced this issue Jul 30, 2024
…ild (#122190)

The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 30, 2024
…ded build (pythonGH-122190)

The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
(cherry picked from commit 2b163aa)

Co-authored-by: Sam Gross <[email protected]>
colesbury added a commit that referenced this issue Jul 30, 2024
…aded build (GH-122190) (#122475)

The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
(cherry picked from commit 2b163aa)

Co-authored-by: Sam Gross <[email protected]>
dpdani added a commit to dpdani/cpython that referenced this issue Aug 21, 2024
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…ded build (python#122190)

The adaptive counter doesn't do anything currently in the free-threaded
build and TSan reports a data race due to concurrent modifications to
the counter.
@mpage mpage reopened this Sep 6, 2024
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Sep 9, 2024
Yhg1s added a commit to Yhg1s/cpython that referenced this issue Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants