Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make _threadmodule.c thread-safe in --disable-gil builds #114271

Closed
Tracked by #116738 ...
mpage opened this issue Jan 18, 2024 · 3 comments
Closed
Tracked by #116738 ...

Make _threadmodule.c thread-safe in --disable-gil builds #114271

mpage opened this issue Jan 18, 2024 · 3 comments
Assignees
Labels
topic-free-threading type-feature A feature request or enhancement

Comments

@mpage
Copy link
Contributor

mpage commented Jan 18, 2024

Feature or enhancement

Proposal:

Make the functionality in _threadmodule.c thread-safe in free-threaded builds.

Commits from nogil-3.12: https://github.com/colesbury/nogil-3.12/commits/nogil-3.12/Modules/_threadmodule.c

Context from @colesbury:

  • Ignore the CriticalLock commit; this was an idea that didn't work and was prone to deadlock.
  • Ignore the weakrefs change.
  • Unclear if we need to deal with _tstate_lock.
  • For the other changes, use your judgement. You may be able to put together smaller changes
  • _Py_ThreadId() is currently only available in the free-threaded build, so the recursive lock implementation can probably (?) just stick with the existing PyThread_get_thread_ident().

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

@mpage mpage added the type-feature A feature request or enhancement label Jan 18, 2024
@mpage
Copy link
Contributor Author

mpage commented Jan 19, 2024

@colesbury - Would you assign this to me, please?

mpage added a commit to mpage/cpython that referenced this issue Feb 5, 2024
ericsnowcurrently pushed a commit that referenced this issue Feb 12, 2024
…e-threaded builds (gh-115093)

Use atomics to mutate PyInterpreterState.threads.count.
fsc-eriker pushed a commit to fsc-eriker/cpython that referenced this issue Feb 14, 2024
…in free-threaded builds (pythongh-115093)

Use atomics to mutate PyInterpreterState.threads.count.
mpage added a commit to mpage/cpython that referenced this issue Feb 16, 2024
mpage added a commit to mpage/cpython that referenced this issue Feb 16, 2024
mpage added a commit to mpage/cpython that referenced this issue Feb 16, 2024
mpage added a commit to mpage/cpython that referenced this issue Feb 16, 2024
mpage added a commit to mpage/cpython that referenced this issue Feb 16, 2024
colesbury pushed a commit that referenced this issue Feb 16, 2024
…115102)

The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.

The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.
gpshead pushed a commit that referenced this issue Mar 1, 2024
…uilds (GH-115190)

Make `_thread.ThreadHandle` thread-safe in free-threaded builds

We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.

The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.

The internal thread handle `detach()` method has been removed.
woodruffw pushed a commit to woodruffw-forks/cpython that referenced this issue Mar 4, 2024
…ilds (python#115102)

The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.

The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.
woodruffw pushed a commit to woodruffw-forks/cpython that referenced this issue Mar 4, 2024
…aded builds (pythonGH-115190)

Make `_thread.ThreadHandle` thread-safe in free-threaded builds

We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.

The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.

The internal thread handle `detach()` method has been removed.
mpage added a commit to mpage/cpython that referenced this issue Mar 5, 2024
mpage added a commit to mpage/cpython that referenced this issue Mar 6, 2024
…in free-threaded builds

Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.

There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
colesbury pushed a commit that referenced this issue Mar 6, 2024
…116433)

Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.

There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
mpage added a commit to mpage/cpython that referenced this issue Mar 8, 2024
mpage added a commit to mpage/cpython that referenced this issue Mar 11, 2024
mpage added a commit to mpage/cpython that referenced this issue Mar 15, 2024
pitrou added a commit to mpage/cpython that referenced this issue Mar 16, 2024
pitrou added a commit to mpage/cpython that referenced this issue Mar 16, 2024
pitrou added a commit that referenced this issue Mar 16, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201
[^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115
[^3]: mpage@8194653

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <[email protected]>
vstinner pushed a commit to vstinner/cpython that referenced this issue Mar 20, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201
[^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115
[^3]: mpage@8194653

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <[email protected]>
adorilson pushed a commit to adorilson/cpython that referenced this issue Mar 25, 2024
…aded builds (pythonGH-115190)

Make `_thread.ThreadHandle` thread-safe in free-threaded builds

We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.

The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.

The internal thread handle `detach()` method has been removed.
adorilson pushed a commit to adorilson/cpython that referenced this issue Mar 25, 2024
…lds (python#116433)

Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.

There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
adorilson pushed a commit to adorilson/cpython that referenced this issue Mar 25, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201
[^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115
[^3]: mpage@8194653

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <[email protected]>
@colesbury
Copy link
Contributor

@mpage is this complete?

@mpage
Copy link
Contributor Author

mpage commented Apr 10, 2024

Yep. I filed #117721 to track the follow up work we discussed for _thread.lock.release().

@mpage mpage closed this as completed Apr 10, 2024
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…ilds (python#115102)

The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.

The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…aded builds (pythonGH-115190)

Make `_thread.ThreadHandle` thread-safe in free-threaded builds

We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.

The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.

The internal thread handle `detach()` method has been removed.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…lds (python#116433)

Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.

There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201
[^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115
[^3]: mpage@8194653

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants