Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit all built-in modules for thread safety #116738

Open
18 of 120 tasks
Tracked by #108219
swtaarrs opened this issue Mar 13, 2024 · 4 comments
Open
18 of 120 tasks
Tracked by #108219

Audit all built-in modules for thread safety #116738

swtaarrs opened this issue Mar 13, 2024 · 4 comments
Labels
extension-modules C modules in the Modules dir topic-free-threading type-feature A feature request or enhancement

Comments

@swtaarrs
Copy link
Member

swtaarrs commented Mar 13, 2024

Feature or enhancement

Proposal:

We should audit every built-in module for thread safety and make any necessary fixes. This can be done separately from tagging the modules as safe using Py_mod_gil; any data races in built-in modules are considered bugs in the free-threaded build and not incompatibilities.

Below is a list of all source files that contain at least one module definition, as defined by files that contain either a call to PyModule_Create() or an array of PyModuleDef_Slots. If you'd like to help out with this task, take a look at one of the incomplete files. If the conversion is trivial, check it off and attach the PR to this issue. Otherwise, convert the line into an issue and assign it to yourself.

Every module is different, but here are a few high-level points to guide the work:

  1. Use PyMutex as the basic unit of locking.
  2. If you need to hold a lock for a longer period of time, especially across calls that may reenter Python code or acquire other locks, use critical sections.
  3. If the module you're looking at is a thin wrapper around related code elsewhere in CPython (e.g., Modules/_codecsmodule.c and Python/codecs.c), you can also audit/convert the related non-module code. Otherwise, try to contain your work to just the module code, and create separate issues for any dependencies that aren't thread-safe.
  4. Remember that C API functions are generally expected to handle thread-safety internally. C code that operates on Python objects using only C API calls is usually thread-safe by default (but look out for code that reads or modifies PyObject*s in C structs, since that needs synchronization).
  5. Watch out for functions/macros that return borrowed references, like PyList_GetItem() or PyDict_GetItem(). If other threads could have references to the object, prefer functions like PyList_GetItemRef() or PyDict_GetItemRef() that return owned references (and will safely raise an error if the index/key doesn't exist).

Files to audit

Completed Issues

Completed Issues

  1. topic-free-threading type-feature
    erlend-aasland
  2. topic-free-threading type-feature
    mpage
  3. 3.13 extension-modules topic-free-threading type-feature
    chgnrdv
  4. 3.13 extension-modules topic-free-threading type-feature
    tomasr8
  5. extension-modules topic-free-threading type-feature

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://peps.python.org/pep-0703/, especially this section

Linked PRs

@swtaarrs swtaarrs added type-feature A feature request or enhancement extension-modules C modules in the Modules dir topic-free-threading labels Mar 13, 2024
@swtaarrs swtaarrs self-assigned this Mar 13, 2024
colesbury pushed a commit that referenced this issue Apr 11, 2024
A collection of small changes aimed at making the `_abc` module safe to
use in a free-threaded build.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
A collection of small changes aimed at making the `_abc` module safe to
use in a free-threaded build.
colesbury pushed a commit that referenced this issue May 2, 2024
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:

- Move codecs-related state that lives on `PyInterpreterState` to a
  struct declared in `pycore_codecs.h`.

- In free-threaded builds, add a mutex to `codecs_state` to synchronize
  operations on `search_path`. Because `search_path_mutex` is used as a
  normal mutex and not a critical section, we must be extremely careful
  with operations called while holding it.

- The codec registry is explicitly initialized as part of
  `_PyUnicode_InitEncodings` to simplify thread-safety.
SonicField pushed a commit to SonicField/cpython that referenced this issue May 8, 2024
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:

- Move codecs-related state that lives on `PyInterpreterState` to a
  struct declared in `pycore_codecs.h`.

- In free-threaded builds, add a mutex to `codecs_state` to synchronize
  operations on `search_path`. Because `search_path_mutex` is used as a
  normal mutex and not a critical section, we must be extremely careful
  with operations called while holding it.

- The codec registry is explicitly initialized as part of
  `_PyUnicode_InitEncodings` to simplify thread-safety.
@mpage
Copy link
Contributor

mpage commented May 28, 2024

@swtaarrs - Is it OK if I remove you as the owner?

@swtaarrs swtaarrs removed their assignment May 29, 2024
@swtaarrs
Copy link
Member Author

Yep, I won't be doing any more work on this 😞

@eendebakpt
Copy link
Contributor

I audited Modules/_statisticsmodule.c and it is already thread safe afaics (it only contains a single method with C input, conversion is handled by clinic generated code). Do we need to make a PR/issue for this module or can we check it off the list (once someone did a second look)?

@davidhewitt
Copy link
Contributor

I have a crash in the PyO3 test suite comparing ZoneInfo objects which should be equal. From a quick scan of the zoneinfo.c source code it seems to me that it's highly likely the zoneinfo cache is not thread safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants