Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PySequence_Fast needs new macros to be safe in a nogil world #119247

Open
MojoVampire opened this issue May 20, 2024 · 3 comments
Open

PySequence_Fast needs new macros to be safe in a nogil world #119247

MojoVampire opened this issue May 20, 2024 · 3 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-feature A feature request or enhancement

Comments

@MojoVampire
Copy link
Contributor

MojoVampire commented May 20, 2024

Feature or enhancement

Proposal:

Right now, most uses of PySequence_Fast are invalid in a nogil context when it is passed an existing list; PySequence_FAST_ITEMS returns a reference to the internal array of PyObject*s that can be resized at any time if other threads add or delete items, PySequence_FAST_GET_SIZE similarly reports a size that is invalid an instant after it's reported. Similarly, if individual items are replaced without changing size, you'd have similar issues.

But when the argument passed is a tuple (incref-ed and returned unchanged, but safe due to immutability) or any non-list type (converted to new list) no lock is needed. Per conversation with Dino, going to create macros, to be called after a call to PySequence_Fast, to conditionally lock and unlock the original list when applicable, while avoiding locks in all other cases, before any other PySequence* APIs are used.

Preliminary (subject to bike-shedding) macro names are:

Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST
Py_END_CRITICAL_SECTION_SEQUENCE_FAST

both defined in pycore_critical_section.h.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

Discussion occurred with Dino during CPython core sprints.

Linked PRs

@MojoVampire MojoVampire added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading labels May 20, 2024
@MojoVampire MojoVampire self-assigned this May 20, 2024
colesbury pushed a commit that referenced this issue May 22, 2024
…build (#119315)

Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 22, 2024
…eaded build (pythonGH-119315)

Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
(cherry picked from commit baf347d)

Co-authored-by: Josh {*()} Rosenberg <[email protected]>
@MojoVampire
Copy link
Contributor Author

Hey @DinoV and @colesbury: Now that the base patch is merged, should I:

  1. Close this as complete and open new issue(s) to apply similar fixes to other uses of PySequence_Fast* APIs?
  2. Reuse this issue for similar changes?
  3. Skip working on this task specifically, and instead make more core Python objects work free-threaded (e.g. I started working on bytearray to make its use of PySequence_Fast* stuff safe, but it's clearly not free-threaded safe in general yet, so maybe I just work on making bytearray as a whole free-threaded safe?).

@colesbury
Copy link
Contributor

Hi Josh - thanks for fixing str.join. We typically reuse the same issue for related changes. As to whether to continue with PySequence_Fast* or work other changes like bytearray, that's up to you -- both sound really useful.

colesbury pushed a commit that referenced this issue May 22, 2024
…readed build (GH-119315) (#119419)

Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
(cherry picked from commit baf347d)

Co-authored-by: Josh {*()} Rosenberg <[email protected]>
@eendebakpt
Copy link
Contributor

@MojoVampire @colesbury In the _json module the PySequence_Fast API is used in a way that is not thread-safe. I am working on a PR, but could use some advice. The new macro Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST starts with a { (I suspect to avoid name clashes), but that makes it hard to use when the critical section used return statements or goto statements.

In #119438 I added an extra macro Py_RETURN_CRITICAL_SECTION_SEQUENCE_FAST to address this, but I am not convinced this is a nice solution. Another option would be to refactor the code in question to eliminate the goto and return statements inside the critical part, but that will result in quite some churn of the code (and perhaps less readable code)

estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…eaded build (python#119315)

Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants