gh-120496: Make enum_iter thread safe #120591

eendebakpt · 2024-06-16T14:35:32Z

We make enum_iter thread-safe using a critical section. We use the same approach as in #119438 to allow for exits in the middle of the critical section. The method enum_iter_long is guarded by the critical section from enum_long.

Without making enum_iter thread safe problems can occur:

For enumerate(range(10)) pairs (n, m) with n unqual to m can be generated
When iterating over sys.maxsize, e.g. enumerate(range(sys.maxsize - 10, sys.maxsize + 10)) an overflow can occur.
As an optimization enum_iter keeps track of the returned tuple. In the free-threaded build multiple threads can operate on the same tuple.

To determine the overhead of making enumerate thread-safe here are some benchmarking results for a single-threaded application. Performance when actually using multiple-threads to readout the enumerate is considered less important.

Benchmark script:

import pyperf
runner = pyperf.Runner()

setup = """

def weighted_sum(k):
    value = 0
    for n, m in enumerate(k):
        value += n * m 
    return value

range10 = range(10)
range1000 = range(1000)
x = list(range(10))
"""

runner.timeit(name="enumerate(range10)", stmt="enumerate(range10)", setup=setup)
runner.timeit(name="list(enumerate(range10))", stmt="list(enumerate(range10))", setup=setup)
runner.timeit(name="list(enumerate(range1000))", stmt="list(enumerate(range1000))", setup=setup)
runner.timeit(name="weighted_sum", stmt="weighted_sum(x)", setup=setup)

Results of main versus free-threading:

enumerate(range10): Mean +- std dev: [enumerate_main] 73.2 ns +- 3.5 ns -> [enumerate_ft] 94.1 ns +- 3.5 ns: 1.29x slower
list(enumerate(range10)): Mean +- std dev: [enumerate_main] 292 ns +- 10 ns -> [enumerate_ft] 351 ns +- 19 ns: 1.20x slower
list(enumerate(range1000)): Mean +- std dev: [enumerate_main] 31.1 us +- 1.6 us -> [enumerate_ft] 32.5 us +- 1.4 us: 1.04x slower
weighted_sum: Mean +- std dev: [enumerate_main] 433 ns +- 17 ns -> [enumerate_ft] 938 ns +- 49 ns: 2.16x slower

Geometric mean: 1.37x slower

Results of free-threading vs. free-threading with this PR:

list(enumerate(range10)): Mean +- std dev: [enumerate_ft] 351 ns +- 19 ns -> [enumerate_ft_pr] 410 ns +- 18 ns: 1.17x slower
list(enumerate(range1000)): Mean +- std dev: [enumerate_ft] 32.5 us +- 1.4 us -> [enumerate_ft_pr] 40.9 us +- 1.7 us: 1.26x slower

Benchmark hidden because not significant (2): enumerate(range10), weighted_sum

Geometric mean: 1.10x slower

The case enumerate(range10) is not affected by the PR (used to check the benchmarking is stable). The cases list(enumerate(range(x))) slow down a bit. More representative is perhaps the weighted_sum benchmark where the enumerate is used a for loop with a minimal amount of work. There the overhead of the locking not significant.

Issue: Sequence iterator thread-safety #120496

eendebakpt · 2024-06-16T14:40:20Z

Objects/enumobject.c

+    }
+    Py_END_CRITICAL_SECTION();
+
+    if (reuse_result) {


Since we hold the only references to result this can be outside the critical section.

eendebakpt · 2024-10-19T20:16:24Z

Closing in favor of #125734

Make enum_iter thread safe

3323cf0

eendebakpt requested a review from ethanfurman as a code owner June 16, 2024 14:35

bedevere-app bot mentioned this pull request Jun 16, 2024

Sequence iterator thread-safety #120496

Closed

bedevere-app bot added the awaiting review label Jun 16, 2024

eendebakpt changed the title ~~gh-120496: make enum_iter thread safe~~ gh-120496: Make enum_iter thread safe Jun 16, 2024

pep8

98f72e6

eendebakpt commented Jun 16, 2024

View reviewed changes

eendebakpt and others added 2 commits June 16, 2024 17:41

fix tests on wasi

ba17086

📜🤖 Added by blurb_it.

07d4574

eendebakpt mentioned this pull request Jun 16, 2024

gh-120608: Make reversed thread-safe #120609

Closed

eendebakpt mentioned this pull request Jul 7, 2024

enum_next and pairwise_next can result in tuple elements with zero reference count in free-threading build #121464

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-120496: Make enum_iter thread safe #120591

gh-120496: Make enum_iter thread safe #120591

eendebakpt commented Jun 16, 2024 •

edited

Loading

eendebakpt Jun 16, 2024

eendebakpt commented Oct 19, 2024

gh-120496: Make enum_iter thread safe #120591

Are you sure you want to change the base?

gh-120496: Make enum_iter thread safe #120591

Conversation

eendebakpt commented Jun 16, 2024 • edited Loading

eendebakpt Jun 16, 2024

Choose a reason for hiding this comment

eendebakpt commented Oct 19, 2024

eendebakpt commented Jun 16, 2024 •

edited

Loading