You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ time ./python nogil_bench.py 1
0.89user 0.00system 0:00.97elapsed 91%CPU
# 0.97s per call
$ time ./python nogil_bench.py 8
11.75user 0.00system 0:01.97elapsed 595%CPU
# 0.24s per call
However, when I modify the benchmark to update an instance member, the time per call skyrockets. Note that the instance isn't shared between threads -- each thread gets its own instance.
importsysfromconcurrent.futuresimportThreadPoolExecutorprint(f"nogil={getattr(sys.flags, 'nogil', False)}")
classFibonacci:
def__init__(self, x):
self.x=xdefcalculate(self, n):
# This line doesn't actually matter for the calculation, but this is what# causes the nogil threaded performance to drop precipitously.self.x+=1ifn<2:
return1returnself.calculate(n-1) +self.calculate(n-2)
deffib(n):
f=Fibonacci(1)
returnf.calculate(n)
threads=8iflen(sys.argv) >1:
threads=int(sys.argv[1])
withThreadPoolExecutor(max_workers=threads) asexecutor:
for_inrange(threads):
executor.submit(lambda: print(fib(34)))
$ time ./python nogil_bench_slow.py 1
2.24user 0.00system 0:02.44elapsed 92%CPU
# 2.44s per call
$ time ./python nogil_bench_slow.py 8
76.39user 150.70system 1:22.25elapsed 276%CPU
# 11.03s per call
Looking at Linux perf, I see that _PyObject_GetInstanceAttribute is 10% of runtime on the slow version, and 0.0% in the fast version, so it is seemingly lock contention getting an instance attribute.
I do not see this pathological behavior on nogil-3.9, so I'm hoping this is just an isolated bug that is fixable independently.
$ time ./python nogil_bench_slow.py 1
1.50user 0.00system 0:01.63elapsed 92%CPU
# 1.63s per call
$ time ./python nogil_bench_slow.py 8
18.40user 0.01system 0:02.91elapsed 632%CPU
# 0.36s per call
Please ignore the fact that that line is meaningless to calculating Fibonacci -- this is my attempt at breaking down pyperformance's raytrace benchmark into a more minimal example. I'm sure you agree that modifying instance members is a pretty common thing to do. :)
Your environment
Debian Buster 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
The text was updated successfully, but these errors were encountered:
Bug report
Using the fibonacci example from the old nogil README, I'm able to see the time-per-call decrease with more threads:
However, when I modify the benchmark to update an instance member, the time per call skyrockets. Note that the instance isn't shared between threads -- each thread gets its own instance.
Looking at Linux perf, I see that
_PyObject_GetInstanceAttribute
is 10% of runtime on the slow version, and 0.0% in the fast version, so it is seemingly lock contention getting an instance attribute.I do not see this pathological behavior on nogil-3.9, so I'm hoping this is just an isolated bug that is fixable independently.
Please ignore the fact that that line is meaningless to calculating Fibonacci -- this is my attempt at breaking down pyperformance's raytrace benchmark into a more minimal example. I'm sure you agree that modifying instance members is a pretty common thing to do. :)
Your environment
Debian Buster 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
The text was updated successfully, but these errors were encountered: