Run the Pyston benchmarks and compare how we're doing with those versus CPython 3.10. #164

gvanrossum · 2021-12-08T23:45:46Z

This is a high-level goal ("epic"?). Tasks like refactoring the PyPerformance suite and making changes to the Pyston benchmarks so they run under 3.10 and 3.11 (subtasks of #60) are supportive of this goal. [Or is #60 just the overall "epic" goal? Then we can close/delete this issue.]

Ideally it would be easy to run the Pyston benchmarks for an arbitrary PR or commit.

ericsnowcurrently · 2022-02-01T22:41:33Z

This is blocked by ~~#254~~ ~~#257~~ #175.

ericsnowcurrently · 2022-02-02T21:29:05Z

FYI, I have the pyston benchmarks running on our internal benchmarking machine. However, only 3 of the 12 are currently succeeding. (See #257.)

For now, here's the results for those three, comparing the 3.10.2 release against main (bebaa95fd0):

Faster (3):
- thrift: 1.02 ms +- 0.01 ms -> 813 us +- 11 us: 1.26x faster
- pycparser: 1.48 sec +- 0.03 sec -> 1.18 sec +- 0.02 sec: 1.25x faster
- json: 5.51 ms +- 0.16 ms -> 4.94 ms +- 0.17 ms: 1.12x faster

Geometric mean: 1.21x faster

So in those 3 benchmarks we're showing a decent speedup at least.

gvanrossum · 2022-04-18T20:16:06Z

Interestingly those Pyston benchmarks that will actually run end up being roughly the same percentage faster as the PyPerformance benchmarks (comparing the results you posted for 3.10.4 to 3.11.0a7, I see a geometric mean of 22% faster).

Compare this to the speedup claimed by Pyston: 34% on Intel, 30% on ARM. Though that's comparing 3.8 to Pyston, so not quite apples to apples. Also, we only run three of the 12 Pyston benchmarks, so there may be surprises ahead as we get more of those to work.

To make comparisons easier, perhaps we could add the following:

Add the 3.10.4 results for running the Pyston benchmarks (run with PyPerformance and CPython) to the benchmark-results directory (should be easy).
Add baseline results for CPython 3.8 (ideally for both the Pyston and PyPerformance benchmark suites).

We could also add results reported by Pyston, if we can get them in JSON format.

gvanrossum · 2022-04-18T20:24:49Z

(The results reported by Pyston can never be compared directly because the hardware/OS they used to run their benchmarks are different from ours. Comparing their baseline results for 3.8 our results for 3.8 (for the same benchmarks) might help knowing how much bias there is due to this -- though it may well differ per benchmark as hardware features like branch prediction may favor some benchmarks over others.)

gramster added this to Fancy CPython Board Jan 10, 2022

gramster moved this to Todo in Fancy CPython Board Jan 10, 2022

ericsnowcurrently added the blocked label Feb 1, 2022

ericsnowcurrently mentioned this issue Feb 8, 2022

Pull pyston benchmarks into pyperformance. #60

Closed

12 tasks

mdboom added the benchmarking Anything related to measurement: Adding new benchmarks, benchmarking infrastructure etc. label Aug 2, 2022

mdboom closed this as completed Feb 28, 2023

github-project-automation bot moved this from Todo to Done in Fancy CPython Board Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run the Pyston benchmarks and compare how we're doing with those versus CPython 3.10. #164

Run the Pyston benchmarks and compare how we're doing with those versus CPython 3.10. #164

gvanrossum commented Dec 8, 2021 •

edited

Loading

ericsnowcurrently commented Feb 1, 2022 •

edited

Loading

ericsnowcurrently commented Feb 2, 2022

gvanrossum commented Apr 18, 2022 •

edited

Loading

gvanrossum commented Apr 18, 2022

Run the Pyston benchmarks and compare how we're doing with those versus CPython 3.10. #164

Run the Pyston benchmarks and compare how we're doing with those versus CPython 3.10. #164

Comments

gvanrossum commented Dec 8, 2021 • edited Loading

ericsnowcurrently commented Feb 1, 2022 • edited Loading

ericsnowcurrently commented Feb 2, 2022

gvanrossum commented Apr 18, 2022 • edited Loading

gvanrossum commented Apr 18, 2022

gvanrossum commented Dec 8, 2021 •

edited

Loading

ericsnowcurrently commented Feb 1, 2022 •

edited

Loading

gvanrossum commented Apr 18, 2022 •

edited

Loading