Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: introduce stack cache #218

Draft
wants to merge 1 commit into
base: devel
Choose a base branch
from
Draft

perf: introduce stack cache #218

wants to merge 1 commit into from

Conversation

P403n1x87
Copy link
Owner

We hash stacks so that we can cache them when emitting data in the binary format.

Requirements for Adding, Changing, Fixing or Removing a Feature

Fill out the template below. Any pull request that does not include enough
information to be reviewed in a timely manner may be closed at the maintainers'
discretion.

Description of the Change

Alternate Designs

Regressions

Verification Process

@P403n1x87 P403n1x87 self-assigned this Apr 9, 2024
We hash stacks so that we can cache them when emitting data in the
binary format.
Copy link

codecov bot commented Apr 10, 2024

Codecov Report

Attention: Patch coverage is 93.18182% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 68.36%. Comparing base (4559915) to head (263d51d).

Files Patch % Lines
src/py_proc.c 50.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #218      +/-   ##
==========================================
- Coverage   68.54%   68.36%   -0.19%     
==========================================
  Files          27       27              
  Lines        2521     2557      +36     
  Branches      771      779       +8     
==========================================
+ Hits         1728     1748      +20     
- Misses        453      466      +13     
- Partials      340      343       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Austin Benchmarks

Running Austin benchmarks with Python 3.10.14

Benchmark Summary

Comparison of dev against 3.6.0.

The following scenarios show a statistically significant difference in performance between the two versions.

Sample Rate Saturation Error Rate Sampling Speed
Wall time [sampling interval: 1] 🟢 🟢 🟢 🟡
Wall time [sampling interval: 10] 🟡 🟡 🟢 🟢
CPU time [sampling interval: 1] 🟡 🟡 🟢 🟡
CPU time [sampling interval: 10] 🟡 🟡 🟢 🟡
CPU time [sampling interval: 100] 🟡 🟡 🟢 🟡
RSA keygen [sampling interval: 1] 🟡 🟡 🟢 🟡
RSA keygen [sampling interval: 100] 🟡 🟡 🟢 🟡
Full metrics [sampling interval: 1] 🟡 🟡 🟢 🟡
Full metrics [sampling interval: 10] 🟡 🟢 🟢 🟡
Full metrics [sampling interval: 100] 🔴 🟡 🟡 🟡
Multiprocess wall time [sampling interval: 1] 🟡 🟡 🟢 🟡
Multiprocess wall time [sampling interval: 10] 🟡 🟡 🟢 🟡
Multiprocess wall time [sampling interval: 100] 🟡 🟡 🟢 🟡

Benchmark Results

Wall time [sampling interval: 1]

Wall time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 192000 ± 2000 1 ± 0 6e-06 ± 4e-06 7.9 ± 0.3
dev 197000 ± 3000 0.99999 ± 1e-05 1e-06 ± 1e-06 7.6 ± 0.5
Wall time [sampling interval: 10]

Wall time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 65000 ± 7000 0.37 ± 0.04 1.6e-05 ± 8e-06 9.9 ± 0.3
dev 63000 ± 5000 0.35 ± 0.03 6e-06 ± 4e-06 9.2 ± 0.4
Wall time [sampling interval: 100]

Wall time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 12300 ± 200 0.0005 ± 0.0002 1e-05 ± 3e-05 9.3 ± 0.7
dev 12400 ± 200 0.0005 ± 0.0002 0.0 ± 2e-05 8.7 ± 0.5
Wall time [sampling interval: 1000]

Wall time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 1860 ± 20 0 ± 0 0.0 ± 0.0001 10.7 ± 0.8
dev 1870 ± 20 0 ± 0 0 ± 0 10 ± 1
CPU time [sampling interval: 1]

CPU time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 61000 ± 3000 1 ± 0 1.4e-05 ± 8e-06 13.0 ± 0.7
dev 61000 ± 2000 1 ± 0 4e-06 ± 5e-06 12.9 ± 0.6
CPU time [sampling interval: 10]

CPU time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 57000 ± 3000 0.61 ± 0.05 1e-05 ± 1e-05 13.1 ± 0.3
dev 57000 ± 4000 0.6 ± 0.06 2e-06 ± 4e-06 13.2 ± 0.4
CPU time [sampling interval: 100]

CPU time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 6370 ± 20 0.0006 ± 0.0002 6e-05 ± 6e-05 17.3 ± 0.9
dev 6371 ± 9 0.0006 ± 0.0002 0 ± 0 16.7 ± 0.8
CPU time [sampling interval: 1000]

CPU time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 947 ± 2 0 ± 0 0 ± 0 20.8 ± 0.6
dev 947 ± 2 0 ± 0 0 ± 0 20 ± 2
RSA keygen [sampling interval: 1]

RSA keygen [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 61000 ± 4000 1 ± 0 7e-05 ± 6e-05 15.7 ± 0.9
dev 60000 ± 3000 1 ± 0 3e-06 ± 1e-05 16 ± 1
RSA keygen [sampling interval: 10]

RSA keygen [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 58000 ± 3000 0.995 ± 0.003 0.00012 ± 7e-05 16.2 ± 0.9
dev 59000 ± 6000 0.994 ± 0.006 4e-05 ± 7e-05 16 ± 2
RSA keygen [sampling interval: 100]

RSA keygen [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 6350 ± 30 0.001 ± 0.001 0.0005 ± 0.0004 20 ± 3
dev 6340 ± 20 0.001 ± 0.001 0 ± 0 21 ± 3
RSA keygen [sampling interval: 1000]

RSA keygen [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 945.1 ± 0.9 0 ± 0 0.001 ± 0.001 22 ± 4
dev 944 ± 2 0 ± 0 0.0 ± 0.001 23 ± 5
Full metrics [sampling interval: 1]

Full metrics [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 74000 ± 2000 1 ± 0 1.2e-05 ± 7e-06 20.2 ± 0.6
dev 75000 ± 2000 1 ± 0 0 ± 0 19.8 ± 0.6
Full metrics [sampling interval: 10]

Full metrics [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 76000 ± 1000 0.9 ± 0.02 1.4e-05 ± 8e-06 19.5 ± 0.7
dev 77000 ± 2000 0.86 ± 0.02 0 ± 0 19.3 ± 0.5
Full metrics [sampling interval: 100]

Full metrics [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 12500 ± 200 0.0011 ± 0.0002 2e-05 ± 3e-05 25 ± 1
dev 12300 ± 100 0.001 ± 0.0002 0 ± 0 24 ± 1
Full metrics [sampling interval: 1000]

Full metrics [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 1860 ± 30 0 ± 0 0.0001 ± 0.0002 28 ± 1
dev 1860 ± 20 0 ± 0 0.0 ± 0.0001 27 ± 2
Multiprocess wall time [sampling interval: 1]

Multiprocess wall time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 7100 ± 700 1 ± 0 9e-05 ± 3e-05 132 ± 8
dev 7300 ± 500 1 ± 0 2e-05 ± 2e-05 132 ± 8
Multiprocess wall time [sampling interval: 10]

Multiprocess wall time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 6600 ± 1000 0.9978 ± 0.0005 7e-05 ± 3e-05 137 ± 4
dev 7100 ± 300 0.999 ± 0.002 1e-05 ± 1e-05 135 ± 5
Multiprocess wall time [sampling interval: 100]

Multiprocess wall time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 6400 ± 800 0.018 ± 0.005 4e-05 ± 3e-05 130 ± 20
dev 6970 ± 80 0.021 ± 0.001 1e-05 ± 1e-05 137 ± 2
Multiprocess wall time [sampling interval: 1000]

Multiprocess wall time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.6.0 6500 ± 100 0.0013 ± 0.0002 1e-05 ± 2e-05 41 ± 1
dev 6510 ± 90 0.0013 ± 0.0002 3e-06 ± 7e-06 40 ± 1

@P403n1x87 P403n1x87 force-pushed the devel branch 2 times, most recently from cb7874e to 0c5264b Compare October 14, 2024 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant