Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add data validation workflow #175

Merged
merged 1 commit into from
Mar 7, 2023
Merged

ci: add data validation workflow #175

merged 1 commit into from
Mar 7, 2023

Conversation

P403n1x87
Copy link
Owner

@P403n1x87 P403n1x87 commented Mar 5, 2023

Description of the Change

This change implements the ideas set out in https://arxiv.org/abs/2301.08941 to perform a statistical comparison of the flame graphs generated by the latest release and the proposed changes in a PR. The aim is to check whether the statistical data generated by two different versions of Austin come from the same distribution. If this is the case we assume that the two versions generate statistically equivalent data.

@P403n1x87 P403n1x87 added the ci/cd label Mar 5, 2023
@github-actions
Copy link

github-actions bot commented Mar 5, 2023

Austin Benchmarks

Running Austin benchmarks with Python 3.10.10

Wall time [sampling interval: 1]

Wall time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 123000 ± 5000 1 ± 0 1.1e-05 ± 6e-06 12.6 ± 0.7
3.5.0 126000 ± 2000 1 ± 0 1.4e-05 ± 4e-06 12.1 ± 0.3
dev 122000 ± 6000 1 ± 0 1.2e-05 ± 6e-06 12.7 ± 0.9
Wall time [sampling interval: 10]

Wall time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 121000 ± 5000 0.544 ± 0.002 1.3e-05 ± 5e-06 12.7 ± 0.5
3.5.0 118000 ± 5000 0.544 ± 0.001 1.4e-05 ± 6e-06 13.0 ± 0.5
dev 124000 ± 3000 0.5449 ± 0.0007 1.6e-05 ± 6e-06 12.5 ± 0.5
Wall time [sampling interval: 100]

Wall time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 10500 ± 300 0.0007 ± 0.0001 5e-05 ± 4e-05 14.6 ± 0.5
3.5.0 10500 ± 200 0.0005 ± 0.0002 4e-05 ± 4e-05 14.3 ± 0.5
dev 10600 ± 200 0.0006 ± 0.0002 5e-05 ± 5e-05 14.2 ± 0.8
Wall time [sampling interval: 1000]

Wall time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 1850 ± 30 0 ± 0 0.0 ± 0.0001 17.9 ± 0.9
3.5.0 1860 ± 20 0.0 ± 0.0001 0 ± 0 18.3 ± 0.8
dev 1850 ± 20 0.0001 ± 0.0002 0 ± 0 18.3 ± 0.8
CPU time [sampling interval: 1]

CPU time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 52000 ± 3000 1 ± 0 2e-05 ± 1e-05 22.4 ± 0.8
3.5.0 52000 ± 3000 1 ± 0 2e-05 ± 1e-05 22.4 ± 0.7
dev 53000 ± 2000 1 ± 0 3e-05 ± 1e-05 22.2 ± 0.4
CPU time [sampling interval: 10]

CPU time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 53000 ± 2000 0.997 ± 0.001 3e-05 ± 2e-05 22.4 ± 0.7
3.5.0 52000 ± 3000 0.9965 ± 0.0006 2e-05 ± 1e-05 22.6 ± 1.0
dev 51000 ± 3000 0.9961 ± 0.0004 2e-05 ± 9e-06 22.5 ± 0.7
CPU time [sampling interval: 100]

CPU time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 5500 ± 200 0.0012 ± 0.0002 0.0002 ± 0.0001 24.1 ± 0.7
3.5.0 5410 ± 80 0.001 ± 0.0004 0.0001 ± 0.0001 23.8 ± 0.6
dev 5470 ± 100 0.0009 ± 0.0004 0.0001 ± 0.0002 23.5 ± 0.7
CPU time [sampling interval: 1000]

CPU time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 947 ± 3 0.0002 ± 0.0004 0 ± 0 31 ± 1
3.5.0 949 ± 2 0.0003 ± 0.0006 0.0001 ± 0.0002 31 ± 3
dev 950 ± 2 0 ± 0 0.0002 ± 0.0003 30 ± 1
RSA keygen [sampling interval: 1]

RSA keygen [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 41000 ± 2000 1 ± 0 0.0004 ± 0.0005 24 ± 1
3.5.0 40000 ± 2000 1 ± 0 0.0002 ± 0.0001 24.2 ± 0.8
dev 40000 ± 2000 1 ± 0 0.0005 ± 0.0004 23.7 ± 0.9
RSA keygen [sampling interval: 10]

RSA keygen [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 39000 ± 2000 0.98 ± 0.02 0.0004 ± 0.0004 24.4 ± 0.5
3.5.0 39000 ± 1000 0.98 ± 0.02 0.0004 ± 0.0003 24.4 ± 0.5
dev 40000 ± 1000 0.97 ± 0.03 0.0004 ± 0.0004 24.3 ± 0.7
RSA keygen [sampling interval: 100]

RSA keygen [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 6300 ± 20 0.0016 ± 0.0009 0.0004 ± 0.0005 27 ± 2
3.5.0 6290 ± 20 0.003 ± 0.003 0.002 ± 0.002 28 ± 3
dev 6300 ± 20 0.001 ± 0.001 0.0006 ± 0.0009 27 ± 2
RSA keygen [sampling interval: 1000]

RSA keygen [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 940 ± 2 0 ± 0 0.002 ± 0.002 33 ± 2
3.5.0 941 ± 1 0 ± 0 0.0005 ± 0.001 35 ± 2
dev 941 ± 2 0 ± 0 0.001 ± 0.001 34 ± 2
Full metrics [sampling interval: 1]

Full metrics [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 51000 ± 1000 1 ± 0 4.1e-05 ± 1e-05 29.8 ± 0.9
3.5.0 51000 ± 3000 1 ± 0 4e-05 ± 1e-05 30 ± 2
dev 51000 ± 4000 1 ± 0 3e-05 ± 1e-05 30 ± 3
Full metrics [sampling interval: 10]

Full metrics [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 51000 ± 2000 1 ± 0 4.1e-05 ± 9e-06 29.8 ± 0.9
3.5.0 52000 ± 2000 1 ± 0 4e-05 ± 2e-05 30 ± 1
dev 51000 ± 2000 1 ± 0 4e-05 ± 2e-05 30 ± 1
Full metrics [sampling interval: 100]

Full metrics [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 9900 ± 300 0.0021 ± 0.0006 0.00012 ± 9e-05 34 ± 1
3.5.0 9900 ± 500 0.003 ± 0.001 5e-05 ± 5e-05 35 ± 1
dev 9900 ± 300 0.0022 ± 0.0007 6e-05 ± 9e-05 34.1 ± 0.7
Full metrics [sampling interval: 1000]

Full metrics [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 1850 ± 30 0.0004 ± 0.0006 0.0 ± 0.0001 40 ± 1
3.5.0 1840 ± 30 0.0001 ± 0.0002 0.0001 ± 0.0001 38.5 ± 0.8
dev 1860 ± 20 0.0 ± 0.0001 0.0001 ± 0.0002 37.4 ± 0.5
Multiprocess wall time [sampling interval: 1]

Multiprocess wall time [sampling interval: 1]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 2600 ± 200 1 ± 0 0.00019 ± 5e-05 250 ± 20
3.5.0 3200 ± 100 1 ± 0 0.00018 ± 6e-05 300 ± 10
dev 3300 ± 200 1 ± 0 0.00017 ± 5e-05 290 ± 20
Multiprocess wall time [sampling interval: 10]

Multiprocess wall time [sampling interval: 10]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 2400 ± 200 1 ± 0 0.0003 ± 0.0003 270 ± 20
3.5.0 3000 ± 100 1 ± 0 0.00019 ± 6e-05 320 ± 10
dev 3200 ± 100 1 ± 0 0.00016 ± 5e-05 300 ± 10
Multiprocess wall time [sampling interval: 100]

Multiprocess wall time [sampling interval: 100]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 2300 ± 100 0.03 ± 0.02 0.00012 ± 6e-05 260 ± 20
3.5.0 2900 ± 200 0.057 ± 0.008 0.00011 ± 4e-05 320 ± 20
dev 3000 ± 200 0.05 ± 0.01 0.0001 ± 4e-05 320 ± 20
Multiprocess wall time [sampling interval: 1000]

Multiprocess wall time [sampling interval: 1000]

Sample Rate Saturation Error Rate Sampling Speed
3.4.1 2100 ± 100 0.0006 ± 0.0004 2e-05 ± 4e-05 42 ± 3
3.5.0 2900 ± 100 0.01 ± 0.003 2e-05 ± 2e-05 80 ± 10
dev 2900 ± 200 0.011 ± 0.003 1e-05 ± 2e-05 90 ± 10

Benchmark Summary

Comparison of dev against 3.5.0.

The following scenarios show a statistically significant difference in performance between the two versions.

Sample Rate Saturation Error Rate Sampling Speed
Wall time [sampling interval: 10] 🟢 🔴 🟡 🟢
Full metrics [sampling interval: 1000] 🟡 🟡 🟡 🟢
Multiprocess wall time [sampling interval: 10] 🟢 🟡 🟡 🟢

@P403n1x87 P403n1x87 force-pushed the ci/data-validation branch from 014cf92 to 6bbf321 Compare March 5, 2023 22:16
@codecov
Copy link

codecov bot commented Mar 5, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.56 🎉

Comparison is base (4c3e26b) 69.84% compared to head (a3dcaa2) 70.41%.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #175      +/-   ##
==========================================
+ Coverage   69.84%   70.41%   +0.56%     
==========================================
  Files          25       25              
  Lines        2474     2474              
  Branches      730      730              
==========================================
+ Hits         1728     1742      +14     
+ Misses        413      396      -17     
- Partials      333      336       +3     
Impacted Files Coverage Δ
src/py_string.h 65.38% <0.00%> (-5.13%) ⬇️
src/py_thread.c 74.69% <0.00%> (-0.21%) ⬇️
src/austin.c 69.34% <0.00%> (+1.00%) ⬆️
src/py_proc.c 69.19% <0.00%> (+3.79%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@P403n1x87 P403n1x87 force-pushed the ci/data-validation branch 7 times, most recently from 9bd5f2d to 20573e4 Compare March 7, 2023 10:31
This change adds a data validation workflow. Data is validated by
performing a Hotelling T2 test on the collected data.
@P403n1x87 P403n1x87 force-pushed the ci/data-validation branch from 20573e4 to a3dcaa2 Compare March 7, 2023 22:24
@P403n1x87 P403n1x87 marked this pull request as ready for review March 7, 2023 22:43
@P403n1x87 P403n1x87 merged commit 8ff6d23 into devel Mar 7, 2023
@P403n1x87 P403n1x87 deleted the ci/data-validation branch March 7, 2023 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant