Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue 6657 - dotProduct overload for small fixed size arrays #6990

Merged
merged 1 commit into from
May 3, 2019

Conversation

n8sh
Copy link
Member

@n8sh n8sh commented Apr 30, 2019

Benchmark using DMD, 10 million trials:

version arg type compiler time (ms)
old float[3] dmd -O 118.503
new float[3] dmd -O 64.535
old float[4] dmd -O 120.456
new float[4] dmd -O 74.575
old float[5] dmd -O 141.211
new float[5] dmd -O 99.659
old float[6] dmd -O 151.537
new float[6] dmd -O 107.559
old float[7] dmd -O 192.009
new float[7] dmd -O 137.358
old float[8] dmd -O 176.876
new float[8] dmd -O 147.655
old float[9] dmd -O 200.092
new float[9] dmd -O 167.429
old float[16] dmd -O 318.174
new float[16] dmd -O 288.255

Benchmark using LDC, 10 million trials:

version arg type compiler time (ms)
old float[3] ldc -O2 55.215
new float[3] ldc -O2 17.219
old float[4] ldc -O2 35.801
new float[4] ldc -O2 20.170
old float[5] ldc -O2 53.151
new float[5] ldc -O2 20.194
old float[6] ldc -O2 62.134
new float[6] ldc -O2 26.126
old float[7] ldc -O2 67.609
new float[7] ldc -O2 26.335
old float[8] ldc -O2 47.428
new float[8] ldc -O2 25.960
old float[9] ldc -O2 66.051
new float[9] ldc -O2 29.275
old float[16] ldc -O2 73.301
new float[16] ldc -O2 45.106

@n8sh n8sh requested a review from andralex as a code owner April 30, 2019 23:01
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @n8sh!

Bugzilla references

Auto-close Bugzilla Severity Description
6657 enhancement dotProduct overload for small fixed size arrays

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#6990"

@n8sh
Copy link
Member Author

n8sh commented May 1, 2019

double benchmark.

Benchmark using DMD, 10 million trials:

version arg type compiler time (ms)
old double[3] dmd -O 286.851
new double[3] dmd -O 167.054
old double[4] dmd -O 280.592
new double[4] dmd -O 178.395
old double[8] dmd -O 342.528
new double[8] dmd -O 243.230
old double[9] dmd -O 388.862
new double[9] dmd -O 263.507
old double[16] dmd -O 496.451
new double[16] dmd -O 353.159

Benchmark using LDC, 10 million trials:

version arg type compiler time (ms)
old double[3] ldc -O2 59.573
new double[3] ldc -O2 18.770
old double[4] ldc -O2 38.978
new double[4] ldc -O2 26.844
old double[8] ldc -O2 55.563
new double[8] ldc -O2 32.187
old double[9] ldc -O2 73.577
new double[9] ldc -O2 35.114
old double[16] ldc -O2 83.846
new double[16] ldc -O2 52.690

@thewilsonator
Copy link
Contributor

Did you do -O3 results for LDC?

@n8sh
Copy link
Member Author

n8sh commented May 1, 2019

float old version ldc2 -O3
version arg type compiler time (ms)
old float[2] ldc -O3 48.728
old float[3] ldc -O3 58.374
old float[4] ldc -O3 32.802
old float[5] ldc -O3 45.219
old float[6] ldc -O3 51.883
old float[7] ldc -O3 65.833
old float[8] ldc -O3 45.824
old float[9] ldc -O3 58.899
old float[10] ldc -O3 68.776
old float[11] ldc -O3 70.097
old float[12] ldc -O3 52.271
old float[13] ldc -O3 71.376
old float[14] ldc -O3 74.872
old float[15] ldc -O3 82.841
old float[16] ldc -O3 68.458
float new version ldc2 -O3
version arg type compiler time (ms)
new float[2] ldc -O3 15.801
new float[3] ldc -O3 19.864
new float[4] ldc -O3 14.314
new float[5] ldc -O3 20.084
new float[6] ldc -O3 20.095
new float[7] ldc -O3 28.606
new float[8] ldc -O3 23.166
new float[9] ldc -O3 30.785
new float[10] ldc -O3 31.497
new float[11] ldc -O3 40.093
new float[12] ldc -O3 35.538
new float[13] ldc -O3 36.830
new float[14] ldc -O3 35.538
new float[15] ldc -O3 44.962
new float[16] ldc -O3 36.207

Had to tweak the test a bit for int because LDC was too clever about optimizing away work. In the integer case by N=16 there is no longer a benefit with LDC but with DMD there still is.

int old version ldc2 -O3
version arg type compiler time (ms)
old int[2] ldc -O3 52.763
old int[3] ldc -O3 70.344
old int[4] ldc -O3 57.306
old int[5] ldc -O3 69.355
old int[6] ldc -O3 79.647
old int[7] ldc -O3 92.875
old int[8] ldc -O3 74.591
old int[9] ldc -O3 80.862
old int[10] ldc -O3 87.535
old int[11] ldc -O3 107.287
old int[12] ldc -O3 88.301
old int[13] ldc -O3 105.696
old int[14] ldc -O3 105.883
old int[15] ldc -O3 117.358
old int[16] ldc -O3 107.997
int new version ldc2 -O3
version arg type compiler time (ms)
new int[2] ldc -O3 19.510
new int[3] ldc -O3 24.590
new int[4] ldc -O3 24.458
new int[5] ldc -O3 29.813
new int[6] ldc -O3 37.051
new int[7] ldc -O3 37.410
new int[8] ldc -O3 65.825
new int[9] ldc -O3 68.708
new int[10] ldc -O3 64.281
new int[11] ldc -O3 70.540
new int[12] ldc -O3 67.946
new int[13] ldc -O3 80.485
new int[14] ldc -O3 101.257
new int[15] ldc -O3 112.384
new int[16] ldc -O3 128.460
int old version dmd -O
version arg type compiler time (ms)
old int[2] dmd -O 120.922
old int[3] dmd -O 138.149
old int[4] dmd -O 147.788
old int[5] dmd -O 150.252
old int[6] dmd -O 181.878
old int[7] dmd -O 192.617
old int[8] dmd -O 204.361
old int[9] dmd -O 215.713
old int[10] dmd -O 231.395
old int[11] dmd -O 248.746
old int[12] dmd -O 240.910
old int[13] dmd -O 274.690
old int[14] dmd -O 287.599
old int[15] dmd -O 310.941
old int[16] dmd -O 330.701
int new version dmd -O
version arg type compiler time (ms)
new int[2] dmd -O 78.174
new int[3] dmd -O 105.965
new int[4] dmd -O 114.752
new int[5] dmd -O 129.529
new int[6] dmd -O 133.568
new int[7] dmd -O 149.295
new int[8] dmd -O 186.759
new int[9] dmd -O 194.025
new int[10] dmd -O 207.918
new int[11] dmd -O 212.257
new int[12] dmd -O 229.368
new int[13] dmd -O 250.924
new int[14] dmd -O 270.067
new int[15] dmd -O 281.089
new int[16] dmd -O 305.268

@thewilsonator
Copy link
Contributor

Very nice! Please merge when you are happy with it.

@PetarKirov
Copy link
Member

Nice! Can you add a unit test to prevent possible future regressions?

@n8sh
Copy link
Member Author

n8sh commented May 2, 2019

You mean a performance benchmark?

@PetarKirov
Copy link
Member

No, just a simple test that calls the function with static arrays of even and odd length.
Performance regressions testing is a very good idea in general, but we don't have any infrastructure to support such.

@n8sh
Copy link
Member Author

n8sh commented May 2, 2019

@ZombineDev makes sense.

@n8sh n8sh force-pushed the issue-6657 branch 2 times, most recently from 21bf0a7 to 7804b9c Compare May 3, 2019 01:23
@n8sh n8sh added the auto-merge label May 3, 2019
@dlang-bot dlang-bot merged commit 36332b8 into dlang:master May 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants