Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: optimize the ARM function for systems with weak SIMD performance #50

Merged
merged 5 commits into from
Jun 29, 2024

Conversation

lemire
Copy link
Member

@lemire lemire commented Jun 26, 2024

To optimize on weak ARM cores, I am using a graviton 2 from AWS. It is based on Neoverse N1. The trick is to do the sum across only once per string (or, once per blocks of ~4kB for long strings).

New results on @EgorBo's data

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/Bogatov1069.utf8.txt 510.69 ns 4.810 ns 0.264 ns 2.09
SIMDUtf8ValidationRealData data/Bogatov136.utf8.txt 92.22 ns 0.438 ns 0.024 ns 1.47
SIMDUtf8ValidationRealData data/Bogatov286.utf8.txt 176.81 ns 1.814 ns 0.099 ns 1.62
SIMDUtf8ValidationRealData data/Bogatov527.utf8.txt 284.26 ns 1.160 ns 0.064 ns 1.85

Old results on @EgorBo's data

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/Bogatov1069.utf8.txt 576.68 ns 89.757 ns 4.920 ns 1.85
SIMDUtf8ValidationRealData data/Bogatov136.utf8.txt 98.05 ns 4.872 ns 0.267 ns 1.39
SIMDUtf8ValidationRealData data/Bogatov286.utf8.txt 193.19 ns 13.463 ns 0.738 ns 1.48
SIMDUtf8ValidationRealData data/Bogatov527.utf8.txt 318.96 ns 11.377 ns 0.624 ns 1.65

New results on Twitter:

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/twitter.json 81.39 us 1.671 us 0.092 us 7.76

Old results on Twitter:

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/twitter.json 90.52 us 1.111 us 0.061 us 6.98

New results on Lipsum:

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/Arabic-Lipsum.utf8.txt 33.085 us 0.0820 us 0.0045 us 2.47
SIMDUtf8ValidationRealData data/Chinese-Lipsum.utf8.txt 28.135 us 0.0356 us 0.0020 us 2.48
SIMDUtf8ValidationRealData data/Emoji-Lipsum.utf8.txt 26.412 us 0.0459 us 0.0025 us 2.48
SIMDUtf8ValidationRealData data/Hebrew-Lipsum.utf8.txt 26.829 us 0.2250 us 0.0123 us 2.48
SIMDUtf8ValidationRealData data/Hindi-Lipsum.utf8.txt 37.910 us 0.1988 us 0.0109 us 2.32
SIMDUtf8ValidationRealData data/Japanese-Lipsum.utf8.txt 27.758 us 9.4075 us 0.5157 us 2.44
SIMDUtf8ValidationRealData data/Korean-Lipsum.utf8.txt 26.945 us 1.8570 us 0.1018 us 2.47
SIMDUtf8ValidationRealData data/Latin-Lipsum.utf8.txt 3.737 us 0.0205 us 0.0011 us 23.26
SIMDUtf8ValidationRealData data/Russian-Lipsum.utf8.txt 45.083 us 0.0307 us 0.0017 us 2.32

Old results on Lipsum:

Method FileName Mean Error StdDev Speed (GB/s)
SIMDUtf8ValidationRealData data/Arabic-Lipsum.utf8.txt 37.712 us 0.1409 us 0.0077 us 2.17
SIMDUtf8ValidationRealData data/Chinese-Lipsum.utf8.txt 33.545 us 0.9888 us 0.0542 us 2.08
SIMDUtf8ValidationRealData data/Emoji-Lipsum.utf8.txt 37.238 us 0.1203 us 0.0066 us 1.76
SIMDUtf8ValidationRealData data/Hebrew-Lipsum.utf8.txt 33.957 us 0.0790 us 0.0043 us 1.96
SIMDUtf8ValidationRealData data/Hindi-Lipsum.utf8.txt 44.110 us 0.0708 us 0.0039 us 1.99
SIMDUtf8ValidationRealData data/Japanese-Lipsum.utf8.txt 32.090 us 1.5966 us 0.0875 us 2.11
SIMDUtf8ValidationRealData data/Korean-Lipsum.utf8.txt 30.836 us 0.3372 us 0.0185 us 2.16
SIMDUtf8ValidationRealData data/Latin-Lipsum.utf8.txt 3.687 us 0.1805 us 0.0099 us 23.58
SIMDUtf8ValidationRealData data/Russian-Lipsum.utf8.txt 50.254 us 3.7907 us 0.2078 us 2.08

Copy link
Collaborator

@Nick-Nuon Nick-Nuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me.

src/UTF8.cs Outdated
{
Span<byte> b = stackalloc byte[16];
v.CopyTo(b);
Console.WriteLine(Convert.ToHexString(b));
Copy link
Collaborator

@EgorBo EgorBo Jun 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can file an API proposal for dotnet/runtime to introduce an API something like:

var hex = v.ToString("X"); // common format symbol for hex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!! I did not mean to leave this code there. Removed.

I can file the proposal if you think that's useful that I do it, or I can support it if you do so (if that's useful that I do so).

@lemire lemire merged commit 7eae2dd into main Jun 29, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants