Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SSE2 in the ASCII fast path of decodeUtf8 #298

Closed
wants to merge 1 commit into from

Conversation

ethercrow
Copy link
Contributor

@ethercrow ethercrow commented Oct 3, 2020

Before:

benchmarking DecodeUtf8/Strict+ascii
time                 18.06 ms   (17.12 ms .. 18.56 ms)
                     0.994 R²   (0.984 R² .. 1.000 R²)
mean                 19.66 ms   (18.57 ms .. 24.01 ms)
std dev              4.971 ms   (25.90 μs .. 9.169 ms)
variance introduced by outliers: 86% (severely inflated)

benchmarking DecodeUtf8/Stream+ascii
time                 19.11 ms   (19.08 ms .. 19.13 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 19.16 ms   (19.13 ms .. 19.22 ms)
std dev              100.5 μs   (26.19 μs .. 160.2 μs)

benchmarking DecodeUtf8/Lazy+ascii
time                 18.78 ms   (18.75 ms .. 18.80 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 18.83 ms   (18.80 ms .. 18.88 ms)
std dev              92.52 μs   (25.88 μs .. 141.1 μs)

After:

benchmarking DecodeUtf8/Strict+ascii
time                 11.17 ms   (10.62 ms .. 11.56 ms)
                     0.992 R²   (0.978 R² .. 0.999 R²)
mean                 11.88 ms   (11.07 ms .. 15.03 ms)
std dev              3.993 ms   (305.1 μs .. 8.201 ms)
variance introduced by outliers: 92% (severely inflated)

benchmarking DecodeUtf8/Stream+ascii
time                 6.647 ms   (6.620 ms .. 6.673 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.709 ms   (6.679 ms .. 6.809 ms)
std dev              137.3 μs   (37.82 μs .. 276.3 μs)

benchmarking DecodeUtf8/Lazy+ascii
time                 6.421 ms   (6.372 ms .. 6.479 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 6.447 ms   (6.423 ms .. 6.527 ms)
std dev              123.4 μs   (44.11 μs .. 246.1 μs)

@ethercrow ethercrow changed the title Use SSE2 in the x86_64 C version of decodeUtf8 Use SSE2 in the ASCII fast path of decodeUtf8 Oct 3, 2020
@ethercrow ethercrow marked this pull request as ready for review October 3, 2020 14:00
@Lysxia Lysxia linked an issue Mar 8, 2021 that may be closed by this pull request
@ethercrow ethercrow closed this Apr 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants