Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

!perf: Use string for iteration and error reporting #24

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

romshark
Copy link
Owner

@romshark romshark commented Dec 17, 2023

The string type is smaller than a []byte and can be copied much faster. Copying structs <= 32 bytes on modern 64-bit systems is essentially free whereas Error[[]byte] is 40 bytes wide
making jscan work faster with string than with []byte. This PR eliminates the overhead at the cost of backward compatibility improving performance by about 14%.

  • 👎🏻 This change requires the use of unsafe.
  • 👎🏻 Requires backward incompatible changes to type Error:
    • Removed type parameter.
    • Removed field Src.
  • 👍🏻 Slightly better performance overall.
  • 👍🏻 Type Error is no longer required to be type-parameterized.

An alternative solution is presented in #25.

Benchmark results:

goos: darwin
goarch: arm64
pkg: github.com/romshark/jscan/v2
                                    │  main.bench  │        string-iterator.bench        │
                                    │    sec/op    │   sec/op     vs base                │
CalcStats/miniscule_1b__________-10    20.63n ± 0%   14.28n ± 0%  -30.80% (p=0.000 n=10)
CalcStats/tiny_8b_______________-10    27.84n ± 0%   25.92n ± 0%   -6.90% (p=0.000 n=10)
CalcStats/small_336b____________-10    311.8n ± 0%   317.2n ± 0%   +1.75% (p=0.000 n=10)
CalcStats/large_26m_____________-10    13.84m ± 0%   13.79m ± 0%   -0.35% (p=0.000 n=10)
CalcStats/nasa_SxSW_2016_125k___-10    115.4µ ± 0%   114.2µ ± 0%   -1.09% (p=0.000 n=10)
CalcStats/escaped_3k____________-10    1.399µ ± 0%   1.397µ ± 0%   -0.14% (p=0.000 n=10)
CalcStats/array_int_1024_12k____-10    13.83µ ± 0%   13.31µ ± 0%   -3.72% (p=0.000 n=10)
CalcStats/array_dec_1024_10k____-10    11.73µ ± 4%   11.91µ ± 7%        ~ (p=0.105 n=10)
CalcStats/array_nullbool_1024_5k-10    7.482µ ± 1%   7.480µ ± 1%        ~ (p=0.436 n=10)
CalcStats/array_str_1024_639k___-10    146.4µ ± 0%   146.2µ ± 0%   -0.15% (p=0.000 n=10)
Valid/deeparray_____________-10       15.680n ± 1%   4.966n ± 0%  -68.33% (p=0.000 n=10)
Valid/unwind_stack__________-10       15.665n ± 1%   4.966n ± 0%  -68.30% (p=0.000 n=10)
Valid/miniscule_1b__________-10       11.430n ± 0%   7.760n ± 0%  -32.11% (p=0.000 n=10)
Valid/tiny_8b_______________-10        17.38n ± 0%   16.76n ± 0%   -3.57% (p=0.000 n=10)
Valid/small_336b____________-10        236.2n ± 0%   225.9n ± 0%   -4.34% (p=0.000 n=10)
Valid/large_26m_____________-10        11.09m ± 0%   11.00m ± 0%   -0.80% (p=0.000 n=10)
Valid/nasa_SxSW_2016_125k___-10        85.25µ ± 0%   83.16µ ± 0%   -2.45% (p=0.000 n=10)
Valid/escaped_3k____________-10        1.387µ ± 0%   1.386µ ± 0%   -0.07% (p=0.001 n=10)
Valid/array_int_1024_12k____-10        9.135µ ± 1%   9.447µ ± 0%   +3.42% (p=0.000 n=10)
Valid/array_dec_1024_10k____-10        8.672µ ± 0%   8.661µ ± 0%        ~ (p=0.280 n=10)
Valid/array_nullbool_1024_5k-10        3.415µ ± 5%   3.243µ ± 4%   -5.04% (p=0.001 n=10)
Valid/array_str_1024_639k___-10        142.3µ ± 0%   142.4µ ± 0%        ~ (p=0.105 n=10)
Yo/bytes-10                                          13.04n ± 0%
Yo/string-10                                         12.89n ± 1%
geomean                                3.038µ        1.681µ       -13.86%

@romshark romshark self-assigned this Dec 17, 2023
@romshark romshark added major Requires backward incompatible changes performance Improves performance labels Dec 17, 2023
@romshark romshark added this to the v3 milestone Dec 17, 2023
@romshark
Copy link
Owner Author

Alternatively, the error reporting can be changed to avoid needing Src in the first place.
The downside would be slightly less convenient error messages:

error at index 6: illegal control character

instead of:

error at index 6 (0x02): illegal control character

But this can still be replicated by the API user if necessary.

The string type is smaller than a byte slice and can be
copied much faster than []byte on modern 64-bit systems.
Drawbacks: this change requires the use of unsafe.
Type Error is no longer required to be type-parameterized but
it's preserved as Error[S] in order to remain backward compatible.
The type parameter can be removed in a new major version.
Remove type parameter because the Error type
requires backward incompatible changes anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major Requires backward incompatible changes performance Improves performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant