Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Improve Integer Serialization Performance #163

Open
Alexhuszagh opened this issue Sep 25, 2024 · 5 comments
Open

[BUG] Improve Integer Serialization Performance #163

Alexhuszagh opened this issue Sep 25, 2024 · 5 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Alexhuszagh
Copy link
Owner

Description

As of itoa 1.0+, it has significantly faster performance for some use-cases than lexical. Since we use our integer writers for float formatting, this could lead to noticeable performance enhancements too for float formatting.

@Alexhuszagh Alexhuszagh added the bug Something isn't working label Sep 25, 2024
@Alexhuszagh Alexhuszagh added this to the 1.1 milestone Sep 25, 2024
@Alexhuszagh Alexhuszagh self-assigned this Sep 25, 2024
@plokhotnyuk
Copy link

ICYMI https://jk-jeon.github.io/posts/2022/02/jeaiii-algorithm/

@Alexhuszagh
Copy link
Owner Author

Alexhuszagh commented Sep 26, 2024

Oh this is incredible, thank you. We use their algorithm for float formatting as well (Dragonbox) so this sounds like an excellent addition.

@Alexhuszagh
Copy link
Owner Author

Alexhuszagh commented Oct 13, 2024

Looking at this, we're getting excellent performance for 32-bit and smaller integers but I'm looking at the best way currently to fix performance for 64-bit integers due to the excessive branching causing issues. The 2-digit loop (the algorithm proposed by Alexandrescu) still generally is the best for integers larger than 32 bits from my testing but there should be a better way to optimize this.

Another great part of the jeaiii algorithm is if possible, it has no additional overhead from bounds checking of the buffer indexing (table lookups still require bounds checking it generally seems).

@jk-jeon
Copy link

jk-jeon commented Oct 31, 2024

Just noticed this.

Giant branching is indeed a problem. I recall I did some research on doing this for 64-bit integers and concluded that it's just best to decompose it into 32-bit chunks (like one 2-digit chunk and two 9-digit chunks or something like that) rather than to operate directly on 64-bit integers, but don't have any real data.

What I can tell is that James did his own research after reading my post and came up with an updated version, which includes 64-bit: https://github.com/jeaiii/itoa

I recommend you to just have a look at what exact voodoo he did. It seems he decomposed a number into 4-8-8 chunks rather than 2-9-9 which I guess makes a lot of sense.

By the way, if I recall correctly, Andrei Alexandrescu is not really the one who invented the 2-digit grouping trick; it seems more like a folklore (like the famous Quake isqrt). He is just the one who popularized the trick, as far as I know.

@plokhotnyuk
Copy link

plokhotnyuk commented Nov 1, 2024

@Alexhuszagh @jk-jeon Thanks a bunch guys for your projects, blogs, and Reddit posts!

I reused your findings in my project: https://github.com/plokhotnyuk/jsoniter-scala

Here you can see how I reduced branch mis-predictions using branch-less counter of digits and 8-byte store methods during serialization of 64-bit integers.

Below are throughput results for serialization of 128 values of 64-bit signed integers as a JSON array on different JVMs by different JSON libraries (including jsoniter-scala, smithy4sJson that uses jsoniter-scala under the hood, and jackson-module-scala and weePicke that both use 3-digit LUT table of jackson-core library):

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants