You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
latin1: used when all chars are latin1. This will be just a memory copy in jdk11+
utf16: used when 50%+ chars are not ascii chars
utf8: used when 50%+ chars are ascii chars
Fury java also use superword and bitmask for 8 bytes ascii check/writing at once, which will make encoding faster.
Here is the fury benchmark result with jdk/kryo/flink string serializer:
For pyfury, we should do similar things, and since pyfury can invoke c++ with low cost, we could implement string encodings using SIMD in c++ and let pyfury wrap that by cython.
Is your feature request related to a problem? Please describe
Feature Request
Fury java serialize string with three encodings:
Fury java also use superword and bitmask for 8 bytes ascii check/writing at once, which will make encoding faster.
Here is the fury benchmark result with jdk/kryo/flink string serializer:
For pyfury, we should do similar things, and since pyfury can invoke c++ with low cost, we could implement string encodings using SIMD in c++ and let pyfury wrap that by cython.
Is your feature request related to a problem? Please describe
No response
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
#1732
#1754
#1890
#1964
The text was updated successfully, but these errors were encountered: