You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just a comment that I thought might be of interest to you: swar-utf8-length can be done with one instruction less (if we don't count creation of constants). This doesn't make a difference on x86 though since it has an "andnot" instruction and already the current version needs only three instructions.
There's also a perhaps interesting different approach here which does without popcount by generating 0/1 in the low bit of each byte and accumulating that.
The text was updated successfully, but these errors were encountered:
@falk-hueffner Hi, thanks for the comment. The instruction "andnot" is an addition that comes from the BMI extension, thus not available everywhere. I've just checked and gcc -march=skylake ... emits this instruction for the SWAR code.
Yeah, popcount can be replaced with multiplication. It was the approach proposed by @zwegner on twitter.
Just a comment that I thought might be of interest to you: swar-utf8-length can be done with one instruction less (if we don't count creation of constants). This doesn't make a difference on x86 though since it has an "andnot" instruction and already the current version needs only three instructions.
There's also a perhaps interesting different approach here which does without popcount by generating 0/1 in the low bit of each byte and accumulating that.
The text was updated successfully, but these errors were encountered: