You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't have a concrete use for this, but when writing the byteset code it occurred to me that people might want to use it for e.g. searching for the next ASCII number, or the next lowercase ASCII character (ISTM like this is more generally useful for ranges of u8 rather than char, but I could be wrong).
These can be accomplished with the byteset functions, but less efficiently than dedicated functions for finding bytes in a range.
One of the flags to PCMPESTRI does allow for range checks, and defining such a thing in earlier SSEs is easier for ranges than for arbitrary byte sets.
The byteset functions could also possibly autodetect this, in many cases it would be cheap (e.g. b"0123456789"), but for some it would take an extra pass over the byte table afterwards to look for the consecutive runs (e.g. b"0918273645", which seems unfortunate.
Additionally, having to type out the members of the set is less syntactically convenient than b'0'..=b'9' (or whatever).
Not sure if these are worth adding. Again, I don't really have a use, so maybe it's worth waiting for someone who does. And it's unclear how extensive you'd like the searching capabilities to be on ByteSlice anyway. Thoughts?
The text was updated successfully, but these errors were encountered:
It's a good idea. One of the possible use cases here is in a regex engine, although using routines like this effectively in that context isn't straight-forward.
I'd probably prefer to wait until we have a concrete use case for this.
If initialization time becomes a problem then we can add something like a Finder to permit callers to amortize that cost if they need to.
I don't have a concrete use for this, but when writing the byteset code it occurred to me that people might want to use it for e.g. searching for the next ASCII number, or the next lowercase ASCII character (ISTM like this is more generally useful for ranges of
u8
rather thanchar
, but I could be wrong).These can be accomplished with the byteset functions, but less efficiently than dedicated functions for finding bytes in a range.
One of the flags to PCMPESTRI does allow for range checks, and defining such a thing in earlier SSEs is easier for ranges than for arbitrary byte sets.
The byteset functions could also possibly autodetect this, in many cases it would be cheap (e.g.
b"0123456789"
), but for some it would take an extra pass over the byte table afterwards to look for the consecutive runs (e.g.b"0918273645"
, which seems unfortunate.Additionally, having to type out the members of the set is less syntactically convenient than
b'0'..=b'9'
(or whatever).Not sure if these are worth adding. Again, I don't really have a use, so maybe it's worth waiting for someone who does. And it's unclear how extensive you'd like the searching capabilities to be on ByteSlice anyway. Thoughts?
The text was updated successfully, but these errors were encountered: