-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is TinyStr ASCII-only? #16
Comments
@raphlinus designed it this way in projectfluent/fluent-langneg-rs#8 and I just took it because perf/mem wins very outstanding so it fits all the needs for Locale. I'm wondering if there should be some refactor in TinyStr to handle UTF-8 maybe and then the current ones would be renamed |
My concern is that all methods on TinyStr are very ascii specific and their perf is great because they are simple bitmasks. if we had |
The main use case I see would be to store grapheme clusters, which tend to be short but could be quite long. So, a |
Hmm, maybe it warrants a rewrite of Or maybe one of the other small string crates will be a good match - SmolStr, ArrayString, smallstr and istring? |
The good performance of It would in theory be possible to extend it to bounded-length UTF-8, but some of the logic (case conversions, which are important for locale tags) depend pretty strongly on the ASCII-ness (7th bit clear). |
It would be useful (unicode-org/icu4x#61 (comment)) to be able to use TinyStr for UTF-8 data, not only ASCII data. What was the design decision to make it support ASCII-only? How hard would it be to extend it to support UTF-8?
The text was updated successfully, but these errors were encountered: