You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just ran into this one in my project. Unfortunately I do not see an easy solution.
The issue isn't zero bytes precisely, but the fact that the library uses a null terminator to ensure that no key is a full prefix of another key. That's the case that needs to be avoided, and what this test produces. If the null character is allowed in the input, I don't see a simple way to ensure that no key is a prefix of another one, though perhaps an adaptive approach could be used.
Another approach (I think I've seen this in other radix tree implementations) would be for internal nodes to have a special leaf node pointer, and use child node pointers exclusively for pointing to other inner nodes. This would require tweaking the algorithm somewhat, and I haven't thought through the performance implications.
For UTF-8 encoded strings, this shouldn't come up unless you allow null characters in the strings. For UTF-16 or UTF-32, though, you can have zero bytes easily. For a particular encoding, the problem can be solved by using a null terminator appropriate to the encoding (eg, "\x00\x00" for UTF-16), but there isn't a general solution.
I'll probably just add null-char checking on my end for now, but I wanted to leave my thoughts here in case anyone else wanted to tackle this (perhaps me in the future).
(Apparently it's also an undocumented limitation of the C version that you can't have full-prefix keys, and null-terminators are similarly advised there.)
As discovered by @nick-codes, it looks like inserting a unicode key into a new ArtTree panics with an index out of range error.
Here's some code to reproduce the error:
Acceptance critieria:
The text was updated successfully, but these errors were encountered: