Does Meili officially support control characters inside strings? #744
LukasKalbertodt
started this conversation in
Feedback & Feature Proposal
Replies: 1 comment 5 replies
-
Hello @LukasKalbertodt, Yes, the control characters are ignored during the tokenization process, meaning that the search never sees any of these characters. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tested this (with Meili v1.4.2):
And it works. I can also retrieve the document again, with the null byte still being inside the
desc
field. I can even search forfoobar
and get a matchstart: 0, length: 7
insidedesc
.My question is: does this work by accident or can I rely on Meili working with control characters inside strings?
And a follow up question: will this slow down Meili? Since deserializing JSON with escape codes means that a new string has to be allocated as one can't simply reference a part of the input file.
EDIT: though it seems like Meili treats \0 as a normal word character. For example with query "bar", the document is not found (due to Meili performing prefix search). All control characters should probably be treated the same as
". "
, i.e. a strong separator token. Would such a change be welcome in Meili?Beta Was this translation helpful? Give feedback.
All reactions