-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fulltext query with wildcard #794
Comments
Mostly no. If it is getEntryByPath or findByPath, the answer is clearly no. We are at the byte level and we search for exact match. If it is fulltext search or suggestion search, no also. The query is passed to xapian which will parse it but we don't set
Libzim works at byte level and the specification tell to store utf8. |
goldendict-ng has use xapian as fulltext engine too. in order to search cjk character , FLAG_CJK_NGRAM has to be passed to termgenerator. without this FLAG_CJK_NGRAM flag ,the query result is not correct as I can remember. If libzim support CJK search ,I can skip the zim dictionary's fulltext creation in goldendict-ng and use libzim to query zim's built-in fulltext ,which should save a lot of disk space. |
Hum.. Maybe we have to use the |
I probably don't fully understand, but we decided to no save positional infirmation a long time ago to save index storage space. Not sure, this CJK flag can work without that kind of information. Glad to learn that I'm wrong if this is the case. |
CJK character are not space delimited as English. the default tokenize is wrong ,which lead to wrong query result. a defect without positional information is that the result is a little more than actual. This can be discussed alone. |
@xiaoyifang I would like to move forward with this ticket. One of the problem is that the feature request is no very clear from the user perspective. Can we please clarify this:
If we are talking about two different feature requests, we should probably have two tickets. |
I talked about fulltext search . I do not know much about the xapian-based suggestions.
I have migrated Goldendict fulltext engine to xapian. During the migration ,I have found that if not enabled CJK flag
it will give less results when searched in Chinese Dictionary. |
The suggestion search is a way to fournish article title base suggestions (completion approach).
Please open a dedicated ticket for this. To me this sounds like a serious bug and I'm in favour to fix this ASAP if the chinese search fails to work properly. This ticket should be from now only focused on the wildcard fulltext search. |
This ticket requests that we allow wildcard fulltext searches. Here my remarks:
@mgautierfr Do you think we could implement this feature without user impact? |
|
CJK has been implement which was definitly the most important for the rest I think we will pass on wildcards. |
Some questions:
https://xapian.org/docs/apidoc/html/classXapian_1_1QueryParser.html#:~:text=FLAG_WILDCARD-,Support%20wildcards.,-At%20present%20only
3. does libzim support CJK tokenize/query?See Support CJK index creation and query #802The text was updated successfully, but these errors were encountered: