-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a config property/env variable specific index field #117
Add a config property/env variable specific index field #117
Conversation
I need to have a closer look, but first: why don't we currently have a match exactly? I'd expect |
hey, yes sure.
if you go to a quarkus page https://quarkus.io/guides/all-config and then try running a selection query (that is currently used) from a dev tools console
section from the footer ... so it just didn't index anything from the page, hence no results. |
since we won't be searching for something that is < 2 symbols there's no need to generate that 1 gram token
9b08c6a
to
5edfff7
Compare
I see, thanks for the explanation. I'd have expected just raising the I'll try this locally and then will merge. |
yeah, give it a try and see what you think 😃. I wasn't sure about increasing the max on the current analyzer, since it would probably breakup more words than just properties... but then I didn't really check how much relly loooooong words we have in the guides 😃 . And then there's also the fact that a config property would be tokenized and not treated as a single keyword... I was thinking: if I'm searching for a config and start with something like *EDIT: oh and because of that regex tokenizer it should only include the env variables and config properties, no other text |
The last change to that size was supposed to set it to 512MiB but mistakenly set it to 512GiB... which is way too much. Currently we only use ~500MB per index, and this will rise to ~1GB per index after quarkusio#117, so 5GiB should be more than enough and leave room for changes to analysis config and indexing of more content in the foreseable future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so this almost doubles the size of indexes (from 500MB to 1GB), which is kind of odd considering the new field supposedly only indexes config properties...
But I guess that's okay, since we have more than enough room to accommodate these relatively still small indexes (see #119 :x )
Will merge, thanks!
The last change to that size was supposed to set it to 512MiB but mistakenly set it to 512GiB... which is way too much. Currently we only use ~500MB per index, and this will rise to ~1GB per index after quarkusio#117, so 5GiB should be more than enough and leave room for changes to analysis config and indexing of more content in the foreseable future.
I was looking into Max's comment quarkusio/quarkusio.github.io#1825 (comment) and I've tried a few things like shingle filter and some other stuff..
The idea I've ended up with is that we add a config property/env variable specific field. Using a regex tokenizer we'll put only tokens that are config properties or env variables. Then, while searching, we'd use a keyword tokenizer on this particular field, which would mean if we find a match there we can boost it with > 1 boost since the search term seems to be looking suspiciously similar to a config property.