-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor parsing logic of the Query Builder #1793
Refactor parsing logic of the Query Builder #1793
Conversation
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Put important static methods at the top and helper methods at bottom to improve readability. Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall! None of them are blockers, some minor suggestions
nit:
This is a critical path, lets make sure there are heavy unit tests around this. I know there are some, but I am not really sure if it covers all use cases.
Might be worth having KNNQueryBuilderParserTest
which manages all parsing tests. I attempted it with InvalidParamsTest
* @param vector Array of floating points | ||
* @deprecated Use {@code {@link KNNQueryBuilder.Builder}} instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add it back? the intent was to always use the builder so it will validate whenever the object is created
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops - copy and paste error. will add back
INTERNAL_PARSER.declareObject(KNNQueryBuilder.Builder::filter, (p, v) -> parseInnerQueryBuilder(p), FILTER_FIELD); | ||
} | ||
|
||
public static KNNQueryBuilder fromXContent(XContentParser parser) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much simpler. Thanks for making this more maintainable
b.ignoreUnmapped(v); | ||
} | ||
}, IGNORE_UNMAPPED_FIELD); | ||
INTERNAL_PARSER.declareFloat(KNNQueryBuilder.Builder::maxDistance, MAX_DISTANCE_FIELD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding it to the builder while parsing it was exactly what I had in mind. Seems like ObjectParser is giving us a lot of boiler plate. Nice find!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Im surprised more queries do not do this. The fromXContent was becoming a mess. This greatly simplfiies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to check how this will impact to overall latency. Latency could be the reason of not using this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good point @heemin32. I saw the ConstructingObjectParser may add some latency, but didnt see anything about ObjectParser.
fieldName = in.readString(); | ||
vector = in.readFloatArray(); | ||
k = in.readInt(); | ||
filter = in.readOptionalNamedWriteable(QueryBuilder.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you might have a conflict with backport merge. Might be worth adding back the check and tackle it on major version upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh is this the issue you ran into? I can look into it.
@@ -114,8 +225,20 @@ public KNNQueryBuilder(String fieldName, float[] vector) { | |||
if (vector.length == 0) { | |||
throw new IllegalArgumentException(String.format("[%s] query vector is empty", NAME)); | |||
} | |||
if (k <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Had marked deprecated so we don't have to add stuff. But I understand the risk, hopefully nothing other than tests are using it
NAME, | ||
KNNQueryBuilder.Builder::new | ||
); | ||
static { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
- Instead of static block you can make a static method out it which returns ObjectParser
- Consider pulling this and the utility methods related to parsing into its own class. I attempted that with MethodParameterParser to single out the responsibility and better testing. The interface is made in a way that forces a lot of responsibilities on this class but we can always pull it out. Ideally I would want to see just the
doToQuery
logic in here, rest all can be delegated to other classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to pull out parsing into separate class. Will make that change.
closing in favor of #1824 |
Description
I was working on some code for #1779 where I had to add a parameter to KNNQueryBuilder and noticed it has gotten pretty complex. I know @shatejas recently did some refactoring and wanted to build upon that to see if I could simplify the parsing logic.
What I found was that we can use ObjectParser on the inner structure of the query to better define the parsing structure so we do not have to deal with tokens. I have noticed a couple other queries also use this, but many still do it the old way (one example using parser: https://github.com/opensearch-project/OpenSearch/blob/main/modules/mapper-extras/src/main/java/org/opensearch/index/query/RankFeatureQueryBuilder.java)
Anyway, this is in early phases. Just wanted to get some preliminary feedback.
Issues Resolved
[List any issues this PR will resolve]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.