-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query refactoring: BoolQueryBuilder and Parser #11121
Query refactoring: BoolQueryBuilder and Parser #11121
Conversation
|
||
/** | ||
* Base class with common code for all {@link QueryBuilder} implementations. | ||
*/ | ||
public abstract class BaseQueryBuilder implements QueryBuilder { | ||
|
||
private final static Map<String, Class<?>> queryBuilderRegistry = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This map should eventually be replaced by registry for the Builders (or whatever we call them in the end) that we have today with the IndicesQueriesRegistry, only that this looks like a registry for the Parsers. I've introduced this static map here for lack of a better solution, but if we eventually merge parsers and builders and can access them this would be good to do here. Any other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we merge parsers and builders at some point anyway why not re-use the registry for query parsers we already have in place and add a "get builder" method to the existing query parsers?
Something along the lines of:
public abstract class BaseQueryBuilder implements QueryBuilder {
IndexQueryParserService service;
@Inject
public BaseQueryBuilder(IndexQueryParserService service) {
this.service = service;
}
protected QueryBuilder readQueryBuilderFrom(StreamInput in) throws IOException {
QueryBuilder newBuilder;
String name = in.readString();
// the following two lines would become one once parsers and builders are merged
QueryParser parser = this.service.queryParser(name);
QueryBuilder builder = parser.createBuilder();
builder.readFrom(in);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this in general, will start looking into this. However the service / lookup field needs to somehow be static since it's read-only and we need it for every query deserialization, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing about this approach is the dependency that we would create between parser and builder... imagine the request comes from the java API, it's in binary format already, and we have to go to the parser to ask for an instance of the builder, but no parsing is needed. That feels slightly weird, maybe it's not a bit problem though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about my last comment, along the same lines, that is why I don't love the parserName method in the builders, and I hope it goes away soon (I thought at first it would be temporary, but maybe it'll have to stay, we'll see). Ideally, builders don't know about parsers, but parsers do know about their respective builders. What do you folks think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, since the Builders can exist without the other (as you said, going through the java API only). The link is the NAME field at the moment. What if in a first step we reverse that direction like you mentioned:
- move all unique NAME constants from the parsers to the builders
- getParserName() in the Builder classes goes away since its known then.
- I think the names() method in the parsers has to stay for now since its used to register them
- as long as we have parser-registry we could add a createBuilder() method to each parser that returns an empty builder instance. Or even stream into that.
I would do that in a separate PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna Just for clarification - my proposal only makes sense if ultimately we intend to merge parsers and builders. Otherwise to me it feels like a hack.
@cbuescher +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaineC yea I see, to be honest I don't know yet if we'll merge those at the end, I kinda like the separation between the two, but we will see. I like Christoph's idea around moving the name to the builder, then the parser can re-use it too rather than the other way around.
I think the names() method in the parsers has to stay for now since its used to register them
sure but it can always return something taken from the corresponding builder
as long as we have parser-registry we could add a createBuilder() method to each parser that returns an empty builder instance. Or even stream into that.
ah I see, so you would avoid registering two things, cause the parser can provide both, and they both need to be registered with the same name. This sounds good to me. Would make it easier for plugins to migrate and keep the registration of custom queries to one single call. Just to clarify, this wouldn't be a create* method, cause it wouldn't create an instance all the time, but simply return always the same empty builder, right?
48ef699
to
0e56b75
Compare
if (disableCoord != null) { | ||
builder.field("disable_coord", disableCoord); | ||
} | ||
builder.field("disable_coord", disableCoord); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about always printing this out, why?
I left a few comments. I think I would split this into two PRs, one for adding Streamable support to all query builders, and streamline serialization support, second one around actually refactoring the bool query. As I said in my comments I think we should move all query builders to |
… builder As part of the query refactoring, this PR moves the NAME field that currently identifies each query parser to the corresponding query builder. The reason for this, as discussed in elastic#11121 (comment), is that we need still need to be able to link parser and builder implementations, but that the query builder is independent of the parser (queries don't necessarily need to be coverted to XContent any more). Builders don't need to know about their parsers, but parsers need to be linked to their respective builders. Closes elastic#11178
…rsers Currently there is a registry for all QueryParsers accessible via the IndicesQueriesModule. For deserializing nested queries e.g. for the BoolQueryBuilder (elastic#11121) we need to look up query builders by their name to be able to deserialize using a prototype builder of the concrete class. This PR adds a getBuilderPrototype() method to each query parser so we can re-use the parser registry to get the corresponding builder using the query name. Closes elastic#11344
41ee416
to
f2a4e27
Compare
Did a rebase and squash after merge of NamedWritable interface on feature branch. I think this is good for another round of reviews now. |
} | ||
|
||
/** | ||
* Gets the queries that <b>should</b> appear in the matching documents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't quite read it. how about "Returns the clauses that should be matched by the returned documents" ? in general should appear is probably not the best way to say it (I'm not a native english speaker though!), I'd update it in the docs everywhere with some better wording
looks good, left some comments |
Adressed last round of comments, putting reading/writing of NamedWritable lists in the Streaming classes amongst other things. Let me know if you think this is good to go. |
int size = readInt(); | ||
for (int i = 0; i < size; i++) { | ||
C obj = readNamedWriteable(); | ||
if (obj != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it ever be null? I think there is something wrong if it is null, cause it should always throw exception upfront in case of issues?
left a couple of tiny comments, LGTM otherwise |
…text ctx) method into a parsing and a query building part, adding NamedWriteable implementation for serialization and hashCode(), equals() for testing. This change also adds tests using fixed set of leaf queries (Terms, Ids, MatchAll) as nested Queries in test query setup. Also QueryParseContext is adapted to return QueryBuilder instances for inner filter parses instead of previously returning Query instances, keeping old methods in place but deprecating them. Relates to elastic#10217 Closes elastic#11427
cfb749c
to
458d62e
Compare
…olquery Query refactoring: BoolQueryBuilder and Parser
… builder As part of the query refactoring, this PR moves the NAME field that currently identifies each query parser to the corresponding query builder. The reason for this, as discussed in elastic#11121 (comment), is that we need still need to be able to link parser and builder implementations, but that the query builder is independent of the parser (queries don't necessarily need to be coverted to XContent any more). Builders don't need to know about their parsers, but parsers need to be linked to their respective builders. Closes elastic#11178
…rsers Currently there is a registry for all QueryParsers accessible via the IndicesQueriesModule. For deserializing nested queries e.g. for the BoolQueryBuilder (elastic#11121) we need to look up query builders by their name to be able to deserialize using a prototype builder of the concrete class. This PR adds a getBuilderPrototype() method to each query parser so we can re-use the parser registry to get the corresponding builder using the query name. Closes elastic#11344
…ring-boolquery Query refactoring: BoolQueryBuilder and Parser
Refactors BoolQueryBuilder and Parser. Splits the parse() method into a parsing and a query building part, adding NamedWriteable implementation for serialization and hashCode(), equals() for testing.
This change also adds tests using fixed set of leaf queries (Terms, Ids, MatchAll) as nested Queries in test query setup. Also QueryParseContext is adapted to return QueryBuilder instances for inner filter parses instead of previously returning Query instances, keeping old methods in place but deprecating them.
Relates to #10217
PR goes agains query-refacoring feature branch.