-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend size_limit setting in query engine to support unlimited index query. #703
Comments
Probably another way of thinking about this: the |
Update the proposal as discussed. |
DesignOpenSearch RequestRequest OperatorsNon Aggregation QueryInterface to the OpenSearch engine used by the
Aggregation QueryThere's no scroll request for aggregation queries in OpenSearch. Request Builder
Physical Plan Implementation
Index Scan Execution
|
Remaining issues:
Here we assume
These work as expected:
But these don't:
The reason being that limit is only pushed down to index scan if they're optimized and merged into a single node. SolutionOption 1Better logical plan optimization so that the Project logical plan node doesn't block optimization for other plan nodes. |
One note on the performance. |
@seankao-az @dai-chen In my opinion, there are two issues:
My proposal is |
makes sense to me. so query.size_limit can be any positive number, regardless to max_result_window. Regarding
I think we should let plugins.query.size_limit setting only decide the final result size, not size of any intermediate step. |
Problem statements
Currently, the query.size_limit setting configure the maximum amount of documents to be pull from OpenSearch. The default value is: 200. for example, Let's say size_limit = 200, and index has 10K docs.
source=index
source=index | head 1
source=index | head 11000
Proposal
The query.size_limit configure the maximum amount of rows returned by query. The default value is: 200. size_limit must larger than 0. If the query has head(PPL) or limit(SQL). it will override the query.size_limit setting.
Expectation of search query .
source=index
source=index | head 1
source=index | head 11000
Expectation of aggregation query.
source=index | stats request, count(*) by request
source=index | stats request, count(*) by request | head 11000
The text was updated successfully, but these errors were encountered: