Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for pagination in v2 engine of SELECT * FROM <table> queries #1666

Merged
merged 17 commits into from
May 30, 2023

Conversation

MaxKsyunz
Copy link
Collaborator

@MaxKsyunz MaxKsyunz commented May 29, 2023

Description

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

For example, initiate a pagination request, send:

curl -s -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query": "SELECT * from calcs", "fetch_size": 5}'

Send the following to get a subsequent page:

curl -s -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"cursor": "...." }'

Each response to a pagination request will include a cursor property if there is more data available. The last page in the paginated request will not have a cursor property.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Yury-Fridlyand and others added 8 commits April 26, 2023 17:23
* Support pagination in V2 engine, phase 1 (#226)

* Fixing integration tests broken during POC

Signed-off-by: MaxKsyunz <[email protected]>

* Comment to clarify an exception.

Signed-off-by: MaxKsyunz <[email protected]>

* Add support for paginated scroll request, first page.

Implement PaginatedPlanCache.convertToPlan for second page to work.

Signed-off-by: MaxKsyunz <[email protected]>

* Progress on paginated scroll request, subsequent page.

Signed-off-by: MaxKsyunz <[email protected]>

* Move `ExpressionSerializer` from `opensearch` to `core`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename `Cursor` `asString` to `toString`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Disable scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add full cursor serialization and deserialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Misc fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Further work on pagination.

* Added push down page size from `LogicalPaginate` to `LogicalRelation`.
* Improved cursor encoding and decoding.
* Added cursor compression.
* Fixed issuing `SearchScrollRequest`.
* Fixed returning last empty page.
* Minor code grooming/commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination fix for empty indices.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix error reporting on wrong cursor.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor comments and error reporting improvement.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add an end-to-end integration test.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add `explain` request handlers.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add IT for explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address issues flagged by checkstyle build step (#229)

Signed-off-by: MaxKsyunz <[email protected]>

* Pagination, phase 1: Add unit tests for `:core` module with coverage. (#230)

* Add unit tests for `:core` module with coverage. Uncovered: `toCursor`, because it is will be changed soon.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Pagination, phase 1: Add unit tests for SQL module with coverage. (#239)

* Add unit tests for SQL module with coverage.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java

Signed-off-by: Yury-Fridlyand <[email protected]>

Co-authored-by: GabeFernandez310 <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>

* Pagination, phase 1: Add unit tests for `:opensearch` module with coverage. (#233)

* Add UT for `:opensearch` module with full coverage, except `toCursor`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix checkstyle.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix the merges.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix explain.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Store `TotalHits` and use it to report `total` in response.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add missing UT for `:protocol` module.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix PPL UTs damaged in f4ea4ad.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor checkstyle fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fallback to v1 engine for pagination (#245)

* Pagination fallback integration tests.

Signed-off-by: MaxKsyunz <[email protected]>

* Add UT with coverage for `toCursor` serialization.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix broken tests in `legacy`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix getting `total` from non-paged requests and from queries without `FROM` clause.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix scroll cleaning.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix cursor request processing.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update ITs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix (again) TotalHits feature.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix typo in prometheus config.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Recover commented logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move `test_pagination_blackbox` to a separate class and add logging.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address some PR feedbacks: rename some classes and revert unnecessary whitespace changed.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor commenting.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address PR comments.

* Add javadocs
* Renames
* Cleaning up some comments
* Remove unused code
* Speed up IT

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor missing changes.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Integration tests for fetch_size, max_result_window, and query.size_limit (#248)

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedQueryService`, extend `QueryService` to hold two planners and use them.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Move push down functions from request builders to a new interface.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Some file moves.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor clean-up according to PR review.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>

* Make scroll timeout configurable.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Fix IT to set cursor keep alive parameter.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove `QueryId.None`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Rename according to PR feedback.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations of `PushDownRequestBuilder`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Merge paginated plan optimizer into the regular optimizer. (#1516)

Merge paginated plan optimizer into the regular optimizer.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Complete rework on serialization and deserialization. (#1498)

Signed-off-by: Yury-Fridlyand <[email protected]>

* Resolve merge conflicts and fix tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor cleanup - missing changes for the previous commit.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove paginate operator  (#1528)

* Remove PaginateOperator class since it is no longer used.


---------

Signed-off-by: MaxKsyunz <[email protected]>

* Remove `PaginatedPlan` - move logic to `QueryPlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Remove default implementations from `SerializablePlan`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Add a doc.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Update design graphs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for merge from upstream/main.

Signed-off-by: MaxKsyunz <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: MaxKsyunz <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>
* Add newer docs for pagination.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address PR feedback.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Complete TODO and add some more info.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Address doc review comments.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Clean up docs.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Apply suggestions from code review

Co-authored-by: Andrew Carbonetto <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>

* Apply suggestions from code review

Co-authored-by: Andrew Carbonetto <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>

* Minor fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
* Update design document to reflect refactor.

Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
@codecov
Copy link

codecov bot commented May 29, 2023

Codecov Report

Merging #1666 (6811a8c) into main (6d796ee) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #1666      +/-   ##
============================================
+ Coverage     97.18%   97.24%   +0.06%     
- Complexity     4150     4259     +109     
============================================
  Files           372      386      +14     
  Lines         10429    10668     +239     
  Branches        716      738      +22     
============================================
+ Hits          10135    10374     +239     
  Misses          287      287              
  Partials          7        7              
Flag Coverage Δ
sql-engine 97.24% <100.00%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ch/sql/planner/optimizer/LogicalPlanOptimizer.java 100.00% <ø> (ø)
...pensearch/sql/planner/physical/FilterOperator.java 100.00% <ø> (ø)
...pensearch/sql/planner/physical/NestedOperator.java 100.00% <ø> (ø)
...java/org/opensearch/sql/storage/StorageEngine.java 100.00% <ø> (ø)
...rc/main/java/org/opensearch/sql/storage/Table.java 100.00% <ø> (ø)
...ch/sql/opensearch/client/OpenSearchRestClient.java 100.00% <ø> (ø)
...ch/sql/opensearch/response/OpenSearchResponse.java 100.00% <ø> (ø)
...rch/sql/opensearch/setting/OpenSearchSettings.java 100.00% <ø> (ø)
...ql/opensearch/storage/OpenSearchStorageEngine.java 100.00% <ø> (ø)
...ge/script/aggregation/AggregationQueryBuilder.java 100.00% <ø> (ø)
... and 47 more

@Yury-Fridlyand Yury-Fridlyand added the pagination Pagination feature, ref #656 label May 30, 2023
Signed-off-by: MaxKsyunz <[email protected]>
acarbonetto
acarbonetto previously approved these changes May 30, 2023
@Yury-Fridlyand Yury-Fridlyand mentioned this pull request May 30, 2023
6 tasks
* Convert a scroll request to string that can be included in a cursor.
* @return a string representing the scroll request.
*/
@Override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. please correct the doc, it is not alligned with code.
  2. add the doc in OpenSearchRequest.

@@ -24,36 +24,39 @@
* by delegated builder internally. This is to avoid conditional check of different push down logic
* for non-aggregate and aggregate query everywhere.
*/
public class OpenSearchIndexScanBuilder extends TableScanBuilder {
public abstract class OpenSearchIndexScanBuilder extends TableScanBuilder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up on same question on #1600.
why it is abstract and no subclass?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subclass is created in OpenSearchIndex.createScanBuilder. The subclass captures data (primarily OpenSearch client reference) necessary in OpenSearchIndexScanBuilder.build to create an instance of OpenSearchIndexScan.

At this point, OpenSearchIndexScanBuilder can become an inner class of OpenSearchIndex but doing so includes refactoring of OpenSearchIndexScanOptimizationTest suite as well.

I plan to include this change as part of follow-up pagination work.

This was my response in #1600 with a slightly different description.

}

@Override
public TableScanOperator build() {
return delegate.build();
return createScan(delegate.build());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference of build and createScan?

Copy link
Collaborator Author

@MaxKsyunz MaxKsyunz May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenSearchIndexScanBuilder.createScan -- create an OpenSearchIndexScan based on the provided OpenSearchRequest.

OpenSearchIndexScanBuilder.build -- build an OpenSearchRequest and an OpenSearchIndexScan based on it.

They will be merged as part of making OpenSearchIndexScanBuilder inner class of OpenSearchIndexScan.

Related to the other discussion about builder.

acarbonetto
acarbonetto previously approved these changes May 30, 2023
Signed-off-by: MaxKsyunz <[email protected]>
Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observation: there are about 15 files with no-op changes - whitespaces or comment updating. Pagination feature had bigger scope, but after few rounds of reworks and improvements, some changes were reverted, but not comments.
I also found few typos in javadocs, nothing critical. Will be fixed in the following work on Pagination feature.

No issues nor objections found while code review and manual testing!

@MaxKsyunz MaxKsyunz merged commit 57ce303 into main May 30, 2023
@Yury-Fridlyand Yury-Fridlyand deleted the feature/pagination/integ branch May 30, 2023 21:44
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 30, 2023
#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
(cherry picked from commit 57ce303)
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 30, 2023
#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
(cherry picked from commit 57ce303)
Yury-Fridlyand pushed a commit that referenced this pull request May 30, 2023
…<table>` queries (#1685)

* Support for pagination in v2 engine of `SELECT * FROM <table>` queries (#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
(cherry picked from commit 57ce303)
acarbonetto pushed a commit that referenced this pull request May 30, 2023
…<table>` queries (#1684)

* Support for pagination in v2 engine of `SELECT * FROM <table>` queries (#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
(cherry picked from commit 57ce303)

* Fix test build failure.

Somehow the import is required in 2.x but not 3.0

Signed-off-by: MaxKsyunz <[email protected]>

---------

Signed-off-by: MaxKsyunz <[email protected]>
Co-authored-by: Max Ksyunz <[email protected]>
MitchellGale pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Jun 12, 2023
opensearch-project#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
MitchellGale pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Jun 12, 2023
opensearch-project#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
Signed-off-by: Mitchell Gale <[email protected]>
MitchellGale pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Jun 12, 2023
opensearch-project#1666)

v2 SQL engine can now paginate simple queries. Pagination is initiated by setting fetch_size property in the request JSON.

Pagination is implemented using the OpenSearch Scroll API. Please see pagination-v2.md for implementation details.
---------

Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Max Ksyunz <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>
Co-authored-by: GabeFernandez310 <[email protected]>
Co-authored-by: Andrew Carbonetto <[email protected]>
Signed-off-by: Mitchell Gale <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants