-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: GROUP BY with timezone and small page size returns duplicate values #51258
Comments
Pinging @elastic/es-search (:Search/SQL) |
@bpintea This "issue" has to do with not passing the I will open a PR to clarify this in our docs. |
The query is of course executed once with the correct timezone passed with the initial HTTP request, the problem is in the |
Previously, when the specified (or default) `fetchSize` led to subsequent HTTP requests and the usage of cursors, those subsequent cursor requests didn't use the `timezone` defined in the connection properties. Even though the query is executed once (with the correct timezone) the processing of the values returned by the `HitExtractors` in the next pages was done using the default timezone `Z`. This could lead to incorrect results. Fix the issue by passing the initially configured timezone to each subsequent cursor HTTP request. Add a note in the docs to clarify that when the REST endpoint is used, the user needs to keep passing the `time_zone` parameter to each subsequent request, as there is no notion of a user session through in the REST HTTP endpoint. Relates to: elastic#51258
Ah, I see. I haven't checked what the cursor actually stores, but:
|
The docs currently state: |
I opened #52080 to fix docs and also fix the issue for the JDBC driver. |
@bpintea I was confused with the implementation. And @costin helped out (as always :) ): #52080 (comment) The expected behaviour is not to pass the timezone in the subsequent paging requests, so there is a bug that affects both http REST endpoint and the jdbc driver. |
Previously, when the specified (or default) fetchSize led to subsequent HTTP requests and the usage of cursors, those subsequent were no longer using the client timezone specified in the initial SQL query. As a consequence, Even though the query is executed once (with the correct timezone) the processing of the query results by the HitExtractors in the next pages was done using the default timezone `Z`. This could lead to incorrect results. Fix the issue by correctly using the initially specified timezone, which is found in the deserialisation of the cursor string. Fixes: elastic#51258
Previously, when the specified (or default) fetchSize led to subsequent HTTP requests and the usage of cursors, those subsequent were no longer using the client timezone specified in the initial SQL query. As a consequence, Even though the query is executed once (with the correct timezone) the processing of the query results by the HitExtractors in the next pages was done using the default timezone Z. This could lead to incorrect results. Fix the issue by correctly using the initially specified timezone, which is found in the deserialisation of the cursor string. Fixes: #51258
Previously, when the specified (or default) fetchSize led to subsequent HTTP requests and the usage of cursors, those subsequent were no longer using the client timezone specified in the initial SQL query. As a consequence, Even though the query is executed once (with the correct timezone) the processing of the query results by the HitExtractors in the next pages was done using the default timezone Z. This could lead to incorrect results. Fix the issue by correctly using the initially specified timezone, which is found in the deserialisation of the cursor string. Fixes: #51258 (cherry picked from commit 8f7afbd)
Previously, when the specified (or default) fetchSize led to subsequent HTTP requests and the usage of cursors, those subsequent were no longer using the client timezone specified in the initial SQL query. As a consequence, Even though the query is executed once (with the correct timezone) the processing of the query results by the HitExtractors in the next pages was done using the default timezone Z. This could lead to incorrect results. Fix the issue by correctly using the initially specified timezone, which is found in the deserialisation of the cursor string. Fixes: #51258 (cherry picked from commit 8f7afbd)
The following query on the test
employees.csv
:will return the following paged results:
null
,1952
,1952
, ...Increasing the page size to 2 will duplicate
1954
.This doesn't (easily?) reproduce without a non-Z timezone.
The text was updated successfully, but these errors were encountered: