Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report zero values instead of unknown for empty ingest queries #15674

Merged
merged 9 commits into from
Jan 17, 2024

Conversation

abhishekrb19
Copy link
Contributor

@abhishekrb19 abhishekrb19 commented Jan 12, 2024

MSQ now allows empty ingest queries by default. For such queries that don't generate any output rows, the query counters in the async status result object/task report don't contain numTotalRows and totalSizeInBytes. These properties when not set/undefined can be confusing to API clients. For example, the web-console treats it as unknown values.

This patch fixes the counters by explicitly reporting them as 0 instead of null for empty ingest queries.

  • Add a test case each for an empty insert and empty replace query in SqlMSQStatementResourcePostTest.java. Validate the responses which now include ResultSetInformation with 0 rows instead of null.
  • Add test coverage for all the MSQ Destinations, including DataSourceMSQDestination and TaskReportMSQDestination in SqlStatementResourceHelperTest.java. We don't report 0 values for DurableStorageDestination type as it only applies to select queries as it can be unknown, so the behavior is preserved as-is.
  • Some miscellaneous spacing, typo and interpolation log fixes in QueryHostFinder.java that I found while testing this patch.

Before the fix, web-console interpreting as unknown number of rows:

CleanShot 2024-01-11 at 23 39 53@2x

After the fix:

CleanShot 2024-01-11 at 23 39 15@2x
Key changed/added classes in this PR
  • SqlStatementResourceHelper.java
  • SqlStatementResourceHelperTest.java

Release Note

Related to #15495, MSQE will report report zero values for numTotalRows and totalSizeInBytes.

This PR has:

  • been self-reviewed.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • a release note entry in the PR description.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

- MSQ engine allows empty ingest by default. The statement response and task
status API should show 0 rows instead of null rows. If worker counter snapshot
is present but empty that denotes that 0 rows were processed.
- Add tests for this scenario.
- Fixup miscellaneous interpollation in QueryHostFinder (extra space, incorrect
interpolations).
@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Jan 12, 2024
@cryptoe cryptoe merged commit c27f5bf into apache:master Jan 17, 2024
83 checks passed
@cryptoe
Copy link
Contributor

cryptoe commented Jan 17, 2024

@abhishekrb19 Could you also add a release note for the above PR. It helps out when release notes are compiled.

@cryptoe cryptoe added this to the Druid 29.0.0 milestone Jan 17, 2024
@abhishekrb19 abhishekrb19 deleted the msq_fix_empty_counters branch January 17, 2024 20:29
LakshSingla pushed a commit to LakshSingla/druid that referenced this pull request Jan 30, 2024
…e#15674)

MSQ now allows empty ingest queries by default. For such queries that don't generate any output rows, the query counters in the async status result object/task report don't contain numTotalRows and totalSizeInBytes. These properties when not set/undefined can be confusing to API clients. For example, the web-console treats it as unknown values.

This patch fixes the counters by explicitly reporting them as 0 instead of null for empty ingest queries.
abhishekagarwal87 pushed a commit that referenced this pull request Jan 30, 2024
… (#15791)

MSQ now allows empty ingest queries by default. For such queries that don't generate any output rows, the query counters in the async status result object/task report don't contain numTotalRows and totalSizeInBytes. These properties when not set/undefined can be confusing to API clients. For example, the web-console treats it as unknown values.

This patch fixes the counters by explicitly reporting them as 0 instead of null for empty ingest queries.

Co-authored-by: Abhishek Radhakrishnan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants