Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Support system tables #21285

Merged
merged 3 commits into from
Oct 31, 2023

Conversation

arhimondr
Copy link
Member

Description

Querying system tables with native execution enabled may result in incorrect result.

Motivation and Context

Native and Java based executions have different partitioning functions implementation hence queries such the one below may return incorrect result.

SELECT *
FROM  (SELECT DISTINCT regionkey FROM table) t 
INNER JOIN  (SELECT regionkey FROM table$partitions) p 
ON t.regionkey = p.regionkey

Impact

Incorrect results returned when querying system tables with native execution enabled

Test Plan

Integration test

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

If release note is NOT required, use:

== NO RELEASE NOTE ==

@arhimondr
Copy link
Member Author

The idea is to insert an extra "GATHER" exchange right on top of a TableScan of a system table. This will ensure partitioning function is applied by native worker consistently.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr Thank you, Andrii, for enabling queries that use system tables on Prestissimo and Presto-on-Spark on Velox.

@amitkdutta amitkdutta merged commit 2172b36 into prestodb:master Oct 31, 2023
59 checks passed
@arhimondr
Copy link
Member Author

@mbasmanova Happy to help

Presto on Spark (even with Java execution) is not supported though for a different reason. In Presto on Spark a distributed stage cannot read from a coordinator stage. Queries that require a system table to be read by a distributed stage (such as join) are not supported.

arhimondr added a commit to arhimondr/presto that referenced this pull request Jan 19, 2024
Follow up of prestodb#21285

Partial aggregation output might not be compatible between Java
and C++ implementations
arhimondr added a commit that referenced this pull request Jan 19, 2024
Follow up of #21285

Partial aggregation output might not be compatible between Java
and C++ implementations
kaikalur pushed a commit to kaikalur/presto that referenced this pull request Mar 14, 2024
Follow up of prestodb#21285

Partial aggregation output might not be compatible between Java
and C++ implementations
wypb pushed a commit to wypb/presto that referenced this pull request Apr 25, 2024
Follow up of prestodb#21285

Partial aggregation output might not be compatible between Java
and C++ implementations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants