Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEFORE syntax support for presto engine. #22631

Merged
merged 1 commit into from
May 14, 2024

Conversation

gupteaj
Copy link
Contributor

@gupteaj gupteaj commented Apr 29, 2024

Description

Presto issue #21971

Note - This PR has BEFORE syntax and engine side changes. For iceberg connector changes, test and doc, a new PR will be created.

Design description :
This feature will allow iceberg connector to query historical data using BEFORE syntax on a table.
Time travel version option will read bigint snapshot id value for the table. Time travel timestamp option will read
timestamp-with-time-zone value for the table.

examples

// Return table data for tab1 before snapshot id 8772871542276440693
select * from tab1 FOR SYSTEM_VERSION BEFORE 8772871542276440693;

// Return table data for tab1 before '2023-08-17 13:29:46.822 America/Los_Angeles'
select * from tab1 FOR SYSTEM_TIME BEFORE TIMESTAMP '2023-08-17 13:29:46.822 America/Los_Angeles';

Parser, Semantic Analyzer, Metadata

  • support point in time query with BEFORE syntax on table reference
select * from tab1 FOR VERSION BEFORE  < bigint type >
select * from tab1 FOR TIMESTAMP BEFORE <timestamp with time zone type>

In parser tree, update TableVersionExpression class to hold time travel operator ( EQUAL/LESS_THAN) for AS OF/BEFORE state

  • use table version operator in Semantic Analyzer code along with table version type and expression
  • return a new error from Semantic Analyzer for BEFORE syntax
  • rename AS OF code with generic table version state

Motivation and Context

Snowflake provides a BEFORE clause for time travel queries. When used in a query, this clause is specified in the FROM clause immediately after the table name. It determines the point in the past from which historical data is requested for the object. The AT keyword includes changes made by a statement or transaction with a timestamp equal to the specified parameter, while the BEFORE keyword refers to a point immediately preceding the specified parameter.

IBM Netezza supports BEFORE clause similar to Snowflake.

References -
Snowflake - https://docs.snowflake.com/en/sql-reference/constructs/at-before
IBM Netezza - https://cloud.ibm.com/docs/netezza?topic=netezza-runningqueries_tt#before_tt

Impact

A new table level option to return specific snapshot for iceberg connector

Test Plan

  • Add new parser tests for BEFORE syntax
  • Update TestIcebergTableVersion test for BEFORE syntax
  • New iceberg PR will have time travel tests

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

Note - Iceberg PR will have release notes and doc section

Copy link

github-actions bot commented Apr 30, 2024

Codenotify: Notifying subscribers in CODENOTIFY files for diff 12a6e44...153bc1f.

Notify File(s)
@aditi-pandit presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@elharo presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@kaikalur presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@rschlussel presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

@tdcmeehan tdcmeehan self-assigned this Apr 30, 2024
Copy link
Contributor

@ClarenceThreepwood ClarenceThreepwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include parser/formatter tests in TestSqlParser and TestStatementBuilder

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Only some little nits.

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix.

Copy link
Contributor

@ClarenceThreepwood ClarenceThreepwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks!

@gupteaj gupteaj merged commit d8d60a9 into prestodb:master May 14, 2024
56 checks passed
@zacw7
Copy link
Member

zacw7 commented May 19, 2024

Hey @gupteaj , this change has broken a bunch of tests.

Stderr:
presto/presto-parser/src/main/java/com/facebook/presto/sql/parser/AstBuilder.java:1607: error: cannot find symbol
                return timestampExpression(getLocation(context), getTableVersionOperator((Token) context.tableVersionState().getChild(0).getPayload()), child);
                                                                                                        ^
  symbol:   method tableVersionState()
  location: variable context of type TableVersionContext
presto/presto-parser/src/main/java/com/facebook/presto/sql/parser/AstBuilder.java:1610: error: cannot find symbol
                return versionExpression(getLocation(context), getTableVersionOperator((Token) context.tableVersionState().getChild(0).getPayload()), child);

Do we need to upgrade the antlr version?

@tdcmeehan
Copy link
Contributor

@zacw7 can you add a link to a test run?

@zacw7
Copy link
Member

zacw7 commented May 20, 2024

@zacw7 can you add a link to a test run?

They are meta internal tests. Let me check the build tool.

@ClarenceThreepwood
Copy link
Contributor

I suspect that the Meta build tool chain needs to recompile the parser module more aggressively. This is the second such instance I have seen in recent times

@wanglinsong wanglinsong mentioned this pull request Jun 25, 2024
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants