Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: implement consistent formatting for constraint expressions #1985

Merged
merged 3 commits into from
Dec 19, 2023

Conversation

Blajda
Copy link
Collaborator

@Blajda Blajda commented Dec 19, 2023

Description

Implements consistent formatting for constraint expressions so something like value < 1000 is normalized to value < 1000

Also includes drive by improvements.

  1. Test & Fix that Datafusion expressions can actually be used when adding a constraint
  2. Test & Fix that constraints can be added to column with capitalization

Related Issue(s)

@github-actions github-actions bot added binding/rust Issues for the Rust crate crate/core labels Dec 19, 2023
Copy link
Collaborator

@ion-elgreco ion-elgreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition!

Left only one comment just for my own understanding

Comment on lines +114 to +115
let expr = into_expr(expr, &schema, &state)?;
let expr_str = fmt_expr_to_sql(&expr)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going from SQL string expr to Datafusion Expr and then back to sql string expression?

Copy link
Collaborator Author

@Blajda Blajda Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly.
If the provided expression is already a Datafusion expression then into_expr will simply return it otherwise it will parse the string expression.
Either way if a String is provided or a proper DF expression it needs to be normalized somewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which SQL dialect does Datafusion actually use?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgres ... but the actual dialect expressions in the delta log should comply with is as of yet unspecified :).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put SQLglot in between on the python side then, so we can allow more flexibility there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the generic dialect which in my mind is similar to postgres. Given we are only parsing expressions and not DML or DDL statements we should be fine.
https://docs.rs/sqlparser/latest/sqlparser/dialect/struct.GenericDialect.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put SQLglot in between on the python side then, so we can allow more flexibility there?

I'm indifferent to that feature. My only concern is if we start exposing operations to other languages (i.e delta-core) then we should have fairly consistent interfaces.
A decision / documentation needs to be made on which sql dialect must be used when exposing expressions to the log. If I had to guess they would pick hive sql.

@Blajda Blajda marked this pull request as ready for review December 19, 2023 19:45
Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@roeap roeap merged commit f6d2061 into delta-io:main Dec 19, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate crate/core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Constraint expr not formatted during commit action
3 participants