Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure optimize is run with liquid_clustered_by #463

Merged
merged 5 commits into from
Sep 27, 2023
Merged

Conversation

benc-db
Copy link
Collaborator

@benc-db benc-db commented Sep 26, 2023

Resolves #448

Description

Discovered that we weren't running optimize with liquid_cluster_by because the optimize clause was only set up to respect z-ordering.

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 19:53 — with GitHub Actions Inactive
@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 19:53 — with GitHub Actions Inactive
@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 19:53 — with GitHub Actions Inactive
susodapop
susodapop previously approved these changes Sep 26, 2023
Copy link

@susodapop susodapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one suggestion.

@@ -605,6 +605,12 @@ def _assertTablesEqualSql(self, relation_a, relation_b, columns=None):

return sql

def assert_in_log(self, message) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is eloquent. Will it make sense to use it elsewhere in the test suite retroactively?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other place this behavior exists today checks for several different messages. For the most part you shouldn't look at the log, since you can just as easily look at stdout with run_and_capture; here I specifically need to look at debug messages, which I'm only doing because I don't know another good way to observe optimize. So short answer is, I may expand this in the future to take multiple messages, at which point that other test can fold in, but I also don't want to encourage log scanning lol.

def assert_in_log(self, message) -> None:
log_file = os.path.join(self._logs_dir, "dbt.log")
with open(log_file, "r") as f:
log = f.read()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One recommendation, run either .upper() or .lower() on both the input and the log message in case of capitalisation differences.

@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 20:45 — with GitHub Actions Inactive
@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 20:45 — with GitHub Actions Inactive
@benc-db benc-db temporarily deployed to azure-prod September 26, 2023 20:45 — with GitHub Actions Inactive
@benc-db benc-db merged commit 3609a6e into main Sep 27, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run optimize after Liquid Clustering
3 participants