Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dbt): Keep syncing datasets if one failed to sync columns #313

Merged
merged 1 commit into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions src/preset_cli/cli/superset/sync/dbt/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -349,11 +349,17 @@ def sync_datasets( # pylint: disable=too-many-locals, too-many-arguments
# compute columns
final_dataset_columns = []
if not reload_columns:
refreshed_columns_list = client.get_refreshed_dataset_columns(dataset["id"])
final_dataset_columns = compute_columns(
dataset["columns"],
refreshed_columns_list,
)
try:
refreshed_columns_list = client.get_refreshed_dataset_columns(
dataset["id"],
)
final_dataset_columns = compute_columns(
dataset["columns"],
refreshed_columns_list,
)
except SupersetError:
failed_datasets.append(model["unique_id"])
Copy link

@Antonio-RiveroMartnez Antonio-RiveroMartnez Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very familiar with this part of the code, but since this list is visible only using a certain flag (--raise-failures) and it wasn't there before to consume, it would be good to put that info somewhere in the help of the cli method if not there already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the feedback, @Antonio-RiveroMartnez 🙏 I can confirm the flag is documented in the help command:

(local-cli) ➜  backend-sdk git:(dbt-cloud-versionless) preset-cli superset sync dbt-core --help
Usage: preset-cli superset sync dbt-core [OPTIONS] FILE

Options:
  --project TEXT              Name of the dbt project
  --target TEXT               Target name
  --profiles PATH             Location of profiles.yml file
  --exposures PATH            Path to file where exposures will be written
  --import-db                 Import (or update) the database connection to
                              Superset
  --disallow-edits            Mark resources as managed externally to prevent
                              edits
  --external-url-prefix TEXT  Base URL for resources
  -s, --select TEXT           Model selection
  -x, --exclude TEXT          Models to exclude
  --exposures-only            Do not sync models to datasets and only fetch
                              exposures instead
  --preserve-columns          Preserve column and metric configurations
                              defined in Preset
  --preserve-metadata         Preserve column and metric configurations
                              defined in Preset
  --merge-metadata            Update Preset configurations based on dbt
                              metadata. Preset-only metrics are preserved
  --raise-failures            End the execution with an error if a model fails
                              to sync or a deprecated feature is used
  --help                      Show this message and exit.

Since this is a customer-facing CLI, we also have this logic documented here.

continue

# compute update payload
update = compute_dataset_metadata(
Expand Down
28 changes: 28 additions & 0 deletions tests/cli/superset/sync/dbt/datasets_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,34 @@ def test_sync_datasets_second_update_fails(mocker: MockerFixture) -> None:
assert failed == [models[0]["unique_id"]]


def test_sync_datasets_sync_columns_fails(mocker: MockerFixture) -> None:
"""
Test ``sync_datasets`` when the request to sync columns fails.
"""
client = mocker.MagicMock()
client.get_refreshed_dataset_columns.side_effect = [SupersetError([error])]
mocker.patch(
"preset_cli.cli.superset.sync.dbt.datasets.get_or_create_dataset",
return_value={"id": 1, "metrics": [], "columns": []},
)
mocker.patch(
"preset_cli.cli.superset.sync.dbt.datasets.compute_metrics",
)
working, failed = sync_datasets(
client=client,
models=models,
metrics=metrics,
database={"id": 1},
disallow_edits=False,
external_url_prefix="",
reload_columns=False,
)

client.update_dataset.assert_not_called()
assert working == []
assert failed == [models[0]["unique_id"]]


def test_sync_datasets_with_alias(mocker: MockerFixture) -> None:
"""
Test ``sync_datasets`` when the model has an alias.
Expand Down
Loading