Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update uniqueness test #74

Merged
merged 9 commits into from
Apr 17, 2024
Merged

Conversation

fivetran-reneeli
Copy link
Contributor

@fivetran-reneeli fivetran-reneeli commented Apr 9, 2024

PR Overview

This PR will address the following Issue/Feature:
#73

This PR will result in the following new package version: v0.11.1

  • updates uniqueness test

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

🚨 Breaking Changes 🚨

  • Updates the unique_invoice_line_item_id uniqueness test in stg_stripe__invoice_line_item to include invoice_id. This is because unique_invoice_line_item_id (unique_id in the raw source invoice_line_item table) was part of an older version of Stripe that was included in the new version to help migrate internal references. See the Stripe API update for more information. The Fivetran connector persists this in order to resolve the pagination break issue for invoice line items that was introduced by the API update.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

  • will ask customer to confirm the test works now

If you had to summarize this PR in an emoji, which would it be?

💃

@fivetran-reneeli fivetran-reneeli linked an issue Apr 9, 2024 that may be closed by this pull request
4 tasks
@fivetran-reneeli fivetran-reneeli self-assigned this Apr 9, 2024
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through these updates, I just have a few questions and suggestions before this will be good to approve. Let me know if you would like to discuss in more detail.

Comment on lines 489 to 493
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- unique_invoice_line_item_id
- invoice_id
- source_relation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Wouldn't it make more sense to include unique_invoice_line_item_id in the unique combination of columns test above as opposed to creating a new test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, do we even need to include this in the test? It seems the original unique combination of columns test is passing just fine. Since the unique_id is an artifact of an older Stripe API, should we just remove the test on that field altogether? It seems like testing on invoice_id and invoice_line_id is sufficient. What are your thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, keep the existing unique test on the combination of invoice_id and invoice_line_id, just remove the unique test on unique_invoice_line_item_id. The hesitancy I have with that is it then diverges from the ERD.

image

Given that, a good compromise could be your first suggestion, adding unique_invoice_line_item_id to the existing test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would not create a new test in this release. I would prefer we remove the failing one and then either not test the unique_id anymore or include it in the existing test.

This may be a hot take, but I would actually recommend we don't include the unique_id at all and we just remove the test. From your investigation we have found that the unique_id is an artifact from a release from January 2020. We also can see the unique combo test on invoice_id and invoice_line_id is working as expected. I would worry that adding the unique_id to that test would not actually be an accurate representation of the uniqueness of the table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per internal discussion with team, will remove the uniqueness test from unique_id entirely. Will see if any changes are needed on the connector end.

CHANGELOG.md Outdated
[PR [#74](https://github.com/fivetran/dbt_stripe_source/pull/74)] includes the following updates:

## 🚨 Breaking Changes 🚨
- Updates the `unique_invoice_line_item_id` uniqueness test in `stg_stripe__invoice_line_item` to include `invoice_id`. This is because `unique_invoice_line_item_id` (`unique_id` in the raw source `invoice_line_item` table) was part of an older version of Stripe that was included in the new version to help migrate internal references. See the Stripe [API update](https://stripe.com/docs/upgrades#2019-12-03) for more information. The Fivetran connector persists this in order to resolve the pagination break issue for invoice line items that was introduced by the [API update](https://stripe.com/docs/upgrades#2019-12-03).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request to update this based on feedback from the test updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, is there a reason this is listed as a breaking change? If it is a breaking change, we will need to bump the version to v0.12.0 and also make a breaking change downstream in the dbt_stripe package.

I am not entirely sure a breaking change is needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes update based on above! And true, I changed this to a bugfix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to make the appropriate updates here based on the other comment.

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli I just responded to your comments. Let me know if you have any questions or would like to discuss further. Once the remaining updates are applied I will give this a re-review.

Comment on lines 489 to 493
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- unique_invoice_line_item_id
- invoice_id
- source_relation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would not create a new test in this release. I would prefer we remove the failing one and then either not test the unique_id anymore or include it in the existing test.

This may be a hot take, but I would actually recommend we don't include the unique_id at all and we just remove the test. From your investigation we have found that the unique_id is an artifact from a release from January 2020. We also can see the unique combo test on invoice_id and invoice_line_id is working as expected. I would worry that adding the unique_id to that test would not actually be an accurate representation of the uniqueness of the table.

CHANGELOG.md Outdated
[PR [#74](https://github.com/fivetran/dbt_stripe_source/pull/74)] includes the following updates:

## 🚨 Breaking Changes 🚨
- Updates the `unique_invoice_line_item_id` uniqueness test in `stg_stripe__invoice_line_item` to include `invoice_id`. This is because `unique_invoice_line_item_id` (`unique_id` in the raw source `invoice_line_item` table) was part of an older version of Stripe that was included in the new version to help migrate internal references. See the Stripe [API update](https://stripe.com/docs/upgrades#2019-12-03) for more information. The Fivetran connector persists this in order to resolve the pagination break issue for invoice line items that was introduced by the [API update](https://stripe.com/docs/upgrades#2019-12-03).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to make the appropriate updates here based on the other comment.

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fivetran-reneeli fivetran-reneeli merged commit 4bfb2b3 into main Apr 17, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Uniqueness test on unique_invoice_line_item_id
3 participants