Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate media_id causing failures in the media_stats table #59

Closed
matus-tomlein opened this issue Oct 12, 2023 · 1 comment · Fixed by #66
Closed

Fix duplicate media_id causing failures in the media_stats table #59

matus-tomlein opened this issue Oct 12, 2023 · 1 comment · Fixed by #66
Labels
category:models Related to the models in the package. priority:medium On the roadmap. type:bug Bugs or weaknesses. The issue has to contain steps to reproduce.

Comments

@matus-tomlein
Copy link
Contributor

Describe the bug

This problem occurs in case of either:

  1. The same media label being used for two different media contents
  2. Or some events being tracked with a different media_type or media_player_type than other media events for the same content (the properties being set later in the tracking)

This causes the media_stats table to break because it has a unique key on the media_id while also grouping by the media_label, media_type and media_player_type (see here).

Steps to reproduce

Generate events which have different media_type tracked for the same media_label.

Expected results

We don't want to hide this problem as it signals an issue in the tracking. But we also don't want the model to break. Instead, it would be better if the media_stats table contained multiple rows for each of the tracked property combinations.

Actual results

dbt jobs fail in this case.

Potential solutions

A couple of solutions are possible:

  1. Add media_type and media_player_type to the surrogate key when generaing media_id (here) – this would be a breaking change.
  2. Change the unique key for the media_stats table to instead be a combined version of the media_id, media_label, media_type, and media_player_type (or a surrogate key for them).
@matus-tomlein matus-tomlein added the type:bug Bugs or weaknesses. The issue has to contain steps to reproduce. label Oct 12, 2023
@github-actions github-actions bot added the status:needs_triage Needs maintainer triage. label Oct 12, 2023
@rlh1994
Copy link
Contributor

rlh1994 commented Oct 16, 2023

We'll work on this as part of the next major release so we can bundle in any other braking chagnes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:models Related to the models in the package. priority:medium On the roadmap. type:bug Bugs or weaknesses. The issue has to contain steps to reproduce.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants