Skip to content

Commit

Permalink
Add missing primary key to media_ad_views
Browse files Browse the repository at this point in the history
  • Loading branch information
georgewoodhead committed Nov 14, 2023
1 parent 2f41fde commit 6363007
Show file tree
Hide file tree
Showing 8 changed files with 33 additions and 14 deletions.
7 changes: 4 additions & 3 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
snowplow-media-player 0.7.0 (2023-xx-xx)
---------------------------------------
## Summary
This release adds a more robust unique media identifier. This fixes an issue where duplicate `media_id` values could occur in the media stats table as a result of incorrect tracking implementation (e.g. sharing the same media label across different media types).
This release adds a more robust unique media identifier. This fixes an issue where duplicate `media_id` values could occur in the media stats table as a result of incorrect tracking implementation (e.g. sharing the same media label across different media types). This release also fixes the incremental materialization of the media_ad_views table by adding a unique primary key.

## Features
Add unique media identifier (close #59)
## Fixes
- Add unique media identifier (close #59)
- Add missing primary key to media_ad_views

## Under the hood

Expand Down
6 changes: 3 additions & 3 deletions docs/markdown/snowplow_media_player_common_cols.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@ A UUID for each event e.g. `c6ef3124-b53a-4b13-a233-0088f79dcbcb`.
{% enddocs %}

{% docs col_media_identifier %}
The surrogate key generated from `media_id`, `media_label`, `media_type` and `media_player_type` to create a unique media element identifier.
The surrogate key generated from `player_id`, `media_label`, `media_type` and `media_player_type` to create a unique media element identifier.
{% enddocs %}

{% docs col_player_id %}
The HTML id attribute of the media content. It is the `player_id` in case of YouTube and `html_id` in case of HTML5.
{% enddocs %}

{% docs col_play_id %}
The surrogate key generated from `page_view_id`, `media_id`, `media_label`, `media_type` and `media_player_type` to create a unique play event identifier.
The surrogate key generated from `page_view_id`, `player_id`, `media_label`, `media_type` and `media_player_type` to create a unique play event identifier.
{% enddocs %}

{% docs col_page_view_id %}
Expand Down Expand Up @@ -301,7 +301,7 @@ The number of pageviews with audio plays of any duration.
{% enddocs %}

{% docs col_last_base_tstamp %}
The start_tstamp of the last processed page_view across all media_ids to be used as a lower limit for subsequent incremental runs.
The start_tstamp of the last processed page_view across all media_identifiers to be used as a lower limit for subsequent incremental runs.
{% enddocs %}

{% docs col_player_current_time %}
Expand Down
4 changes: 2 additions & 2 deletions docs/markdown/snowplow_media_player_macro_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,10 @@ The query for the player_id column.
```sql
select
...,
{{ media_id_field(
{{ player_id_field(
youtube_player_id='a.contexts_com_youtube_youtube_1[0]:playerId',
media_player_id='a.contexts_org_whatwg_media_element_1[0]:htmlId::varchar'
) }} as media_id
) }} as player_id
from {{ var('snowplow__events') }} as a
```

Expand Down
2 changes: 1 addition & 1 deletion models/base/scratch/base_scratch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ models:
description: '{{ doc("col_domain_userid") }}'
- name: media_identifier
description: '{{ doc("col_media_identifier") }}'
- name: media_id
- name: player_id
description: '{{ doc("col_player_id") }}'
- name: media_label
description: '{{ doc("col_media_label") }}'
Expand Down
7 changes: 7 additions & 0 deletions models/media_ad_views/media_ad_views.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ models:
+tags: "snowplow_media_player_incremental"
description: '{{ doc("table_base") }}'
columns:
- name: media_ad_view_id
description: The primary key of this table
tags:
- primary-key
tests:
- unique
- not_null
- name: media_ad_id
description: '{{ doc("col_media_ad_id") }}'
tests:
Expand Down
7 changes: 7 additions & 0 deletions models/media_ad_views/scratch/base_scratch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@ models:
- name: snowplow_media_player_media_ad_views_this_run
description: '{{ doc("table_media_ad_views_this_run") }}'
columns:
- name: media_ad_view_id
description: The primary key of this table
tags:
- primary-key
tests:
- unique
- not_null
- name: media_ad_id
description: '{{ doc("col_media_ad_id") }}'
tests:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,11 @@ events_this_run as (

)

select *
{% if target.type in ['databricks', 'spark'] -%}
, date(prep.viewed_at) as viewed_at_date
{%- endif %}
from prep
select
{{ dbt_utils.generate_surrogate_key(['p.play_id', 'p.ad_break_id', 'p.media_ad_id']) }} as media_ad_view_id
, p.*
{% if target.type in ['databricks', 'spark'] -%}
, date(p.viewed_at) as viewed_at_date
{%- endif %}

from prep as p
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ You may obtain a copy of the Snowplow Personal and Academic License Version 1.0
{{
config(
materialized= "incremental",
unique_key= 'media_ad_view_id',
upsert_date_key='last_event',
sort = 'last_event',
dist = 'media_ad_id',
Expand Down

0 comments on commit 6363007

Please sign in to comment.