Skip to content

Commit

Permalink
Add support for new Snowplow media event and entity schemas on Web an…
Browse files Browse the repository at this point in the history
…d mobile (close #49)

Minor changes based on review

Use snowplow_utils.get_field macro from the field macro

Correct table aliases in macro docs

Add comments for event type field

Select all properties from atomic table in base events

Change in to equals when comparing event type

Rename _X_percent_reached columns to percent_reached_X

Change var snowplow__events to snowplow__events_table in src_base.yml

Avoid having only one argument in coalesce in macros

Fix coalesce in a few more macros

Disable mobile events and ad quartile events by default

Select all columns from atomic table in bigquery, postgres, snowflake

Fix subquery for snowflake

Add sort, dist and cluster_by config to base_sessions_lifecycle manifest

Unify macros to not add aliases

Rename session_id to session_identifier

Mention that play time counts rewatched content in docs

Protect access to properties in optional contexts and improve docs

Add check for at least one media player context

Disable media_ad_views and media_ads in case media ad context is disabled

Add session_identifier to media_ad_views

Add a note in column documentation that several properties will be null if the media session entity is not available

Handle cases when the table is empty during incremental run in media_stats

Use the timestamp of the first ad event regardless of it's type for viewed_at in ad views and add a test for not null
  • Loading branch information
matus-tomlein committed Sep 20, 2023
1 parent 15fe474 commit 00eae5a
Show file tree
Hide file tree
Showing 84 changed files with 15,994 additions and 1,963 deletions.
51 changes: 38 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,27 @@ The latest version of the snowplow-media-player package supports BigQuery, Datab

### Requirements

- A dataset of media-player web events from the [Snowplow JavaScript tracker][tracker-docs] must be available in the database. In order for this to happen at least one of the JavaScript based media tracking plugins need to be enabled: [Media Tracking plugin][media-tracking] or [YouTube Tracking plugin][youtube-tracking]
- Have the [`webPage` context][webpage-context] enabled.
- Have the [media-player event schema][media-player-event-schema] enabled.
- Have the [media-player context schema][media-player-context-schema] enabled.
- Depending on the plugin / intention have all the relevant contexts from below enabled:
- in case of embedded YouTube tracking: Have the [YouTube specific context schema][youtube-specific-context-schema] enabled.
- in case of HTML5 audio or video tracking: Have the [HTML5 media element context schema][html5-media-element-context-schema] enabled.
- in case of HTML5 video tracking: Have the [HTML5 video element context schema][html5-video-element-context-schema] enabled.
- A dataset of media-player events must be available in the database. You can collect media events using our plugins for the JavaScript tracker or using the iOS and Android trackers: [Media plugin][media-plugin], [HTML5 media player plugin][media-tracking], [YouTube plugin][youtube-tracking], [Vimeo plugin][vimeo-tracking] or the [iOS and Android media APIs][mobile-tracking]
- Have the [`webPage` context][webpage-context] enabled on Web or the [screen context][screen-context] on mobile (default).
- Enabled session tracking on the tracker (default).

The model is compatible with all versions of our media tracking APIs. These have evolved over time and may track the media events using two sets of event and contexts schemas:

1. Version 1 media schemas:

- [media-player event schema][media-player-event-schema] used for all media events.
- [media-player context v1 schema][media-player-context-schema].
- Depending on the plugin / intention there are player-specific contexts:
- in case of embedded YouTube tracking: Have the [YouTube specific context schema][youtube-specific-context-schema] enabled.
- in case of HTML5 audio or video tracking: Have the [HTML5 media element context schema][html5-media-element-context-schema] enabled.
- in case of HTML5 video tracking: Have the [HTML5 video element context schema][html5-video-element-context-schema] enabled.

2. Version 2 media schemas (preferred):

- [per-event media event schemas][media-event-schemas].
- [media-player context v2 schema][media-player-v2-context-schema].
- optional [media-session context schema][media-session-context-schema].
- optional [media-ad][media-ad-context-schema] and [ad break][media-ad-break-context-schema] context schema.

### Installation

Expand All @@ -40,11 +53,13 @@ Please refer to the [doc site](https://docs.snowplow.io/docs/modeling-your-data/

The package contains multiple staging models however the mart models are as follows:

| Model | Description |
|------------------------------------------|--------------------------------------------------------------------------------------------|
| snowplow_media_player_base | A table summarizing media player events by media and pageview including impressions. |
| snowplow_media_player_plays_by_pageview | A view summarizing media plays by media on a pageview level. |
| snowplow_media_player_media_stats | An aggregated table of media metrics on a media_id level. |
| Model | Description |
|------------------------------------------|------------------------------------------------------------------------------------------------------------------|
| snowplow_media_player_base | A table summarizing media player events by media and pageview including impressions. |
| snowplow_media_player_plays_by_pageview | A view summarizing media plays by media on a pageview level. |
| snowplow_media_player_media_stats | An aggregated table of media metrics on a media_id level. |
| snowplow_media_player_media_ad_views | A view summarizing each ad viewed within a media playback (only for v2 schemas, see above). |
| snowplow_media_player_media_ads | An aggregated table of ad metrics for each ad played within each media content (only for v2 schemas, see above). |

Please refer to the [dbt doc site][snowplow-media-player-docs-dbt] for details on the model output tables.

Expand Down Expand Up @@ -77,19 +92,29 @@ limitations under the License.
[tracker-docs]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/

[webpage-context]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracker-setup/initialization-options/#Adding_predefined_contexts
[screen-context]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/mobile-trackers/tracking-events/screen-tracking/#screen-view-event-and-screen-context-entity

[media-player-event-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/media_player_event/jsonschema/1-0-0
[media-player-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/media_player/jsonschema/1-0-0
[youtube-specific-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.youtube/youtube/jsonschema/1-0-0
[html5-media-element-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/org.whatwg/media_element/jsonschema/1-0-0
[html5-video-element-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/org.whatwg/video_element/jsonschema/1-0-0
[media-event-schemas]: https://github.com/snowplow/iglu-central/tree/master/schemas/com.snowplowanalytics.snowplow.media
[media-player-v2-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/media_player/jsonschema/2-0-0
[media-session-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.media/session/jsonschema/1-0-0
[media-ad-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.media/ad/jsonschema/1-0-0
[media-ad-break-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.media/ad_break/jsonschema/1-0-0

[media-tracking]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/plugins/media-tracking/

[javascript-tracker]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3

[youtube-tracking]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/plugins/youtube-tracking/

[media-plugin]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/browser-tracker/browser-tracker-v3-reference/plugins/media/
[vimeo-tracking]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/javascript-trackers/browser-tracker/browser-tracker-v3-reference/plugins/vimeo-tracking/
[mobile-tracking]: https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/mobile-trackers/tracking-events/media-tracking/

[dbt-package-docs]: https://docs.getdbt.com/docs/building-a-dbt-project/package-management

[discourse-image]: https://img.shields.io/discourse/posts?server=https%3A%2F%2Fdiscourse.snowplow.io%2F
Expand Down
36 changes: 34 additions & 2 deletions dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ vars:
snowplow__max_session_days: 3
snowplow__upsert_lookback_days: 30
snowplow__allow_refresh: false
snowplow__app_id: []

# Variables - Contexts, filters, and logs
# please set any of the below three variables to true if the related context schemas are enabled for your warehouse, please note it cannot be used to filter the data:
Expand All @@ -56,15 +57,29 @@ vars:
snowplow__enable_whatwg_media: false
# set to true if the HTML5 video element context schema is enabled
snowplow__enable_whatwg_video: false
snowplow__app_id: []
snowplow__enable_media_player_v1: false
snowplow__enable_media_player_v2: true
snowplow__enable_media_session: true
snowplow__enable_media_ad: false
snowplow__enable_media_ad_break: false
snowplow__enable_web_events: true
snowplow__enable_mobile_events: false
snowplow__enable_ad_quartile_event: false

# Variables - Warehouse Specific
snowplow__media_player_event_context: 'com_snowplowanalytics_snowplow_media_player_event_1'
snowplow__media_player_context: 'com_snowplowanalytics_snowplow_media_player_1'
snowplow__media_player_v2_context: 'com_snowplowanalytics_snowplow_media_player_2'
snowplow__media_session_context: 'com_snowplowanalytics_snowplow_media_session_1'
snowplow__media_ad_context: 'com_snowplowanalytics_snowplow_media_ad_1'
snowplow__media_ad_break_context: 'com_snowplowanalytics_snowplow_media_ad_break_1'
snowplow__media_ad_quartile_event: 'com_snowplowanalytics_snowplow_media_ad_quartile_event_1'
snowplow__youtube_context: 'com_youtube_youtube_1'
snowplow__html5_media_element_context: 'org_whatwg_media_element_1'
snowplow__html5_video_element_context: 'org_whatwg_video_element_1'
snowplow__context_web_page: 'com_snowplowanalytics_snowplow_web_page_1'
snowplow__context_screen: 'com_snowplowanalytics_mobile_screen_1'
snowplow__context_mobile_session: 'com_snowplowanalytics_snowplow_client_session_1'
snowplow__derived_tstamp_partitioned: true
snowplow__query_tag: 'snowplow_dbt'
snowplow__enable_load_tstamp: true
Expand All @@ -86,7 +101,15 @@ models:
+materialized: view
base:
manifest:
+schema: 'snowplow_manifest'
+schema: "snowplow_manifest"
bigquery:
+enabled: "{{ target.type == 'bigquery' | as_bool() }}"
databricks:
+enabled: "{{ target.type in ['databricks', 'spark'] | as_bool() }}"
default:
+enabled: "{{ target.type in ['redshift', 'postgres'] | as_bool() }}"
snowflake:
+enabled: "{{ target.type == 'snowflake' | as_bool() }}"
scratch:
+schema: 'scratch'
+tags: 'scratch'
Expand Down Expand Up @@ -114,3 +137,12 @@ models:
+schema: 'scratch'
+tags: 'snowplow_media_player_incremental'
+enabled: false
media_ad_views:
+schema: 'derived'
+tags: 'snowplow_media_player_incremental'
scratch:
+schema: 'scratch'
+tags: 'scratch'
media_ads:
+schema: 'derived'
+tags: 'snowplow_media_player_incremental'
24 changes: 24 additions & 0 deletions docs/markdown/snowplow_media_player_atomic_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@
This context table contains the `page_view_id` associated with an event.
{% enddocs %}

{% docs table_screen_context %}
This context table contains the screen view ID associated with a mobile event.
{% enddocs %}

{% docs table_client_session_context %}
This context table contains user and session identifiers associated with mobile events.
{% enddocs %}

{% docs table_media_player_event %}
The table specifying the media player event type (e.g. playing, seek) and the label given for the media for user friendly identification.
{% enddocs %}
Expand All @@ -10,6 +18,22 @@ The table specifying the media player event type (e.g. playing, seek) and the la
This context table contains a set of entities that are common between media events across platforms.
{% enddocs %}

{% docs table_media_session_context %}
This context table contains context entities for media player events that track sessions of media player usage (a media session is one video playback).
{% enddocs %}

{% docs table_media_ad_context %}
This context table contains context entities with information about the currently played ad.
{% enddocs %}

{% docs table_media_ad_break_context %}
This context table contains context entities that are added to all ad events belonging to an ad break.
{% enddocs %}

{% docs table_media_ad_quartile_event %}
This table contains self-describing event data fired when a quartile of ad is reached after continuous ad playback at normal speed.
{% enddocs %}

{% docs table_youtube_context %}
The context table with data specific to embedded YouTube videos.
{% enddocs %}
Expand Down
Loading

0 comments on commit 00eae5a

Please sign in to comment.