Skip to content

Commit

Permalink
v0.11.0 release branch (#34)
Browse files Browse the repository at this point in the history
* document default LEAD columns (#32)

* changes

* docs

* changelog

* update source package ref

* Update CHANGELOG.md

Co-authored-by: fivetran-catfritz <[email protected]>

* changelog

---------

Co-authored-by: fivetran-catfritz <[email protected]>

* Bug/databricks syntax (#33)

* changes

* docs

* changelog

* update source package ref

* Update CHANGELOG.md

Co-authored-by: fivetran-catfritz <[email protected]>

* changelog

* bug/databricks-syntax

* update pkg

* update cte

* remove ignore nulls

* updates

* updates & regen docs

* update yml

---------

Co-authored-by: Jamie Rodriguez <[email protected]>

* Update packages.yml

---------

Co-authored-by: Jamie Rodriguez <[email protected]>
  • Loading branch information
fivetran-catfritz and fivetran-jamie authored Apr 2, 2024
1 parent f3a58d7 commit 0459112
Show file tree
Hide file tree
Showing 18 changed files with 267 additions and 134 deletions.
4 changes: 2 additions & 2 deletions .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
## UPDATE FOR VARS HERE, IF NO VARS, PLEASE REMOVE
dbt run --vars '{marketo__enable_campaigns: true, marketo__enable_programs: true, marketo__activity_delete_lead_enabled: false}' --target "$db" --full-refresh
dbt test --target "$db"
dbt run --vars '{lead_history_columns: ['first_name', 'lead_status', 'urgency', 'priority', 'relative_score', 'relative_urgency', 'demographic_score_marketing', 'behavior_score_marketing'], marketo__enable_campaigns: true, marketo__enable_programs: true, marketo__activity_delete_lead_enabled: false}' --target "$db" --full-refresh
dbt test --target "$db" --vars '{lead_history_columns: ['first_name', 'lead_status', 'urgency', 'priority', 'relative_score', 'relative_urgency', 'demographic_score_marketing', 'behavior_score_marketing'], marketo__enable_campaigns: true, marketo__enable_programs: true, marketo__activity_delete_lead_enabled: false}' --target "$db"
### END VARS CHUNK, REMOVE IF NOT USING
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
43 changes: 11 additions & 32 deletions .github/PULL_REQUEST_TEMPLATE/maintainer_pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,27 @@
**This PR will result in the following new package version:**
<!--- Please add details around your decision for breaking vs non-breaking version upgrade. If this is a breaking change, were backwards-compatible options explored? -->

**Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:**
**Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:**
<!--- Copy/paste the CHANGELOG for this version below. -->

## PR Checklist
### Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
- [ ] dbt compile
- [ ] dbt run –full-refresh
- [ ] dbt run
- [ ] dbt test
- [ ] dbt run –vars (if applicable)
- [ ] dbt run –full-refresh && dbt test
- [ ] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:
- [ ] The appropriate issue has been linked and tagged
- [ ] You are assigned to the corresponding issue and this PR
- [ ] The appropriate issue has been linked, tagged, and properly assigned
- [ ] All necessary documentation and version upgrades have been applied
<!--- Be sure to update the package version in the dbt_project.yml, integration_tests/dbt_project.yml, and README if necessary. -->
- [ ] docs were regenerated (unless this PR does not include any code or yml updates)
- [ ] BuildKite integration tests are passing
- [ ] Detailed validation steps have been provided below

### Detailed Validation
Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":
- [ ] You have validated these changes and assure this PR will address the respective Issue/Feature.
- [ ] You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
- [ ] You have provided details below around the validation steps performed to gain confidence in these changes.
Please share any and all of your validation steps:
<!--- Provide the steps you took to validate your changes below. -->

### Standard Updates
Please acknowledge that your PR contains the following standard updates:
- Package versioning has been appropriately indexed in the following locations:
- [ ] indexed within dbt_project.yml
- [ ] indexed within integration_tests/dbt_project.yml
- [ ] CHANGELOG has individual entries for each respective change in this PR
<!--- If there is a parallel upstream change, remember to reference the corresponding CHANGELOG as an individual entry. -->
- [ ] README updates have been applied (if applicable)
<!--- Remember to check the following README locations for common updates. →
<!--- Suggested install range (needed for breaking changes) →
<!--- Dependency matrix is appropriately updated (if applicable) →
<!--- New variable documentation (if applicable) -->
- [ ] DECISIONLOG updates have been updated (if applicable)
- [ ] Appropriate yml documentation has been added (if applicable)

### dbt Docs
Please acknowledge that after the above were all completed the below were applied to your branch:
- [ ] docs were regenerated (unless this PR does not include any code or yml updates)

### If you had to summarize this PR in an emoji, which would it be?
<!--- For a complete list of markdown compatible emojis check our this git repo (https://gist.github.com/rxaviers/7360908) -->
:dancer:
:dancer:
13 changes: 13 additions & 0 deletions .github/workflows/auto-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: 'auto release'
on:
pull_request:
types:
- closed
branches:
- main

jobs:
call-workflow-passing-data:
if: github.event.pull_request.merged
uses: fivetran/dbt_package_automations/.github/workflows/auto-release.yml@main
secrets: inherit
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
target/
dbt_modules/
logs/

package-lock.yml
dbt_packages/
51 changes: 51 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,54 @@
# dbt_marketo v0.11.0

[PR #33](https://github.com/fivetran/dbt_marketo/pull/33) includes the following updates:
## Bug Fix
- Removed the use of `ignore nulls` statements in `marketo__lead_history` and `marketo__change_data_scd`, which was incompatible with PostgreSQL and Databricks Runtime. The logic has been updated with a new approach but produces the same results as before.
- Updated model `marketo__change_data_pivot` to use the `activity_id` as a tie-breaker to remove randomness when ordering events having the same `activity_timestamp`.
- Previously if two events happened at the same timestamp, results would be inconsistent, which propagated to downstream models. Now, this model will produce consistent results.

## Under the hood
- Added additional variable configurations to integration tests to account for a wider range of situations.

---

[PR #32](https://github.com/fivetran/dbt_marketo/pull/32) and Marketo Source [PR #35](https://github.com/fivetran/dbt_marketo_source/pull/35) include the following updates:

## Feature Updates (includes 🚨 breaking changes 🚨)
- Ensures that `stg_marketo__lead` (and therefore `marketo__leads`) has and documents the below columns, all [standard](https://developers.marketo.com/rest-api/lead-database/fields/list-of-standard-fields/) fields from Marketo. Previously, peristed all fields found in your `LEAD` source table but only _ensured_ that the `id`, `created_at`, `updated_at`, `email`, `first_name`, `last_name`, and `_fivetran_synced` fields were included. If any of the following default columns are missing from your `LEAD` table, `stg_marketo__lead` will create a NULL version with the proper data type:
- `phone`
- `main_phone`
- `mobile_phone`
- `company`
- `inferred_company`
- `address_lead`
- `address`
- `city`
- `state`
- `state_code`
- `country`
- `country_code`
- `postal_code`
- `billing_street`
- `billing_city`
- `billing_state`
- `billing_state_code`
- `billing_country`
- `billing_country_code`
- `billing_postal_code`
- `inferred_city`
- `inferred_state_region`
- `inferred_country`
- `inferred_postal_code`
- `inferred_phone_area_code`
- `anonymous_ip`
- `unsubscribed` -> aliased as `is_unsubscribed` (🚨 breaking change 🚨)
- `email_invalid` -> aliased as `is_email_invalid` (🚨 breaking change 🚨)
- `do_not_call`

## Under the Hood
- Updated the maintainer PR template to resemble the most up to date format.
- Included auto-releaser GitHub Actions workflow to automate future releases.

# dbt_marketo v0.10.0

## 🚨 Breaking Changes 🚨 (recommend --full-refresh):
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Include the following Marketo package version in your `packages.yml` file.
```yml
packages:
- package: fivetran/marketo
version: [">=0.10.0", "<0.11.0"]
version: [">=0.11.0", "<0.12.0"]
```
Do **NOT** include the `marketo_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.

Expand Down Expand Up @@ -89,7 +89,7 @@ vars:
marketo__activity_delete_lead_enabled: false # Disable if you do not have the activity_delete_lead table
```
## (Optional) Step 5: Additional configurations
<details><summary>Expand for details</summary>
<details open><summary>Expand/Collapse details</summary>
<br>

### Passing Through Additional Columns
Expand Down Expand Up @@ -152,7 +152,7 @@ This dbt package is dependent on the following dbt packages. Please be aware tha
```yml
packages:
- package: fivetran/marketo_source
version: [">=0.10.0", "<0.11.0"]
version: [">=0.11.0", "<0.12.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'marketo'
version: '0.10.0'
version: '0.11.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
on-run-start: "{{ lead_history_columns_warning() }}"
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

24 changes: 12 additions & 12 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'marketo_integration_tests'
version: '0.10.0'
version: '0.11.0'
profile: 'integration_tests'
config-version: 2

Expand Down
36 changes: 0 additions & 36 deletions macros/dummy_coalesce_value.sql

This file was deleted.

2 changes: 1 addition & 1 deletion models/intermediate/marketo__change_data_pivot.sql
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ with change_data as (
*,
row_number() over (
partition by cast(activity_timestamp as date), lead_id, primary_attribute_value_id
order by activity_timestamp asc
order by activity_timestamp asc, activity_id desc -- In the case that events come in the exact same time, we will rely on the activity_id to prove the order
) as row_num
from joined

Expand Down
81 changes: 49 additions & 32 deletions models/intermediate/marketo__change_data_scd.sql
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,14 @@
}}

{%- set lead_columns = adapter.get_columns_in_relation(ref('int_marketo__lead')) -%}
{% set filtered_lead_columns = [] %}
{% for col in lead_columns if col.name|lower not in ['lead_id','_fivetran_synced'] and col.name|lower in var('lead_history_columns') %}
{% set filtered_lead_columns = filtered_lead_columns.append(col) %}
{% endfor %}

{%- set change_data_columns = adapter.get_columns_in_relation(ref('marketo__change_data_pivot')) -%}
{%- set change_data_columns_xf = change_data_columns|map(attribute='name')|list %}

with change_data as (

select *
Expand Down Expand Up @@ -45,51 +50,63 @@ with change_data as (
)
}}

), field_partitions as (

select
coalesce(unioned.date_day, current_date) as valid_to,
unioned.date_day,
unioned.lead_id

{% for col in filtered_lead_columns %}
{% if col.name not in change_data_columns_xf %}
, unioned.{{ col.name }}
, null as {{ col.name }}_partition

{% else %}
, unioned.{{ col.name }}
, sum(case when unioned.{{ col.name }} is null and not coalesce(details.{{ col.name }}, true) then 0
else 1 end) over (
partition by unioned.lead_id
order by coalesce(unioned.date_day, current_date) desc
rows between unbounded preceding and current row)
as {{ col.name }}_partition

{% endif %}
{% endfor %}

from unioned
left join details
on unioned.date_day = details.date_day
and unioned.lead_id = details.lead_id

), today as (

-- For each day where a change occurred for each lead, we backfill the values from the subsequent change,
-- going back in time. In order to account for changes that occur to or from null values, we need to do a coalesce
-- with dummy values, which we nullif() at the end.
-- For each day where a change occurred for each lead, we backfill the values from the subsequent change, going back in time.
-- The 'details' table is joined in for exactly this purpose. It tells us, even if a value is null, whether that null
-- value is because no change occurred on that day, or because there was a change and the change involved the null value.

select
coalesce(unioned.date_day, current_date) as valid_to,
unioned.lead_id
{% for col in lead_columns if col.name|lower not in ['lead_id','_fivetran_synced'] and col.name|lower in var('lead_history_columns') %}
,
{% if col.name not in change_data_columns_xf %}
field_partitions.valid_to,
field_partitions.lead_id

{% for col in filtered_lead_columns %}
{% if col.name not in change_data_columns_xf %}
{# If the column does not exist in the change data, grab the value from the current state of the record. #}
last_value(unioned.{{ col.name }}) over (
partition by unioned.lead_id
order by unioned.date_day asc
, last_value(field_partitions.{{ col.name }}) over (
partition by field_partitions.lead_id
order by field_partitions.date_day asc
rows between unbounded preceding and current row) as {{ col.name }}

{% else %}

case

{# if there was a change on the day, as specified by the details table, use that value #}
when coalesce(details.{{ col.name }}, True) then unioned.{{ col.name }}

{# otherwise, grab the most recent value from a day where a change did occur #}
else nullif(

first_value(case when coalesce(details.{{ col.name }}, True) then coalesce(unioned.{{ col.name}}, {{ fivetran_utils.dummy_coalesce_value(col) }}) end ignore nulls) over (
partition by unioned.lead_id
order by coalesce(unioned.date_day, current_date) asc
rows between 1 following and unbounded following),

{{ fivetran_utils.dummy_coalesce_value(col) }})
end as {{ col.name }}
, first_value(field_partitions.{{ col.name }}) over (
partition by field_partitions.lead_id, field_partitions.{{ col.name }}_partition
order by field_partitions.valid_to desc
rows between unbounded preceding and current row)
as {{ col.name }}
{% endif %}
{% endfor %}

from unioned
left join details
on unioned.date_day = details.date_day
and unioned.lead_id = details.lead_id
from field_partitions

), surrogate_key as (

Expand Down
Loading

0 comments on commit 0459112

Please sign in to comment.