You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for data that is already historic (e.g. google ads reports or facebook marketing insights) we may not need to have an additional _scd table.
furthermore, with a large lookback window and frequent syncs, it is possible that values within the lookback window change slightly between each sync, resulting in a huge amount of _scd data (e.g. daily syncs with 28-day lookback can be 28x more data in the _scd table than in the final table. hourly or 5 minute sync interval will just explode that number)
Describe the solution you’d like
an option in the UI to not persist the _scd table (i.e. mark it as 'ephemeral' in dbt)
Describe the alternative you’ve considered or used
customize the dbt project and replace basic normalization, but it doesn't work on kube (#5091) and it is also clunkier to use on multiple connections
Additional context
I am trying to reduce the amount of excess data stored on redshift, this + #2227 would be great
The text was updated successfully, but these errors were encountered:
Tell us about the problem you're trying to solve
for data that is already historic (e.g. google ads reports or facebook marketing insights) we may not need to have an additional
_scd
table.furthermore, with a large lookback window and frequent syncs, it is possible that values within the lookback window change slightly between each sync, resulting in a huge amount of
_scd
data (e.g. daily syncs with 28-day lookback can be 28x more data in the_scd
table than in the final table. hourly or 5 minute sync interval will just explode that number)Describe the solution you’d like
an option in the UI to not persist the
_scd
table (i.e. mark it as 'ephemeral' in dbt)Describe the alternative you’ve considered or used
customize the dbt project and replace basic normalization, but it doesn't work on kube (#5091) and it is also clunkier to use on multiple connections
Additional context
I am trying to reduce the amount of excess data stored on redshift, this + #2227 would be great
The text was updated successfully, but these errors were encountered: