-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make some of those sweet sweet changes #3
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫 Not Reviewed
Reason: Nimbus doesn't know about this Repository.
Action: Please try uninstalling and [reinstalling the GitHub App](get a plan).
📚 Refer to the documentation for more details.
❓ If you need further assistance, please contact support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚫 Not Reviewed
Reason: Nimbus doesn't know about this Repository.
Action: Please try uninstalling and [reinstalling the GitHub App](get a plan).
📚 Refer to the documentation for more details.
❓ If you need further assistance, please contact support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀 Taking a look at your changes! I'll provide feedback shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main suggestion is to remove the unnecessary empty line at the end of the file and ensure the renamed
CTE includes all the necessary columns and transformations to properly stage the date dimension data. Additionally, it's worth noting that the model appears to be incomplete, so consider adding the required transformations within the CTE.
@@ -20,7 +20,6 @@ renamed as ( | |||
d_moy as moy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added empty line at the end of the file is unnecessary and doesn't contribute to the functionality of the model. Consider removing it to maintain a clean and concise codebase. Example:
renamed as (
)
select * from renamed
Additionally, it's worth noting that this model appears to be incomplete. The renamed
CTE is empty, which might indicate missing transformations. Consider adding the necessary columns and transformations within the CTE to properly stage the date dimension data. For example:
with source as (
select * from {{ source('tpcds_core', 'date_dim') }}
),
renamed as (
select
d_date_sk as date_key,
d_date as date,
d_year as year,
d_month as month,
d_day as day
-- Add more columns as needed
from source
)
select * from renamed
More information on staging models
Staging models are crucial in dbt projects as they serve as the first layer of transformation, cleaning and preparing source data for further use. Ensure that all necessary columns are included and properly renamed to follow your project's naming conventions.
Source: dbt Labs - Staging Layer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀 Taking a look at your changes! I'll provide feedback shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My review focuses on improving readability and maintainability of the data models. I suggest explicitly defining the columns to include in each model, and adding a comment and a dbt test to explain the removal of the d_fy_year
column and ensure it doesn't cause issues in downstream models.
@@ -1,25 +1,10 @@ | |||
with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a select
statement to explicitly define the columns you want to include from each referenced model. This improves readability and makes it easier to track changes in the future. For example:
select
stores.store_sk,
stores.store_id,
stores.store_name,
-- Add other relevant columns from stores
date_dim.date as store_closed_date
from {{ ref('stg_tpcds_core__store') }} as stores
left join {{ ref('stg_tpcds_core__date_dim') }} as date_dim
on stores.closed_date_sk = date_dim.date_sk
This approach also allows you to rename columns if needed and ensures that only necessary columns are included in the final output.
More info
Source: dbt Labs - SQL Style Guide
@@ -20,7 +20,6 @@ renamed as ( | |||
d_moy as moy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the d_fy_year
column might impact downstream models. Consider adding a comment explaining why this column was removed and how it affects the data model. Additionally, you may want to add a dbt test to ensure that this change doesn't cause unexpected issues in dependent models. For example:
-- Add a comment explaining the removal
-- d_fy_year column removed as it's no longer needed for current analysis
-- Add a dbt test in your schema.yml file
version: 2
models:
- name: stg_tpcds_core__date_dim
columns:
- name: date_sk
tests:
- unique
- not_null
# Add other relevant tests
tests:
- dbt_utils.equal_rowcount:
compare_model: source('tpcds_core', 'date_dim')
This test ensures that the row count in your staging model matches the source, helping to catch any potential data loss issues.
More info
Source: dbt Docs - Testing
well well well we meet again