Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] respect project root when loading seeds #8762

Merged
merged 9 commits into from
Oct 10, 2023

Conversation

MichelleArk
Copy link
Contributor

@MichelleArk MichelleArk commented Oct 3, 2023

resolves #6875

Problem

dbt seed fails when attempting to leverage partial parse file across dbt invocations in the same project, but different root directories.

Solution

Use the project root instead of the root path on the model node (which could be from a prior dbt run in a different directory, but the same project) when loading seeds.

Risk assessment (for backporting):

  • Added additional safety here wrt preserving previous behaviour and try to use the model.root_path, and if the file doesn't exist, falling back to the project root.
  • no changes to manifest schema

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@cla-bot cla-bot bot added the cla:yes label Oct 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2023

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (0c965c8) 86.62% compared to head (f778dec) 86.50%.
Report is 15 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8762      +/-   ##
==========================================
- Coverage   86.62%   86.50%   -0.13%     
==========================================
  Files         176      176              
  Lines       25772    25861      +89     
==========================================
+ Hits        22325    22371      +46     
- Misses       3447     3490      +43     
Flag Coverage Δ
integration 83.25% <71.42%> (-0.14%) ⬇️
unit 65.06% <14.28%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/tests/util.py 86.59% <100.00%> (+0.09%) ⬆️
core/dbt/context/providers.py 88.71% <60.00%> (-0.24%) ⬇️

... and 14 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

core/dbt/parser/seeds.py Outdated Show resolved Hide resolved
@@ -875,6 +875,7 @@ class SeedNode(ParsedNode): # No SQLDefaults!
config: SeedConfig = field(default_factory=SeedConfig)
# seeds need the root_path because the contents are not loaded initially
# and we need the root_path to load the seed later
# TODO: remove root_path as it is unused, and instead computed dynamically in load_agate_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of:

  • remove root_path for v1.7 (manifest v11)
  • backport just the load_agate_table change to earlier versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! This is what I was thinking as well. I left this as a TODO so this spike could be merged + backported as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opted to preserve the existing behaviour for getting the seed path that leverages root_path to mitigate risk of regression - so it's not quite in a state yet where this TODO makes sense as root_path is still used in load_agate_table

@MichelleArk MichelleArk self-assigned this Oct 5, 2023
@MichelleArk MichelleArk changed the title [Spike] respect project root when loading seeds [Fix] respect project root when loading seeds Oct 6, 2023
@MichelleArk MichelleArk force-pushed the spike/portable-partial-parsing-seeds branch from 5ad4646 to 8e46d49 Compare October 6, 2023 17:44
@MichelleArk MichelleArk force-pushed the spike/portable-partial-parsing-seeds branch from 8e46d49 to d316e77 Compare October 6, 2023 17:45
@MichelleArk MichelleArk force-pushed the spike/portable-partial-parsing-seeds branch 2 times, most recently from 5a8ba41 to 3b23a51 Compare October 9, 2023 19:40
@MichelleArk MichelleArk force-pushed the spike/portable-partial-parsing-seeds branch from 3b23a51 to f778dec Compare October 9, 2023 22:10
@MichelleArk MichelleArk marked this pull request as ready for review October 9, 2023 22:38
@MichelleArk MichelleArk requested a review from a team as a code owner October 9, 2023 22:38
@MichelleArk MichelleArk requested review from martynydbt and gshank and removed request for martynydbt October 9, 2023 22:38
@MichelleArk MichelleArk merged commit 964e0e4 into main Oct 10, 2023
@MichelleArk MichelleArk deleted the spike/portable-partial-parsing-seeds branch October 10, 2023 15:17
@github-actions
Copy link
Contributor

The backport to 1.6.latest failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.6.latest 1.6.latest
# Navigate to the new working tree
cd .worktrees/backport-1.6.latest
# Create a new branch
git switch --create backport-8762-to-1.6.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 964e0e4e8ad50a917074aabc8493b210a84e0258
# Push it to GitHub
git push --set-upstream origin backport-8762-to-1.6.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.6.latest

Then, create a pull request where the base branch is 1.6.latest and the compare/head branch is backport-8762-to-1.6.latest.

MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
@MichelleArk
Copy link
Contributor Author

MichelleArk commented Oct 10, 2023

Backports:
1.6: #8804
1.5: #8805
1.4: #8806
1.3: #8810
1.2: #8814 (draft)
1.1: #8815 (draft)
1.0:#8816 (draft)

MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
MichelleArk added a commit that referenced this pull request Oct 10, 2023
aranke pushed a commit that referenced this pull request Oct 11, 2023
aranke added a commit that referenced this pull request Oct 11, 2023
aranke pushed a commit that referenced this pull request Oct 11, 2023
tatiana added a commit to astronomer/astronomer-cosmos that referenced this pull request May 10, 2024
With the introduction of enabling partial parse in PR #904, 
upon testing the implementation, it is observed that the seeds 
files were not been able to be located as the partial parse file 
contained a stale `root_path` from previous command runs. 
This issue is observed on specific earlier versions of dbt-core like
`1.5.4` and `1.6.5`, but not on recent versions of dbt-core `1.5.8`,
`1.6.6`
and `1.7.0`. I am suspecting that PR
dbt-labs/dbt-core#8762
is likely the fix and the fix appears to be backported to later version 
releases of `1.5.x` and `1.6.x`.

However, irrespective of the dbt-core version, this PR attempts to 
correct the `root_path` in the partial parse file by replacing it with 
the needed project directory where the project files are located. 
And thus ensures that the feature runs correctly for older and 
newer versions of dbt-core.

closes: #937

---------

Co-authored-by: Tatiana Al-Chueyr <[email protected]>
@aranke aranke mentioned this pull request Jul 12, 2024
5 tasks
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
With the introduction of enabling partial parse in PR astronomer#904, 
upon testing the implementation, it is observed that the seeds 
files were not been able to be located as the partial parse file 
contained a stale `root_path` from previous command runs. 
This issue is observed on specific earlier versions of dbt-core like
`1.5.4` and `1.6.5`, but not on recent versions of dbt-core `1.5.8`,
`1.6.6`
and `1.7.0`. I am suspecting that PR
dbt-labs/dbt-core#8762
is likely the fix and the fix appears to be backported to later version 
releases of `1.5.x` and `1.6.x`.

However, irrespective of the dbt-core version, this PR attempts to 
correct the `root_path` in the partial parse file by replacing it with 
the needed project directory where the project files are located. 
And thus ensures that the feature runs correctly for older and 
newer versions of dbt-core.

closes: astronomer#937

---------

Co-authored-by: Tatiana Al-Chueyr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-2042] Enable seeds to be handled from stored manifest data
3 participants