-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Duplicated source when seed and model paths overlap #6102
Comments
Hey @jmg-duarte - I agree this is a bug! This sounds the same as one previously reported: #5120. We included a fix for that (#5176) in dbt-core v1.2.0 and v1.1.1. As such, I wasn't able to reproduce this while running locally, using the latest version of Is it possible that this regressed in v1.3? I would expect that this functional test should have started failing: https://github.com/dbt-labs/dbt-core/blob/main/tests/functional/configs/test_dupe_paths.py |
Hey @jtcohen6, thanks for the quick response! I think this still happens if the paths resolve to the same one but the name is different. Such as:
I haven't had the chance to look too deep into this, but I've been seeing this since 1.2.2. |
I have a minimal reproducible example, steps to reproduce:
seeds: ["path/to/folder"]
models: ["path/to/folder/"]
From an outside perspective, it seems as though the paths are not resolved before being used, so my suggestion would be to resolve them. I've also updated the original issue to reflect this finding. |
Ah, good to know that |
Turning this dbt-core/core/dbt/config/project.py Lines 128 to 139 in 8145eed
Into def _all_source_paths(
model_paths: List[str],
seed_paths: List[str],
snapshot_paths: List[str],
analysis_paths: List[str],
macro_paths: List[str],
) -> List[str]:
# We need to turn a list of lists into just a list, then convert to a set to
# get only unique elements, then back to a list
paths = chain(model_paths, seed_paths, snapshot_paths, analysis_paths, macro_paths)
paths = map(lambda s: str(pathlib.Path(s).resolve()), paths)
return list(set(paths)) Should work, I can try and submit a PR if this looks ok to you |
@jmg-duarte That makes sense to me! If you're up to contribute the fix, plus another test like this one, that would be very much appreciated :) |
Is this a new bug in dbt-core?
Current Behavior
If I have the following structure:
And the following configuration:
I will get an error like:
I am aware that I can do:
But this is not practical for several data providers, furthermore, as the data in question is usually not mixed with other providers, we would like to keep the structure as close as it is.
Expected Behavior
No name clash.
My intuition says that sources shouldn't be analysed when searching for seeds, but there may be a reason behind this.
Steps To Reproduce
Described in current behavior.
Relevant log output
Described in current behavior.
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response
The text was updated successfully, but these errors were encountered: