Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-750] [Bug] Raise helpful error if user provides "dbt" as project name during dbt init #5379

Closed
1 task done
philippefutureboy opened this issue Jun 15, 2022 · 4 comments · Fixed by #5620
Closed
1 task done
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! init Issues related to initializing the dbt starter project

Comments

@philippefutureboy
Copy link

philippefutureboy commented Jun 15, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Given the following setup

$ poetry add dbt-core dbt-bigquery
$ dbt init
20:28:33  Running with dbt=1.1.1
Enter a name for your project (letters, digits, underscore): dbt
Which database would you like to use?
[1] bigquery

(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)

Enter a number: 1
[1] oauth
[2] service_account
Desired authentication method option (enter a number): 2
keyfile (/path/to/bigquery/keyfile.json): <path>
project (GCP project id): <project>
dataset (the name of your dbt dataset): dbt_default
threads (1 or more): 1
job_execution_timeout_seconds [300]: 
[1] US
[2] EU
Desired location option (enter a number): 1
20:29:42  Profile dbt written to /Users/<username>/.dbt/profiles.yml using target's profile_template.yml and your supplied values. Run 'dbt debug' to validate the connection.
20:29:42  
Your new dbt project "dbt" was created!

For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:

  https://docs.getdbt.com/docs/configure-your-profile

One more thing:

Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:

  https://community.getdbt.com/

Happy modeling!

$ dbt run --project-dir /path/to/folder/named/dbt

I get the following output:

21:05:16  Running with dbt=1.1.1
21:05:16  Encountered an error:
Compilation Error
  dbt found more than one package with the name "dbt" included in this project. Package names must be unique in a project. Please rename one of these packages.

Which was pretty confusing at first until I figured out the issue was the top level folder name.
As a first-time user of dbt, that's a kind of weird way to get this relationship started 🥴

Expected Behavior

Allow top level folder to be named dbt :)
Since we are using dbt in the context of apache airflow, I think that structurally speaking it's an acceptable & reasonable way to structure the folder hierarchy:

dags/
   dag_helpers/
   dbt/
   utils/
   extract.py
   load.py
   transform.py
   export.py

Steps To Reproduce

No response

Relevant log output

No response

Environment

- OS: MacOS
- Python: python 3.8.7, pip 21.3.1
- dbt: 1.1.1

What database are you using dbt with?

bigquery

Additional Context

I searched but could not find an equivalent issue. Feel free to mark as duplicate if I missed something.
Related: #2029
For the time being I'll just rename the folder something else.

@philippefutureboy philippefutureboy added bug Something isn't working triage labels Jun 15, 2022
@github-actions github-actions bot changed the title [Bug] Allow naming project folder "dbt" [CT-750] [Bug] Allow naming project folder "dbt" Jun 15, 2022
@jtcohen6
Copy link
Contributor

@philippefutureboy Thanks for opening!

As a first-time user of dbt, that's a kind of weird way to get this relationship started 🥴

Fair :)

For good reason, I don't think we're going to be able to support dbt (or dbt_bigquery, for that matter). The problem, as you found, isn't with the name of the file directory, but with the name field in dbt_project.yml, which dbt uses to uniquely identify each project/package namespace.

A better behavior here would be to raise that validation error earlier, during the dbt init process. At simplest, that would look like an adjustment to the logic in this method:

def get_valid_project_name(self) -> str:
"""Returns a valid project name, either from CLI arg or user prompt."""
name = self.args.project_name
while not ProjectName.is_valid(name):
if name:
click.echo(name + " is not a valid project name.")
name = click.prompt("Enter a name for your project (letters, digits, underscore)")
return name

It should be fairly simple to check that the user-provided project name doesn't match dbt, a.k.a.GLOBAL_PROJECT_NAME via:

from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME

It would be a bit trickier to check that the user-provided name also doesn't collide with one of the installed plugins' reserved package names (dbt_snowflake, dbt_bigquery, etc). There is a method for that here, it would just require a bit more plumbing to get right. Something in the vein of:

    def get_valid_project_name(self) -> str:
        """Returns a valid project name, either from CLI arg or user prompt."""
        name = self.args.project_name
        internal_package_names = set({GLOBAL_PROJECT_NAME})
        available_adapters = list(_get_adapter_plugin_names())
        for adapter_name in available_adapters:
            internal_package_names.add(get_adapter_package_names(adapter_name))
        while not (ProjectName.is_valid(name) or name in internal_package_names):
            if name:
                click.echo(name + " is not a valid project name.")
            name = click.prompt("Enter a name for your project (letters, digits, underscore)")
        return name

There are pretty thorough tests for dbt init in test_init, and a clear example of a similar change made in #4536.

I'm going to mark this one as good_first_issue for a community contributor. Would you be interested?

@jtcohen6 jtcohen6 added good_first_issue Straightforward + self-contained changes, good for new contributors! init Issues related to initializing the dbt starter project and removed triage labels Jun 16, 2022
@jtcohen6 jtcohen6 changed the title [CT-750] [Bug] Allow naming project folder "dbt" [CT-750] [Bug] Raise helpful error if user provides "dbt" as project name during dbt init Jun 16, 2022
@philippefutureboy
Copy link
Author

Hi @jtcohen6 !

That's a great explainer! I didn't even realize the problem was the project name, instead of the folder name! For me that's great news because I mind more the folder structure than the project name tbh :)

I'm going to mark this one as good_first_issue for a community contributor. Would you be interested?

I'm going to have to say no. I would love to but I know I can't hold that promise. Thanks for asking!

Hopefully this will be a great entrypoint for a new prolific contributor :)

Cheers! 🥂

@Artic42
Copy link

Artic42 commented Jun 19, 2022

Hi @jtcohen6,

I can take on this issue I was looking for something to start dipping my toes in the project

@Goodkat
Copy link
Contributor

Goodkat commented Jul 23, 2022

Hi @jtcohen6,

the following line generates the error when calling dbt init: "No plugin found for postgres"

internal_package_names.add(get_adapter_package_names(adapter_name))

It looks like get_adapter_package_names cannot find the package name, however adapter_name is correct ("postgres").
I think the reason is that self.plugins list is empty at the moment of the project initialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! init Issues related to initializing the dbt starter project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants