Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy Between Manifest & Catalog - Non-Defined Columns Missing #5

Open
sisu-callum opened this issue Apr 18, 2021 · 3 comments
Open
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@sisu-callum
Copy link

While looking at the results of this last version 0.7.0, I noticed that one of my models only had a single column defined within the LookML. I dug into it and realized that this is because I only defined a single column within my schema.yml for that model. I dug into the manifest of the specific model model using this command cat target/manifest.json | jq '.nodes["model.my_project_name.model_name"]' and noticed that only a single column is returned in the manifest.

I then pulled up the catalog with this command cat target/catalog.json | jq '.nodes["model.my_project_name.model_name"]' and there were 229 columns returned.

I'm not sure if this is an opinionated stance that has been taken (ie, columns that make it to the Looker view should be defined within the corresponding schema.yml) but wanted to highlight just in case it wasn't intentional.

@owlas
Copy link
Collaborator

owlas commented Apr 18, 2021

Thanks for raising this one (came up even earlier than I thought!)

So this was an intentional decision such that only resources declared in dbt appear in looker. This really suits the way we work but might not be for everyone.

I'm quite curious though what would be most useful for your team? I guess you were at least expecting columns to appear for models declared in dbt. What about tables in the warehouse but not in the dbt schema?

@owlas owlas added question Further information is requested documentation Improvements or additions to documentation labels Apr 18, 2021
@sisu-callum
Copy link
Author

I'm a big fan of this decision. My personal ethos is that teams should aim for 100% column coverage, as thats the best way to pass on domain knowledge.

That being said, 100% column coverage is a bit of an aspirational thing and could potentially impact teams that want to iterate model --> lookml creation quickly and programmatically with dbt2looker. I'm imagining a scenario where someone has model_1 and model_2 that create model_3. They use dbt_utils.star in model_3 to get all columns from both models. If they added 2 columns to model_1 and didn't document them, I could see that potentially being an issue that isn't caught until some raises a flag with the new dimensions not appearing in the LookML.

I think the best solution (and probably the most work to be honest) would be the creation of a dbt2looker.yml file where this could be defined at the project level. Default setting would be the current as-is state of not bringing in fields that aren't declared but this could be changed to include everything. That gives different teams the flexibility to choose an option that fits their style of development and documentation.

@mike-weinberg
Copy link

mike-weinberg commented May 3, 2021

perhaps this could be another meta option, eg a default like infer-looker-dimensions=false and then if you flip the switch, automatically generate column descriptions indicating that the field was machine generated, because fundamentally we want some sort of guard to caution users when they interact with data that has not been explicitly documented.

I think this might also solve for the open question discussed in dbt slack, since there's not a huge distinction between columns and subcolumns from dbt's perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants