Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1582] [Feature] Refactor tasks to breakout parsing the manifest into a separate piece #6357

Closed
3 tasks done
Tracked by #6356 ...
ChenyuLInx opened this issue Dec 2, 2022 · 3 comments
Closed
3 tasks done
Tracked by #6356 ...
Assignees
Labels
enhancement New feature or request python_api Issues related to dbtRunner Python entry point

Comments

@ChenyuLInx
Copy link
Contributor

ChenyuLInx commented Dec 2, 2022

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now we are doing the parsing and generating a manifest object as part of the runtime_initialization under task.
This makes this important step of dbt hidden under multiple layer of inheritance.
This also makes use need to do hacky solutions to fit the usecases of dbt-server where we want to have some endpoint do parsing upon files being modified, and saves some kind of manifest object for faster dbt invocation.

In order to resolve this, we should refactor out the ManifestTask out of the inheritance chain of tasks into it's own module. Then we will initialize tasks that need parsing of the project with a constructed manifest that generated by the refactored out Manifest loader.

One thing tbd is where we want the compile step to happen after this refactor. (Stu: created #6708 to address abstract graph generation as well)

@ChenyuLInx ChenyuLInx added enhancement New feature or request triage labels Dec 2, 2022
@github-actions github-actions bot changed the title [Feature] Refactor tasks to breakout parsing the manifest into a separate piece [CT-1582] [Feature] Refactor tasks to breakout parsing the manifest into a separate piece Dec 2, 2022
@dbeatty10 dbeatty10 added Refinement Maintainer input needed and removed triage labels Dec 2, 2022
@jtcohen6 jtcohen6 added python_api Issues related to dbtRunner Python entry point Team:Execution labels Dec 2, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 2, 2022

This sounds right to me! dbt-server wants to be able to reuse an already-parsed manifest. We should enable that by creating a clean split between:

  1. Steps for creating a manifest, i.e. ManifestLoader.get_full_manifest(self.config)
  2. Everything that happens after a full manifest is provided

One thing tbd is where we want the compile step to happen after this refactor.

For now, let's keep compilation as a step that happens separately from & subsequent to parsing. The important steps of compilation are:

  1. Interpolating ephemeral model CTEs into models that ref() them — this actually mutates the manifest
  2. Building a networkx graph from the manifest — which is different for dbt build, versus other commands, because build wants additional test edges

We can think in the future about whether we want to perform that first step (ephemeral model interpolation), and mutate the manifest, before caching it. We can also think about whether we want to additionally create & cache the graph object, created by compilation, for additional performance speedup. We might want to create two graph objects, one for build and one for non-build commands.

@jtcohen6 jtcohen6 removed the Refinement Maintainer input needed label Dec 2, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jan 8, 2023

@jtcohen6
Copy link
Contributor

Resolved by #6565

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python_api Issues related to dbtRunner Python entry point
Projects
None yet
Development

No branches or pull requests

4 participants