-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance degradation #932
Comments
To address the performance degradation and intermittent connection errors with your Airflow DAGs using cosmos for dbt tasks, consider the following targeted steps:
For the Snowflake connection issues:
These steps are derived from analyzing the provided configuration and issues. Implementing these recommendations should help mitigate the performance and connectivity issues you're experiencing.
|
Hi @liranc1 thanks for sharing this information, could you please confirm which version of Cosmos you were using before and after this performance degradation was observed? |
Cosmos was not used before, I was using bash operator. |
@liranc1 could you try out Cosmos 1.4 and let us know if there are any performance improvements? |
Some progress: #1014. |
The previously mentioned PR, #1014, is for review and seems to have promising results |
If using
Some ways to improve the performance using Cosmos 1.4: 1. Can you pre-compile your dbt project? If yes, this would remove this responsibility from the Airflow DAG processor, greatly impacting the DAG parsing time. You could try this by using and specifying the path to the manifest file:
More information: https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-manifest 2. If you need to use If yes, this will avoid Cosmos having to run dbt deps all the time before running any dbt command, both in the scheduler and worker nodes. In that case, you should set:
More info: 3. If you need to use LoadMode.DBT_LS, is your dbt project large? Could you use selectors to select a subset?
More info: https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html 4. Are you able to install dbt in the same Python virtual environment as you have Airflow installed? If this is a possibility, you'll be able to experience significant performance improvements by leveraging the More information: |
Before using cosmos Airflow dag was running for about 15 min for a certain dbt command.
After the change to cosmos, the same dbt command is much more volatile, often taking 20-30 min.
All Airflow's resources stayed the same for the dag, and there was no change in the dbt connection details.
I also encountered some tasks randomly failing due to connection error (snowflake), that was successful on the next run. This issue did not occur without cosmos.
cosmos configurations used:
ExecutionConfig(dbt_executable_path=DBT_EXECUTABLE_PATH)
RenderConfig(
select=["models"],
test_behavior=TestBehavior.NONE,
load_method=LoadMode.DBT_LS,
dbt_deps=False
)
ProjectConfig(os.environ["DBT_PROJECT_PATH"], dbt_vars=dbt_vars)
dbt version:
Core:
Plugins:
The text was updated successfully, but these errors were encountered: