Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int_jira__issue_calendar_spine performs slowly on snowflake #111

Open
2 of 4 tasks
jschintz-windriver opened this issue Oct 10, 2023 · 11 comments · May be fixed by #114
Open
2 of 4 tasks

int_jira__issue_calendar_spine performs slowly on snowflake #111

jschintz-windriver opened this issue Oct 10, 2023 · 11 comments · May be fixed by #114
Assignees
Labels
error:forced priority:p3 Affects many users; can wait status:in_progress Currently being worked on type:bug Something is broken or incorrect update_type:models Primary focus requires model updates

Comments

@jschintz-windriver
Copy link

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

The model 'int_jira__issue_calendar_spine' is performing slowly on snowflake - taking about 10-15 minutes depending on the run.
Our instance has about 600M records for this table.
The query plans show that all records are being deleted/inserted for each run.
Should this be materialized as a table instead?

14:51:31 1455 of 1614 OK created sql incremental model AV_int_jira.int_jira__issue_calendar_spine [�[32mSUCCESS 650585743�[0m in 597.48s]
Shouldn't only a small % be loaded with incremental strategy not 99%?

Describe alternatives you've considered

Fork the repo and change myself ... but that requires upkeep :(

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

@jschintz-windriver jschintz-windriver changed the title [Feature] int_jira__issue_calendar_spine performs slowly int_jira__issue_calendar_spine performs slowly on snowflake Oct 10, 2023
@fivetran-reneeli
Copy link
Contributor

fivetran-reneeli commented Oct 11, 2023

Hi @jschintz-windriver! Thanks for raising this and glad to take a look. I don't believe it should be behaving as a full delete and re-load since it's incremental. Can you share more info on your configs, version of the package, and specific warehouse you are using?

@fivetran-reneeli
Copy link
Contributor

Hi @jschintz-windriver, just following up!

@jschintz-windriver
Copy link
Author

@fivetran-reneeli Sorry for the delay!

Some info about our deployment

  • dbt v 1.6X
  • snowflake warehouse
  • package: fivetran/jira version: 0.15.0

@fivetran-reneeli
Copy link
Contributor

Thanks @jschintz-windriver!

I am wondering if there's something in your project yml that's overriding the config set in the model. Because given line 7, it shouldn't be exhibiting that behavior.

incremental_strategy = 'merge' if target.type not in ('snowflake', 'postgres', 'redshift') else 'delete+insert'

And this model should be materialized as incremental based on this line. So after your initial run, so you should see, for example this line of code compiling.

where cast( date_day as date) >= (select min(earliest_open_until_date) from

Would you be able to share your dbt_project.yml?

@jschintz-windriver
Copy link
Author

jschintz-windriver commented Nov 1, 2023

@fivetran-reneeli The incremental logic is indeed compiling - I just don't think its performant for large datasets.
I made a suggested change in my fork and opened a PR in your repo.
Do you see any issues with including this new logic?
It reduced the model load time from 10 minutes to 2 with our deployment.
#113

@fivetran-reneeli
Copy link
Contributor

Hi @jschintz-windriver, our team looked into this, and can confirm we see the same long loading issue with the model. Thank you for opening the PR! When we went through this we saw some opportunities to make this more performant, but this will be helpful as a starting point as we tackle this.

@jschintz-windriver
Copy link
Author

@fivetran-reneeli Thanks so much for the update and taking the time to help us out.

Let me know if there's any testing on my part that could be useful.

@fivetran-joemarkiewicz
Copy link
Contributor

Hi @jschintz-windriver I just wanted to reach back out and let you know this update is still on our roadmap. At the moment it is seeming like we will be able to pick this up and dive deeper in improving the model's performance after the holidays.

We will be sure to share more updates in this thread once we pick up this ticket.

@fivetran-avinash fivetran-avinash added priority:p3 Affects many users; can wait type:bug Something is broken or incorrect status:in_progress Currently being worked on update_type:models Primary focus requires model updates labels Jan 4, 2024
@fivetran-avinash fivetran-avinash self-assigned this Jan 4, 2024
@fivetran-avinash fivetran-avinash linked a pull request Jan 9, 2024 that will close this issue
15 tasks
@fivetran-avinash fivetran-avinash linked a pull request Jan 9, 2024 that will close this issue
15 tasks
@fivetran-avinash
Copy link
Contributor

fivetran-avinash commented Jan 9, 2024

Hello @jschintz-windriver! We wanted to let you know we've been working on a fix for this issue of slow runtime, and we believe we've made several optimizations to the dbt_jira models that should significantly improve performance on incremental runs. You can see the changes that were applied to these models here and an explanation for these changes in the CHANGELOG additions.

Can you test these changes on the following branch by attempting a dbt incremental run and see how your models perform and whether there are any issues? Once we get confirmation that this branch works effectively for incremental runs on Snowflake, we can get these changes merged into the package.


- git: https://github.com/fivetran/dbt_jira.git
  revision: bugfix/optimize-calendar-spine
  warn-unpinned: false

@fivetran-avinash
Copy link
Contributor

Hi @jschintz-windriver ! Just bumping this back up for visibility. Can you test this branch when possible? We don't want to move forward with pushing this fix until we have validation that this is improving the performance of your calendar spine.

@fivetran-avinash
Copy link
Contributor

Hello @jschintz-windriver ! Hope all is well. Will you be able to take a look at our new branch and see if this improves overall performance on Snowflake? Let us know if you need any assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error:forced priority:p3 Affects many users; can wait status:in_progress Currently being worked on type:bug Something is broken or incorrect update_type:models Primary focus requires model updates
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants