Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure data refresh can still be run if no initial index exist #2727

Merged
merged 2 commits into from
Sep 5, 2023

Conversation

sruthiv98
Copy link
Contributor

@sruthiv98 sruthiv98 commented Jul 27, 2023

Fixes

Fixes #2725 by @AetherUnbound

Description

This PR adds a none_failed trigger rule to the generate_index_suffix task in the primary data refresh DAG. It ties together the old index retrieval and old index deletion tasks so both are either run or skipped together. It also adds an ensure_downstream_runs task at the end of the data refresh task group to ensure that the task state from promote is reflected downstream (rather than the task state from trigger_delete_index which can now be skipped).

Before
Screenshot_2023-08-31_14-03-00

After _(ignore the failures, they're expected and should propagate all the way through to the end of the DAG)
Screenshot_2023-08-31_14-01-25

Testing Instructions

See this comment below: #2727 (comment)

When these steps are run on this branch, both the get_current_index and trigger_delete_index steps should be skipped, but all other data refresh steps should run as intended (in the case provided, this means that all tasks after the data refresh task group should remain "upstream failed").

Checklist

  • My pull request has a descriptive title (not a vague title likeUpdate index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I tried running the project locally and verified that there are no visible errors.
  • I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@openverse-bot openverse-bot added 🟩 priority: low Low priority and doesn't need to be rushed 🛠 goal: fix Bug fix 💻 aspect: code Concerns the software code in the repository 🧱 stack: catalog Related to the catalog and Airflow DAGs labels Jul 27, 2023
@AetherUnbound
Copy link
Collaborator

AetherUnbound commented Jul 28, 2023

Thanks for submitting this fix @sruthiv98! You may want to look at our general setup guide for instructions on how to set up just and starting the stack. Once that's set up for you locally, you can run just lint and commit the changes, that will ensure the code is formatted a specific way and pass our linting check 🙂

On testing, this particular case is pretty specific since it technically shouldn't happen during normal processing. I've confirmed that you can replicate the original problem using the following steps:

  1. just down -v && just c && just i (this clears out existing volumes, then launches both the catalog stack and the ingestion server stack)
  2. Enable the audio data refresh DAG (http://localhost:9090/dags/audio_data_refresh/graph)
  3. Mark Success on the get_before_record_count task when it fails (no data exists)
  4. The get_current_index task should skip, and so should the rest of the DAG

After your change, the behavior of step 4 should change so that it skips, but the rest of the DAG runs (it may fail, because there's no data, but as long as it runs even if get_current_index is skipped we're good!)

Hope this helps!

@sruthiv98
Copy link
Contributor Author

After your change, the behavior of step 4 should change so that it skips, but the rest of the DAG runs (it may fail, because there's no data, but as long as it runs even if get_current_index is skipped we're good!)

Thanks for the detailed testing notes!! Super helpful. Just ran this and get_current_index did not automatically skip (it was stuck on marked_for_retry, and when I marked it as success the DAG moved on to the next task. Is this a behavior we were expecting or is the fact that it kinda stalled on marked_for_retry indicate that something isn't right?

@AetherUnbound
Copy link
Collaborator

Thanks for the detailed testing notes!! Super helpful. Just ran this and get_current_index did not automatically skip (it was stuck on marked_for_retry, and when I marked it as success the DAG moved on to the next task. Is this a behavior we were expecting or is the fact that it kinda stalled on marked_for_retry indicate that something isn't right?

Oh, I remember you mentioning something about running the initialization (via just init). Can you try once more starting from just down -v on the testing instructions and see if that does it?

@sruthiv98
Copy link
Contributor Author

Can you try once more starting from just down -v on the testing instructions and see if that does it

Oh this is how I tested it originally! But yes, just tried it again and the run still stalls with up_for_retry at the get_current_index task :/

@AetherUnbound
Copy link
Collaborator

Oh this is how I tested it originally! But yes, just tried it again and the run still stalls with up_for_retry at the get_current_index task :/

Sorry it's taken me so long to get back to this @sruthiv98! Just to confirm, did you also mark the get_before_record_count task as success? That one is inconsequential to the pieces we're testing but does cause the run to fail otherwise! It should be step 3 of the instructions posted above. If it still fails after that, please post the logs of the failed task!

@AetherUnbound AetherUnbound self-assigned this Aug 29, 2023
@AetherUnbound AetherUnbound force-pushed the fix/dag-task-trigger-rule branch from c76c611 to 379c8a0 Compare August 31, 2023 20:27
@AetherUnbound AetherUnbound changed the title WIP: adding task trigger rule Ensure data refresh can still be run if no initial index exist Aug 31, 2023
@AetherUnbound AetherUnbound marked this pull request as ready for review August 31, 2023 21:09
@AetherUnbound AetherUnbound requested a review from a team as a code owner August 31, 2023 21:09
@AetherUnbound
Copy link
Collaborator

@sruthiv98 this required a few more changes than I anticipated, so in the interest of getting this merged I applied those changes and rebased. Thanks for your help on the initial portion!

Copy link
Contributor

@obulat obulat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, @sruthiv98 🎉

@obulat obulat merged commit 83ce3ea into WordPress:main Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: catalog Related to the catalog and Airflow DAGs
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Change trigger rule for data refresh's generate_index_suffix to handle case where get_current_index skips
4 participants