Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch active tasks from memory in SeekableStreamSupervisor #16098

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

AmatyaAvadhanula
Copy link
Contributor

@AmatyaAvadhanula AmatyaAvadhanula commented Mar 11, 2024

The SeekableStreamSupervisor fetches the task payloads for every active task in its datasource twice every RunNotice.
In large clusters, this may cause the RunNotice to take a long time when it may be able to complete within a couple of seconds otherwise.
If there are hundreds of supervisors, there are 4 * supervisors calls to the metadata store every minute to fetch all the active datasource task payloads. This change can help reduce the load on the db significantly in such cases.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@abhishekagarwal87
Copy link
Contributor

What problems does this PR address?

@AmatyaAvadhanula
Copy link
Contributor Author

AmatyaAvadhanula commented Mar 11, 2024

The SeekableStreamSupervisor fetches the task payloads for every active task in its datasource twice every RunNotice.
In large clusters, this may cause the RunNotice to take a long time when it may be able to complete within a couple of seconds otherwise.
If there are hundreds of supervisors, there are 4 * supervisors calls to the metadata store every minute to fetch all the active datasource task payloads. This change can help reduce the load on the db significantly in such cases.

Copy link

github-actions bot commented Jul 7, 2024

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the [email protected] list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 7, 2024
Copy link

github-actions bot commented Aug 4, 2024

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Aug 4, 2024
@github-actions github-actions bot removed the stale label Oct 15, 2024
@kfaraz
Copy link
Contributor

kfaraz commented Oct 17, 2024

@AmatyaAvadhanula , the change here makes sense to me.
Can we move this move this from Draft to Ready?
There seem to be some merge conflicts.

@AmatyaAvadhanula AmatyaAvadhanula marked this pull request as ready for review October 17, 2024 20:54
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@AmatyaAvadhanula , the SeekableStreamSupervisor also makes calls to taskStorage.getTask(). I wonder if these calls should also first check for those tasks in memory. If yes, then we should probably just remove TaskStorage from SeekableStreamSupervisor and use TaskQueryTool instead and route everything from there.

The TaskQueryTool can decide if a task should be served from memory or storage.
What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants