Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make analytics report update job scheduling more efficient #7576

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Mar 8, 2024

Motivation and context

Currently, schedule_analytics_report_autoupdate_job attempts to debounce job scheduling by examining existing jobs before scheduling a new one. Unfortunately, the scheduler.get_jobs function, which it uses for this purpose, scales poorly. Not only does it fetch IDs of all scheduled jobs (and not just ones related to the current object), but it then fetches information about every job, one by one. The current logic doesn't even need this information, but RQ Scheduler provides no method to get just the IDs.

Replace the current logic with a new lightweight approach that uses a custom Redis key to block scheduling of additional jobs.

How has this been tested?

Manual testing.

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad SpecLad force-pushed the efficient-analytics-debounce branch from 9173a2a to 0f78ac8 Compare March 8, 2024 15:58
@SpecLad SpecLad marked this pull request as ready for review March 8, 2024 15:58
@SpecLad SpecLad requested a review from klakhov March 8, 2024 15:59
Copy link

codecov bot commented Mar 8, 2024

Codecov Report

Merging #7576 (144381c) into develop (bfb902f) will decrease coverage by 0.09%.
Report is 6 commits behind head on develop.
The diff coverage is 100.00%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7576      +/-   ##
===========================================
- Coverage    83.53%   83.44%   -0.09%     
===========================================
  Files          372      373       +1     
  Lines        39700    39739      +39     
  Branches      3729     3741      +12     
===========================================
  Hits         33162    33162              
- Misses        6538     6577      +39     
Components Coverage Δ
cvat-ui 79.24% <ø> (-0.19%) ⬇️
cvat-server 87.33% <100.00%> (+0.01%) ⬆️

Copy link
Contributor

@klakhov klakhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, patch works for me.

cvat/apps/analytics_report/report/create.py Outdated Show resolved Hide resolved
Currently, `schedule_analytics_report_autoupdate_job` attempts to debounce
job scheduling by examining existing jobs before scheduling a new one.
Unfortunately, the `scheduler.get_jobs` function, which it uses for this
purpose, scales poorly. Not only does it fetch a list of all scheduled jobs
(and not just ones related to the current object), but it then fetches
information about every job, one by one. The current logic doesn't even need
this information, but RQ Scheduler provides no method to get just the IDs.

Replace the current logic with a new lightweight approach that uses a custom
Redis key to block scheduling of additional jobs.
@SpecLad SpecLad force-pushed the efficient-analytics-debounce branch from 0f78ac8 to 144381c Compare March 11, 2024 15:07
@SpecLad SpecLad merged commit 009f9f8 into cvat-ai:develop Mar 11, 2024
34 checks passed
@SpecLad SpecLad deleted the efficient-analytics-debounce branch March 11, 2024 17:38
@cvat-bot cvat-bot bot mentioned this pull request Mar 11, 2024
zhiltsov-max pushed a commit that referenced this pull request Mar 29, 2024
…ts (#7596)

<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
See #7576 for more details. This patch extracts the high-level
throttling functionality added in that patch and reuses it for quality
reports.

Note: in that patch I referred to this functionality as debouncing, but
throttling seems like a more accurate description. It would be
debouncing if the autoupdate job only ran after no updates occurred for
a period, which is not how it actually works.

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->
- ~~[ ] I have updated the documentation accordingly~~
- ~~[ ] I have added tests to cover my changes~~
- ~~[ ] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))~~
- ~~[ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants