-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flexible scheduling #2786
Flexible scheduling #2786
Conversation
* main: (21 commits) Add IPv6 config to docker-compose.yml (#2256) Remove IPs with zero vulnerabilities (#2319) Translations update from Hosted Weblate (#2311) Chore/update pr template with comments (#2305) Sector report summary - Best and worst scoring security checks (#2312) Remove icons from compliance issue table (#2340) More ulimits for buggy celery (#2338) Remove smartphone from bug report template (#2334) add meta / cache hash for rpki boefje to raw output (#2255) Add max fds ulimit to octopoes api worker (#2327) Fix multiple Debian issues (#2283) Fix/upgrade jinja2 (#2326) Remove preselection from multireport flow (#2318) Updated template file to respect environment prefixes in docs (#2317) fix zero division (#2298) List item behaviour (#2281) Translations update from Hosted Weblate (#2279) Fixed invalid type usage in `get_rabbit_channel` and `close_rabbit_channel` (#2280) Translations update from Hosted Weblate (#2261) Fix export buttons report (#2259) ...
* main: Fix/394 Introduce clearance level control for objects imported by CSV (#2390) Update RabbitMQ to the latest version (#2392) Show created at and data from in reports (#2370) Rename invalid rpki finding to expired (#2377) Convert `docker-compose` to `docker compose` (#2341) Remove uWSGI (#2366) Prevent double github actions (#2374)
* main: Fix/394 Introduce clearance level control for objects imported by CSV (#2390) Update RabbitMQ to the latest version (#2392) Show created at and data from in reports (#2370) Rename invalid rpki finding to expired (#2377) Convert `docker-compose` to `docker compose` (#2341) Remove uWSGI (#2366) Prevent double github actions (#2374) Fix WEASYPRINT_BASEURL default value and change ports in docker-compose.yml (#2373) Remove debian11 packages (#2358) Error handling for Generate Report (#2274) Update dependencies (#2348) Add token authentication (#2349) Adds CAA records to the model, boefje, normalizer, adds a check bit and a finding (#2315) Check for sudo in install and update script (#2360) Feat/normalizer mimetype upload deeplink (#2220) Add hrefs to Basic Security overview (#2330) Render dicts and list ooi attrs as jsonfield (#2355)
…lexible-scheduling * chore/mula/update-architecture-doc: Fix Fix diagram010.svg Update docs and restructure some code Update documentation Add diagrams Update architecture documentation
* main: Fix new boefjes issue for scheduler (#3297) Add audit logging to CRUD actions using Django signals (#3314) fix: Button height (#3316) Add user id to OOI (#3305) Translations update from Hosted Weblate (#3179) Restructure scheduler storage module (#3294) Fix CSRF error in API with token auth (#3313)
* main: Restructure scheduler server module (#3295)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff @jpbruinsslot! Do you have a todo list or timeline on when we can merge this and make the switch?
# We at least delay a job by the grace period | ||
minimum = self.ctx.config.pq_grace_period | ||
deadline = datetime.now(timezone.utc) + timedelta(seconds=minimum) | ||
|
||
# We want to delay the job by a random amount of time, in a range of 5 hours | ||
jitter_range_seconds = 5 * 60 * 60 | ||
jitter_offset = timedelta( | ||
seconds=random.uniform(-jitter_range_seconds, jitter_range_seconds) # noqa | ||
) | ||
|
||
return True | ||
# Check if the adjusted time is earlier than the minimum, and | ||
# ensure that the adjusted time is not earlier than the deadline | ||
adjusted_time = deadline + jitter_offset | ||
adjusted_time = max(adjusted_time, deadline) | ||
|
||
def is_item_on_queue_by_hash(self, item_hash: str) -> bool: | ||
return self.queue.is_item_on_queue_by_hash(item_hash) | ||
return adjusted_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that, if I read this correctly, this create a uniform distribution between "(minimal) deadline - 5 hours" and "deadline + 5 hours", and then sets everything below the deadline equal to the deadline. In practice this will converge to a situation where 50% of the time the adjusted_time
equals the deadline and the other 50% is uniformly distributed over the 5 hours after the deadline. I think that's not what's intended? Perhaps the jitter offset should be:
jitter_offset = timedelta(seconds=random.uniform(0, 2*jitter_range_seconds))
To spread it out uniformly over the 10 hours after the (minimal) deadline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Donnype I've updated it, would you mind checking?
depends_on = None | ||
|
||
|
||
def upgrade(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I challenge you to test it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, tested it with a minimal set. But would be good to fully test it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just a few remarks. We've already discussed the PR in a high level overview offline
mula/scheduler/utils/cron.py
Outdated
from croniter import croniter | ||
|
||
|
||
def next_run(expression: str, start_time: datetime = datetime.now(timezone.utc)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't assign datetime.now(timezone.utc)
in the signature, otherwise it will always have the same value (based on the module import time). The Pythonic way is to assign a None
as default, and assign a new datetime in the function body if the passed value is indeed None
.
Thanks! And thank you for your suggestions. I wanted to get the reviews out first. For the next steps, I'll implement the suggestions, get QA to look at it and focus on the migration for a bit to make sure that works out fine. When merged I'd like to have it deployed in order to test to uncover edge cases, and how it will fare with more throughput/load. |
Checklist for QA:
What works:General scheduling works on a 'make reset', tasks can be rescheduled, links in the UI to/from the task list seem to work. Findings are created. What doesn't work:
Bug or feature?:n/a |
Updated and amended qa instructions, for additional testing |
Please refer to #204 for more background on this PR
Changes
Since this is a big PR with substantial changes, I've added the following for more guidance on review and qa. It is helpful to read the above-linked issue to get more context. And whenever you need me I can give you this as well in a call.
The main changes are:
Schedule
model to support a more deterministic approach to (re)scheduling of tasksPrioritizedItem
model and its database table. The queue is not a separate table anymore, a 'view' of the database filtered on statusqueued
is the current state of the queued items of a scheduler.Specific code changes of interest:
Model changes
Schedule
model added tomodels/schedule.py
p_item
change todata
in the task table)storage/schedule_store.py
PrioritizedItem
items
tableBoefjeScheduler
changespush_tasks_for_rescheduling()
NormalizerScheduler
changesAbstract base class
Scheduler
changes - some methods have been moved around so below are the main points of interest:post_push()
method creates aSchedule
for when a task has been added to the queue.API server changes
Schedule
Task
endpoints to support the new setupBoefje Runner changes to support the new setup
Rocky changes to support the new setup
QA notes
schedule
objects have been created for the boefje tasks, you can run through the onboarding, do some scans and check the schedulescurl http://localhost:8004/schedules
main
make reset
feature/mula/refactor-queue
docker compose kill
shutdown all the containersmake kat
(we don't want to reset we want the migration to be applied on the initial run)Code Checklist
.env
changes files if required and changed the.env-dist
accordingly.Checklist for code reviewers:
Copy-paste the checklist from the docs/source/templates folder into your comment.
Checklist for QA:
Copy-paste the checklist from the docs/source/templates folder into your comment.