Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use priority queue for job processing #486

Merged
merged 11 commits into from
Jul 24, 2019
Merged

Conversation

redshiftzero
Copy link
Contributor

Description

Fixes #423.

After investigation, #423 is a worthwhile architectural change as we get some major benefits:

  • We can continue to use a single general (priority) queue instead of many separate queues (what we were previously considering) for different actions (i.e. a priority queue is the right data structure for the situation where e.g. we want a metadata sync to occur with high priority every 5 minutes and then other tasks at lower priorities when we get time e.g. starring or flagging). Still worth having the file download queue to be separate so that file downloads (which can take a long time) occur concurrently with other shorter-lived actions (all in the general priority queue).
  • This PR adds an execution priority for each server-related (job) action - unimplemented options are commented out intentionally- and replaces the use of Python's Queue with Python's PriorityQueue.
  • Removes RunnableQueue.last_job while continuing to preserve job ordering in cases where jobs timeout/queue execution pauses.

Other notes to be aware of:

  • A quirk of Python PriorityQueue is that it does not preserve FIFO ordering of objects with equal priorities. A counter was added to our job objects to ensure that the sort order of objects with equal priorities is stable (added notes inline and in commit messages to make this clear).
  • These job counters will need to persist if/when we persist the queues.

Test Plan

I would follow the standard test plan for this one

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs, network (via the RPC service) traffic, or fine tuning of the graphical user interface, Qubes testing is required. Please check as applicable:

  • I have tested these changes in Qubes
  • I do not have a Qubes OS workstation (the reviewer will need to test these changes in Qubes)
  • This really only modifies how the queue processes jobs, so testing on Qubes is not strictly necessary.

TODO: properly handle last_job
the next job in a priority queue is set by sorted(list(entries))[0]

this means that the job objects themselves must be sortable, i.e.
they must all implement __lt__.

in this commit I'm just implementing __lt__ for two job types where
we don't care about the order as long as all jobs of that type have
the same priority.

see related discussion in https://bugs.python.org/issue31145 and for
a solution using dataclasses (not in python 3.5)
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will run through some tests in the morning but looks good for the initial pass-through

securedrop_client/api_jobs/base.py Outdated Show resolved Hide resolved
securedrop_client/queue.py Outdated Show resolved Hide resolved
@redshiftzero redshiftzero force-pushed the spike-priority-queue branch from aaebfd2 to 09851e6 Compare July 23, 2019 23:12
@sssoleileraaa sssoleileraaa self-requested a review July 23, 2019 23:27
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug on master now does not occur. Please confirm that this PR fixes it.

However a similar but different bug is happening, STR:

  1. log into the client in Qubes
  2. cut connection to server
  3. send a few replies and wait until they timeout (you'll see the red bars)
  4. send a message as the source with the failed replies
  5. press refresh
  6. reconnect server
  7. wait until refresh is over
  8. send another message as the source and refresh

Expected:

The replies will be sent and no longer red. The messages from the source will show up in the conversation view.

Actual:

The replies are still red (failed). The messages from the source do not show up in the conversation view. When you close the client and reopen the red replies are blue and the messages appear, so this seems to only be a UI refresh issue.

@sssoleileraaa
Copy link
Contributor

  1. send a few replies and wait until they timeout (you'll see the red bars)

Wait for at least one to time out. The actual behavior happens both when a reply is in the middle of being processed and when all replies are finished being processed and failed.

@sssoleileraaa sssoleileraaa force-pushed the spike-priority-queue branch from 8952d13 to bdd6a93 Compare July 24, 2019 19:36
@sssoleileraaa
Copy link
Contributor

I added a commit to test and fix the small UI bug I mentioned in the comment: bdd6a93

Now when replies begin succeeding again, the previous replies that failed show up as successful and new source messages appear.

@sssoleileraaa sssoleileraaa self-requested a review July 24, 2019 19:43
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, all comments have been addressed so I think it's time to :shipit:

@sssoleileraaa sssoleileraaa merged commit 4633946 into master Jul 24, 2019
@sssoleileraaa sssoleileraaa deleted the spike-priority-queue branch July 24, 2019 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate using a priority queue for the general queue server actions
2 participants