Add new metadata queue. #715

ntoll · 2020-01-21T13:02:45Z

Description

Towards #652.

This does exactly what it says on the tin. Ignore my rather naive questions on the original ticket. Once I dug deeper I realised the suggested solution was relatively simple and followed an existing precedent with the file download queue.

Test Plan

Updated unit tests as per changes to the code.

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

I have tested these changes in the appropriate Qubes environment
I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
These changes should not need testing in Qubes

If these changes add or remove files other than client code, packaging logic (e.g., the AppArmor profile) may need to be updated. Please check as applicable:

I have submitted a separate PR to the packaging repo
No update to the packaging logic (e.g., AppArmor profile) is required for these changes
I don't know and would appreciate guidance

ntoll · 2020-01-21T13:41:13Z

NOTE: the CI failure is down to an error (missing keys?) in the Debian packaging process.

emkll · 2020-01-21T14:08:46Z

@ntoll I think you will need to rebase on latest master, due to changes introduced in #701

ntoll · 2020-01-21T14:13:39Z

Aha... thanks for the heads up.

eloquence · 2020-01-21T22:03:00Z

(Edited PR body to not resolve #652, as the scope of that issue is a bit larger, including an increase in the frequency of syncs, but it's fine to split that into multiple PRs.)

sssoleileraaa

The scope of the issue says:

What's in scope for this issue?

So the scope of this issue is to run MetadataSyncJob outside of the main queue (until we add async job support) and to sync more frequently to make up for removing syncs from other areas of the code. What still needs to be decided by is:
* How often to sync (idea: every 15 seconds or until a MetadataSyncJob is successful, whichever one is longer)

* Whether or not to create another queue just for MetadataSyncJob processing (using a queue fits within our current architecture of using queues for processing jobs, however, it would also make a lot of sense to _not_ use a queue since we don't ever need to line up more than one MetadataSyncJob at a time)

So this PR needs to increase frequency of background syncs

sssoleileraaa · 2020-01-22T14:33:11Z

Whether or not to create another queue just for MetadataSyncJob processing (using a queue fits within our current architecture of using queues for processing jobs, however, it would also make a lot of sense to not use a queue since we don't ever need to line up more than one MetadataSyncJob at a time)

I personally think not using a queue is the right approach, but if you choose to use a queue then limiting it to one job at a time might work.

ntoll · 2020-01-22T14:33:26Z

@creviera yeah... this was discussed in yesterday's stand-up, but not yet reflected here. 👍 (Apologies for the opacity).

sssoleileraaa

Since we need to ensure that only one job runs at a time, you can either try redshiftzero's idea: #652 (comment) or just run this job outside of a queue (not everything needs to be in a queue)

sssoleileraaa · 2020-01-22T14:47:48Z

@creviera yeah... this was discussed in yesterday's stand-up, but not yet reflected here. +1 (Apologies for the opacity).

Sorry, just to be clear, if you want to increase the frequency of syncs in a separate pr that sounds good to me (smaller PRs ftw!) I updated my review comment with only one requested change.

… the suggested solution.

sssoleileraaa · 2020-01-22T16:09:30Z

Just discussed pausing and resuming queues with @ntoll, once we resume_queues on a successful metadata sync instead of on failure (relying on the fact that metadatasync has the highest priority and will run forever until it succeeds) this will close #672

sssoleileraaa · 2020-01-22T17:33:33Z

securedrop_client/logic.py

        self.sync_update = QTimer()
        self.sync_update.timeout.connect(self.sync_api)
-        self.sync_update.start(1000 * 60 * 5)  # every 5 minutes.
+        self.sync_update.start(1000 * 60)  # every minute.


this time seems good for now, we can increase sync frequency once all sync_api calls have been removed

also: you need to remove the line of code that resumes the queues in on_sync_failure now that sync is outside of the main queue and we don't need to unpause the queue in order to run the MetadataSyncJob. We should never pause the metadata sync queue.

one more note: while offline you can stop the thread and restart it on login, which will take care of this issue: #671 since you can remove the sync_api call on login and instead just start the metadata sync thread

I've removed the queue resumption code as requested. The QTimer will kick in (after a minute) and try another MetadataSyncJob.

See my comment in #671.

sssoleileraaa · 2020-01-22T17:35:07Z

securedrop_client/queue.py

+
+        self.main_queue.pinged.connect(self.resume_queues)
+        self.download_file_queue.pinged.connect(self.resume_queues)
+        self.metadata_queue.pinged.connect(self.resume_queues)


this queue doesn't need to pause or resume so you can remove this as well as the pinged signal. instead you can just call resume_queues from the Controller on_sync_success if the queues are paused.

I'm not sure I follow.

The queue decides itself (see the exception handling in the process method of the RunnableQueue class.) Therefore, the metadata queue could be paused if it encounters a problem (which is a good thing). However, the update timer will restart the queue after X period of time, right...?

In which case, I'd argue these signals still need to be there. IYSWIM.

So currently, we never stop metadata syncs when they fail because of a request timeout. We want them to run in the background until they succeed, which is why it doesn't make sense to pause the metadata sync queue. It looks like your code will try to enqueue another MetadataSyncJob after 60 seconds but it will be dropped because the queue will be full and paused. The right behavior is to not pause the metadata queue and it makes the code simpler because you can remove the pinged signal and remove complicated logic around unpausing the metadata sync queue when we want to keep reaching out to server to see if we can unpause the other queues, see #671 (comment) for an idea on how to make this much simpler.

sssoleileraaa · 2020-01-24T02:23:27Z

securedrop_client/queue.py


        self.main_queue.paused.connect(self.on_queue_paused)
        self.download_file_queue.paused.connect(self.on_queue_paused)
+        self.metadata_queue.paused.connect(self.on_queue_paused)
+
+        self.main_queue.pinged.connect(self.resume_queues)


you can remove the main_queue and download_file_queue pinged signal also because there is already a resume signal emitted by the queue manager, see

securedrop-client/securedrop_client/queue.py

Lines 177 to 181 in 33856fb

def resume_queues(self) -> None:

logger.info("Resuming queues")

self.start_queues()

self.main_queue.resume.emit()

self.download_file_queue.resume.emit()

so once you remove these three lines of code, and add self.resume_queues()` in on_sync_success, and update the queue manager's resume function to check if a queue is paused before emitting the resume signal, then this PR should be good to go.

again, this is what closes #672, see #715 (comment)

OK... I've made the changes you've requested, updated the unit tests, and rebased with latest master. 👍

…only emit resume signal if queue is paused. Fixed tests too.

sssoleileraaa

the metadata sync queue should never pause so you should remove the resume signal but otherwise this lgtm and wfm

Add new metadata queue.

e287954

ntoll requested review from sssoleileraaa, kushaldas and redshiftzero as code owners January 21, 2020 13:02

Merge branch 'master' into metadata-queue1

443c908

redshiftzero mentioned this pull request Jan 21, 2020

Separate MetadataSyncJob from main queue and increase frequency of background syncs #652

Closed

sssoleileraaa suggested changes Jan 22, 2020

View reviewed changes

ntoll mentioned this pull request Jan 22, 2020

snippets don't update when content of message/reply is updated #707

Closed

sssoleileraaa suggested changes Jan 22, 2020

View reviewed changes

Ensure queue size of 1 for metadata jobs. Thanks to @redshiftzero for…

0587868

… the suggested solution.

ntoll added 2 commits January 22, 2020 16:09

Sync more frequently to every minute.

b700df1

Ensure queues unpause (if required) after a successful API call.

75c2032

sssoleileraaa reviewed Jan 22, 2020

View reviewed changes

Minor fixes to address @creviera's feedback.

9233e48

sssoleileraaa reviewed Jan 24, 2020

View reviewed changes

ntoll added 2 commits January 27, 2020 10:38

Merge branch 'master' into metadata-queue1

691ba9f

Address review comments by @creviera: resume queues on sync success, …

c5a9204

…only emit resume signal if queue is paused. Fixed tests too.

sssoleileraaa approved these changes Jan 27, 2020

View reviewed changes

sssoleileraaa merged commit eb9ce9c into freedomofpress:master Jan 27, 2020

sssoleileraaa mentioned this pull request Jan 27, 2020

metadata syncs are not continuous #672

Closed

eloquence mentioned this pull request Jan 28, 2020

Optimize metadata sync frequency #738

Closed

rmol mentioned this pull request Jan 28, 2020

securedrop-client 0.0.13 #743

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new metadata queue. #715

Add new metadata queue. #715

ntoll commented Jan 21, 2020 •

edited by eloquence

Loading

ntoll commented Jan 21, 2020

emkll commented Jan 21, 2020

ntoll commented Jan 21, 2020

eloquence commented Jan 21, 2020

sssoleileraaa left a comment

sssoleileraaa commented Jan 22, 2020

ntoll commented Jan 22, 2020

sssoleileraaa left a comment

sssoleileraaa commented Jan 22, 2020

sssoleileraaa commented Jan 22, 2020

sssoleileraaa Jan 22, 2020

sssoleileraaa Jan 22, 2020

sssoleileraaa Jan 22, 2020

ntoll Jan 23, 2020 •

edited

Loading

sssoleileraaa Jan 22, 2020

ntoll Jan 23, 2020

sssoleileraaa Jan 23, 2020 •

edited

Loading

sssoleileraaa Jan 24, 2020

sssoleileraaa Jan 24, 2020

ntoll Jan 27, 2020

sssoleileraaa left a comment

	def resume_queues(self) -> None:
	logger.info("Resuming queues")
	self.start_queues()
	self.main_queue.resume.emit()
	self.download_file_queue.resume.emit()

Add new metadata queue. #715

Add new metadata queue. #715

Conversation

ntoll commented Jan 21, 2020 • edited by eloquence Loading

Description

Test Plan

Checklist

ntoll commented Jan 21, 2020

emkll commented Jan 21, 2020

ntoll commented Jan 21, 2020

eloquence commented Jan 21, 2020

sssoleileraaa left a comment

Choose a reason for hiding this comment

What's in scope for this issue?

sssoleileraaa commented Jan 22, 2020

ntoll commented Jan 22, 2020

sssoleileraaa left a comment

Choose a reason for hiding this comment

sssoleileraaa commented Jan 22, 2020

sssoleileraaa commented Jan 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ntoll Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sssoleileraaa Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sssoleileraaa left a comment

Choose a reason for hiding this comment

ntoll commented Jan 21, 2020 •

edited by eloquence

Loading

ntoll Jan 23, 2020 •

edited

Loading

sssoleileraaa Jan 23, 2020 •

edited

Loading