Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SendReplyJobTimeoutError now derived from Exception #824

Closed
wants to merge 1 commit into from

Conversation

sssoleileraaa
Copy link
Contributor

@sssoleileraaa sssoleileraaa commented Feb 26, 2020

Description

Fixes #820
Unblocks #819
Unblocks #823

  • SendReplyJobTimeoutError now derives from Exception instead of sdclientapi.RequestTimeoutError since it appears that deriving from sdclientapi.RequestTimeoutError makes it so we can't access self.reply_uuid from the Controller when we pass it the exception. I admit this is a fix for an issue I don't fully understand (it seems perfectly reasonable to me to derive from sdclientapi.RequestTimeoutError which derives from Exception, but this does fix the issue.
  • As a result, I removed retry logic if a job encounters a RequestTimeoutError or ServerConnectionError since we already retry jobs that fail for those reasons in the queue. Overall I think this simplifies the code, plus it makes sense not to retry a bunch of times back to back if we get a ServerConnectionError. I also think our timeouts are long enough to account for when the network is temporarily slow, plus we would just pause the queue until the network is fast enough to retry the job anyway.

Test Plan

Make sure #820 (comment) and #820 (comment) no longer happen

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

@sssoleileraaa sssoleileraaa marked this pull request as ready for review February 26, 2020 03:57
@@ -96,7 +96,7 @@ def _make_call(self, encrypted_reply: str, api_client: API) -> sdclientapi.Reply
# TODO: Once https://github.com/freedomofpress/securedrop-client/issues/648, we will want to
# pass the default request timeout to reply_source instead of setting it on the api object
# directly.
api_client.default_request_timeout = 5
api_client.default_request_timeout = 0.01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accidentally left in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, yup

self.failure_signal.emit(e)
raise
else:
self.success_signal.emit(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the effect of removing this automatic retry logic could be that a user will see a notification much more often... I think we should understand (roughly) how often the request succeeds on the first try - if it often needs to be retried and succeeds on a subsequent try then we should keep this logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thinking is that retrying 5 times (5 being the default) when we get a ServerConnectionError wouldn't result in more success. it doesn't hurt to retry 5 times back to back, but i don't think it's necessary. we could just fall back to the queue pause/resume-when-sync-can-connect-to-the-server logic. retrying 5 times when we get a RequestTimeoutError could result in more success, but i think more commonly if the request times out after X seconds, since we don't increase that time in the next attempt, it'll probably time out again and again, so it seems to make sense to pause the queue and rely on queue retry logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i think more commonly if the request times out after X seconds, since we don't increase that time in the next attempt, it'll probably time out again and again, so it seems to make sense to pause the queue and rely on queue retry logic.

@redshiftzero pointed out to me that we would end up notifying the user that we "Trying to reconnect to the SecureDrop server" after one attempt that times out, so even if it's not too common for this to occur, it would be worth researching first before removing the job retry logic and err on the side of caution so that we don't over-notify the user about network blips.

@@ -106,7 +106,7 @@ def __init__(self, message: str, reply_uuid: str):
self.reply_uuid = reply_uuid


class SendReplyJobTimeoutError(RequestTimeoutError):
class SendReplyJobTimeoutError(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the exception in #820 UnboundLocalError, I would expect that the exception type that the job should raise should be SendReplyJobError - what am I missing?

@sssoleileraaa sssoleileraaa deleted the fix-issue-820 branch May 26, 2020 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

app crashes when object has no attribute 'reply_uuid'
2 participants