-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up threading #77
Conversation
@@ -47,33 +47,29 @@ def __init__(self, api_call, *args, **kwargs): | |||
self.args = args | |||
self.kwargs = kwargs | |||
self.result = None | |||
self.i_timed_out = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flag telling the API thread that it's been timed out, and when it finally gets unblocked it should Do Nothing.
self.timer = QTimer() | ||
self.timer.timeout.connect(lambda: self.timeout.emit()) | ||
self.timer.setSingleShot(True) | ||
self.timer.start(5000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the timer stuff moved to the Client
object.
securedrop_client/logic.py
Outdated
self.sync_flag = os.path.join(home, 'sync_flag') | ||
self.home = home # The "home" directory for client files. | ||
self.data_dir = os.path.join(self.home, 'data') # File data. | ||
self.timer = None # call timeout timer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Client
object now has a timer, to detect API calls which take too long. The home
stuff above is unrelated to the threading changes.
securedrop_client/logic.py
Outdated
|
||
# we've started a timer. when that hits zero, call our | ||
# timeout function | ||
self.timeout_api_call.connect(lambda: self.timeout_cleanup(timeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than triggering the user-specified callbacks (for success and timeouts) directly, we wrap them in our own functions which will clean up the client, then call the callback functions.
self.api_thread.start() | ||
|
||
else: | ||
logger.info("There's already an API request running, so I'm not going to ignore this one (XXX this may not be the coolest thing to do...)") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the log message implies, we should consider how we actually want to handle "concurrent" API requests...
self.api_runner = None | ||
self.api_thread = None | ||
self.timer = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call_reset
is called by our callback wrappers. We take care to disconnect the timeout handler callbacks: if we don't, then we'll just add more and more callbacks over time, and when one call finally times out, all collected timeout handlers will get triggered (even for requests that have long-since completed).
self.api_thread = QThread(self.gui) | ||
self.api_runner = APICallRunner(function, *args, **kwargs) | ||
self.api_runner.moveToThread(self.api_thread) | ||
self.api_runner.call_finished.connect(callback) | ||
self.api_runner.timeout.connect(timeout) | ||
self.finish_api_call.connect(self.api_runner.on_cancel_timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled in our callback wrapper now.
self.api_thread.started.connect(self.api_runner.call_api) | ||
self.api_thread.finished.connect(self.call_reset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also handled in our callback wrapper.
securedrop_client/logic.py
Outdated
|
||
def login(self, username, password, totp): | ||
""" | ||
Given a username, password and time based one-time-passcode (TOTP), | ||
create a new instance representing the SecureDrop api and authenticate. | ||
""" | ||
self.api = sdclientapi.API(self.hostname, username, password, totp) | ||
|
||
self.api = sdclientapi.API(self.hostname, username, password, totp, proxy=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this proxy=True
will probably break things for folks who are not running on Qubes, I think... Hmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we can snip this out for review on non-Qubes OSes (cc #2)
""" | ||
Handles the result of an authentication call against the API. | ||
""" | ||
self.call_reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now handled in our callback wrapper
""" | ||
self.timer.stop() | ||
|
||
def on_authenticate(self, result, result_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note the new call signature, adding result_data
(unused in this method, as it turns out)
self.call_reset() | ||
|
||
user_callback() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are our callback wrappers, which reset the state of the Client before calling user callbacks.
securedrop_client/logic.py
Outdated
self.on_sync_timeout, self.api) | ||
logger.info("In sync_api, after call to call_api, I'm in thread {}".format(self.thread().currentThreadId())) | ||
else: | ||
logger.info("It looks like you are not authenticated. Weird, isn't it.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was debugging output and should be removed
securedrop_client/logic.py
Outdated
Called when downloading a file has timed out. | ||
""" | ||
# Update the status bar to indicate a failure state. | ||
self.set_status("Connection to server timed out, please try again.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is leftover from my "message" branch and should be removed from this PR.
@joshuathayer epic stuff and a much needed refactor. Regarding QTimer... my understanding was that it was something that couldn't be blocked (i.e. Qt handled this automagically when the QTimer object was started). But I see from the docs, that we'd need to add an event loop to the thread (http://doc.qt.io/qt-5/qtimer.html#details), or do something else (as you've done) so good catch there! Everything else you've done makes complete sense to me and I like how we're simplifying (or should that be decomplectifying, ;-) ?) . In any case, this completely resonates with the "just build something that works, then refactor when we know more" philosophy of my first pass at this. If we can wrap all the thread-y stuff in this way, it'll make any future API calls much easier. Also, see my comments in #80 about the signal not indicating success/fail via a bool so we fix #71. Could this be wrapped into this PR..? Happy to do this work if need be. |
7d8f550
to
d23affe
Compare
I started making minor changes to this (tests/linting fixes) so I could test and merge... but then I just remembered since this PR was proposed we added the use of |
@@ -67,7 +68,8 @@ def call_api(self): | |||
if self.i_timed_out is False: | |||
self.call_finished.emit(result_flag) | |||
else: | |||
logger.info("Thread returned from API call, but it had timed out.") | |||
logger.info("Thread returned from API call, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note this is basically just for informational purposes for devs right now so this one line I no covered
0944f97
to
5323e5f
Compare
OK this seems to work fine in non-Qubes. In Qubes I was able to download a file but I'm still seeing a lot of timeouts when downloading files... @kushaldas or @joshuathayer if you get a chance to try this out in Qubes please do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice work..! (I read through the code and tested it runs fine against a local SD server and tests pass).
Question: do we want to wrap up all the threading related changes into this one handy PR (see #80)..?
closing this in favor of #85 |
(which also has these commits) |
This fixes our API request timeout handling, and in the process tries to clean up how we handle our API thread in general.
As a quick overview: in order to retain application interactivity while we do remote API calls, we run those API calls in a separate thread (as provided by Qt's
QThread
) from our main UI thread. That's a common and recommended pattern.Specifically, we have a
logic.Client
class, which is responsible for spawning and managing an API thread.Client
holds a ref to anclient.APICallRunner
object, which represents the remote API request. When it's time to run the request, theClient
spawns a new thread, "moves" itsAPICallRunner
object to the new thread, and starts the new thread. TheAPICallRunner
'scall_api
method does the actual remote request, and its thread is blocked while that request runs.The
Client
is responsible for managing that thread: it needs to have methods to handle successes and failures, including timing out.The first issue this PR addresses was that we were started a
QTimer
in theAPICallRunner
thread, setting its timeout handler to a function in that thread. But since that thread is blocked while the remote request is running, it has no way of handling that callback until the remote request returns. This led to a bug where both the success and timeout handlers were being run: the request would take longer than 5 seconds and then eventually succeed. In the meantime, the timer would arrange for the timeout handler to be run on the API thread. When the remote request returned and the thread became unblocked, the success handler would be run (since the request succeeded), and also the timeout handler would run (since the timer arranged for that to happen).I moved the timer and its handler into the
Client
object and its thread. When the timer runs out, we abandon it and itsAPICallRunner
object (from the perspective ofClient
), but the thread is still running and will happily call its success callback when it eventually gets unblocked. So, we set a flag in the CallRunner object telling it that it's run out of time, and should silently do nothing when it eventually becomes unblocked.Next, this tidies up our callbacks a bit, by handling
Client
-object state changes after request/failure in Client-wide functions, rather than in individual callbacks. After every request, we want to reset the client to an "empty" state: noAPICallRunner
object, no API thread, no callbacks registered. There are slightly different ways to do this on success and on failure. Rather than duplicating that logic in every callback handler, this PR now wraps the user-specific callback in a small function which handles the cleanup on its own.Note that this involves a small change to how callbacks get to any response data. Previously, they'd look at
self.api_runner.result
. In this PR, we resetself.api_runner
before the callback is run, soself.api_runner.result
will never be defined while in the callback. Rather, the result is passed as an argument to the callback function directly, so callback signatures should be of the formwhere result is a boolean, and result_data is the actual data returned from the API (previously found at
self.api_runner.result
)This change allows us to greatly simplify how the client is reset between requests, and reduces the responsibilities of the success/failure callbacks (or, if you've consumed the same koolaide as me, it decomplects those responsibilities).
Multitheaded stuff is notoriously hard to test and to reason about. I only found these problems because I'm developing against a remote SD instance and was running in to timeout issues. I've manually tested various success and failure cases, but I suspect that there are still race conditions and cases that cause unexpected behavior... though I do think this puts us in a better state than before?
Also this work originated as part of my branch adding message loading. I manually pulled out this threading stuff, but there are some changes remaining here which have to do with message loading still... I'll to clean those up shortly.