-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
app: add pending reply status, persist replies in the database #578
Conversation
Question: is the above video just looping, or does it ever go from |
oh yeah the gif is just looping. i can confirm it is the case that the urgent-coral bar will only show up if there is some problem that requires user intervention. for example, if the reply send fails the first time due to a network error, then the reply gets resent again transparently to the user without turning the color bar urgent-coral. |
Today I thought a bit about how to handle the server/client reply order state synchronization (the remaining issue with this branch, and the underlying issue behind #489), here are my thoughts on how to proceed, comment if you think there's an error in my logic or you think I'm overlooking a simpler/more elegant solution. First here are some test cases that we need to consider and handle. For all cases we only consider a single source. Other requirements/constraints are:
Case 1: Single journalist, multiple successful replies
Expected ordering (server and client): M, X, Y Case 2: Single user, multiple replies, some that fail
Expected ordering (server): M, Y Case 3: Multiple clients, multiple successful replies
Expected ordering (server, journalist A, journalist B): M, X, Y Case 4: Single journalist, multiple successful replies, source messaging at the same time
Expected ordering (server, journalist A, journalist B): M1, M2, X Current behavior
New behaviorHere’s my current thinking on the best way to handle this:
|
31c6737
to
0a76bcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick feedback while I'm testing this:
The client won't open now after I closed the application when the reply fails to send. I'm seeing:
File "/home/creviera/workspace/freedomofpress/securedrop-client/securedrop_client/storage.py", line 166, in update_sources
delete_single_submission_or_reply_on_disk(document, data_dir)
File "/home/creviera/workspace/freedomofpress/securedrop-client/securedrop_client/storage.py", line 437, in delete_single_submission_or_reply_on_disk
filename_without_extensions = obj_db.filename.split('.')[0]
AttributeError: 'DraftReply' object has no attribute 'filename'
ahh thanks, forgot the source sync deletion iterates through the |
not sure what's happening but all my replies are now saved in Update: i was wrong about replies sending on |
For the interested observer @creviera's report is correct - replies now fail to send - however this is due to #598 on master (to reproduce on master you must first add a new source since the issue appears to be something related to gpg key imports). |
I can confirm that after deleting the lock file cited in freedomofpress/securedrop#4909 (which from my testing just now appears to be the cause of #598) then testing this branch in Qubes works without issue for me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
STR:
- Send a reply that you know will fail (cut connection to server)
- Send a couple other replies (you'll see them stay in the pending state)
- Click retry (nothing should change)
- Fix connection to server and refresh
Expected
The red reply and pending replies that follow are sent
Actual
The red reply shows up once in red as failed, followed by a copy of itself once in green as successful. The pending replies that follow are sent and green.
Interesting! I think what is happening here is that the reply send does time out: the reply status is failed and is stored as a draft on the client side. However, the reply actually did get saved on the server, we just didn't get the response due to network issues. I think the right place to handle this is during the sync action for new replies: we should check if there is a draft locally with the same uuid as the new reply from the server: if yes, then we simply delete the local draft. I'll implement this and then retest your scenario. |
09482af
to
1cb1a2d
Compare
during the sync, we don't attempt to delete draft replies in the source.collection they aren't stored on disk, but will get deleted by the cascade delete when the source is deleted. however we also ensure that duplicate drafts are cleaned up on sync to handle the scenario where a ReplySendJob "fails" but the reply _was_ actually saved properly on the server.
1cb1a2d
to
4880707
Compare
So - there's a remaining issue with the queue pausing which will complicate testing (basically as is the queue isn't pausing when the server is completely down, not timing out, which requires proxy changes): that's freedomofpress/securedrop-proxy#128 - we'll need to handle this outside of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far, just made a follow-up issue that we can discuss on Monday.
Switching to code review now.
if a user sends multiple replies A, B, C, the order should always be A, B, C, even if: * A fails * B sends successfully * C is pending we ensure that once a reply sends successfully - or we find that a reply _did_ send but we have it marked as failed as we never got the response (h/t @creviera for testing this case) - that we ensure that C appears after B. this is done by updating the file_counter.
added a commit 95db9b4 to handle a case @creviera and I just discussed synchronously:
At this point reply A has been successfully sent on the server in this scenario - we just never got the response. That reply will appears as failed locally until we sync and find out otherwise. In real world use this can happen, and we now have logic in this PR to handle this situation on sync: if we find that a draft exists locally matching a successfully sent reply on the server, we remove the local draft and update to reflect the server state locally. However, in this case we need to also ensure that pending reply B will appear after now successfully reply A in the conversation view for the user. To do so, we run the logic that runs after a reply sends successfully updating the ordering of draft replies to move the drafts to after the |
Yeah, so what I'm seeing is:
This is because, as you say, Reply A actually sent and made it to the server in step 2 but the client doesn't know about it yet. It would be nice to be able to see Reply A as successful before Reply B shows up as successful because there's about 10 seconds of it looking like Reply A was never sent. You might have mentioned the reason to not sync the client with the server before starting the queue again, but I can't remember it so it'll be helpful to have an explanation in writing somewhere. UpdateEventually we will move Metadata sync to a job that'll be added to a queue at a set interval but could also be added to the main queue when the sync icon is clicked or when we do things like unpausing a queue. I think it might make sense to make what I'm describing above into an issue and mention that we could prioritize a Metadata sync job before any other job on the queue. |
Reply A is always showing up before Reply B now, but instead of Reply B showing up as successful it now shows up as failed (in the case where we close the client, see steps to repro below):
Before the latest commit, I'm pretty sure Reply B would send. I'm super curious to see what's going on here, but might have to wait until tomorrow morning. |
Oh wait, actually, this is what we want! We don't want to automatically send pending replies between different client sessions. Okay, nvm, ignore latest comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the only thing I haven't reviewed yet are tests. I have some comments, a few questions, nothing really blocking the PR, but it would be nice to have another day to let all these changes sink in and chat with you more about design etc.
@@ -6,7 +6,8 @@ | |||
|
|||
from securedrop_client.api_jobs.base import ApiJob | |||
from securedrop_client.crypto import GpgHelper | |||
from securedrop_client.db import Reply, Source | |||
from securedrop_client.db import DraftReply, Reply, ReplySendStatus, ReplySendStatusCodes, Source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so organized!
session.commit() | ||
|
||
return reply_db_object.uuid | ||
except RequestTimeoutError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is out of the scope of this PR but shouldn't we also be catching AuthError
and ApiInaccessibleError
and raising a custom exception to include reply_uuid and message like we do SendReplyJobTimeoutError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:o yes... yes we should
@@ -54,7 +56,11 @@ def collection(self) -> List: | |||
collection.extend(self.messages) | |||
collection.extend(self.files) | |||
collection.extend(self.replies) | |||
collection.sort(key=lambda x: x.file_counter) | |||
collection.extend(self.draftreplies) | |||
# Sort first by the file_counter, then by timestamp (used only for draft replies). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a more high-level visible place we should mention how we use timestamps for saved drafts. I can't think of one other than in a docstring for source or in our client architecture doc. I was just hoping to find more information about why we use the timestamp somewhere. I did find your PR comment:
this is the local timestamp that the reply was sent and this will be used for ordering local replies when there are multiple attempted replies in between conversation items from the server.
So we could use a local_file_counter instead of timestamp right? I don't feel strongly about this but maybe it makes it clearer that we don't actually care about time, we just care about order in which a reply was drafted so that we can display the drafts in the correct order in the client?
Or perhaps we'll want to show the timestamp next to the draft to help the journalist remember when they drafted it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah good point - how about I add a description of the ordering situation here to the wiki architecture page?
i.e. saying something like
"draft replies store:
- a
file_counter
which points to thefile_counter
of the previously sent item. this enables us to interleave the drafts with the items from the source conversation fetched from the server, which do not have timestamps associated with them. - a
timestamp
which contains the timestamp the draft reply was saved locally: this is used to order drafts in the case where there are multiple drafts sent after a given reply (i.e. whenfile_counter
is the same for multiple drafts)"
with an example
I actually did call the DraftReply.file_counter
field local_file_counter
field (😇) but then renamed it back to file_counter
to simplify the source.collection.sort
key. You're right that we could ditch timestamp
and have two fields file_counter
and local_file_counter
. imho I figure is slightly more useful to have the actual timestamp locally for if we ever do want to expose the draft timestamp to users (I could imagine that being useful).
|
||
|
||
class ReplySendStatusCodes(Enum): | ||
"""In progress (sending) replies can currently have the following statuses""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a SAVED
status in the future 🔮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha yeah so at some point in this PR before I created the DraftReply
object this status object was stored on the Reply
and had three codes: FAILED, SUCCESSFUL, PENDING
. but I removed SUCCESSFUL
when I created the DraftReply
as only drafts can be FAILED
or PENDING
, all replies are by construction SUCCESSFUL
I could imagine this is a place to store more detailed status codes about the failures in the future like ENCRYPTION_FAILED_NO_SOURCE_KEY
, SEND_FAILED_SOURCE_DELETED
. We could also imagine some more granular pending statuses like PENDING_ENCRYPTION_IN_PROGRESS
, PENDING_AWAITING_SERVER_RESPONSE
.
Send reply and emit a signal so that the gui can be updated immediately, even before the | ||
the reply is saved locally. | ||
Send reply and emit a signal so that the gui can be updated immediately indicating | ||
that it is a pending reply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
@@ -241,7 +244,7 @@ def update_replies(remote_replies: List[SDKReply], local_replies: List[Reply], | |||
* Existing replies are updated in the local database. | |||
* New replies have an entry created in the local database. | |||
* Local replies not returned in the remote replies are deleted from the | |||
local database. | |||
local database unless they are pending or failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
When we confirm a sent reply, if there are drafts that were sent after it, | ||
we need to reposition them to ensure that they appear _after_ the confirmed | ||
replies. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it super helpful for you to add a quick description for each parameter. For instance I was confused how the new_file_counter
was found until I looked at where this function is used. It looks like new_file_counter
is from the reply (that is causing the draft replies to be reordered) that the server has recorded? And old_file_counter
is from the reply (that is causing the draft replies to be reordered) that the client has recorded?
Why do we need to set all the draft replies to have the same file_counter
as the new_file_counter
?
It looks like this works but just not sure why you decided to approach it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your understanding here is correct - I've added some more explanation and an example in 44c4394, let me know if that makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring change is really helpful. This change is significant and not only fixes a very important issue of a journalist potentially losing what they write in the ReplyBox, it also sets us up nicely for adding a "Save" feature and allowing users to selectively send a draft or failed reply.
This looks ready to be framed 🖼️
Let's shipit!
Description
Fixes #350.
Fixes #294.
What it looks like now when you send replies (the failed replies will persist between application restarts and clicking between sources):
For a followup:
Test Plan
To generate lots of failed replies and introduce delay I'm testing using the following server diff:
You can use that or you can just use staging and rely on regular ol tor network issues to:
Checklist
If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable: