-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decide on synchronization strategy when pushing to servers #51
Comments
I'm in favor of option 2. This is how chat apps (Signal, Whatsapp, Gitter) work. The message appears in the conversation with a spinner then a check when it's received or a |
I'm also in favor of option 2 from a UX perspective. Of course, properly implementing two-way sync is a known hard problem :) There's been a bunch of exciting movement in the world of distributed, eventually-consistent databases, particularly with the popularization of offline-first application architectures. Apache's CouchDB, for example, says right in the marketing copy
"Challenging network infrastructure" sounds like our environment! It's probably a bit late in the game to change our client's underlying database, but we should be aware that there's a whole world of research and established techniques for distributed, eventually consistent applications, and, to the extent possible and reasonable, try to incorporate those ideas in whatever we design. |
yep - 2 is definitely better and what we want, but is non trivial hm couchDB looks very interesting @joshuathayer! agreed re: trying to use standard techniques where we can. for the alpha we'll stick with a relatively simple approach, that said if there are systems e.g. couchDB that can do some of this work for us, we're definitely down to consider them given the importance of maintaining consistent state and not trying to reinvent the wheel. what about this way forward for now:
implementation-wise: the UI is reflecting the state of the local database, so one way to implement this is to commit the change to the database (we do need to commit the change for the UI to update), then make the change back if a fail in step 3 happens (bearing in mind that other user actions can happen between the "change A sent to server" and "change A confirmed by server"). we could do this UI rollback logic via callbacks passed to
thoughts? |
Yep that all makes sense! One danger comes from here:
Imagine a state machine like
where A and B are change actions. Consider a situation where the user takes action A and we optimistically advance application state to X. Then the user takes action B, and we advance application state to Y. If A and B eventually succeed on the server, then everything is fine. But! If A fails on the server and B succeeds, then we're in a weird state: the server is in some state Less likely but still possible are out-of-order problems: what happens if A and B happen in the client in rapid succession, but because of weirdness over Tor, B hits the server first. If A and B are both valid transitions from the state of the server but aren't commutative, the server transparently arrives at a different state than the client ( I promise I'm not trying to be a jerk! And I don't have simple answers to these problems, aside from the laborious process of examining the actions we'd actually taking, considering ways in which their success and failure could interact, and programming defensively against those scenarios- it could be that in practice we don't have to worry much about the more tangled consistency issues. |
We could also take a hybrid approach- actions which are idempotent and commutative (starring, tagging...) could be done optimistically, and failure could be handled simply: "enqueue this request and try again until it works". Actions whose eventual failure (or misordering) could land the system in an inconsistent or incorrect state could be done synchronously, either beachballing the entire UI or some elements of it until the request either succeeds or fails. |
excellent points, thank you @joshuathayer! i thought a bit more about this - what follows is each write action possible by securedrop clients, and proposals for what the client should do, such that we don't enter into any of the inconsistent states you describe. feel free to point out issues here! considering first this state machine and scenario:
my understanding is that for this to happen, there have to be actions that are valid at state Y that were invalid at state X. we are fortunate (for now - we'll have to be careful about this until we modify our synchronization process) in that for securedrop we only have two of these:
(we do however have actions that are invalid at state Y that were valid at state X, i.e. all delete actions, so we may block some actions which is ok)
the good news is that many of our actions are idempotent (see below section). the bad news is two of these actions do not commute: unstarring and starring do not commute with each other ( in the non-idempotent cases we have deletion, but the server will report a failure for any actions attempted after a deletion has occurred, so we can safely rollback (and deleting the source does not enable any other previously-disallowed actions). replying is the one case where the ordering really does matter from the user's perspective, so we should sync the API often to update the replies locally. but it still may be the case that replies arrive at approximately the same time just as in other chat applications, and clients may not agree on the ordering. in this case, the server keeps track of the true global ordering (based on time of arrival) and clients will update to reflect that after sync. summary of possible write actions and the failure behavioridempotent actions
non-idempotent actions
suggested proposed callback behavior for each action and implementation notesidempotent actions
non-idempotent actions
other thoughtstimeoutsif we just don't get a response from any of the above actions, then the change may or may not have been committed to the server, in which case we should sync the API. durability
queue/retryproposal for at least beta: we add a queue for processing these API calls and add retry logic (ideally only after transient errors like network issues). |
Folks, thanks for such a stimulating set of comments. @redshiftzero that's an epic and really useful summary of the various actions and states. Nice one! I'm not sure I can add much more to this discussion because:
Thoughts on UI: Instead of Beachball-mageddon, why not have a status bar across the bottom of the application (as browsers used to do) which contains a message/icon indicating the current activity/state..? It's a single place for the user to look and means we only have one beachball to worry about IYSWIM from a code perspective. Of course UX may indicate such a status bar should go somewhere else... I'm anxious we retain simple and easy to maintain code. |
ah right this was picked up in #63 otherwise I think we have a decent plan for the alpha behavior and I'm going to close this... I'll link out to my comment re: proposed callbacks on the relevant issues note there is room after the alpha for further thought on this (i.e. i don't want to shut down the conversation if people have more ideas on sync strategy in the future, but i do want to signal that this is the plan for the alpha 😉) |
Jeebus, my brain hurts reading all of that! GitHub needs a nerd-heart emoji, as the simple red heart just doesn't cut it sometimes. :) @ntoll I'm guessing you guys have a set solution in place for Alpha (that will be great, of course!), and that the timed/animated Invision clickthroughs I posted last week adequately display intent-goals for Beta. Pls let me know if anything else is needed; happy to provide! |
Users have reported printers not working because of `ppdc` warning messages reported in the client (asking them to contact an administrator). These warnings do not result in non-zero return codes and are seemingly really just warnings, so no need to get users involved. Fixes #51
Supports logout endpoint (fixes #51)
so far our sync logic is pull logic - it fetches everything on the server, and treats it as the canonical source of truth for updating local databases in securedrop clients. this issue is about a strategy for pushing new changes to servers.
something to bear in mind is that interfaces feel sluggish to humans if there's a >100ms response time. the mean round trip latency for tor is greater than 100ms, so any activity involving interaction with the securedrop server is going to feel sluggish to users if we wait for changes to be successfully pushed to servers prior to reflecting it in the GUI.
push option 1: apply to the server prior to reflecting locally and make the user wait i.e. let it be slow. UX wise we'd need to build in a lot of spinning beach balls so that the user knows that something is happening. this is easier logic-wise and it's what i've done in #50, but it's not necessarily the best strategy long term as spinning beach balls are probably pretty annoying to users.
push option 2: make the change locally, then attempt to apply it on the server. rollback if there was a failure applying the change to the server (there are lots of reasons this can happen: tor is flaky, securedrop application server goes down, etc.). we'd need to present the failure to push the change to the server to the user in this scenario somehow.
The text was updated successfully, but these errors were encountered: