-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make deletion of multiple sources asynchronous #5257
Conversation
This pull request introduces 2 alerts when merging 41e8297 into 696b47a - view on LGTM.com new alerts:
|
9846e2c
to
d813aa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great! thanks @rmol. tested both in dev env and with ~150 sources in staging, confirmed keys and files both deleted async after deletion is scheduled.
at least before release, we should add a note to the docs about what scheduled for deletion means, since the user might not realize that this means that the deletions will start processing basically immediately but just async, and instead might wonder/worry "when is it scheduled for?" (cc @ninavizz and @eloquence on any thoughts there).
If another person (@emkll or @zenmonkeykstop) wants to take a look in the next couple days given the scope of the changes, please do, else this is good to merge when CI passes
@@ -38,16 +38,16 @@ def name_length_validation(form, field): | |||
|
|||
|
|||
class NewUserForm(FlaskForm): | |||
username = TextField('username', validators=[ | |||
username = StringField('username', validators=[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(for other reviewers: TextField
is deprecated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was just trying to reduce the noise.
Thanks for flagging the UX implications @redshiftzero. I gave it a quick spin, I agree we may want to tweak the notification language a bit and/or address this in docs. Will discuss w/ Nina & John. Should we add a "SecureDrop Client" section to the test plan since this does modify the API and has potential implications for sync behavior? |
Question for @rmol: What motivates changing the user experience at all? If I'm understanding the scope of changes correctly, we were using a potentially long-running asynchronous process for the deletion of files on-disk before, and are using a potentially long-running asynchronous process for the deletion of files and database records now, to prevent timeouts at any stage of the process. Do I have that right? If so, is there an argument against keeping the Journalist UI exactly as it was? I do think we should document deletion behavior a bit more so admins and curious journalists understand how it works under the hood, but I'm not sure we need to expose those mechanics to the user. |
@redshiftzero That's a good point. I'd suggest that we consider changing the wording to "scheduled for immediate deletion", or "deletion of 100 collection(s) has begun". @eloquence Yes, I'll add to the test plan a check of the implications for the client. @eloquence My motivation for changing the user experience was that a journalist's perception of what is happening in the system should match reality. You have it right: before, when we said the keypairs and It's a better situation than being unable to delete large batches at all when you need to, but I think it's important to be clear about the true status of the source data. I believe that should absolutely be indicated to journalists, at the very least in the confirmation message. I'd like to go further and make it possible to monitor pending deletions somewhere in the journalist interface. If we don't change the UI at all and the only notification of this change is in the docs, few people are going to be aware of it and understand what's actually happening, and I think that's dangerous. It occurs to me that the client already exposes the asynchronous nature of source deletion, with a visible pending state. It wouldn't be the worst thing for both experiences to be consistent. |
I'm not sure I follow. In the current production experience, when I delete a batch of sources, it immediately says "3 collections deleted", but the shredder may not have deleted the associated files yet, no? In that sense it seems we're merely expanding the asynchronicity of an operation that's already partially asynchronous? |
That is correct. But the keypairs and Source records are gone by the time the response is rendered.
Yes, but I think the fact that all the sources' data will now be available for a short while is significant and should be made clear. |
Given that this currently only schedules immediate deletions, it looks the same as current behaviour from the journalist perspective, so a UX change doesn't seem necessary at this point IMO. (If an option for disappearing messages/trash messages is added subsequently, it would probably be worth differentiating deletion types at that point.) |
I feel it's very confusing for a user to execute a "Delete" command, and to then be told their data is still available and will actually be deleted later. I appreciate that's how it works, and how transparency to users is critical in security. This degree of both, however, I feel cause confusion and too much friction with existing mental models and expectations of how "Delete" should work. Also, "Schedule 'x' Task" is a command option in many apps (such as scheduling emails or Tweets to be sent), and when Erik first pinged me about this my first thought was "I didn't schedule a delete, I just wanted it to delete!" If anything, offering an "Undo" feature while the data is still available, meets existing user mental models—but of course that would be either inaccurate, or just odd, in our context. SD is a very complicated product, and it's a lot of balance between what of that complexity to expose to users, and what of it to not. If the user cannot meaningfully act upon something, making it clear to a user that the data remains available provides no value, and may even create confusion. My $0.02. :) |
I've changed it to say "secure deletion of 1 collection has begun", so there's no confusion about whether the journalist "scheduled" the operation. |
388f7d7
to
f3c5dec
Compare
Hey John! Thanks for these quick updates. Erik spun them up on his dev-env for me, and I had a chance to poke around a bit. Primary takeaway for me, is that we're now introducing a partial-concept; vs more concretely informing the user's risk as introduced by a class of actions. The concept of secure deletion is presented against only 1 of 3 deletion actions the user can take. As the user, I'm left to ask: "Why can't I securely delete files or messages? There's probably a reason I can't securely delete a Source while (in the Source Page view)." I've observed SecureDrop users to assume a lot of odd things about how things are working under the hood, when concepts aren't clear to them. TL;DR, I think it would be better for now to keep the old deletion user experience, and make consistent changes across all deletion actions in a future UI iteration. Happy to do a call with you to discuss further. I appreciate all the thought for users, so much. :) |
f3c5dec
to
4fb965f
Compare
The flash message text has been reverted. |
Add Source.deleted_at timestamp. Start generally filtering sources that have it set. When deleting multiple sources with journalist_app.utils.col_delete, just mark them deleted with this and defer the removal of their submissions, keypair and database record to asynchronous processing. Add a new systemd service to periodically purge sources marked for deletion. It's been decided that the journalist experience will not be changed to make this new mechanism apparent; the messaging remains the same and there is no facility for monitoring the deletion process. Assign explicit names to source GPG keys. Instead of using the python-gnupg default of "Autogenerated Key", label source keys with "Source Key". When asked to delete a reply keypair, verify that its UID contains one of those labels.
4fb965f
to
ee06369
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving, thanks @rmol!
Status
Ready for review
Description of Changes
Add
Source.deleted_at
timestamp. Start generally filtering sources that have it set. When deleting multiple sources withjournalist_app.utils.col_delete
, just mark them deleted with this and defer the removal of their submissions, keypair and database record to asynchronous processing. Add a new systemd service to periodically purge sources marked for deletion.Assign explicit names to source GPG keys. Instead of using the python-gnupg default of "Autogenerated Key", label source keys with "Source Key". When asked to delete a reply keypair, verify that its UID contains one of those labels.
Fixes #5233
Testing
Development configuration
5233-spring-cleaning-support
).NUM_SOURCES=100 make dev
.make dev
, you should see thesource_deleter
script deleting the individual sources and their data.Production configuration
5233-spring-cleaning-support
)make build-debs && make staging
to provision a staging environment with these changes.qa_loader.py
or another means of populating the staging database with a large number of sources (100 should be sufficient to have triggered the timeout before this change, but 500 or 1000 would be better).source_deleter
service is running on the app server withsystemctl status securedrop_source_deleter
.source_deleter
logging on the app server withjournalctl -fu securedrop_source_deleter
. Deleting 1000 sources took almost half an hour in my testing.Deployment
Making the deletion of sources asynchronous makes it possible to delete many at once without having the operation time out, but that sources are not being immediately removed is only indicated by the changed text of the flash message after the sources are marked for deletion: it now says "n collections scheduled for deletion".
As mentioned above, deleting large numbers of sources can take a while, so there's the possibility of system errors or a reboot interrupting them. The
source_deleter
process should resume removal of sources marked for deletion when the system comes back up.Errors during deletion will only appear in the system logs, and then in OSSEC alerts. At this point problems are going to require support, but the failure modes should be few and unlikely. Deletion of submissions is handled by the existing (reasonably well-tested by now) shredder mechanism. Deletion of GPG keys could fail if permissions on
/var/lib/securedrop/keys
were altered, or there were a bug in GnuPG/python-gnupg, or there were some other ephemeral error on the server, but in those cases the deletion should abort before removing the source database record, so the operation would be retried.Checklist
If you made changes to the server application code:
make lint
) and tests (make test
) pass in the development containerIf you made changes to the system configuration:
If you made non-trivial code changes: