Deleting large numbers of sources can fail such that is it thereafter impossible to delete the sources #5233

rmol · 2020-05-06T15:20:55Z

Description

In the journalist interface, if you select a large number of sources and delete them, the operation can take longer than the Apache timeout. Some sources' store directories will have been moved to the shredder before the failure. Thereafter, if you try to delete them, a ValueError will be thrown at line 246 of store.py, in move_to_shredder. The unhandled exception prevents the deletion of the source record, and this is how we get the zombie apocalypse.

Steps to Reproduce

In an environment using Apache (staging, prod VMs, QA hardware):

Add 500 sources with qa_loader.py --source-count 500.
Log in as a journalist.
Select all sources and click delete.

Expected Behavior

That the sources would be deleted without error.

Actual Behavior

You get a gateway timeout. Navigating back to the source list and trying again results in an internal server error, with a stacktrace in /var/log/apache2/journalist-error.log.

Comments

The fix might be as simple as checking for the existence of the source's store directory in journalist_app.utils.delete_collection and only calling move_to_shredder if it still exists. While there, the key pair deletion should be checked as well, so that if it's already gone, the source database record is still deleted.

The text was updated successfully, but these errors were encountered:

eloquence · 2020-05-06T19:13:30Z

Since it's not a regression, agreement today was to undertake a timeboxed (~2 hour) attempt to resolve for 1.3.0; if it ends up being more complex, will likely bump to 1.4.0.

rmol · 2020-05-06T20:14:34Z

To no one's surprise, it turned out to be more complex. When the original deletion request times out, the mod_wsgi process is still seeing it to completion, because it doesn't care that Apache has given up.

Any workaround to press on past nonexistent source store directories or keypairs, can still fail when the source is queried in the final steps of journalist_app.utils.delete_collection, and has been deleted. A journalist could keep backing up to the source list, refreshing, selecting all and deleting, and it might look like their efforts are making headway as the list shrinks, but in fact they'll just be tripping over the failures while the initial request is still churning through the deletions.

We could lengthen the Apache timeout, or introduce a request timeout in the mod_wsgi configuration, but neither is a sure fix, and could introduce other problems. The right thing to do is ensure we generate a response to this request in a reasonable timeframe.

To that end, I'm going to look at adding a deleted flag to Source. Updating that should be quick. Another background worker process will periodically scan for deleted sources and do the work that is currently done in delete_collection. For now, we'll just omit deleted sources from the journalist interface.

eloquence · 2020-05-06T22:03:16Z

For the sake of clarity, per chat w/ Jen, we'll keep this on the sprint for John to continue to investigate, during the 5/6-5/20 sprint, but we're still keeping it on the 1.4.0 (not 1.3.0) milestone, so it is not subject to the release/QA timeline pressure.

eloquence added the bug label May 6, 2020

redshiftzero added the QA: Release label May 6, 2020

redshiftzero added this to the 1.3.0 milestone May 6, 2020

rmol self-assigned this May 6, 2020

redshiftzero modified the milestones: 1.3.0 , 1.4.0 May 6, 2020

rocodes mentioned this issue May 11, 2020

Release SecureDrop 1.3.0 #5205

Closed

22 tasks

rmol mentioned this issue May 13, 2020

Make deletion of multiple sources asynchronous #5257

Merged

3 tasks

redshiftzero closed this as completed in #5257 May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deleting large numbers of sources can fail such that is it thereafter impossible to delete the sources #5233

Deleting large numbers of sources can fail such that is it thereafter impossible to delete the sources #5233

rmol commented May 6, 2020

eloquence commented May 6, 2020

rmol commented May 6, 2020

eloquence commented May 6, 2020