-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve deletion of submissions #4713
Improve deletion of submissions #4713
Conversation
Replace srm with shred, which is faster, reducing the chance that deletion of a submission will be interrupted. Update rq and redis requirements, to eliminate long-standing bugs. Add manage.py tasks for detecting and correcting submissions that have been disconnected from their files on disk, and vice versa. Update manage.py to explicitly run with the production virtualenv. Also specify the virtualenv in WSGI scripts and the run-test script. In the dev/test Docker container, install requirements in a virtualenv at the same path as production. Add a supervisor script for requeuing interrupted rq jobs. If the app server is rebooted while an rq job is running, that job has already been deleted from the queue and rq will not automatically resume it on reboot, but it does have a record of it in the queue's started job registry. This script checks that registry for jobs that aren't already queued or being run, and requeues them.
e90ccf7
to
a8ecb98
Compare
I'm SSHed in here trying to debug the test failures (CI only) and I think there's some docker layer caching weirdness going on here: I rebuilt the container, reran the tests, and they passed... |
ad54e9c
to
f759401
Compare
Codecov Report
@@ Coverage Diff @@
## develop #4713 +/- ##
===========================================
- Coverage 82.38% 82.23% -0.15%
===========================================
Files 46 48 +2
Lines 3162 3350 +188
Branches 345 380 +35
===========================================
+ Hits 2605 2755 +150
- Misses 470 503 +33
- Partials 87 92 +5
Continue to review full report at Codecov.
|
f759401
to
b2457d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran through the test plan here on a previous version of the diff, and all worked well except for a small issue due to the same underlying problem as #4656 (path to python in manage.py invocation). This is fixed in the latest diff, see discussion in #4656 for more.
Given that only a small change was added since I last tested, I'm going to approve and merge based on visual review of the diff. These PR testing steps are in the draft 1.0.0 test plan.
Status
Ready for review
Description of Changes
Replace srm with shred, which is faster, reducing the chance that deletion of a submission will be interrupted.
Update rq and redis requirements, to eliminate long-standing bugs.
Add manage.py tasks for detecting and correcting submissions that have been disconnected from their files on disk, and vice versa. Update manage.py to explicitly run with the production virtualenv. Also specify the virtualenv in WSGI scripts and the run-test script. In the dev/test Docker container, install requirements in a virtualenv at the same path as production.
Add a supervisor script for requeuing interrupted rq jobs. If the app server is rebooted while an rq job is running, that job has already been deleted from the queue and rq will not automatically resume it on reboot, but it does have a record of it in the queue's started job registry. This script checks that registry for jobs that aren't already queued or being run, and requeues them.
Fixes #4711.
Fixes #4712.
Testing
Testing everything requires production VMs or hardware. These instructions assume VMs; adjust IP addresses as necessary if you want to test on hardware.
make build-debs
.scp build/xenial/*.deb [email protected]:
scp build/xenial/*ossec-server*.deb [email protected]:
dpkg -i --auto-deconfigure
(--auto-deconfigure
is necessary to upgradesecuredrop-app-code
ascron-apt
would in production). On the app server, remove the two "ossec-server" packages and you candpkg -i --auto-deconfigure *.deb
.Testing automatic requeuing of interrupted deletions
Establish two SSH connections to the app server. In one, become
root
withsudo su -
and in the other becomewww-data
withsudo -u www-data bash
. In thewww-data
shell:. /opt/venvs/securedrop-app-code/bin/activate
cd /var/www/securedrop
dd if=/dev/zero of=/var/lib/securedrop/store/bigfile bs=1M count=1000
python3
In the
root
shell:rqrequeue
log:less /var/log/securedrop_worker/rqrequeue.err
-- at the end you should see lines like this:/var/lib/securedrop/store/bigfile
should be deleted, and therqrequeue
log should start saying:Testing detection and correction of disconnected submissions
Visit the source interface and send two messages. First we'll test a disconnected database record.
In your
www-data
shell:cd /var/lib/securedrop/store
ls -laR
rm
.cd /var/www/securedrop
./manage.py check-disconnected-db-submissions
should reportThere are submissions in the database with no corresponding files. Run "manage.py list-disconnected-db-submissions" for details.
./manage.py list-disconnected-db-submissions
should list the ID of the deleted submission, e.g.2
../manage.py delete-disconnected-db-submissions
should prompt you withEnter 'y' to delete all submissions missing files:
-- replyy
and you should seeRemoving submission IDs [2]...
(the ID may differ).Now we'll delete the remaining database record and verify that its disconnected file is detected. Still in your
www-data
shell:sqlite3 /var/lib/securedrop/db.sqlite
Delete the submission record for the remaining message (substitute your filename):
delete from submissions where filename = '1-exhausted_overmantel-msg.gpg';
./manage.py check-disconnected-fs-submissions
should reportThere are files in the submission area with no corresponding records in the database. Run "manage.py list-disconnected-fs-submissions" for details.
../manage.py list-disconnected-fs-submissions
should show a list like:./manage.py delete-disconnected-fs-submissions
should prompt you to delete that file. Do so.Testing OSSEC reporting of disconnects
Create a file under
/var/lib/securedrop/store
withtouch /var/lib/securedrop/store/testfile
. If you don't feel like waiting a day for the OSSEC report, you can edit/var/ossec/etc/ossec.conf
, look forcheck-disconnect
, and reduce the<frequency>
, thenservice ossec restart
.Checklist
If you made changes to the server application code:
If you made changes to the system configuration:
If you made non-trivial code changes: