-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add key cache #5184
Add key cache #5184
Conversation
It actually returns a fingerprint, not a key.
Also rename the fingerprint cache and increase its maximum size.
This pull request introduces 1 alert when merging b0b411c into 8576e0b - view on LGTM.com new alerts:
|
As sources are being added, include the count in the output, to give a better idea of progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a visual review of the change, looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good! I haven't ran through the test plan yet, but I've dropped a couple minor comments inline. Will re-review first up on my next work day so we can merge for 1.3.0
One other question: regarding the 100 MB for 1000 sources number - scaling linearly from that (well, not necessarily the best thing to do since only a subset of that 100 MB is due to the caches) but it has got me wondering about the memory usage - can you share an estimate of the memory usage when the caches are fully populated?
Improve the order of operations when cleaning deleted keys out of the CryptoUtil caches. Fix index of deleted journalist in create-dev-data.py.
By default, mod_wsgi waits for the first request to load the application. By adding the "process-group" and "application-group" options to the "WSGIScriptAlias" directive, we can tell it to load on process start instead, so the caches get populated then instead of forcing the first request to wait. See: https://modwsgi.readthedocs.io/en/latest/configuration-directives/WSGIScriptAlias.html
install_files/ansible-base/roles/app/templates/sites-available/journalist.conf
Show resolved
Hide resolved
Since I am still seeing issues with timeouts, I think it's important to test this PR against a staging server with 200, 300, and 1000 sources if there's time (which would close freedomofpress/securedrop-client#1007 if it succeeds), before it's merged. Also, remember to deleted the local database between runs so you can see whether or not the source list is populated. |
I've updated the PR description to add detail about the switch to Redis. @redshiftzero this is ready for another look. @creviera In testing with 1000 sources, I see the client source list populated, with messages being downloaded, in under two minutes. The initial sync completes in around 20 seconds. |
The timing in my last comment was observed with the client running in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am approving this. My only suggestion would be an extra script to run after every reboot, to repopulate the redis cache from the existing fingerprints/keys. That can be useful for the first time syncing by the journalists. The same script should also run after an update via the package.
@kushaldas The cache is populated when Apache starts the mod_wsgi processes (from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all my comments were addressed, I just tested the upgrade in staging and otherwise ran through the test plan and all lgtm. thanks!
Status
Ready for review
Description of Changes
Moves the CryptoUtil caching to Redis. This reduces total memory usage compared to keeping the caches in the WSGI processes, and eliminates the possibility that the cache would have to be repopulated if WSGI processes restart.
Adds caching of public keys, not just fingerprints.
Renames
CryptoUtil.getkey
andCryptoUtil.export_pubkey
to better match what they do.Adds some detail to the output of
create-dev-data.py
to better indicate progress.Uses 1024-bit keys if the environment variable
SECUREDROP_ENV
istest
ordev
, which speeds up the local dev server startup with large numbers of sources.Fixes #5183.
Testing
Run the dev server with the environment variable
NUM_SOURCES
set to a large number, like 500 or 1000. Use the test script from the #5100 test plan. Running it against this branch should show faster response times than master, once the caches are populated.Checking Redis memory usage
Stop Apache with
service apache2 stop
.In
redis-cli
:keys *
CryptoUtil
cache keys if they exist:del sd/crypto-util/fingerprints
del sd/crypto-util/keys
info
and look forused_memory_human
. It should be undera megabyte, e.g.
used_memory_human:517.83K
Start Apache with
service apache2 start
.Wait for the caches to populate. It took a few minutes on my system. You can watch for the Apache and GPG processes to drop out of
top
, then check the number of keys in the Redis caches inredis-cli
withhlen sd/crypto-util/fingerprints
andhlen sd/crypto-util/keys
.Execute
info
inredis-cli
and look forused_memory_human
. With dev/test 1024-bit keys, it should be under two megabytes, e.g.used_memory_human:1.66M
. With production 4096-bit keys I seeused_memory_human:2.64M
.Deployment
This requires additional working memory, but as suggested in the test plan, caching 1000 1024-bit keys and fingerprints in Redis takes under 2 megabytes. Production 4096-bit keys will result in a proportional increase.
Checklist
If you made changes to the server application code:
make lint
) and tests (make test
) pass in the development containerIf you made non-trivial code changes: