Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is 40 seconds still a good default timeout for a sync? #1025

Closed
sssoleileraaa opened this issue Mar 30, 2020 · 5 comments
Closed

Is 40 seconds still a good default timeout for a sync? #1025

sssoleileraaa opened this issue Mar 30, 2020 · 5 comments

Comments

@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Mar 30, 2020

Description

Are these measurements still correct? If not, what are the new measurements for how long each api request takes?

Spawned from #1007 (comment)

Here's some current data to go on for how long our three calls take. We currently use the default timeout of 40 seconds for each of these calls, and this will be used once freedomofpress/securedrop-proxy#145 is fixed.

sources=300, messages=600, replies=600, run_count=3

Unit Mean time (seconds)
get_sources endpoint 12
get_all_submissions endpoint 4
get_all_replies endpoint 4

sources=1000...

[need test results]

@sssoleileraaa
Copy link
Contributor Author

sources=1000, messages=2000, replies=2000, run_count=3

client version 0.1.6-dev-20200407-060103 (latest as of now)

Each run, after 120 seconds I saw The SecureDrop server cannot be reached. Trying to reconnect... with a RequestTimeoutError. This tells us that we were not able to get 1000 sources within the 40 second timeout period before we kill the subprocess qrexec call. Also, I left the client running for 10 minutes each time I tested this to see if the following sync requests would succeed after the cache was populated, but continued to see RequestTimeoutError.

IMPORTANT: We don't see an error after 120 seconds NOT because of the 120 second default proxy timeout, but because we try our 40-second sync 3 times before showing an error

@sssoleileraaa
Copy link
Contributor Author

sssoleileraaa commented Apr 8, 2020

Here are the precise steps I took, information about my system, and results:

  • [sd-app] SecureDrop Client v0.1.6-dev-20200408-060110
  • Staging server running on Debian 10, built off of develop branch, commit 8576e0bb51468ca70c92f1faf6e58f000851a6c8

Test:

  1. Boot up your staging server with 200 sources
  2. In sd-app run rm ~/.securedrop_client/svs.sqlite
  3. Start the client installed on Qubes and log in
  4. Wait for first sync to complete

Results:
1/3 times this first sync timed out when following the Test steps above. The other two times it took about 1 minute 10 seconds before seeing the first sync succeed and the source list populate

I've also tested with 300 sources and see timeouts much more frequently. For some reason it's taking a lot longer on my staging server to populate the cache I believe, and I'm not sure this is a problem we have to worry about with production systems. In order to see more of what's going on on the server I'll need to add logging or rely on logging patches that @redshiftzero or @rmol or anyone else has?

@kushaldas
Copy link
Contributor

kushaldas commented Apr 21, 2020

In my app-staging based on commit freedomofpress/securedrop@bc06d98:

  • Direct API access using ssh tunnel (to skip Tor)
  • 1461 number of sources and 50 submissions and 20 replies, keysize 1024
  • the DEFAULT_TIMEOUT in SDK is 60 seconds (modified by me to test)
  • get_sources() is throwing 504 from apache

Redis has the keys and the fingerprints.

127.0.0.1:6379> hlen sd/crypto-util/keys
(integer) 1461
127.0.0.1:6379> hlen sd/crypto-util/fingerprints
(integer) 1461

@kushaldas
Copy link
Contributor

For each source, the code has to go and look for all the submissions, in my local test/experiment that is causing time increase in a few folds. I will get back with numbers later in the week.

@eloquence
Copy link
Member

Per https://github.com/freedomofpress/securedrop-workstation/wiki/Timeouts it is 60 seconds now, and we have a few other issues tracking improvements in this area, so closing this old ticket as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants