-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state_sync tests fail with RPC timeouts #3129
Labels
A-chain
Area: Chain, client & related
Comments
SkidanovAlex
added a commit
that referenced
this issue
Aug 14, 2020
…t the head update, and test fixes Making doomslug wait for 600ms from the last time the node sent an endorsement, not from the head update. Under load this will remove unnecessary idleness. This change breaks catchup tests that do not use doomslug, because they rely on blocks being produced without skips and forks, and without doomslug and the wait not starting from the head update the timing is less predictable. I address it by making the tests actually use doomslug (and upping the block prod times where necessary). Since they rely on having no skips and forks, there's no reason not to use doomslug. For the only test that had two separate versions with and without doomslug, I removed the version without. Separately and unrelatedly to the above, removed backtrace from the formatted client errors. On my (very beefy) machine fetching the backtrace takes 800ms, and on nayduck runners it was more than 2s. Since we format errors in many places (including responces to RPC, though that will change), such delays are not acceptable. This fixes state_sync tests. Fixes: #3139, #3129 Test plan: ---------- For #3139: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/88 For #3129: TODO Catchup tests are split accross many runs, but I ensured they are not flaky after this change.
SkidanovAlex
added a commit
that referenced
this issue
Aug 14, 2020
…t the head update, and test fixes Making doomslug wait for 600ms from the last time the node sent an endorsement, not from the head update. Under load this will remove unnecessary idleness. This change breaks catchup tests that do not use doomslug, because they rely on blocks being produced without skips and forks, and without doomslug and the wait not starting from the head update the timing is less predictable. I address it by making the tests actually use doomslug (and upping the block prod times where necessary). Since they rely on having no skips and forks, there's no reason not to use doomslug. For the only test that had two separate versions with and without doomslug, I removed the version without. Separately and unrelatedly to the above, removed backtrace from the formatted client errors. On my (very beefy) machine fetching the backtrace takes 800ms, and on nayduck runners it was more than 2s. Since we format errors in many places (including responces to RPC, though that will change), such delays are not acceptable. This fixes state_sync tests. Fixes: #3139, #3129 Test plan: ---------- For #3139: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/88 For #3129: (still running, will update the description) Catchup tests are split accross many runs, but I ensured they are not flaky after this change.
bowenwang1996
pushed a commit
that referenced
this issue
Aug 14, 2020
…t the head update, and test fixes (#3164) Making doomslug wait for 600ms from the last time the node sent an endorsement, not from the head update. Under load this will remove unnecessary idleness. This change breaks catchup tests that do not use doomslug, because they rely on blocks being produced without skips and forks, and without doomslug and the wait not starting from the head update the timing is less predictable. I address it by making the tests actually use doomslug (and upping the block prod times where necessary). Since they rely on having no skips and forks, there's no reason not to use doomslug. For the only test that had two separate versions with and without doomslug, I removed the version without. Separately and unrelatedly to the above, removed backtrace from the formatted client errors. On my (very beefy) machine fetching the backtrace takes 800ms, and on nayduck runners it was more than 2s. Since we format errors in many places (including responces to RPC, though that will change), such delays are not acceptable. This fixes state_sync tests. Fixes: #3139, #3129 Test plan: ---------- For #3139: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/88 For #3129: (still running, will update the description) Catchup tests are split accross many runs, but I ensured they are not flaky after this change.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Rather consistently, e.g:
state_sync.py: http://52.149.162.182:3000/#/test/7604
state_sync_routed.py: http://52.149.162.182:3000/#/test/7611
state_sync_late.py: http://52.149.162.182:3000/#/test/7612
The text was updated successfully, but these errors were encountered: