Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle stakepoold/dcrwallet/dcrd disconnects #475

Closed
jholdstock opened this issue Aug 13, 2019 · 7 comments
Closed

Handle stakepoold/dcrwallet/dcrd disconnects #475

jholdstock opened this issue Aug 13, 2019 · 7 comments

Comments

@jholdstock
Copy link
Member

Reported by @JoeGruffins:

But if dcrwallet is not running, it seems any command (CreateMultisig at least) will hang without error. Dcrstakepool will not start up if a wallet is not online, but if one goes offline after starting dcrstakepool, stakepoold starts trying to reconnect, and it seems, maybe all commands, will wait for a reconnect indefinitely, and then fire. The user will see a 504 gateway timeout error, and the operator will get no notice. Maybe this is a problem for another issue. I do not expect this pr is the cause, but if you can also check for stakepoold's trying to reconnect to a wallet state, I think that would be good...

@jholdstock
Copy link
Member Author

#476 has been raised to facilitate this. A proper test harness removes the effort required for repeatedly creating/tearing down full VSP configurations.

@jholdstock
Copy link
Member Author

Related #166 and #70

@JoeGruffins
Copy link
Member

As for the dcrd and dcrwallet disconnects, I think our problem lies here
https://github.com/decred/dcrd/blob/8497b9843bcb4191ec1f4a235da178660fb9ad70/rpcclient/infrastructure.go#L655
We are waiting for a reconnect and then all commands are sent.
Some possible solutions:

  • Add a timeout to grpc commands. If they take to long throw a timeout error and discard whatever results may come back at a later time.
  • Ping dcrwallet/dcrd before sending commands. This reinstates the dcrwallet link we just broke though.
  • Turn off the auto reconnect feature and rewrite our own auto-reconnect that allows us to take the status of the connection.
    @jholdstock I think the third option is best, or can you think of any more.

@jrick
Copy link
Member

jrick commented Aug 19, 2019

The third option is what I had done before in dcrwallet (while it was still using rpcclient -- as of master it no longer does) and I recommend that approach.

@jholdstock
Copy link
Member Author

jholdstock commented Aug 20, 2019

I have done a whole bunch of manual testing using the framework below. I think we want to resolve all of these before cutting a release.

Disconnect dcrd

Test Pass Notes
Check disconnection in logs ✔️
Check /status for DaemonConnected == false ✔️
All pages should still load ✔️
Can still log-in and register ✔️
Can still update voting preferences, change password ✔️
Error shown when trying to connect decrediton. ✔️

Restart dcrd

Test Pass Notes
Check for reconnection in logs ✔️
Check /status for DaemonConnected == true ✔️
Connect decrediton works ✔️

Kill one instance of dcrwallet

Test Pass Notes
Check disconnection in logs ✔️
All pages should load /tickets does not load. homepage and /stats fail to load after the getstakeinfo cache has been cleared
/status shows the disconnected wallet /status does not load
Change voting preferences without error page does not load
Changing password works ✔️
Login/register works ✔️
Error shown when trying to connect decrediton. infinite loading bar

Reconnect dcrwallet:

Test Pass Notes
Check for reconnection in logs ✔️ Reconnection takes up to 1 minute
Connect decrediton works Error "system error - unable to process wallet commands"
All pages load ✔️
/status shows wallet status ✔️
Change voting preferences without error ✔️

Disconnect one instance of stakepoold

Test Pass Notes
Check disconnection in logs Nothing shown
All pages should load ✔️
/status shows the disconnected back-end ✔️
Change voting preferences without error ✔️
Changing password works ✔️
Login/register works ✔️
Error shown when trying to connect decrediton. ✔️

Reconnect stakepoold

Test Pass Notes
Check for reconnection in logs Nothing shown
All pages should load ✔️
/status shows the reconnected back-end ✔️ Reconnection takes up to 1 minute
Change voting preferences without error ✔️
Connect decrediton works Error "system error - unable to process wallet command"

@JoeGruffins
Copy link
Member

very awesome tables

@jholdstock
Copy link
Member Author

Resolved by #494

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants