Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Gracefully handle timeouts #8061

Closed
Tracked by #7210
bobanm opened this issue Jan 23, 2023 · 1 comment
Closed
Tracked by #7210

Gracefully handle timeouts #8061

bobanm opened this issue Jan 23, 2023 · 1 comment
Assignees
Milestone

Comments

@bobanm
Copy link
Contributor

bobanm commented Jan 23, 2023

Expected behavior

When a timeout occurs, the node should be more robust and not shut down.

Actual behavior

Sometimes, but not always, the application shuts down after a timeout. I noticed it first time when invoking random_setHashOnion endpoint.

lisknode endpoint:invoke random_setHashOnion '{"address":"lske5sqed53fdcs4m9et28f2k7u9fk6hno9bauday"}'
 ›   Error: Response not received in 3000ms

[where lisknode is my local alias for ~/lightcurve/lisk-sdk/examples/pos-mainchain/bin/run]

The response I get is:

2023-01-23T09:29:27.876Z INFO dude engine 28555 [status=success method=system_getMetadata id=1] Handled RPC request
2023-01-23T09:29:27.880Z INFO dude engine 28555 [status=success method=system_getSchema id=2] Handled RPC request
2023-01-23T09:29:27.881Z INFO dude engine 28555 [status=success method=system_getNodeInfo id=3] Handled RPC request
2023-01-23T09:29:30.898Z INFO dude engine 28555 [status=error err=Response not received in 3000ms] Failed to handle IPC request
/home/dude/lightcurve/lisk-sdk/framework/dist-node/abi_handler/abi_client.js:21
        reject(new Error(message !== null && message !== void 0 ? message : `Timed out in ${ms}ms.`));
               ^

Error: Response not received in 3000ms
    at Timeout.<anonymous> (/home/dude/lightcurve/lisk-sdk/framework/dist-node/abi_handler/abi_client.js:21:16)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)
2023-01-23T09:29:34.433Z ERROR dude application 28500 [code=1 signal=] Engine exited unexpectedly
2023-01-23T09:29:34.435Z INFO dude application 28500 [errorCode=1 message=process.exit] Application shutdown started

Steps to reproduce

  1. Start a node: lisknode start --api-ipc
  2. Invoke random_setHashOnion endpoint with default params, as shown above.

It can also be reproduced by starting the application in debugger, running any command, setting a breakpoint and waiting a bit before stepping over the breakpoint.

Which version(s) does this affect? (Environment, OS, etc...)

Current development branch

@Madhulearn Madhulearn added this to the Sprint 89 milestone Feb 13, 2023
@Madhulearn Madhulearn modified the milestones: Sprint 89, Sprint 90 Feb 28, 2023
@bobanm bobanm self-assigned this Mar 7, 2023
@bobanm
Copy link
Contributor Author

bobanm commented Mar 13, 2023

I locally reverted #8185, rebuilt the framework package and tried to reproduce the issue. Every time I would triger a timeout using random_setHashOnion the node handled it gracefully, as expected.

Considering that @shuse2 could not reproduce the issue back when I reported it, and I can't reproduce it now, either there was something wrong with my local build back then, or something else resolved the issue in the meanwhile.

@bobanm bobanm closed this as completed Mar 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants