Gracefully handle timeouts #8061

bobanm · 2023-01-23T09:36:38Z

Expected behavior

When a timeout occurs, the node should be more robust and not shut down.

Actual behavior

Sometimes, but not always, the application shuts down after a timeout. I noticed it first time when invoking random_setHashOnion endpoint.

lisknode endpoint:invoke random_setHashOnion '{"address":"lske5sqed53fdcs4m9et28f2k7u9fk6hno9bauday"}'
 ›   Error: Response not received in 3000ms

[where lisknode is my local alias for ~/lightcurve/lisk-sdk/examples/pos-mainchain/bin/run]

The response I get is:

2023-01-23T09:29:27.876Z INFO dude engine 28555 [status=success method=system_getMetadata id=1] Handled RPC request
2023-01-23T09:29:27.880Z INFO dude engine 28555 [status=success method=system_getSchema id=2] Handled RPC request
2023-01-23T09:29:27.881Z INFO dude engine 28555 [status=success method=system_getNodeInfo id=3] Handled RPC request
2023-01-23T09:29:30.898Z INFO dude engine 28555 [status=error err=Response not received in 3000ms] Failed to handle IPC request
/home/dude/lightcurve/lisk-sdk/framework/dist-node/abi_handler/abi_client.js:21
        reject(new Error(message !== null && message !== void 0 ? message : `Timed out in ${ms}ms.`));
               ^

Error: Response not received in 3000ms
    at Timeout.<anonymous> (/home/dude/lightcurve/lisk-sdk/framework/dist-node/abi_handler/abi_client.js:21:16)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)
2023-01-23T09:29:34.433Z ERROR dude application 28500 [code=1 signal=] Engine exited unexpectedly
2023-01-23T09:29:34.435Z INFO dude application 28500 [errorCode=1 message=process.exit] Application shutdown started

Steps to reproduce

Start a node: lisknode start --api-ipc
Invoke random_setHashOnion endpoint with default params, as shown above.

It can also be reproduced by starting the application in debugger, running any command, setting a breakpoint and waiting a bit before stepping over the breakpoint.

Which version(s) does this affect? (Environment, OS, etc...)

Current development branch

The text was updated successfully, but these errors were encountered:

bobanm · 2023-03-13T07:38:44Z

I locally reverted #8185, rebuilt the framework package and tried to reproduce the issue. Every time I would triger a timeout using random_setHashOnion the node handled it gracefully, as expected.

Considering that @shuse2 could not reproduce the issue back when I reported it, and I can't reproduce it now, either there was something wrong with my local build back then, or something else resolved the issue in the meanwhile.

bobanm added the type: bug label Jan 23, 2023

shuse2 mentioned this issue Jan 27, 2023

Prepare Lisk SDK v6.0.0 for beta #7210

Closed

Madhulearn added this to the Sprint 89 milestone Feb 13, 2023

Madhulearn modified the milestones: Sprint 89, Sprint 90 Feb 28, 2023

bobanm self-assigned this Mar 7, 2023

bobanm closed this as completed Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gracefully handle timeouts #8061

Gracefully handle timeouts #8061

bobanm commented Jan 23, 2023

bobanm commented Mar 13, 2023

Gracefully handle timeouts #8061

Gracefully handle timeouts #8061

Comments

bobanm commented Jan 23, 2023

Expected behavior

Actual behavior

Steps to reproduce

Which version(s) does this affect? (Environment, OS, etc...)

bobanm commented Mar 13, 2023