Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websockets subscription hangs after a while #2053

Open
DefiCake opened this issue Nov 10, 2021 · 13 comments
Open

Websockets subscription hangs after a while #2053

DefiCake opened this issue Nov 10, 2021 · 13 comments
Labels
status:ready This issue is ready to be worked on type:bug Something isn't working

Comments

@DefiCake
Copy link

DefiCake commented Nov 10, 2021

Hi, I am having issues with hardhat. Extended period of times in a subscription seem to end up with nothing being notified to the subscribed client.

I double checked that this only happens in hardhat, with both ethers.js and web3.js clients. Raw geth, Alchemy, Infura seem to be all working fine with the clients.

Repository with a very simple dockerized environment to aid with reproduction:

https://github.com/DefiCake/hardhat-ws-issue

@iscke
Copy link

iscke commented Nov 22, 2021

this happens because ws subscriptions are implemented as filters, which have a deadline of 5 minutes and get removed after that if you haven't called eth_getFilterChanges to poll them, which you obviously don't need to do if you're receiving changes through a websocket.

i think the way to fix this is just to ignore the filter deadline for subscriptions, since polling a websocket filter to keep it active doesn't really make sense.

@iscke
Copy link

iscke commented Nov 22, 2021

(probably relevant: #1780 (comment))

@iscke
Copy link

iscke commented Nov 22, 2021

(regarding those comments, eth does have a filter deadline, but as far as i can tell, it's only used for the json-rpc eth_newFilter methods (see the subscription and EventSystem structs in eth/filters/filter_system in the geth codebase))

@alcuadrado alcuadrado self-assigned this Nov 23, 2021
@molfar
Copy link

molfar commented Dec 5, 2021

I confirm, there is some issue with events deadline with websocket connection.

@alcuadrado
Copy link
Member

@iscke are you familiar with how geth handles this? do they also set a deadline? do they ignore it for websocket subscriptions? if so, don't they leak resources forever?

@iangregsondev
Copy link

Hi,

do you know when this is going to be looked at?

Websockets cease to function after around 5 minutes. So a restart of the hardhat node is required. I am running it out of process i.e "npx hardhat node"

I think my problem is related to this. Websockets initially works and events do arrive but then finally it stops delivering any events.

@shawnwang0715
Copy link

Same issue here.
My current workaround is to unsubscribe all events and re-subscribe them again every minute. So it won't hit the "5-minute deadline"

updateSubscription() {
    const subscriptions = Array.from(web3.eth._requestManager.subscriptions.values());
    for (const item of subscriptions) {
        item.subscription.unsubscribe();
    }
    // Subscribe your events again
    ...
  }

@github-actions
Copy link
Contributor

This issue was marked as stale because it didn't have any activity in the last 30 days. If you think it's still relevant, please leave a comment indicating so. Otherwise, it will be closed in 7 days.

@github-actions github-actions bot added Stale and removed Stale labels Jul 11, 2022
@iangregsondev
Copy link

@iscke @fvictorio Any chance we can leave this active and remove the stale label.

This is still an issue, there are some workarounds but I am not sure if they fully support all use cases and different example workarounds exist for web3 / esthers.js.

Thanks.

@timolson
Copy link

Websockets dying is 100% a critical issue preventing us from using Hardhat whatsoever. I'm pretty shocked that so many people seem to be polling for new blocks instead of subscribing to newHeads. Not sure why the priority was removed from this bug, but +1 from me.

@sertony
Copy link

sertony commented Feb 14, 2024

In our project we've fixed it by forcing the hardhat node generating new blocks, i.e.

 const config: HardhatUserConfig = {
     solidity: '0.8.20',
     networks: {
         hardhat: {
             // See: https://hardhat.org/hardhat-network/docs/reference#mining-modes
             mining: {
                 auto: true,
                 // Produce new block every 3 minutes to resolve next issues
                 // https://github.com/NomicFoundation/hardhat/issues/2053
                 // https://github.com/ethers-io/ethers.js/issues/2338
                 // https://github.com/ethers-io/ethers.js/discussions/4116
                 interval: 3 * 60 * 1000, // should be less then 5 minutes to make event subscription work
             },
         },
     },
 };

This is important since the 5 minute deadline is getting expired due to the fact that it's not renewed when reading the filter changes which happens in ethers only once a new block is generated.

@blastrock
Copy link

If someone is searching for the new deadline, it has moved here:

Instant::now() + Duration::from_secs(5 * 60)

@sim31
Copy link

sim31 commented Oct 4, 2024

In our project we've fixed it by forcing the hardhat node generating new blocks, i.e.

I'm using auto mining mode. Still am missing events after around 5 minutes. I have also setup up pinging and reconnection to keep the websocket alive so I know that websocket connection is ok, but subscription to events somehow drops.

markspanbroek added a commit to codex-storage/nim-codex that referenced this issue Nov 7, 2024
To work around this issue when subscriptions are
inactive for more than 5 minutes:
NomicFoundation/hardhat#2053
markspanbroek added a commit to codex-storage/nim-codex that referenced this issue Nov 8, 2024
To work around this issue when subscriptions are
inactive for more than 5 minutes:
NomicFoundation/hardhat#2053
markspanbroek added a commit to codex-storage/nim-codex that referenced this issue Nov 13, 2024
To work around this issue when subscriptions are
inactive for more than 5 minutes:
NomicFoundation/hardhat#2053

Use 100 millisecond polling; default polling interval
of 4 seconds is too close to the 5 second timeout for
`check eventually`.
github-merge-queue bot pushed a commit to codex-storage/nim-codex that referenced this issue Nov 25, 2024
* Use http subscriptions instead of websocket for tests

To work around this issue when subscriptions are
inactive for more than 5 minutes:
NomicFoundation/hardhat#2053

Use 100 millisecond polling; default polling interval
of 4 seconds is too close to the 5 second timeout for
`check eventually`.

* use .confirm(1) instead of confirm(0)

confirm(0) doesn't wait at all, confirm(1) waits
for the transaction to be mined

* speed up partial payout integration test

* update nim-ethers to version 0.10.0

includes fixes for http polling and .confirm()

* fix timing of marketplace tests

allow for a bit more time to withdraw funds

* use .confirm(1) in marketplace tests

to ensure that the transaction has been processed
before continuing with the test

* fix timing issue in validation unit test

* fix proof integration test

there were two logic errors in this test:
- a slot is freed anyway at the end of the contract
- when starting the request takes a long time, the
  first slot can already be freed because there were
  too many missing proofs

* fix intermittent error in contract tests

currentTime() doesn't always correctly reflect
the time of the next transaction

* reduce number of slots in integration test

otherwise the windows runner in the CI won't
be able to start the request before it expires

* fix timing in purchasing test

allow for a bit more time for a request to
be submitted

* fix timing of request submission in test

windows ci is so slow, it can take up to 40 seconds
just to submit a storage request to hardhat

* increase proof period to 90 seconds

* adjust timing of integration tests

reason: with the increased period length of 90 seconds, it
can take longer to wait for a stable challenge at the
beginning of a period.

* increase CI timeout to 2 hours

* Fix slow builds on windows

apparently it takes windows 2-3 seconds to
resolve "localhost" to 127.0.0.1 for every
json-rpc connection that we make 🤦
veaceslavdoina pushed a commit to codex-storage/nim-codex that referenced this issue Nov 26, 2024
* Use http subscriptions instead of websocket for tests

To work around this issue when subscriptions are
inactive for more than 5 minutes:
NomicFoundation/hardhat#2053

Use 100 millisecond polling; default polling interval
of 4 seconds is too close to the 5 second timeout for
`check eventually`.

* use .confirm(1) instead of confirm(0)

confirm(0) doesn't wait at all, confirm(1) waits
for the transaction to be mined

* speed up partial payout integration test

* update nim-ethers to version 0.10.0

includes fixes for http polling and .confirm()

* fix timing of marketplace tests

allow for a bit more time to withdraw funds

* use .confirm(1) in marketplace tests

to ensure that the transaction has been processed
before continuing with the test

* fix timing issue in validation unit test

* fix proof integration test

there were two logic errors in this test:
- a slot is freed anyway at the end of the contract
- when starting the request takes a long time, the
  first slot can already be freed because there were
  too many missing proofs

* fix intermittent error in contract tests

currentTime() doesn't always correctly reflect
the time of the next transaction

* reduce number of slots in integration test

otherwise the windows runner in the CI won't
be able to start the request before it expires

* fix timing in purchasing test

allow for a bit more time for a request to
be submitted

* fix timing of request submission in test

windows ci is so slow, it can take up to 40 seconds
just to submit a storage request to hardhat

* increase proof period to 90 seconds

* adjust timing of integration tests

reason: with the increased period length of 90 seconds, it
can take longer to wait for a stable challenge at the
beginning of a period.

* increase CI timeout to 2 hours

* Fix slow builds on windows

apparently it takes windows 2-3 seconds to
resolve "localhost" to 127.0.0.1 for every
json-rpc connection that we make 🤦
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:ready This issue is ready to be worked on type:bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests