-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native WebSocket Connection is unreliable #949
Comments
Some very preliminary results from an investigation
Steps taken
The run id for the Relay test is:
|
The two messages with exactly the same timestamp are likely to be the same message coming from the same js-waku node but received through a different path. Just in case I added a delay: waku-org/waku-tests@aa0109e |
I reverted this change and it caused some timeout. Haven't look into providing a new change. Let me know if it's needed. |
Started looking into this again. Summary so far:
@D4nte could we perhaps compile and deploy the |
On local tests, it seems like the websocket connections time out 10 mins after they were established, despite the connection being used and keep-alive (5 min pings) enabled. Unclear if this is affecting the tests, which should run in under 10 mins. Continuing investigation. |
Seems like This would only affect connections after 10 min, so not clear if this is what causes issues here. |
Fix for timeout here vacp2p/nim-libp2p#721 |
@jm-clius you don't need certificate for local tests, you can just plain websocket. for the waku connect fleet: feel free to do as you wish, if you need me to do something, please let me know.
Indeed I agree that this only uses "receive via relay". I am happy to icebox this and proceed to enable dns discovery and increase js-waku's default number of connections before reviewing this issue. |
Logs reveal that this is indeed related to timing in the test: the fleet node starts receiving messages related to the test while it's still in the process of finalising the connection to the |
@D4nte I think this issue can be closed as the failing tests were unrelated to websocket performance? Since the last websocket fixes, I'm not aware of any remaining websocket issues and reliability tests pass. |
Fixed in |
I have done some test automation to check behaviour of the fleets using js-waku in NodeJS.
It does some basic checks:
Here are the findings after running the tests several time from my laptop:
Here are some sample outputs, hopefully it's self-explanatory:
You can trigger the flow manually directly from GitHub: https://github.com/status-im/waku-tests/actions/workflows/run.yml.
The text was updated successfully, but these errors were encountered: