-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: possible memory leak #1662
Comments
I just deployed latest master (statusteam/nim-waku:9e1432c9) in my minifleet with just relay enabled, in an attempt to point to which protocol may be causing issues. |
Can't reproduce the memory leak with the above setup. 20 peers with just relay and discovery enabled 100 msg/second and 2kb msg size (actually more than the traffic we have in the fleet). Container mem usage is <50Mb over almost 1 day. So I would say the mem lead is not related to the relay protocol. |
We've run overnight on wakuv2.test: I'm also running several other versions on a sandbox machine, but so far the leak is only observed on the fleets. Next step is to try to pin the specific commit that caused the leak. Since we suspect that one of the submodule updates may be to blame, I'll deploy the commit just after the two submodule updates next. |
|
@jm-clius checked the current memory status. it seems like it was present already in v0.16.0 release - no noticeable difference between release and later versions, all using more than 1.2 GB of memory currently. Restarting ams3 with v0.15.0. |
|
Thanks, @Ivansete-status! I'm starting HK with 43b3f1c - more or less halfway between the two releases, so we can narrow down the offending change. |
In 67db35e the HK node crashed at 1.06 GB |
After analysing the heaptrack reports, we saw the next points as possible mem-leak candidates:
|
Thanks for this! What is the meaning of the two values next to each other? Is that allocs and deallocs? |
In 67db35e the HK is 1.29 GB. Has mem leak. |
While we're investigating the ways in which these allocations could leak and if it's due to nwaku or submodules, I've done the following:
|
A quick summary on the findings:
|
Thanks for the summary @alrevuelta ! Regarding the LibP2P correlation with the mem-leak and wss option, in the next image we can see that in the last two hours, the US node has wss disabled but still ~70 peers, and the memory is low: |
@Ivansete-status ah i see my bad. I thought wss was enabled again in the star. |
The next issue was reported in |
We now have an additional problem which is that there are many peers connecting through wss.
( I checked that all connections with |
Several items being fixed here, but #1800 provides some memory improvement, coupled with better peer management techniques + gossipsub scoring. Monitoring memory usage. |
Problem
A possible memory leak was introduced or activated in the
wakuv2.test
fleet, roughly between 30 March and 5th of AprilImpact
Critical. Unbounded memory growth causes crashes and instability.
Screenshots/logs
nwaku version/commit hash
At least c26dcb2 and later
Additional context
This memory leak is not visible on
status.test
, which runs the same version of nwakuThe text was updated successfully, but these errors were encountered: