-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Simple Easily Reproducable Catastrophic TCP Stack Failure Results in Low Heap Crash shown with WiFiServer, Latency > 100ms, and Larger Send Bytes #2925
Comments
I see only this commit to have relevant changes. try rolling them back and see if that changes your results |
I have tried reproducing this last night, but seemingly my latency was not big enough. @CircuitSerialKiller any chance you can help narrowing down on the offending change using |
Yes I can. I'm also going to grab a Wireshark capture. I should have a few hours this weekend to dedicate to this. |
Hi @igrr and @joelucid The commit before this, Sorry for the long delay before I could dedicate a few hours on a weekend to commit to tracking this down. |
Hello, I did try this with lwip2: it works with no bug. With lwip1.4 it does not. I suspect that the bissected commit makes things work better, so lwip1.4 has too much data to deal with and it just does not work with too much stuffed buffers. This is the reason why I wanted to update lwip in the first place. Here is the log of the sketch above, with a display every seconds, and available heap size.
lots of requests from now
lwip starts to timeout something
so about ~2min after burst requests, memory is released back by lwip timers. I am not pushing to lwip2 in v2.4.0, I still think this is too early. Edit: in the meantime, same sketch with lwip1.4 did not release memory back after 2368seconds (~40mn). |
I witnessed this same issue with the webserver, and was forced to move to the asyncwebserver. I wish I had seen this before, I would have tested lwip2 instead. |
@d-a-v I'm going a bit crazy with the cleanup, but I think I saw a PR with lwip2. Or was it a branch? Anyways, is that up to date? I may just try testing it anyways. |
@d-a-v thanks, I will get to this right after I finish my current pendings for this repo. |
Hi, this problem wasn't introduced with an LWIP change. This problem still exists in the current Arduino Core with LWIP v2.1.0-14-g33f234f. |
@CircuitSerialKiller Yes it still does not work as stated above. It says:
However once - let's be imaginative enough - lwip2 is merged, I guess the chunk size will be reconsidered because it will have no impact on lwip stability. So it could become configurable for a bandwidth purpose (and TCP_MSS too). There will be a tradeoff between bandwidth and free HEAP. |
@CircuitSerialKiller @d-a-v how does release 2.4.0 work out? I would think no trouble with lwip MSS=536. |
What is the status here? |
A year after I can tell you that these 256B chunks are not impacting latency (or very lightly). If you can confirm the OP issue is solved, please close. |
Closing due to age and lack of response. |
Basic Infos
Hardware
Hardware: ESP-12E, WeMos D1 Mini
Core Version: Latest github version
Description
I have to use the newer core on github for the
MEMP_NUM_TCP_PCB_TIME_WAIT = 5
to prevent crashing with frequent HTTP connections. (Issue 2767).
After switching to the latest github version of the ESP8266 core from version 2.3.0, I found that when accessing my HTTP server from VPN, or via the outside work with higher latency connections, HTTP responses from the ESP8266 will not make it to the client, and the ESP8266 will eventually run out of free heap until it fails. I have found this occurs only when the response is beyond a certain number of bytes in length as well, though I haven't determined the exact number of bytes. > 2048 bytes definitely demonstrates this issue. I tested tested via telnet direct to port 80 as well, and issuing HTTP commands. This does not occur with the 2.3.0 release.
I've modified the WiFiWebServer.ino file to easily demonstrate this issue. Just setup a NAT to your ESP8266 IP address with the appropriate ports, and from a remote site, cell phone, etc, try accessing the following URL repeatedly while watching the serial console. You'll see the free heap steadily decrease and never recover!
http://a.b.c.d/gpio/1
Where a.b.c.d is your outside IP address.
Settings in IDE
Module: NodeMCU 0.9
Flash Size: 4MB
CPU Frequency: 80Mhz
Flash Mode: qio
Flash Frequency: 40Mhz
Upload Using: SERIAL
Reset Method: nodemcu
Sketch
Debug Messages
The text was updated successfully, but these errors were encountered: