-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WiFi] Reduce wifi reset calls and call full wifi init at reset (#3147) #3148
Conversation
I made a test build for this. |
I get often this now with this changes (merged code, not the bin files above) and OTA update:
|
I'm also testing this and there is some logic error which I'm hunting down as we speak. P.S. to get the node to enter some endless loop, try to disconnect the ESP from the access point side (Mikrotik allows you to kick of a connected client) |
@TD-er: Something OT: Have you checked the pubSubClient v2.8? Sending works, receiving not... :-/ |
Nope, have not looked at the PubSubClient main repo for a while. |
I downloaded it here (right version?): |
Yep. |
Ah, ok. So I can wait for a long time to get it to work in the orig condition... LOL |
Have a discussion with you some time ago in different issue, to prevent fast switching of wifi state - wifi logic should work more lazy (2nd step) and wifistate/wificonnected should be not a function but int/bool to check (first step), this will reduce wifi overhead. |
Not sure if I understand you correctly @uzi18 What I'm now trying to do (not yet committed) is to consider the What I'm seeing though is that this |
Made a new test build for this PR: |
src/ESPEasyWiFiEvent.cpp
Outdated
wifiStatus &= ~ESPEASY_WIFI_GOT_IP; | ||
wifiStatus &= ~ESPEASY_WIFI_SERVICES_INITIALIZED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TD-er use bitRead/bitWrite macros, do you use here inversed logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's perhaps indeed a bit more readable :)
OK, there is really something wrong in the core library. station_status_t status = wifi_station_get_connect_status();
if (status == STATION_CONNECTING && timeOutReached(last_wifi_connect_attempt_moment + 15000)) {
resetWiFi();
} The strange part is, this But it is actually happening on only some modules when trying to reconnect, or on the first connection. |
After merging this: |
Is Vcc only dropped on the (first?) readings while it was still trying to connect? If so, then it can be explained by the fact the Vcc is not read during the WiFi connect, as reading it may affect WiFi performance and vice verse. I told you before the loop count is not a really useful measure for detecting the CPU load. |
Now I merged this: |
Usually I wait a few days before trying the new core versions, especially if they include a new toolchain.. But if you build the 'stage' build, you would actually have the same code as now on the master branch of esp8266/Arduino. |
I have flashed a new FW this minute with your changes and core 2.7.2 and let it run over night now. Will see what the log says tomorrow.... ;-) |
Well I do have some test code pending with more logging. |
Wow, after 9 hours no crash (reboot), no lost WiFi! My config: ESPeasy custom, core 2.7.2, controllers 2 + 13 + 16 and 1 plugin (P001). But this is with my test node. Will now see what my "real" nodes do after flashing this FW... |
Yep a lot of stability related fixes lately |
Ok, this was the last of my 18 nodes I flashed (all are different). This is a high risk to do this on daily needed nodes, but the only real test with usable results. But for now it looks not so bad.... :-) |
OK, I have to make a decision now, how to tackle the WiFi issues. Option 1: Option 2: What I observe here in my tests, is that I do not seem to miss any events. But some internals of libraries or the core may still rely on the So what's the better option to do here? |
Is option 2 not a bit risky to ignore the WiFi status? BTW: At this time all my nodes seem to run stable with your last changes. |
Well if the wifi status is incorrect, what should you do? The tricky part here is that some nodes appear to have a higher chance of seeing this out of sync behavior compared to others. |
Hard to say. I have here 16 Sonoff nodes and 2 ESP-12F in use. All running without issues now. But I had not really strange WiFi issues in the past. Maybe it depends on the good router (FB7490)!? |
I used to think that too, but since I now have one node that has significant more issues connecting to WiFi, compared to the other nodes, I don't think it is solely the router that makes the difference. After all, it has been reported by users for ages, but I simply was unable to reproduce it. |
Or it depends on the stuff that is compiled within the firmware!? |
Ok, the 6 nodes I flashed yesterday survived the night without any issues... :-) |
Maybe you can also let the node force to reconnect to WiFi and/or kick a node from the access point (if possible, or restart the AP) |
Ooops, CPU load is rising from 10 to 80% after I disabled WLAN at the router. After connection is done it goes back to 10 |
CPU load (as it is computed) will indeed seem to go up. The only way to make this look less (remember the load computation is based on the time spent in loop calls that don't have a scheduled job to run) is to schedule the wifi connect calls too. |
When no sensor data are lost with that high load it doesn't care to me. ;-) |
It is only lost if the controller buffer is full and it cannot offload it to the other end of the controller because there is no network connection. |
Under normal circumstances WLAN should be up and the nodes are connected. Most important is, that they do not crash and reboot with a lost WiFi... |
Just added some patches mainly for ESP32 and if you're logging to syslog. But I also looked into the 'lockup' you mentioned @v-a-d-e-r On boards with the CP210x chips when I had VS code to read the serial log and close VS code, the node hangs. |
I have no ESP32 in use. It is related to all nodes. Only 2 of 18 use serial port. |
The USB to serial thingy I've only seen on the ESP8266 boards. |
Also available for ESP32 with USB. See here.... |
The USB chip is present on almost all boards. By the way, what an exorbitant high price for such a module. |
That's right. I took the first hit I found, because I needed only the link ;-) |
No need to reboot after changing UDP port setting.
The max. UDP packet size could allow for a large memory allocation when a large packet was received. No matter if this was a valid packet to process anyway.
This could lead to extra memory usage until the next (valid) NTP packet was processed. Also the UDP port could be held.
Fixes: letscontrolit#3155 For the ESP8266, we keep things like the last boot cause in RTC memory. Currently that's not supported on ESP32 (in our code) so the last boot cause is never read and stored.
Fixes: #2931
Fixes: #1258
Fixes: #3155
Fixes: #3141
Fixes: #3111
Connected state is now based on the state reported by the internal WiFi object.
Frequent calls to
reset()
of WiFi are now rate limited to make sure we don't end up in a loop.