-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconnect WiFi (scan for strongest AP) #731
Comments
But it would not be the cleanest solution. Probably more like a function for checking RSSI and if it will be lower than threshold, then it can initiate scan and reconnect to a stronger one. |
It'd be nice if the ESP-IDF would support 802.11r (fast roaming) but they do not, as far as I know. Without fast roaming you have to go through the whole disassociate/scan/reassociate process. @randybb linked to the open esp-idf issue tracking fast roaming support. If they're just now working on adding it to ESP32 I can't imagine we'll ever see it for esp8266. |
Agree. Proper roaming would for sure be the best. But as that does not seem to be happening any time soon, this is a "something" (hopefully more feasible) rather than "nothing", in the meantime. |
I would like something that would periodically rescan too. I have several ESPHome smart plugs (S31) and 3 WiFi APs thru my house to provide sufficient coverage. Sometimes I will have to reboot an AP (for firmware updates, reconfigure channel/security/add-VLAN/etc, troubleshooting, power-failure not all are on UPSs) and then they will connect back to whichever happens to come up first. In some cases, this could be connecting to the far end of the house and latch on forever. I have tried setting multiple networks with the BSSID set for priority to the expected nearest one but this still doesn't work well if the AP boots up slower than the S31 (which is most times). It would be much better if it could properly support roaming in some way. Using a full reboot just to scan for WiFi is very annoying if it's connected to a light and worse if it's a TV, Radio, or other device that can not tolerate a brief blip without doing a full "reboot cycle" and probably not good for the relay contacts if it's a higher current load such as a washing-machine, dishwasher, etc. An additional problem, it seems to pick the first AP it sees (by lowest channel number?) out of them to try and connect to regardless of signal strength as the plugs boot up. This complicates the issue farther...my 2-networks specifying BSSID with higher priority helps, but is still a poor workaround. Proper periodic scanning and selecting by signal periodically would correct this problem too. |
Vote for this too, I'd like to have any such functionality as well! |
Remember to upvote FR using the 👍 on OP |
I mean the current If you have multiple matching networks with the same priority, they will automatically be chosen in a round-robin fashion because on each disconnect the previous network gets a If a device connects to a wrong network, it will stay there as long as it's still connected. But as soon as the wifi connection drops it will automatically choose the best network again. |
No, it does not. Critical distinction, "strongest AP" is NOT the same as "strongest network".
This doesn't help when they are the same network, but multiple access points (same SSID, many BSSID to provide redundancy, distribute load, and improved coverage) and it locks onto a weaker BSSID when a stronger one is available and it won't ever switch to the stronger one. In my experience, on bootup the ESP also frequently does not pick the strongest BSSID for the specified SSID either (seems to go by lowest channel number not signal for which BSSID on a given SSID???)
That is part of problem, especially with multiple BSSIDs on the same SSID. It should periodically scan (or when it gets below a threshold and reconnect to the strongest one. If it doesn't initially pick the strongest BSSID to begin with, it never tries again either. There is already a way to have it report back it's signal-strength, there should be a way to have that drop below a threshold and trigger a scan/reconnect without rebooting (which interrupts whatever it's doing, for example cycling the relay on a smart-plug making your TV/radio/whatever reboot too). At least then it would be possible to design something to make it retry periodically with an automation without having to reboot. Additionally, if it is insisting on using an AP with a poor signal (say at the far end of your house as I often observe) and causing many re-transmitted packets due to low (but not enough to disconnect) signal quality, that introduces significant performance overhead and impact not only to the ESP device but all other devices on the same access point. |
I'm suffering this now. Had anyone dig into required changes already? |
Tasmota fixed this a couple of years ago, but it is disabled by default... (SetOption56/SetOption57, see arendst/Tasmota#3173). Absolutely needed for ESPHome, without it the devices are nearly guaranteed to do the wrong thing on multi-AP networks. That is causing me big headaches currently. |
Ah, workaround for static setups (not the riding lawnmower mentioned above): define the bssid in the WiFi settings. Not as elegant as an automatic scan though, plus it requires one to guess in edge cases. |
Sorry, that was nonsense. The BSSID steers the network, not the specific AP. |
I expect the bssid thing to work, however, it will stick to one AP |
BSSID is what steers the AP (it's the wireless radio's MAC address) the SSID is the network. Workaround for non-moving is to set the BSSID in network settings BUT if the access point goes offline you still need it to scan for any SSID as a backup (lower priority network). Then I had to make switches, sensors, and automations to reboot the ESPs any time an AP goes offline to force them to re-scan for the one they are supposed to be on. |
Anyway what otro said should work. So this is a bug and not a feature request. Just putting it clear. Cannot debut now. No time and my problematic nodes are hard to serial debug for now. |
as @randybb mentioned the most ideal would be typical roaming implementation, but if that's too much of a heavy lift and we are waiting for upstream. -- 1 -- For example one of mine is currently connected to an AP with RSSI -90db, while there is an AP far closer with -65db. If I was able to set a minimum, then the -90db would always be ignored during scan (unless it is the only one available and log if so). That would likely be easier than the avg user determining the BSSID. -- 2 -- If that's really the case (esp doesn't select BSSID based on signal) we should really add an alert/note in the wifi component documentation to warn users with multi-AP networks. -- 3 -- Again the most ideal is a traditional scan/thresholds and proper roaming support. |
Running into this issue also. Did the initial flash in the house, then moved esp32 to garage where there is enough signal to see the original AP, but the connection is very unreliable. Even when the esp32 completely falls off the network, it won't try to connect to the AP that's in the garage because it can still see the one inside. I had to take my laptop to the garage and do the initial programming on a different esp32 in the garage to get it to use the AP in there. Forcing it to a specific BSSID will not resolve the issue because if that AP ever fails and it tries to connect to one of the APs furthest away, I'll be right back in the same situation. |
esp8266 have a bit better WiFi reception with PCB antennas than esp32. In house I don't have problems - I have an AP for every 50 m2, but outside it is another story. I have been using ESP32 with external antennas (ESP32-WROOM-32U) without any problems where ESP32 with PCB antennas would have problems. |
I have observed #2 with my Sonoff S31 plugs on ESPHome...they go by the lowest channel number they see regardless of signal strength...which in some cases means picking the worst signal. Really hope something can be figured out better than the mess of improvised BSSID and manually rebooting when it "might" be on a different AP... |
I have the same problem - 3 AP in mesh configuration & 1 AP from internet provider. The mesh does a periodic reset every night, so the ESP devices connects to the internet provider AP. Which is a much less stable connection. I tried to implement a background scan on the ESP8266 - as mentioned above it interrupts the connection. Each channel takes about 100mSec to scan, so I thought that by scanning one channel every 10sec I'll get a reasonable compromise. Small delay for events to be sent and a full scan every few minutes (14 channels, 10sec/channel = 140sec for full scan, 100mSec/10sec = 1% of event delay by up to 100mSec). I thought that the TCP/IP stack will handle a small interrupt in the link - but the system became unstable. The same happens if a full scan is initiated too quickly, the current implementation supports re-scanning, but it must be used sparsely. Why does the API uses TCP/IP? why not use UDP? It uses less resources, and the API messages are small, so a data-gram event driven approach is simpler than a streaming approach used by TCP/IP. |
UDP would complicate other things, it is not a reliable protocol ("best effort" is often reliable enough in a quiet network, but if the packet is lost it will never retry...would need to rewrite applications to keep retrying and checking if it worked...much more effort). If it was controlling a light show with a nonstop stream of commands, UDP would be better but for simple on off and button press TCP is the reasonable option. I'm curious what "unstable" meant - maybe there are some timeouts or buffers that can be adjusted somewhere? I would have thought it would "look" like high latency (which should be fine, and can naturally exist). I wouldn't even be that upset if it couldn't do full-on roaming if there was a way to do a scheduled reboot say middle of the night and it properly found the best AP...but as it currently stands they seem to prefer the first by channel # rather than signal strength. |
See esp8266/Arduino#4213 or http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html I might be wrong about why the heap is running out - but I got a strong correlation between network errors and heap running out. It sometimes comes back, if no network error occurs for a long time - so it isn't a normal memory leak |
Seems like a super reasonable desired feature, but with some technical limitations. Perhaps attempting it via a rescan, only available as an action that can be called at first? |
As mentioned before, this has been solved by tasmota years ago. I am running 22 ESP8266 and 4 Mesh-APs successfully with tasmota and I am looking into migrating to ESPhome. But without some working "roaming" support it will definitely not work, because I know the troubles I had with tasmota until the "poor man's wifi roaming" was implemented. It is probably not a very clean solution, but it works rock stable for years. All ESPs are always connected to the best AP, after AP reboot they switch to the next best AP, and after a fixed timespan (22 minutes iirc) they move back to the original AP if the RSSI is significantly better. |
I was checking the signal strength of some of my devices and they were not connected to the nearest AP even though the signal was poor, I assumed they would reconnect to the best AP every 5 mins like Tasmota does but then discovered this request! Even a few reboots did not seem to work for me and ESPHome still connected to the distant AP even though there was a much closer one, I had to enter in the config fast_connect: off which should be the default I believe but now its connected to the nearest AP. Will be good to get this feature in ESPHome, I was migrating from Tasmota for a few devices but need to pause that now... |
Any news on this? Anyone with a workaround or something? |
For a workaround, using API callable services, you can try something like this:
The priority of a network drops every time it disconnects from the AP. In my setup, the mesh routers reset every night (don't ask), so the priority score is useless and just causes the unit to choose the wrong AP. I've tried multiple ways of calling start_scanning periodically to keep track of the AP with the strongest RSSI, but sometimes the scan causes the link to Home Assistant to disconnect. And sometimes even causes ESPHome reboot. For now I've added the following patch to handle my the nightly AP resets.
|
It looks like 802.11k & v are now supported, and I believe this would help. Does the linked sample code aid in integrating into ESPHome? |
👍 |
@micronen hey, I have different (and think better) solution for you and maybe for others who came here too. I have similar problem (in my case unreliable router) and I made esphome device that restarts this one mesh point by cutting off power for couple seconds if it's 2,4GHz network is unavailable. During unavailability of closest mesh point, esp connects to any mesh point and periodically check if closest one is available. Here is yaml:
In my case it's sensor that flood logs every second, but I see no problem to make it as periodic automation. |
Aside of the new feature (since 2023.06) to disable and reenable WiFi for a moment if the RSSI drops unter certain value that may achieve this with OOB features also for Arduino, there is another option to satsify your exact use case from the antenna side, but you must be able to install OpenWRT and DAWN on the antennas. Depending on the used ESP this works from gently by asking the client to roam to refusing AUTH and ASSOC to make sure not so cooperative clients are forced to roam. I had the issue with robots and devices moving around constantly and always sticked to the same antenna until they lost connection completely for a couple of seconds and the device was not reachable for ~20 seconds. If there is interest I can post my DAWN config |
In my case, this original problem ESPHome would scan and then keep trying to connect to the lowest channel number which in some rooms was below that RSSI threshold, so then it ended up fighting with the AP sending reassociation packets and would fail to connect at all while sitting directly under another AP with full signal on a higher channel number |
I'm having the same issue, ESPs constantly connect to distant access points. I tried prioritizing the closest access points using multiple network definitions and specifying the bssid of the closest AP as the highest priority, but it still doesn't connect to it's closest AP. wifi:
networks:
- ssid: !secret 'wifi_ssid'
password: !secret 'wifi_password'
priority: 0.0
- ssid: !secret 'wifi_ssid'
bssid: <closest AP mac address>
password: !secret 'wifi_password'
priority: 10.0
fast_connect: false
domain: .iot.<my-domain>
power_save_mode: NONE
ap:
ssid: chicken-coop-light-hs
ap_timeout: 1min
reboot_timeout: 15min
output_power: 20.0
passive_scan: false
enable_on_boot: true
use_address: chicken-coop-light.iot.<my-domain> One note is that the AP it is connecting to is on channel 1, the AP I want it to connect to is on channel 11. I wonder if it's prioritizing the lower channel, rather than prioritizing the bssid I want it to use. |
Same here on all of my esphome devices when i'll update or restart one of my AP's. |
well, seems almost another year has passed, any updates or good workarounds? |
+1 |
AP roaming is needed indeed. |
As little seems to be happening to this feature request, I've just created FR #2981 which would add a workaround to the ESPHome dashboard. |
Any update on this, is it in the roadmap at all to be added? |
Describe the problem you have/What new integration you would like
I would like to have a function that disconnects the WiFi and then performs a new scan and connects to the strongest AP found.
Please describe your use case for this integration and alternatives you've tried:
I have an ESP32 that rides along my robot lawnmower and keeps track of some things.
I have 3 APs to be able to cover my house + garden. But the ESP is "sticky" and stays connected to which ever AP it decides on first.
I know it is possible to set the "reboot_timeout" to something quite low, but it seems unneccesary to reboot the whole ESP and loose track internally of stuff, when I only really want to try a reconnect to a better positioned AP.
Additional context
The text was updated successfully, but these errors were encountered: