-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable 802.11kv causes memory leak and unexpected impact with wifi scan (IDFGH-5703) #7423
Comments
I figure out the reason why my device run to OOM if CONFIG_WPA_11KV_SUPPORT=y My code flow is: (when user start wifi scan, I will collect the scan result and display for user)
This works fine when CONFIG_WPA_11KV_SUPPORT=n. However, when CONFIG_WPA_11KV_SUPPORT=y the device will scan in the backgroud. This leads to this issue:
|
Hi @AxelLin , Scan triggered by supplicant are cleaned in esp_supp_scan_done_cleanup() routine. We tried to reproduce this issue inhouse but couldn't. If the issue is easy enough to reproduce in your environment , Can you please collect supplicant logs and sniffer captures and attach here? |
The problem is not esp_supp_scan_done_cleanup(). My code (register event handler for WIFI_EVENT, ESP_EVENT_ANY_ID) to collect scan result will allocate memory to collect scan result, The point is it will post WIFI_EVENT_SCAN_DONE event, and I cannot tell |
Let me explain this way:
See above code flow. This is nothing to do with environment. |
Hi @AxelLin ,
that's what my original comment #7423 (comment) mean, whenever scan is issued by supplicant, cleanup is also called by supplicant(application need not to call anything). And the supplicant scan cleanup function only being called when that scan is issued by supplicant. May be the issue is more related to your application and how it is handling the scan_done event, Can you paste your scan issue and handler code so that we can take a look?
|
I'm aware the application should clean up memory allocated by the application. I'm not sure what you mean by "keep reference in application to check whether scan was issued by application". My application does simple scan as below:
When got WIFI_EVENT_SCAN_DONE event:
|
A simple code example to handle this.
When you get WIFI_EVENT_SCAN_DONE event: if (scan_issued_by_application == false) /* do scan processing */
|
It's racy with your suggestion. You expect: scan_issued_by_application = true; But in practice the supplicant is running in different task, it could be: scan_issued_by_application = true; or esp_wifi_scan_start() // trigger by supplicant or scan_issued_by_application = true; In additonal, I notice sometimes the esp_wifi_scan_start() trigger by application returns -1 if scan was trigger by supplicant. |
I also notice something unusal when application calls scan (scan all channels). esp_issue_scan() will scan specific scan_params. i.e. It only adds new id for the same SSID / channel. However, when my application trigger all channel scan, I observe below: I (376316) wpa: BSS: Add new id 206 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESG' chan 9 |
Once 802.11kv is enabled on ESP32, the device keeps scan so frequently. I (70455720) wpa: scan issued at time=2507736990988 I'm testing with Tplink Deco X60.
|
+1 have seen this issue, we ended up just disabling 802.11k and forcing a wifi disconnect before a scan however we would have a preferred a way to differentiate a from an intended scan. We tie a lot of onboarding user logic (including LED states) to wifi events. All of our wifi scans use the non-blocking mechanism esp_wifi_scan_start(&scan_config, false). |
Hi @AxelLin , are you enabling client initiated roaming? |
No, that will cause other issues so I didn't call esp_wifi_set_rssi_threshold() in this test. I just set below config and enable rm_enabled and btm_enabled in wifi config.
I'm not sure if I can do it the sniffer may include some confidential data that I cannot share. |
+1, I don't have confidence to enable CONFIG_WPA_11KV_SUPPORT for production code at the moment. |
Hi @AxelLin, in that case, scan is issued by AP, you can check whether AP is sending beacon measurement request. |
@AxelLin @AshUK We have discussed the SCAN_DONE event implication for the applications. For internal 11kv scan, we will have additional event added, so that applications can distinguish clearly. Also, for scanning frequency, we will check with different routers with roaming enabled and see the "beacon measurement request" frequency from the AP. |
I have confirmed the frequently scan is because reciving "beacon measurement request" from AP. |
Hi @AxelLin, we have btm_enabled & rm_enabled flags in wifi_sta_config_t to control these 11kv behaviours. We may do some optimisation to deal with APs sending continuous measurement requests. |
Thanks for the confirm. Actually I had asked it before: #3671 (comment) Maybe needs some document update to make it clear about the way to run-time disable wifi roaming. |
In my current test, I just bypass the SCAN_DONE even if it is triggered by internal 11kv scan. |
@AxelLin - Yes the documentation is WIP. Will have it ready soon. |
Hello, I have bought a Deco M4 V2 (3-pack). I used one as a router and the second and third one as access points. I successfully made the ESP32 connected to it. but I found that ESP32 connected to the closest deco and it stayed connected to it, even if I walked and went to another deco. Is there any way to make the esp32 connect to another repeater without disconnect? is this related to this issue? |
@IbrahimMuala - The issue is not related. Moving from one AP to another without disconnection can be done with 802.11r roaming. The support for the same is not added yet. However, with 11kv features you can register to get notified when signal strength from an AP drops to level below certain threshold. The application can ten get list of nearby APs and trigger re-connection with the best candidate if desired. |
@sagb2015 |
@sagb2015 @kapilkedawat |
If application issues a all channel scan, the wpa supplicant will also receive SIG_SUPPLICANT_SCAN_DONE event. |
I'm also wondering if the app's esp_wifi_scan_stop call will stop the esp_wifi_scan_start called by supplicant if it was scanning in progress. e.g. |
@sagb2015 @kapilkedawat There is a CONFIG_WPA_SCAN_CACHE setting. The help text for WPA_SCAN_CACHE does not tell what this config is for. |
Hi @AxelLin , When station is getting WIFI_EVENT_STA_BSS_RSSI_LOW, most likely scenario is that station has moved from it's previous location. In that case, cached results won't help and preferably you should go for a new scan. WPA_SCAN_CACHE enables supplicant to store scan result and adds support for scan table. It's more advantageous for AP side when they want to manage the network and can get the saved results from station to move the station accordingly. |
Environment
Module or chip used: ESP32-WROOM-32E
IDF version: v4.3-356-g48ae2309fd9c
Build System: idf.py
Compiler version: xtensa-esp32-elf-gcc (crosstool-NG esp-2021r1) 8.4.0
Operating System: Linux
Power Supply: USB
Problem Description
While testing wifi roaming, I notice my application keeps receiving WIFI_EVENT_SCAN_DONE event.
I register even hander for WIFI_EVENT_SCAN_DONE event because when user issues wifi scan
command, my application code will collect received scan result.
But now my application code keeps get called because the scan is issued internally when 802.11kv is enabled.
I'm not sure if there is a proper way to differentiate the scan caused by 802.11kv or
the scan because user explicitly set command to scan. If not, this seems a bug.
BTW, the device issues scan so frequently when 802.11kv is enabled (see the timestamp below).
Is this normal?
I (726679) wpa: action frame sent
I (732690) wpa: scan issued at time=1956986705404
I (732695) wpa: BSS: Add new id 326 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (732709) wpa: BSS: Add new id 327 BSSID d8:47:32:7c:34:ba SSID 'TEST-MESH' chan 9
I (732738) wpa: BSS: Add new id 328 BSSID ac:84:c6:32:35:b5 SSID 'TEST-MESH' chan 9
I (732741) app: Got WIFI_EVENT, event_id 1
I (732742) wpa: scan done received
I (732748) wpa: action frame sent
I (738756) wpa: scan issued at time=1956992771567
I (738761) wpa: BSS: Add new id 329 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (738773) wpa: BSS: Add new id 330 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESH' chan 9
I (738808) app: Got WIFI_EVENT, event_id 1
I (738808) wpa: scan done received
I (738810) wpa: action frame sent
I (744816) wpa: scan issued at time=1956998831566
I (744822) wpa: BSS: Add new id 331 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (744828) wpa: BSS: Add new id 332 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESH' chan 9
I (744856) wpa: BSS: Add new id 333 BSSID d8:47:32:7c:34:ba SSID 'TEST-MESH' chan 9
I (744868) app: Got WIFI_EVENT, event_id 1
I (744869) wpa: scan done received
I (744871) wpa: action frame sent
I (750879) wpa: scan issued at time=1957004894061
I (750894) wpa: BSS: Add new id 334 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (750903) wpa: BSS: Add new id 335 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESH' chan 9
I (750931) app: Got WIFI_EVENT, event_id 1
I (750931) wpa: scan done received
I (750933) wpa: action frame sent
I (757037) wpa: scan issued at time=1957011052506
I (757043) wpa: BSS: Add new id 336 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (757046) wpa: BSS: Add new id 337 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESH' chan 9
I (757089) app: Got WIFI_EVENT, event_id 1
I (757089) wpa: scan done received
I (757091) wpa: action frame sent
I (763102) wpa: scan issued at time=1957017117584
I (763107) wpa: BSS: Add new id 338 BSSID ac:84:c6:32:1a:26 SSID 'TEST-MESH' chan 9
I (763115) wpa: BSS: Add new id 339 BSSID ac:84:c6:32:14:77 SSID 'TEST-MESH' chan 9
I (763130) wpa: BSS: Add new id 340 BSSID d8:47:32:7c:34:ba SSID 'TEST-MESH' chan 9
I (763151) wpa: BSS: Add new id 341 BSSID ac:84:c6:32:35:b5 SSID 'TEST-MESH' chan 9
I (763154) app: Got WIFI_EVENT, event_id 1
I (763156) wpa: scan done received
Expected Behavior
I'm not sure if there is a proper way to differentiate the scan caused by 802.11kv or
the scan because user explicitly set command to scan.
If yes, please document how do differentiate the scan caused by 802.11kv or not.
If not, this seems a bug.
Actual Behavior
see above log
Steps to reproduce
Just set below config and enable rm_enabled and btm_enabled in wifi config.
CONFIG_WPA_DEBUG_PRINT=y
CONFIG_WPA_11KV_SUPPORT=y
The text was updated successfully, but these errors were encountered: