Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wifi Roaming not enabled yet #2092

Closed
caco3 opened this issue Feb 27, 2023 · 19 comments
Closed

Wifi Roaming not enabled yet #2092

caco3 opened this issue Feb 27, 2023 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@caco3
Copy link
Collaborator

caco3 commented Feb 27, 2023

Support for Wifi Roaming is implemented, but it seems not enabled yet.

@caco3 caco3 added the bug Something isn't working label Feb 27, 2023
@caco3
Copy link
Collaborator Author

caco3 commented Feb 27, 2023

@Slider0007 Ich war mal so frei und habe dich assigned, da du am ehesten weist, was noch fehlt.

@jomjol FYI

@Slider0007
Copy link
Collaborator

No problem :-)

I've already started with heap analysis, but right now it looks not really promising without having risk of instability. One chance would be to reduce heap which is used by MQTT discovery topics. This would improve the situation --> see #2091 (comment)

@caco3 caco3 mentioned this issue Feb 27, 2023
@caco3
Copy link
Collaborator Author

caco3 commented Feb 27, 2023

As a reply to #2091 (comment):

I agree that we only should send any MQTT topics at a defined point in time.
REST API trigger or re-connecting of broker or wifi just should set a flag, but the sending should only happen at the end of the next round.

Do we understand why roaming requires so much additional memory? Can we adjust this?
Keep in mind that the models also will grow over time, thus will take more memory. I think there was already an issue that a model was too large.

As for HA Discovery, I still do not understand why it takes so much memory. In my understanding, MQTT publishing is a blocking task, thus it does not matter if I want to send 1 or 20 topics, the memory usage should be the same.

We might also consider to go and use the upper 4 Mbytes (Hi-Mem, I think @nliaudat looked into this, but I can't find the Github issue/PR) for roaming, eg. make it a requirement to have 8 MBytes for Wifi Roaming.
Most users do not need roaming (since they have only one Wifi Accesspoint), so for them 4 MB would still suffice.
And the ones that need the roaming (and possibly other large features in the future), need to provide a 8MB device.
I ordered 3 times an ESP32-CAM, and it always came with 8 MB (untested, but they are shown as 8 MB). So I am not surer if that many 4 MB devices are around.

@Slider0007
Copy link
Collaborator

I agree that we only should send any MQTT topics at a defined point in time.
REST API trigger or re-connecting of broker or wifi just should set a flag, but the sending should only happen at the end of the next round.

Would it not be easier to just trigger by button which is interlocked with flow finished?

Do we understand why roaming requires so much additional memory? Can we adjust this?

An additional task is used to handle to roaming features. This alone consumes 6kB, and every beacon which needs to be saved used ca. 1kB. I assume this cannot be adjusted.
Source: espressif/esp-idf#3671 (comment)

As for HA Discovery, I still do not understand why it takes so much memory. In my understanding, MQTT publishing is a blocking task, thus it does not matter if I want to send 1 or 20 topics, the memory usage should be the same.

I tried to explain here: #1990 (comment)

Keep in mind that the models also will grow over time, thus will take more memory. I think there was already an issue that a model was too large.

This doesn't matter right now because models are stored in PSRAM. PSRAM is not a limitation anymore. We have with the largest model still 650kB free PSRAM. So there is no need or benefit to increase PSRAM at the moment, so HIMEM is not needed right now.
The actual limitation is the internal RAM, because all task heap and DMA memory (which is used also for wifi stack, IP stack, as well as to write to SD) is only possible with internal RAM. I checked some possiblities to have task heap in PSRAM, but this approach have some limitations and makes higher complexity escpecially with revision 1 boards (bug in PSRAM handling). Also it could a performance topic. Maybe some more investigation could help here...

In sum, it's difficult to find internal RAM and I checked different sources and biggest "variable consumer" (of DMA memory) is still HA discovery including respective DEBUG messages and here I personally see a quite easy way to get a lot memory back without having a big drawback.

Other option would be not implementing wifi roaming. I agree that not everybody (me neither) needs such functionality because usually device is mounted statically.
As alternative I figured out an option to SCAN all wifi channels, sort by RSSI before choosing a best AP.

@jomjol
Copy link
Owner

jomjol commented Feb 27, 2023

As alternative I figured out an option to SCAN all wifi channels, sort by RSSI before choosing a best AP.

I think, this would be a good compromise. Because I have this problem, that the device connects not with the best, but the first found AP constantly and this is sometimes very instable.

@caco3
Copy link
Collaborator Author

caco3 commented Feb 27, 2023

Would it not be easier to just trigger by button which is interlocked with flow finished?

So you would only enable the button when the flow is finshed?
How would you then handle asynchronous REST API calls (which are also called by the button)?

An additional task is used to handle to roaming features. This alone consumes 6kB, and every beacon which needs to be saved used ca. 1kB

Wow, that's heavy!

I tried to explain here: #1990 (comment)

Yes, I have studied that part.
Mei point is that UART logging and also SD-card writing are all blocking calls, this one topic after the other gets handled. And therefore it should not matter (memory wise) how many topics I want to send.
Also it should not mater which kind of topics they are HA Discover, Static, System Topics. They all use the same amount of memory per topic.

revision 1 boards (bug in PSRAM handling).

Maybe we could drop support for them? We could check the revision in the next version and tell the people to report to us if they use rev. 1. And if we do not get too many notes, there is no reason to keep it. Especially as one just could do a drop-in replacement with a newer board. Personally I do not know what I have, it seems not yet be reported in the UART log.

Other option would be not implementing wifi roaming.

If there is an alternative, that might be a go-able way.

I also would like to have roaming.
I have a near-by Fritzbox, wich gets ignored in faviour of a farer away Ubiquity Accesspoint.
Note: I once disabled all of my Ubiquity AP, but it still did not connect to my Fritzbox, so maybe there is a different issue (They use the same SSID and passwords).

@Slider0007
Copy link
Collaborator

Slider0007 commented Feb 27, 2023

Would it not be easier to just trigger by button which is interlocked with flow finished?

So you would only enable the button when the flow is finshed? How would you then handle asynchronous REST API calls (which are also called by the button)?

Setting the flag like you proposed would also be possible. That was just an quick idea, maybe there are other possiblities. Calling discovery topics with REST could just be return a message that this is only available when flow is finished. As long it's sending in a situation which is not critical.
Calling other static, system infos asynchronously are not that critical because those messages are less heavy than discovery topics.

I tried to explain here: #1990 (comment)

Yes, I have studied that part. Mei point is that UART logging and also SD-card writing are all blocking calls, this one topic after the other gets handled. And therefore it should not matter (memory wise) how many topics I want to send. Also it should not mater which kind of topics they are HA Discover, Static, System Topics. They all use the same amount of memory per topic.

Theory is one part which can interpret different way, but you can check it easily just by yourself. With your theory no difference in heap consumption should occur, but numbers show a different story when HA discovery is activated with DEBUG level (keep rest of config the same) --> http://IP/heap

revision 1 boards (bug in PSRAM handling).

Maybe we could drop support for them? We could check the revision in the next version and tell the people to report to us if they use rev. 1. And if we do not get too many notes, there is no reason to keep it. Especially as one just could do a drop-in replacement with a newer board. Personally I do not know what I have, it seems not yet be reported in the UART log.

All my boards a rev.1 boards. I do not like to replace all my boards if not really necessary. I assume there are still a lot around. With newest rolling, board revision number will be printed in logfile at boot.

Other option would be not implementing wifi roaming.

If there is an alternative, that might be a go-able way.

I also would like to have roaming. I have a near-by Fritzbox, wich gets ignored in faviour of a farer away Ubiquity Accesspoint. Note: I once disabled all of my Ubiquity AP, but it still did not connect to my Fritzbox, so maybe there is a different issue (They use the same SSID and passwords).

As alternative I figured out an option to SCAN all wifi channels, sort by RSSI before choosing a best AP.

I think, this would be a good compromise. Because I have this problem, that the device connects not with the best, but the first found AP constantly and this is sometimes very instable.

@caco3, @jomjol: Maybe the simple WIFI full SCAN config could alredy help here. I haven't fully tested this option. If you want to test, I can provide the test firmware.

@caco3
Copy link
Collaborator Author

caco3 commented Feb 27, 2023

Setting the flag would also be possible. That was just an quick idea, maybe there are other possiblities. With REST it could just be return a message that this is only available when flow is finished. As long it's sending in a situation which is not critical.

I do not like that, do we expect a user has to try for several minutes until it is allowed? If the interval is tight, the next flow starts quickly, in the worst case only providing a window of some seconds!

Also, if one would want to trigger it automatically, that would be annoying to handle.

Theory is one part which can interpret different way, but you can check it easily just by yourself. With your theory no difference in heap consumption should occur, but numbers show a different story when HA discovery is activated (keep rest of config the same) --> http://ip/heap

I will see that I can play a bit with it myself.

All my boards a rev.1 boards.

ah, ok. Like said I thought they might be all sold out by now since they are on rev 4 or so.

With newest rolling, board revision number will be printed in logfile at boot.

👍

If you want to test, I can provide the test firmware.

Yes, please do

@Slider0007
Copy link
Collaborator

Slider0007 commented Feb 27, 2023

Also, if one would want to trigger it automatically, that would be annoying to handle.

I understand your point. This is for sure a compromise.
But just for my understanding: I thought, that's just for initial setup and after that it's not needed anymore. Why is there a need to automate this single event of intial setup?

I imagine it this way (with my not existing HA knowledge): Boot the device, have HA ready, wait until device is in state "flow finished". Push the button and that's it.
Excuse me if I'm totally wrong here :-)

All results and system status will be provided round by round anyway.

If you want to test, I can provide the test firmware.

Yes, please do

No problem, maybe tomorrow :-)

@nliaudat
Copy link
Contributor

nliaudat commented Feb 28, 2023

Board revision can be retrieved with esp_sys.cpp : GetChipRevision()

The rev1 RAM bug workaround is enabled by default in sdkconfig.defaults by setting :
CONFIG_ESP32_REV_MIN_0=y

Using HIMEM was discussed in #1842. It need to manually allocate RAM with esp_himem_alloc or heap_caps_malloc()

I've tried with

CTfLiteClass *tflite = (CTfLiteClass *) heap_caps_malloc(sizeof(CTfLiteClass), MALLOC_CAP_8BIT | MALLOC_CAP_SPIRAM); //Allocating memory in IRAM or PSRAM, whichever has more free space
...
free(tflite); 

@caco3
Copy link
Collaborator Author

caco3 commented Feb 28, 2023

But just for my understanding: I thought, that's just for initial setup and after that it's not needed anymore. Why is there a need to automate this single event of intial setup?

Yes, the discovery only is needed once. My scenario was more generic, related to other possible REST API calls which could trigger MQTT publishing or so.

@caco3
Copy link
Collaborator Author

caco3 commented Feb 28, 2023

Board revision can be retrieved with esp_sys.cpp : GetChipRevision()

Great! @Slider0007 Can you add this to your PR or should we add it separately? i think that might get useful in the future.

@Slider0007
Copy link
Collaborator

Slider0007 commented Feb 28, 2023

@caco3 : As described a few posts above, board revision gets already printed to logfile at every boot with latest rolling. So it's already included.

@nliaudat
Copy link
Contributor

board revision gets already printed to logfile at every boot

Is it really necessary to log this information at each boot? I would rather see that kind of information in the web interface under system > info
Besides, the esp_sys.cpp : get_device_info() function is designed for that case.

@Slider0007
Copy link
Collaborator

For diagnostic purpose I think it could be helpful to have it in the log, because whenever user share there issues they don't share Info menu but the logfile, so all info is in one place. Nevertheless, this info can for sure added to Info page as well. But this wasn't the main goal of the PR where I modified boot part.

@Slider0007
Copy link
Collaborator

Besides, the esp_sys.cpp : get_device_info() function is designed for that case.

I was thinking about his, using that function but activating the flag all in the file included functions get activated. Unfortunately, most of them are duplicates of functions which are already available in other files. Therefore a cleanup should be done before this is part of software is usable.

@Slider0007
Copy link
Collaborator

@caco3, @jomjol: I created two firmware versions for testing --> further infos #2107

@Slider0007
Copy link
Collaborator

I created a third firmware version for testing (switch APs with wifi scanning only) --> further infos #2107

@Slider0007
Copy link
Collaborator

Merged into newest rolling. This will be part of next official release 15.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants