Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(ESP32) refactor device value rendering #16

Closed
proddy opened this issue Nov 24, 2020 · 44 comments
Closed

(ESP32) refactor device value rendering #16

proddy opened this issue Nov 24, 2020 · 44 comments
Assignees
Labels
enhancement New feature or request

Comments

@proddy
Copy link
Contributor

proddy commented Nov 24, 2020

there's a lot of code duplication in the device libraries so I'm working through to clean up the code, moving functions back to the base class. One of the changes is in the way the device information is printed to the shell which is very similar to how we create the data to send to the web. I propose

  • have one method to form a structured json document with the key/value pairs
  • use this method for both web and console
  • remove the n= v= elements from the json object as this creates a lot of unnecessary memory. Just use a sequenced list of values as pairs, e.g. n1, v2, n2, v2....

I've tested with Boiler with good results so far.

@proddy proddy self-assigned this Nov 24, 2020
proddy referenced this issue in emsesp/EMS-ESP Nov 26, 2020
@proddy
Copy link
Contributor Author

proddy commented Nov 26, 2020

made some minor changes

  • when packaging data to send to the web, instead of key/value objects it uses a string array. This saves about 30% of the data being sent to the web (or console) and uses less memory
  • the same method to create the web json payload is now the same for the console, removing a lot of duplicate code

@MichaelDvP
Copy link
Contributor

I see that the times are now double, as number in minutes and as text "days-hrs-mins".
Imho the number is much better for mqtt/api, the text better for web/console, but in both cases there is no need to have the values double.
To change this: add bool parameter to export_values() -> export_values(JsonObject & json, const bool textformat)
What do you think?

@proddy
Copy link
Contributor Author

proddy commented Nov 27, 2020

haha! our brains think alike. This is what I was planning to do yesterday night but wanted to get the changes in so you could merge your boiler updates. So lets do it.

@MichaelDvP
Copy link
Contributor

Ok, i'll make the change, the boiler enhancement contains 4 times.

@proddy proddy closed this as completed Nov 29, 2020
@proddy
Copy link
Contributor Author

proddy commented Nov 30, 2020

Still thinking of ways to improve this. Now when data is sent to the console or the web it follows these steps:

  • calls export_values_* function to build up the json doc, which we use to send to MQTT
  • create a new large json doc for storage
  • parses this json doc into the new doc
  • send it to web or shell

This is bit wasteful. I'm thinking there should be one function that takes the current device values and builds the json structure, so it's only done once and used everywher. For example the export_value() takes an additional parameter

  • MQTT: as it's now, key/values with key being the shortened abbreviated name
  • Web/Console: the key is now the full text

Advantages

  • there is less heap used since we don't need to create nested JSON docs
  • the values are in their native format, so numbers are numbers (and not strings). They will be rendered in the web UI by their i8ln, e.g. 2,345 instead of "2345" (EN locale)

Disadvantages

  • there is no easy way to add the suffix (min, uA, % ...). The key/value object pair in JSON is efficient for holding a value. Having a nested object (name:n, value:,v, suffix:s) would triple the size of the JSON payload. One option is to have a separate lookup table that maps value name with a suffix and use that when displaying in the Console and building the web response payload.

Then we save some heap memory

@proddy proddy reopened this Nov 30, 2020
@proddy proddy changed the title refactor how device values are printed to shell refactor device value rendering (to Web, Console or MQTT) to base class Dec 4, 2020
@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

@MichaelDvP I'm working on some changes here, moving all the logic that generates the JSON for Web and MQTT back to the EMSDevices base class. It saves a lot of stack memory, removes 200 lines of code and makes it easier to extend. All device variables (e..g currSelTemp) is registered just like we do with the telegram handlers and the base class handles the formatting. Hopefully, we'll also have less nested DynamicJson objects freeing up Heap.

The downside is that it changes almost all the code, so backporting PRs is time-consuming. I'm adding your changes manually each time. Which also means it's probably buggy.

What do you think about this approach?

  • release 2.1.1 as it is now (or soon) and call in 2.2, since it has more than just bug fixes
  • use dev for 2.2.1 and only adding small fixes, no new functionality
  • use a new branch 2.3 where we can both fine-tune the new framework, and add new features. When it's done we can merge it back into 'dev'

@MichaelDvP
Copy link
Contributor

I agree. Make a release and the refactoring based on this. Fixes and changes in a temporary new dev can be ported later.

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

great, you can decide when 2.1.1 -> 2.2 -> main/stable as you've done most of the changes. Let me know and I'll do the merge. There's a github actions script that does this. I'll also update the documentation.

@MichaelDvP
Copy link
Contributor

I think, freeze 2.1.1 now, take it as base, wait a day for possible feedback to the latest changes and if there is no massive issue release.

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

@MichaelDvP another opinion: shall we drop MQTT single mode? Just use either Nested or Home Assistant. It really simplifies the code.

@MichaelDvP
Copy link
Contributor

I like the single, mainly because of dallassensors, we discussed that before. Also thermostat with smaller packages for main data and hc (RC35) is nice, but not necessary.

We can drop, but i will always have a dallas format with unique id in my system.

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

What I meant is that we remove the MQTT format options. There is just one, which is for everyone, that is nested for thermostat/mixer and does what you want Dallas to do. No choices.

Then there is an option like 'Home Assistant?' which when enabled will do the extra bits needed to make HA work.

The only problem may be the different ways Dallas is been shown, then in that case have a 'Dallas MQTT format' option.

Reason I'm asking is that is greatly simplifies the new code I'm writing for all the EMS devices (not dallas)

@MichaelDvP
Copy link
Contributor

Make it nested and i add a checkbox to mqtt settings [x] Dallas sensor publish by numbers or inverted [x] Dallas sensor publish by id that takes effect on the non-HA-format, ok?

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

got it.

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

another thing, looking at the mixer, the topics are either mixer_data_hc<n> or mixer_data_wwc<n> depending on the circuit (and the device ID). The payload also has the type (either hc or wwc). Can we change this in v2.3 to just use mixer_data_<n> or is it possible there is both a HC1 and WWC1 connected?

@proddy
Copy link
Contributor Author

proddy commented Dec 4, 2020

oh forget that. It's only for single mode. I remember we now collect all the mixer payloads, combine them and send them as boiler_data. silly me.

@MichaelDvP
Copy link
Contributor

Yes, nested it's {hc1:{},wwc1:{}}, and it's possible to have all in one system (device 0x20-0x27 hc1-8, 0x28, 0x29: wwc1, 2

@MichaelDvP
Copy link
Contributor

@proddy: I think most issues solved now, but need a few additions.
I can push the changes need for emsesp/EMS-ESP#628 (as setting), emsesp/EMS-ESP#635, emsesp/EMS-ESP#637, emsesp/EMS-ESP#650, and emsesp/EMS-ESP#651.
Should i push to dev now and than you release new main with these changes, or add to next dev?
Or do a PR and you can check before? (I've added the changes to my repo)

BTW: as mentioned in 628 and 635 there seems to be somthing wrong in github for the dev-branch: source download points to 2.10.-main, from here. and here

@proddy
Copy link
Contributor Author

proddy commented Dec 8, 2020

Hi @MichaelDvP , I've changed (or touched) almost every single file with my 2.3 build so an auto merge PR will probably fail. It's better if you push all your changes directly to the dev branch and when you're happy I can merge into the main branch as 2.2 stable.

Then I'll manually try and merge into the fb-632-refactorvalues branch.

@proddy
Copy link
Contributor Author

proddy commented Dec 13, 2020

pushed some quite big changes to the fb-632-refactorvalues branch. I had to touch a lot of the code, and made many fixes and improvements along the way. Roughly what's new is

  • each device registers how the values are rendered
  • all the code in the device files to publish, show to console, check for changes has been removed and controlled in the EMSdevices base class. This removes hundreds of duplicated code lines.
  • add three new settings in the Web to configure the Dallas MQTT format, to enable Home Assistant and how to treat current room temperature values in the thermostat. The thermostat_data has two additional elements, hatemp and hamode based on the earlier suggestions
  • the unit of measure is appended to the value, and also sent as a separate data item to the web

Tested with simulations with an ESP32 and it seems to work.

But the boiler code fails when registering on an ESP8266. I need to debug and find out why. I think we're running out of memory heap perhaps.

@proddy
Copy link
Contributor Author

proddy commented Dec 13, 2020

and forgot to mention @MichaelDvP , still need to manually merge in your latest changes from dev

@MichaelDvP
Copy link
Contributor

Tested on 8266, but heap is very low. Need some retrys to load web, crashes on setting changes. This is without telnet!
v2 3_Heap

@proddy
Copy link
Contributor Author

proddy commented Dec 14, 2020

Yes, its a big problem on the ESP8266 and I'm worried, and angry too! I need to measure how much physical memory the new device value table struct is using. I was hoping very little since it only contains pointers to flash memory.

By the way for my testing I used the standalone version (make file and no ESP) and also an isolated ESP8266 without anything attached (just the USB) and simulating a Boiler using http://ems-esp/api?device=system&cmd=test&data=boiler. Then monitor the memory usage.

I'll continue this week with fine tuning

@MichaelDvP
Copy link
Contributor

I'll test with esp32 later. There are a lot of changes and i need some time to understand all. First look, maybe i'm wrong, but the prefix (tag) is stored as string for each value? And boiler tags are long and nearly the same for all values. Store only _ww, _info, "" to save memory?

I can add the b7-changes manually.

@proddy
Copy link
Contributor Author

proddy commented Dec 14, 2020

I think I'll just a bitmap, it only takes a byte and have masks. Then in the logic I determine the behaviour based on the device_id.

@MichaelDvP
Copy link
Contributor

I was wrong, i tested with shortend strings but it has no effect, It's always the same boiler string and the value-register is only a pointer to it.

@proddy
Copy link
Contributor Author

proddy commented Dec 14, 2020

I was wrong, i tested with shortend strings but it has no effect, It's always the same boiler string and the value-register is only a pointer to it.

yes the std::string uses a & (a reference). I could haved used a && with a std::move to do a real copy but it would have had the same affect. I think the right approach is to create constexpr for the tag and do the logic conversion in the rendering code. I'll try this out tonight and also look at printing out sizeof(dv) to see how big these registered device value objects really are. On the ESP8266 it uses 4 bytes to store a pointer and all flash data must be 32-bit aligned (https://arduino-esp8266.readthedocs.io/en/latest/PROGMEM.html#in-summary).

@proddy
Copy link
Contributor Author

proddy commented Dec 14, 2020

okay, so the new device value registry eats up so much memory is mind boggling. My test environment:

  • ESP8266
  • build with flags debug_flags = -DEMSESP_TEST -DEMSESP_FORCE_SERIAL -DEMSESP_DEBUG
  • MQTT is disabled (done via Web UI). This is so I can isolate only the mem issues in the new code.
  • connect ESP8266 to USB, Open Serial
  • Wifi Connects, MQTT connects
  • initial memory after startup, idle is 15488 bytes (82%)
  • add a boiler (using the test command or http://ems-esp/api?device=system&cmd=test&data=boiler`). Tthis will only add a single boiler. Note I commented out the boiler_data_ww and boiler_data_info, which leaves only 38 entries for boiler_data.
  • memory goes down to 6952 (37%) and 75% fragmentation

So registering just 38 entries takes up 6000 bytes of heap memory. Which is strange since the struct only contains pointers. This is why it's failing. Now I need to analyze why.

proddy referenced this issue in emsesp/EMS-ESP Dec 15, 2020
@proddy
Copy link
Contributor Author

proddy commented Dec 15, 2020

doing some memory profiling. Using the steps above

                                (core)                    Free heap: 81% (15128) (~0),    frag:15% (~0)
000+00:00:36.442 I 22: [system] (starting boiler regs)    Free heap: 70% (13048) (~2080), frag:6%  (~9)
000+00:00:36.442 I 23: [system] (after telegram type reg) Free heap: 64% (11856) (~1192), frag:6%  (~0)
000+00:00:36.442 I 24: [system] (after mqtt cmd reg)      Free heap: 57% (10584) (~1272), frag:12% (~6)

loading 38 device values (sizeof struct is 38 bytes, total mem should be 1520 bytes)
000+00:00:36.442 I 25: [system] (after device value reg)  Free heap: 42% (7784)  (~2800), frag:33% (~21)
000+00:00:40.010 I 26: [system] (core)                    Free heap: 48% (9000)  (~1216), frag:43% (~10)

after 11 mins, memory leak!
000+00:11:45.165 I 29: [system] (core)                    Free heap: 37% (6912) (~0), frag:26% (~0)

conclusions is that the std::vector takes a lot of memory. 2800 bytes for 38 objects. I will experiment with a lightweight implementation of a list and also a dynamic array to see how that helps.

@proddy proddy changed the title refactor device value rendering (to Web, Console or MQTT) to base class refactor device value rendering Dec 22, 2020
@proddy
Copy link
Contributor Author

proddy commented Dec 22, 2020

I've been spending quite some time in understanding how the heap works on an ESP8266 with an effort to reduce it so the implementation to register all the device values doesn't explode the ESP8266.

For the core of EMS-ESP I really need only two containers, a queue for the incoming Rx, outgoing Tx and outgoing MQTT messages and a fixed dynamic array for storing 2D data like the MQTT commands, Telegram handlers and these new device values.

for queues I'm using the C++ STL container implementation that comes with Arduino called std::list. It's a lovely thread-safe container with some nice features (like search) but each node occupies heap memory and can create a little fragmentation. Since all our queues are fixed sized it's better to allocate a continuous block of memory upfront. In tests I've tried out different containers like std::queue and std::deque. You can see the results below.

For the array I was using std::vector. Vector is quite light-weight and fast, but as it grows it reserves double the elements. Since our array's are (for the majority) fixed-size it would also makes sense to reserve what we can early on.

Since there is a lot of overhead in these containers I wrote my own lightweight single-threaded containers. Here are the benchmarking results with storing 200 structs (class elements):

container heap memory used memory fragmentation bytes used per element
std::vector 4416 9% 22
std::list 6712 0% 33
std::queue 4032 1% 20
std::deque 4552 1% 22
emsesp::array 3520 0% 17
emsesp::queue 3520 0% 17

In the struct itself, uint8_t's obviously take 1 byte, flash strings 4 bytes and function pointers using std::function with std::bind or lambda functions take 14 bytes, compared to the the old-style C void (*) pointers.

With these new changes I think I can reduce memory by 10-15%. I'll try later and see what it looks like...

proddy referenced this issue in emsesp/EMS-ESP Dec 29, 2020
@proddy
Copy link
Contributor Author

proddy commented Dec 29, 2020

Pushed in my latest efforts. Replaced all the std:: containers with home grown queues and arrays, thinking it would help. Now the heap is allocated up front, but problem is that it takes up too much on the ESP8266 so its explodes and crashes. I did manage to make some minor improvements along the way, like using C++17 and better optimizations that helped reduce the binary RAM from 50.5% to 48.4% and the Flash from 93% down to 86%. Also bugs and others things in the EMS-ESP code which I've now forgotten.

Note, there is still a memory leak somewhere, which you can see when compiling with EMSESP_DEBUG. I suspect this is in EMSdevice::generate_values_json(). So this is unusable at the moment.

Some next things to try are

  • moving the entire 77 EMS device table to PROGMEM. It's now as a std::vector in heap and taking up almost 1 KB
  • using shared_ptr's instead of structs in the array and queue

If I can't get it working then I'm going to give up, and re-write a version 3.0 using ESP-IDF and not stupid Arduino's which takes up more than half the memory.

@proddy
Copy link
Contributor Author

proddy commented Jan 11, 2021

the coding is complete. Been running stable for a few days. I'll rename the branch to v2.3 and start to extend with new features. The 2.2 dev branch should be renamed to as this will be the last minor release, only ESP8266 patches and bug fixes.

@proddy proddy closed this as completed Jan 11, 2021
@MichaelDvP
Copy link
Contributor

Just to let you know and to prevent double work, i started to test the refactored build on bbqkees esp32 prototype.
I found some bugs in mixer and for my RC35 and added the mqtt-base and dallas_fails so far.
I pushed this to my github, so you can compare.

@proddy
Copy link
Contributor Author

proddy commented Jan 17, 2021

Hi @MichaelDvP great that you're testing the ESP32 too. Please go ahead and submit anything directly to the branch. I will rename that branch shortly and call it something more meaningful like v3.0_esp32. Right now I'm adding in the Ethernet code and testing the current usage when using different Tx power settings

@proddy
Copy link
Contributor Author

proddy commented Jan 17, 2021

actually @MichaelDvP best wait until I've moved the branch. I'm about to push a lot of new changes, mainly renaming the wifi functions

@MichaelDvP
Copy link
Contributor

I also wait for feedback to Base setting and have to do some more tests, no hurry.

@proddy proddy changed the title refactor device value rendering (ESP32) refactor device value rendering Jan 21, 2021
@proddy proddy reopened this Jan 21, 2021
@proddy
Copy link
Contributor Author

proddy commented Jan 21, 2021

For the record, I'm experiencing frequent restarts. It could be memory related so monitoring and will spend some time debugging to try and chase the root cause.

@MichaelDvP
Copy link
Contributor

Memory was really low in the first ETH implementation, now it's much better, but i have only tested with wifi-only.
As mentioned before that mixer is buggy, registering a dataset on each ems-telegram. I've fixed this already.

I'm trying to connect the 8266 v2.2.1 buspowered and the esp32 by servicejack. Different device-ids, mqtt-base, etc. but this makes a lot of issues on ems and mqtt. Ems works best if both on the same id (0x0B) and one is set to tx_mode 0.
Also connecting both to the same mqtt-broker with different base and client-id crashes sometimes the connection. Have to test more, i can not specify the exact circumstances.

Another issue, the system uptime in v3 counts after some 10 hours suddenly +50 days. Maybe something in memory addressing.

@proddy
Copy link
Contributor Author

proddy commented Jan 25, 2021

Thanks for the feedback @MichaelDvP . I've been monitoring the memory with Grafana and haven't seen any more drops so I think its ok now. The restarts I was experiencing were most likely caused by Wifi reconnects.

I'll also test running multiple devices (one ESP8266 and one ESP32) to see if I can fix the MQTT issues you're seeing. I would had expected with different hostnames and client-IDs, and HA disabled, it should work.

Haven't see the issue with the +50 days yet:

{"status":"connected","rssi":92,"uptime":"001+11:43:03.642","uptime_sec":128583,"mqttfails":0,"rxsent":83632,"rxfails":14,"txread":19305,"txwrite":3,"txfails":0,"freemem":109884}

If you could merge in the MQTT base and Mixer fixes, I'll rename the branch to v3-esp32 or something.

I've also started looking into alternative libraries for Web and MQTT which don't use AsyncTCP and customized for the ESP32 which will gives us TLS/SSL.

@MichaelDvP
Copy link
Contributor

Ok, mixer, base and dallas-fails added, also a few fixes for roomctrl and master-thermostat.

I'll also push a new version (v2.2.1b1) with base and sensor-fails, also changes time to datetime and split boiler_info to be compatible with v3.
For mqtt i've found half a kB heap not storing full topic and add the base on publish (only v2.2). Working here for some days without issue.

@proddy
Copy link
Contributor Author

proddy commented Jan 25, 2021

thanks for those changes. i didn't mention earlier that using the Ethernet libraries uses 20Kb of heap too, if activated.

@proddy
Copy link
Contributor Author

proddy commented Jan 26, 2021

going to rename the branch

@proddy proddy closed this as completed Jan 26, 2021
@proddy proddy transferred this issue from emsesp/EMS-ESP Mar 14, 2021
@proddy proddy added the enhancement New feature or request label Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants