Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory & Heap issues with 4.3 (FreeRTOS functions, const char * in DRAM .data and TLSF control_t) (IDFGH-6143) #7822

Closed
philippe44 opened this issue Nov 3, 2021 · 11 comments
Assignees
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally

Comments

@philippe44
Copy link

philippe44 commented Nov 3, 2021

While investigating and issue with available heap decrease between 4.3 and 4.0 while porting my application, I've opened #7813 and #7818 that I've closed now that I understand the problem and I think it's better to make a summary here. I observed a loss of 10-15kB of the precious internal RAM for any application and I've seen 3 causes:

1- On 4.3, the FREERTOS_PLACE_FUNCTIONS_INTO_FLASH seems to be disabled by default where it is not an option in 4.0. I missed that, my bad, but that was a reduction of ~7kB of available on-chip RAM for heap. That is simple to "solve"

2- The TLSF has a ton of asserts() with a string in heap_tlsf.c. That's nice but unfortunately, as these functions are placed in IRAM for efficiency, it seems that (logically?) the linker also places the strings (which should normaly be in .rodata in Flash) in DRAM .data, so that's another 1.5kB loss. The tlsf_assert can be changed to nothing, but that does not seem wise, maybe there is a way to still move these string to .rodata in Flash.

3- The TLSF itself requires a gigantic header/control block of 1.6kB for each heap zone, and as esp32 seems to have 5, then we end up "wasting" ~9kB. This one seems really awful to me (no offense meant) because it's a lot of scarce memory (for a better algorithm, I understand). Maybe moving SL_INDEX_COUNT_LOG2 to 4 would be a better size/efficiency compromise or it could become a config parameter. Or it could be made a runtime parameter that depends on the memory pool size, that would allow to use minimize overhead as each internal block is small and a larger block when SPIRAM is used but there we don't really care. This is obviously more complicated as the sizeof(control_t) is now irrelevant and must be calculated per heap zone.

[edit]: Also, when using SPIRAM, currently the control_t is above 2kB which becomes ridiculous knowing that some hezap zone of internal RAM are as small as 6kB or 15kB. I feel that a variable control_t is necessary and I'd be happy to work on a patch if you agree with the idea.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Nov 3, 2021
@github-actions github-actions bot changed the title Memory & Heap issues with 4.3 (FreeRTOS functions, const char * in DRAM .data and TLSF control_t) Memory & Heap issues with 4.3 (FreeRTOS functions, const char * in DRAM .data and TLSF control_t) (IDFGH-6143) Nov 3, 2021
philippe44 added a commit to philippe44/esp-idf that referenced this issue Nov 4, 2021
@sle118
Copy link

sle118 commented Nov 4, 2021

Looking forward to seeing that commit reviewed and merged!

@0xjakob
Copy link
Contributor

0xjakob commented Nov 16, 2021

@philippe44 regarding 2-:

The idea of asserts is to use them only during development. Once the application is stable, you can disable them. Since they only abort the application and reset the device, they're also not very meaningful for a user. That being said, if you put their string into flash and the assert triggers when flash operations are disabled, then the assert will trigger an exception which is also not very pleasant.

For the rest of the points my knowledge is too limited, I can't make any judgement but I'll pass this to colleagues who are more familiar.

@philippe44
Copy link
Author

@0xjakob: absolutely, I forgot about that but I was surprised because it was a change in 4.3 vs 4.0. I can disable them in general.

@X-Ryl669
Copy link
Contributor

then the assert will trigger an exception which is also not very pleasant.

But very useful. It'll pinpoint where the assert failed, so even without a debugger, with gdbstub, you'll see where the assert failed. If it doesn't trigger an exception, it's very likely the assert will go unnoticed (what's a line in a hundred of lines of logs?) and your code will crash later on because the allocated memory is missing or corrupted.

@philippe44
Copy link
Author

I've moved to make them simply silent. The issue is that our project is becoming quite big and every piece of ram is needed and now every piece of flash is needed as well...

@0xjakob
Copy link
Contributor

0xjakob commented Jan 12, 2022

then the assert will trigger an exception which is also not very pleasant.

But very useful. It'll pinpoint where the assert failed, so even without a debugger, with gdbstub, you'll see where the assert failed. If it doesn't trigger an exception, it's very likely the assert will go unnoticed (what's a line in a hundred of lines of logs?) and your code will crash later on because the allocated memory is missing or corrupted.

I'm afraid not. If the assert does not trigger an exception, then yes, the default is to print out the register stack trace which makes it possible to see where the assert fails. That's what the assert is for.

If, however, there's an exception while the assert function is running and printing the stack trace, then you lose all this information. You may see a stack trace but to the assert code itself, not the code the assert is back tracing at that moment. All you'll know is that an assert triggered, but not where.

@X-Ryl669
Copy link
Contributor

X-Ryl669 commented Jan 12, 2022

You should see this patch and the file heap_tlsf.c so it pinpoints the assert that were changed.
Basically, it goes from:

tlsf_assert(someCondition && "Some text") 

to

tlsf_assert(someCondition); // Some text

There is no change whatsoever in the way the assert is handled. If the assert stop the execution (as it should), you'll get an address somewhere in the stack to this line. You'll use addr2line or gdb or whatever too, you'll get to the code line above and you'll see the message Some text.
If it does not stop the execution, you'll get "Assert failed at __ FILE __: __ LINE __" (which, IMHO is worst, since it implies saving the complete file path in the final binary), but again, you'll jump to the file & line and also see the text.
If the assert trigger an exception where you've lost the source of the assert, then I urgently invite you to change your assert code to something useful. Event a simple #define Assert(X) if (!X) { *(char *)0 = 42; } would be better in this case.

In all case, the patched version is better since it does not require storing useless assert text in the final binary, and you can still have the assert enabled in your build.

@AxelLin
Copy link
Contributor

AxelLin commented Jan 15, 2022

@philippe44 regarding 2-:

The idea of asserts is to use them only during development. Once the application is stable, you can disable them. Since they only abort the application and reset the device, they're also not very meaningful for a user.

No, this is not ture.
It's common to have assertion enabled for relase code.
It can avoid undefined behavior if it indeed hit assertion in some cases.
e.g. reset device is better than have a dead device runing in a endless loop.

@philippe44
Copy link
Author

Well, what I hope really is that my PR will be accepted...

@AxelLin
Copy link
Contributor

AxelLin commented Jan 16, 2022

2- The TLSF has a ton of asserts() with a string in heap_tlsf.c. That's nice but unfortunately, as these functions are placed in IRAM for efficiency, it seems that (logically?) the linker also places the strings (which should normaly be in .rodata in Flash) in DRAM .data, so that's another 1.5kB loss. The tlsf_assert can be changed to nothing, but that does not seem wise, maybe there is a way to still move these string to .rodata in Flash.

Does CONFIG_COMPILER_OPTIMIZATION_ASSERTIONS_SILENT=y help?

@espressif-bot espressif-bot added Resolution: Won't Do This will not be worked on Status: Done Issue is done internally and removed Status: Opened Issue is new labels Aug 9, 2022
espressif-bot pushed a commit that referenced this issue Oct 27, 2022
- remove tlsf_platform.h from esp-idf since the fl_index is now calculated
  based on the size of the requested heap
- update CMakeLists.txt accordingly

* based on the changes made to the TLSF in #7829
* contributes to fix #7822
@AxelLin
Copy link
Contributor

AxelLin commented Dec 23, 2022

v5.0: c173845
v4.4: 345a12c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

7 participants