Static shared memory in PSRAM for model/imageTMP and tensor_arena #2215

caco3 · 2023-03-20T21:40:53Z

Proof of Concept

‼️ Do not merge!

Use same memory block for all `tensor_area`

This is working ok

Use same memory block for both models and `imageTMP`.

If the allocated memory gets too small, it gets freed an a larger block gets allocated. This happens withing the first round and then no change is needed anymore.

This seems to work fine for the first round but at the 2nd round I always run into

I (24722) C IMG BASIS: Image (zwImage) loaded from memory: 0, 0, 0
E (24722) C IMG BASIS: Image with size 0 loaded --> reboot to be done! Check that your camera module is workin

The fb how ever contains valid data!

* Testcase for #2145 and debug-log (#2151) * new models ana-cont-11.0.5, ana-class100-1.5.7, dig-class100-1.6.0 * Testcase for #2145 Added debug log, if allowNegativeRates is handeled * Fix timezone config parser (#2169) * make sure to parse the whole config line * fix crash on empty timezone parameter --------- Co-authored-by: CaCO3 <[email protected]> * Enhance ROI pages (#2161) * Check if the ROIs are equidistant. Only if not, untick the checkbox * renaming * Check if the ROIs have same y, dy and dx. If so, tick the sync checkbox * only allow editing space when box is checked * fix sync check * show inner frame on all ROIs * cleanup * Check if the ROIs have same dy and dx. If so, tick the sync checkbox * checkbox position * renaming * renaming * show inner frame and cross hairs on all ROIs * update ROIs on ticking checkboxes * show timezone hint * fix deleting last ROI * cleanup --------- Co-authored-by: CaCO3 <[email protected]> * restart timeout on progress, catch error (#2170) * restart timeout on progress, catch error * . --------- Co-authored-by: CaCO3 <[email protected]> * BugFix #2167 * Release 15.1 preparations (#2171) * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update changelog * Fix links to PR * Formating * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md * Update Changelog.md --------- Co-authored-by: Slider0007 <[email protected]> Co-authored-by: Slider0007 <[email protected]> * fix typo * Replace relative documentation links with absolute ones pointing to the external documentation (#2180) Co-authored-by: CaCO3 <[email protected]> * Sort model files in configuration combobox (#2189) * new models ana-cont-11.0.5, ana-class100-1.5.7, dig-class100-1.6.0 * Testcase for #2145 Added debug log, if allowNegativeRates is handeled * Sort model files in combobox * reboot task - increase stack size (#2201) Avoid stack overflow * Update interface_influxdb.cpp * Update Changelog.md --------- Co-authored-by: Frank Haverland <[email protected]> Co-authored-by: CaCO3 <[email protected]> Co-authored-by: CaCO3 <[email protected]> Co-authored-by: Slider0007 <[email protected]> Co-authored-by: Slider0007 <[email protected]>

…n PSRAM

caco3 · 2023-03-21T23:17:58Z

@Slider0007 @jomjol Maybe you have an idea why we run into the Image with size 0 loaded issue?

caco3 · 2023-03-21T23:27:37Z

Example log where the first loaded model is smaller than the 2nd one and again smaller than imageTMP:

I (11922) PSRAM: Allocated 921600 bytes in PSRAM for 'C IMG BASIS->CImageBasis (rawImage)'
I (13482) PSRAM: Allocated 128004 bytes in PSRAM for 'ALIGN->AlgROI'

I (13592) PSRAM: Allocated 819200 bytes in PSRAM for 'TFLITE->tensor_arena'

I (13642) TFLITE: First model to be loaded: /sdcard/config/dig-cont_0610_s3_q.tflite, Size: 315504
I (13662) PSRAM: Allocated 315504 bytes in PSRAM for 'shared model/imageTMP memory'

I (14532) PSRAM: Allocated 1920 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig1)'
I (14552) PSRAM: Allocated 13746 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig1 original)'
I (14582) PSRAM: Allocated 1920 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig2)'
I (14602) PSRAM: Allocated 13746 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig2 original)'
I (14622) PSRAM: Allocated 1920 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig3)'
I (14652) PSRAM: Allocated 13746 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI dig3 original)'

I (14772) TFLITE: 2nd model to be loaded: /sdcard/config/ana-cont_1105_s2_q.tflite, Size: 53328
I (14792) TFLITE: Currently allocated shared model/imageTMP memory: 315504
I (14822) TFLITE: Shared model/imageTMP memory is large enough for 2nd model

I (15002) PSRAM: Allocated 3072 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana1)'
I (15032) PSRAM: Allocated 37632 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana1 original)'
I (15052) PSRAM: Allocated 3072 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana2)'
I (15072) PSRAM: Allocated 37632 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana2 original)'
I (15102) PSRAM: Allocated 3072 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana3)'
I (15122) PSRAM: Allocated 37632 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana3 original)'
I (15152) PSRAM: Allocated 3072 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana4)'
I (15172) PSRAM: Allocated 37632 bytes in PSRAM for 'C IMG BASIS->CImageBasis (ROI ana4 original)'

I (15342) TFLITE SERVER: Round #1 started

I (25862) ALIGN: Currently allocated shared model/imageTMP memory: 315504
I (25892) ALIGN: Extending shared model/imageTMP memory so we can fit the imageTMP...
I (25912) PSRAM: Freeing memory in PSRAM used for 'shared model/imageTMP memory'...
I (25942) PSRAM: Allocated 921600 bytes in PSRAM for 'shared model/imageTMP memory'

I (52842) C IMG BASIS: Not freeing shared model/imageTMP memory (ImageTMP)
I (53012) TFLITE: 2nd model to be loaded: /sdcard/config/dig-cont_0610_s3_q.tflite, Size: 315504
I (53042) TFLITE: Currently allocated shared model/imageTMP memory: 921600
I (53062) TFLITE: Shared model/imageTMP memory is large enough for 2nd model

I (62832) TFLITE: 2nd model to be loaded: /sdcard/config/ana-cont_1105_s2_q.tflite, Size: 53328
I (62862) TFLITE: Currently allocated shared model/imageTMP memory: 921600
I (62882) TFLITE: Shared model/imageTMP memory is large enough for 2nd model

I (66232) TFLITE SERVER: Round #1 completed (51 seconds)

I (135362) TFLITE SERVER: Round #2 started

E (142502) C IMG BASIS: Image with size 0 loaded --> reboot to be done! Check that your camera module is working and

Slider0007 · 2023-03-22T19:12:39Z

@caco3, @jomjol
Please find attached some screenshots which try to decribe the actual PSRAM usage (only lager allocation visualized, not really pretty but maybe helpful to understand it better).

@caco3: This is my assumption why first round with your approach is working and you're failing with second round:

ROUND 1:

ROUND 2:

WORST CASE (max. models sizes)

Slider0007 approach: Keep models loaded, share tensor/ImageTMP

I did also some research to this topic. Approach is working till sum of 1000kB models (model1+2), then it also hangs at the same point than in approach from @caco3. This means 1200kB models are not working with this approach anymore. More detailed infos to further findings I have described in private chat. The testbranch is located here, if you'd like to have a look to it:

Branch: https://github.com/Slider0007/AI-on-the-edge-device/tree/keep-model-loaded
Precomplied firmware: https://github.com/Slider0007/AI-on-the-edge-device/actions/runs/4494015542

###Output of PSRAM memory blocks after a few rounds (flow finished):
Showing data for heap: 0x3f802c70
Block 0x3f8034dc data, size: 40 bytes, Free: No
Block 0x3f803508 data, size: 20 bytes, Free: No
Block 0x3f803520 data, size: 61440 bytes, Free: No
Block 0x3f812524 data, size: 55880 bytes, Free: No
Block 0x3f81ff70 data, size: 80 bytes, Free: No
Block 0x3f81ffc4 data, size: 16 bytes, Free: No
Block 0x3f81ffd8 data, size: 16 bytes, Free: No
Block 0x3f81ffec data, size: 16 bytes, Free: No
Block 0x3f820000 data, size: 16 bytes, Free: No
Block 0x3f820014 data, size: 16 bytes, Free: No
Block 0x3f820028 data, size: 16 bytes, Free: No
Block 0x3f82003c data, size: 16 bytes, Free: No
Block 0x3f820050 data, size: 16 bytes, Free: No
Block 0x3f820064 data, size: 12 bytes, Free: No
Block 0x3f820074 data, size: 24 bytes, Free: No
Block 0x3f820090 data, size: 2080 bytes, Free: No
Block 0x3f8208b4 data, size: 136 bytes, Free: No
Block 0x3f820940 data, size: 136 bytes, Free: No
Block 0x3f8209cc data, size: 1696 bytes, Free: No
Block 0x3f821070 data, size: 1696 bytes, Free: No
Block 0x3f821714 data, size: 1696 bytes, Free: No
Block 0x3f821db8 data, size: 1696 bytes, Free: No
Block 0x3f82245c data, size: 1696 bytes, Free: No
Block 0x3f822b00 data, size: 1696 bytes, Free: No
Block 0x3f8231a4 data, size: 1696 bytes, Free: No
Block 0x3f823848 data, size: 1696 bytes, Free: No
Block 0x3f823eec data, size: 1696 bytes, Free: No
Block 0x3f824590 data, size: 1696 bytes, Free: No
Block 0x3f824c34 data, size: 1696 bytes, Free: No
Block 0x3f8252d8 data, size: 1696 bytes, Free: No
Block 0x3f82597c data, size: 1696 bytes, Free: No
Block 0x3f826020 data, size: 1696 bytes, Free: No
Block 0x3f8266c4 data, size: 1696 bytes, Free: No
Block 0x3f826d68 data, size: 1696 bytes, Free: No
Block 0x3f82740c data, size: 552 bytes, Free: No
Block 0x3f827638 data, size: 768 bytes, Free: No
Block 0x3f82793c data, size: 56 bytes, Free: No
Block 0x3f827978 data, size: 12 bytes, Free: No
Block 0x3f827988 data, size: 12 bytes, Free: No
Block 0x3f827998 data, size: 56 bytes, Free: No
Block 0x3f8279d4 data, size: 12 bytes, Free: No
Block 0x3f8279e4 data, size: 80 bytes, Free: No
Block 0x3f827a38 data, size: 12 bytes, Free: No
Block 0x3f827a48 data, size: 12 bytes, Free: No
Block 0x3f827a58 data, size: 16 bytes, Free: No
Block 0x3f827a6c data, size: 76 bytes, Free: No
Block 0x3f827abc data, size: 56 bytes, Free: No
Block 0x3f827af8 data, size: 80 bytes, Free: No
Block 0x3f827b4c data, size: 184 bytes, Free: No
Block 0x3f827c08 data, size: 184 bytes, Free: No
Block 0x3f827cc4 data, size: 80 bytes, Free: No
Block 0x3f827d18 data, size: 40 bytes, Free: No
Block 0x3f827d44 data, size: 64 bytes, Free: No
Block 0x3f827d88 data, size: 104 bytes, Free: No
Block 0x3f827df4 data, size: 16 bytes, Free: No
Block 0x3f827e08 data, size: 16 bytes, Free: No
Block 0x3f827e1c data, size: 12 bytes, Free: Yes
Block 0x3f827e2c data, size: 44 bytes, Free: No
Block 0x3f827e5c data, size: 108 bytes, Free: Yes
Block 0x3f827ecc data, size: 56 bytes, Free: No
Block 0x3f827f08 data, size: 208 bytes, Free: No
Block 0x3f827fdc data, size: 208 bytes, Free: No
Block 0x3f8280b0 data, size: 16 bytes, Free: No
Block 0x3f8280c4 data, size: 120 bytes, Free: No
Block 0x3f828140 data, size: 64 bytes, Free: No
Block 0x3f828184 data, size: 104 bytes, Free: Yes
Block 0x3f8281f0 data, size: 136 bytes, Free: No
Block 0x3f82827c data, size: 942080 bytes, Free: No
Block 0x3f90e280 data, size: 128004 bytes, Free: No
Block 0x3f92d688 data, size: 226480 bytes, Free: No
Block 0x3f964b3c data, size: 1920 bytes, Free: No
Block 0x3f9652c0 data, size: 4860 bytes, Free: No
Block 0x3f9665c0 data, size: 1920 bytes, Free: No
Block 0x3f966d44 data, size: 4860 bytes, Free: No
Block 0x3f968044 data, size: 1920 bytes, Free: No
Block 0x3f9687c8 data, size: 4860 bytes, Free: No
Block 0x3f969ac8 data, size: 183756 bytes, Free: No
Block 0x3f996898 data, size: 3072 bytes, Free: No
Block 0x3f99749c data, size: 17788 bytes, Free: No
Block 0x3f99ba1c data, size: 3072 bytes, Free: No
Block 0x3f99c620 data, size: 17788 bytes, Free: No
Block 0x3f9a0ba0 data, size: 3072 bytes, Free: No
Block 0x3f9a17a4 data, size: 17788 bytes, Free: No
Block 0x3f9a5d24 data, size: 3072 bytes, Free: No
Block 0x3f9a6928 data, size: 17788 bytes, Free: No
Block 0x3f9aaea8 data, size: 3072 bytes, Free: No
----> same start block like below
Block 0x3f9abaac data, size: 17788 bytes, Free: No
Block 0x3f9b002c data, size: 856 bytes, Free: Yes
Block 0x3f9b0388 data, size: 56 bytes, Free: No
Block 0x3f9b03c4 data, size: 12 bytes, Free: No
Block 0x3f9b03d4 data, size: 512 bytes, Free: No

---> Helper structure memory
Block 0x3f9b05d8 data, size: 632400 bytes, Free: Yes --> only remaining not solved bigger fragmentation; potentially internal CImage helper structure which is neccessary to convert JPG to matrix image
<--- Helper structure memory

Block 0x3fa4ac2c data, size: 921604 bytes, Free: No
Block 0x3fb2bc34 data, size: 869316 bytes, Free: Yes

During Take image state in the following round (marked CImage Helper strucutre block):
----> same start block like above
Block 0x3f9abaac data, size: 17788 bytes, Free: No
Block 0x3f9b002c data, size: 208 bytes, Free: No
Block 0x3f9b0100 data, size: 208 bytes, Free: No
Block 0x3f9b01d4 data, size: 104 bytes, Free: Yes
Block 0x3f9b0240 data, size: 56 bytes, Free: No
Block 0x3f9b027c data, size: 264 bytes, Free: Yes
Block 0x3f9b0388 data, size: 56 bytes, Free: No
Block 0x3f9b03c4 data, size: 12 bytes, Free: No

---> Helper structure memory
Block 0x3f9b03d4 data, size: 18456 bytes, Free: No
Block 0x3f9b4bf0 data, size: 307216 bytes, Free: No
Block 0x3f9ffc04 data, size: 153616 bytes, Free: No
Block 0x3fa25418 data, size: 153616 bytes, Free: No
<--- Helper structure memory

Block 0x3fa4ac2c data, size: 921604 bytes, Free: No
Block 0x3fb2bc34 data, size: 869316 bytes, Free: Yes

If nothing come in between the same free blocks are allocated again, but whenever something is coming in between and take only some bytes PSRAM gets even more fragemented. That's my majot concern!

Actual implemented version without any preallocation (v15.0.3)

It seems that the actual implementation is the best balanced version, but only possible if no fragmentation occurs which is in my option the main issue.

If we get rid of the fragmentation we have a really good base to work on for further improvements. Unfortunately this would exclude the possibility to use wifi stack and bss in PSRAM and reduce the internal RAM again which is really bad to see. Up to now I have no glue if we come around the obstacle.

See #2200 for details

caco3 · 2023-03-22T22:50:04Z

Thanks @Slider0007 for the helpful visualization!

As a side note, it would not be difficult to tell the stb library to use the internal RAM! They actually provide #defines so malloc can be customized: https://github.com/jomjol/AI-on-the-edge-device/blob/rolling/code/components/jomjol_image_proc/stb_image.h#L627

Using MALLOC_CAP_INTERNAL we should be able to enforce it to use the internal RAM. Question is if we have enough space there. Or we could modify it to use a static memory block like we did for the other parts.

See #2200 for details Co-authored-by: CaCO3 <[email protected]>

* fix missing value data --------- Co-authored-by: CaCO3 <[email protected]>

…vice into rolling

* Use double instead of float * Error handling + set to RAW if newvalue < 0 * REST SetPrevalue: Set to RAW if newvalue < 0 * set prevalue with MQTT

caco3 · 2023-03-29T08:48:25Z

Some links which might help for further analysis:

- stb_image.h: Version update 2.25 -> 2.28 - stb_resize.h: Version update 0.96 -> 0.97 - stb_write.h: Version update 1.14 -> 1.16 Co-authored-by: CaCO3 <[email protected]>

* Rename module tag name * Rename server_tflite.cpp -> MainFlowControl.cpp * Remove redundandant MQTTMainTopic function * Update * Remove obsolete GetMQTTMainTopic

…TBOX for testing

…n PSRAM

…TBOX for testing

…n-the-edge-device into shared-psram-objects

caco3 · 2023-04-01T15:51:29Z

crash

350 KB model and

-CONFIG_SPIRAM_USE_MALLOC=y
+#CONFIG_SPIRAM_USE_MALLOC=y
+CONFIG_SPIRAM_USE_CAPS_ALLOC=y

=> no fragmentation, but most likely not enough internal RAM

jomjol and others added 5 commits March 19, 2023 18:31

use a single instance of CTfLiteClass and reuse the allocated space i…

d090e00

…n PSRAM

renaming

ebd7b4f

.

a913698

use shared memory for model and imageTMP

58a1571

caco3 changed the title ~~use a single instance of CTfLiteClass and reuse the allocated space in PSRAM~~ Static shared memory in PSRAM for model/imageTMP and tensor_arena Mar 22, 2023

log MQTT connection refused reasons (#2216)

e4a6fd3

Revert PSRAM usage as it lead to memory fragmentation.

267782d

See #2200 for details

This was referenced Mar 22, 2023

Revert PSRAM usage as it lead to memory fragmentation. #2224

Merged

Out of memory #2200

Closed

CaCO3 and others added 10 commits March 23, 2023 21:24

fix missing value data

c4b990a

Revert PSRAM usage as it lead to memory fragmentation. (#2224)

e2b66aa

See #2200 for details Co-authored-by: CaCO3 <[email protected]>

Fix missing value data in graph (#2230)

fa09680

* fix missing value data --------- Co-authored-by: CaCO3 <[email protected]>

Update Changelog.md (#2231)

9ffaf6e

Merge branch 'master' into rolling

db36fe2

Update interface_influxdb.cpp (#2233)

727b871

Merge branch 'rolling' of https://github.com/jomjol/AI-on-the-edge-de…

33bfef0

…vice into rolling

update copyright year

de1dcc4

Cleanup

f79e03f

Set prevalue using MQTT + set prevalue to RAW value (REST+MQTT) (#2252)

b6bfeea

* Use double instead of float * Error handling + set to RAW if newvalue < 0 * REST SetPrevalue: Set to RAW if newvalue < 0 * set prevalue with MQTT

caco3 and others added 4 commits March 30, 2023 21:54

removed the stb_image files and re-add them as a submodule. (#2223)

df12dea

- stb_image.h: Version update 2.25 -> 2.28 - stb_resize.h: Version update 0.96 -> 0.97 - stb_write.h: Version update 1.14 -> 1.16 Co-authored-by: CaCO3 <[email protected]>

Remove obsolete ClassFlowWriteList (#2264)

0e3a50d

Renaming & cleanup of some modules / functions in source code (#2265)

e995d6c

* Rename module tag name * Rename server_tflite.cpp -> MainFlowControl.cpp * Remove redundandant MQTTMainTopic function * Update * Remove obsolete GetMQTTMainTopic

enable USE_SHARED_MODEL_AND_IMAGETMP_MEMORY and CONFIG_MQTT_CUSTOM_OU…

3e752fd

…TBOX for testing

Slider0007 mentioned this pull request Apr 1, 2023

Keep models loaded in RAM - FOR TESTING - NO MERGE #2279

Closed

CaCO3 added 8 commits April 1, 2023 17:29

use a single instance of CTfLiteClass and reuse the allocated space i…

bfabc8e

…n PSRAM

renaming

99498ef

.

8e7c193

use shared memory for model and imageTMP

2b84d88

enable USE_SHARED_MODEL_AND_IMAGETMP_MEMORY and CONFIG_MQTT_CUSTOM_OU…

f1aebef

…TBOX for testing

Merge branch 'shared-psram-objects' of https://github.com/jomjol/AI-o…

c7c967b

…n-the-edge-device into shared-psram-objects

extend psram files

547044e

STBI use PSRAM hooks

59bf0d4

caco3 closed this Apr 30, 2023

caco3 deleted the shared-psram-objects branch May 2, 2023 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Static shared memory in PSRAM for model/imageTMP and tensor_arena #2215

Static shared memory in PSRAM for model/imageTMP and tensor_arena #2215

caco3 commented Mar 20, 2023 •

edited

Loading

caco3 commented Mar 21, 2023

caco3 commented Mar 21, 2023

Slider0007 commented Mar 22, 2023 •

edited

Loading

caco3 commented Mar 22, 2023

caco3 commented Mar 29, 2023

caco3 commented Apr 1, 2023

Static shared memory in PSRAM for model/imageTMP and tensor_arena #2215

Static shared memory in PSRAM for model/imageTMP and tensor_arena #2215

Conversation

caco3 commented Mar 20, 2023 • edited Loading

Proof of Concept

Use same memory block for all tensor_area

Use same memory block for both models and imageTMP.

caco3 commented Mar 21, 2023

caco3 commented Mar 21, 2023

Slider0007 commented Mar 22, 2023 • edited Loading

Slider0007 approach: Keep models loaded, share tensor/ImageTMP

Actual implemented version without any preallocation (v15.0.3)

caco3 commented Mar 22, 2023

caco3 commented Mar 29, 2023

caco3 commented Apr 1, 2023

caco3 commented Mar 20, 2023 •

edited

Loading

Use same memory block for all `tensor_area`

Use same memory block for both models and `imageTMP`.

Slider0007 commented Mar 22, 2023 •

edited

Loading