Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix esp32 event callback serialisation #2913

Merged
merged 18 commits into from
Nov 25, 2024

Conversation

mikee47
Copy link
Contributor

@mikee47 mikee47 commented Nov 20, 2024

This PR addresses the issue raised in #2912 by pushing events into the tcpip thread for handling. Network applications therefore 'live' in the tcpip task and the Sming task is created only for non-networked applications. Sming uses a separate event callback queue instead of the IDF one.

  • Use custom queue for sming messaging
  • Serialise idf messages to sming queue in networking code
  • Use idle task watchdogs instead of custom one
  • Push task callbacks to tcpip thread in network builds
  • Fix stack sizes. Network builds just use default size for event task stack
  • Use heap for initial partition table load. Not enough room on event task stack for this.
  • Add a semaphore to smg_uart_write so debug output doesn't get garbled between tasks (e.g. wifi).

TODO:

  • Deal with events from wifi task
  • How to deal with task watchdog? This is attached to the Sming task and reset inside the event loop, but there are also calls to esp_task_wdt_reset (and indirectly via system_soft_wdt_feed) from code which is now running in the tcpip task.
  • Decide whether to keep the task name in debug output. The goal of this PR is to ensure everything is properly serialised to one task, so at that point it will be less useful.

@mikee47 mikee47 marked this pull request as draft November 20, 2024 21:33
Copy link

what-the-diff bot commented Nov 20, 2024

PR Summary

  • Enhanced Thread Safety in UART Operations
    A Lock class was included in uart.cpp to manage mutexes, a tech concept to ensure multiple processes share a single resource without conflict. This makes UART operations secure when performed by multiple threads, thus promoting better functionality.

  • Expanded Common Configuration Size
    CONFIG_LWIP_TCPIP_TASK_STACK_SIZE was expanded in the common configuration. This provides more memory space to handle tasks efficiently.

  • Added Task Name Retrieval Function
    A new function, os_get_task_name(), was presented in esp_system.h that serves to fetch the current task's name. This addition can assist with tracking and managing tasks.

  • Improved System Initialization
    The main() function in startup.cpp was updated to use System.queueCallback(init) as a part of the initialization process instead of direct init() call, giving a more reliable system initialization.

  • Conditional Definition Based on Network Status
    Introduced a condition-based definition of SMING_TASK_STACK_SIZE in startup.cpp depending on the DISABLE_NETWORK preprocessor directive, modifying its behavior in response to network status.

  • Introduced Responsive TCP/IP Event Handler
    A new TCP/IP event handler was created in tasks.cpp to react to events varied based on network conditions, promoting more effective interaction and responsiveness.

  • Improved Partition Information Loading
    Altered the method of loading partition information in Device.cpp to use a dynamic buffer with std::make_unique, leading to a more efficient and responsive loading process.

  • Updated Logging Function
    The debug_e macro in debug_progmem.h was improved to include the task's name in the log output for the ESP32 platform, allowing for superior debugging and tracking of tasks.

@slaff slaff added this to the 6.0.0 milestone Nov 21, 2024
@mikee47
Copy link
Contributor Author

mikee47 commented Nov 22, 2024

So message handling is, I think, much improved. I've discarded the customised (hacked) event loop and reverted to the standard event loop since it's now just used for IDF stuff. Sming now uses a separate queue for queueCallback stuff and includes a watchdog reset in there.

I'm now using the RTC watchdog which is normally disabled after boot, but it's more convenient than a task watchdog as it can be reset from any task.

However, in idle situations the watchdog expires as there's nothing being posted to the message queue.

If anyone has thoughts on this please comment! Here's the IDF page for reference https://docs.espressif.com/projects/esp-idf/en/v5.2.3/esp32/api-reference/system/wdts.html.

@mikee47
Copy link
Contributor Author

mikee47 commented Nov 22, 2024

Also the RTC watchdog is only available for esp32 and esp32s2. I'm inclined to ditch the wdt stuff entirely...

Use separate queue for sming messaging, serviced via tcpip thread or Sming thread if networking disabled.
SmartConfig hangs on `esp_event_handler_register` call if invoked via task queue.
Can deadlock if used inside `init()` as it's called in low-priority thread.
@mikee47 mikee47 force-pushed the fix/tcpip-task-event-handling branch from ef31d29 to e6e9d26 Compare November 25, 2024 11:55
@slaff
Copy link
Contributor

slaff commented Nov 25, 2024

Also the RTC watchdog is only available for esp32 and esp32s2. I'm inclined to ditch the wdt stuff entirely...

Just for my understanding? Are you adding the wdt reset to make it more convenient for the developers or because there is some requirement? If there is no requirement maybe we should ditch the wdt and explain in a document that it is users responsibility to use the watchdog for time-consuming tasks.

@mikee47 mikee47 changed the title Fix tcpip task event handling Fix esp32 event callback serialisation Nov 25, 2024
@mikee47 mikee47 marked this pull request as ready for review November 25, 2024 12:34
@mikee47
Copy link
Contributor Author

mikee47 commented Nov 25, 2024

Also the RTC watchdog is only available for esp32 and esp32s2. I'm inclined to ditch the wdt stuff entirely...

Just for my understanding? Are you adding the wdt reset to make it more convenient for the developers or because there is some requirement? If there is no requirement maybe we should ditch the wdt and explain in a document that it is users responsibility to use the watchdog for time-consuming tasks.

The WDT is there to make behaviour consistent with esp8266 (and rp2040). I've gone back to the default IDF behaviour which enables the IDLE task watchdog by default. All I've done is change the timeout from 5s to 8s. When applications call system_soft_wdt_feed() we yield the task rather than resetting the watchdog hardware directly.

NB. this approach is probably the only one that makes sense since it guards all tasks and not just the Sming ones.

@slaff slaff merged commit 502e304 into SmingHub:develop Nov 25, 2024
31 checks passed
@mikee47 mikee47 deleted the fix/tcpip-task-event-handling branch November 25, 2024 13:35
@mikee47
Copy link
Contributor Author

mikee47 commented Nov 25, 2024

The CI failures are my fault: I made a minor update to the esp-idf fork and omitted to tag it.

Most of the tcpip code is, by default, emitted to flash but many functions can be moved to IRAM with CONFIG_LWIP_IRAM_OPTIMIZATION=y. This PR uses tcpip_callbackmsg_trycallback_fromisr (which in turn calls sys_mbox_trypost_fromisr) but neither of these, it would appear, are candidates for such optimisation.

The esp32 is more tolerant of calling code from flash within interrupt routines, but for efficiency
and reliability these two functions should be in IRAM. This is the reason for the updated.

NB. Prior to v5.3 I want to get v5.2.3 sorted. This will require work to the esp32 Component which can also be simplified considerably now we've dropped older versions :-) This is in progress ATM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants