-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve thread safety in esp_timer (IDFGH-9920) #11215
improve thread safety in esp_timer (IDFGH-9920) #11215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix @JensGutermuth, the changes looks good to me!
Do you happen to have any piece of code which triggers this race condition? I would like to add a test case for this, so if you have some code already it would save a bit of time. (I.e. something that fails without the fix and works correctly after the fix.)
I'll see if I can put together a small example triggering the race tomorrow. I've seen the issue sporadically in a fairly large proprietary project. Heavily using the NimBLE GATT server seemed to have triggered it most often. Side note: You guys are fast! Getting a thoughtful response in under 30 minutes is impressive. Very cool! |
Replacing Code to reproduce the race in esp_timer#include <stdio.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include <esp_timer.h>
#include <esp_err.h>
#include <esp_log.h>
static esp_timer_handle_t shared_timer = NULL;
static volatile int shared_timer_happened = 0;
void esp_timer_callback(void * arg)
{
// do something
shared_timer_happened++;
}
/**
* @brief Starts and stops the shared_timer in a loop to trigger a race
*
* Starting and stopping an esp_timer from different tasks may
* trigger a race. This function usually triggers it within a couple of seconds
* if run concurrently in different tasks, perferably on different cores.
*/
void toggle_shared_timer(void * pvParameters)
{
while(1)
{
for (int i = 0; i < 10000; i++)
{
esp_timer_start_once(shared_timer, 1000);
esp_timer_stop(shared_timer);
}
ESP_LOGI("check", "ok core %d", xPortGetCoreID());
// yield to allow the IDLE task to run
vTaskDelay(1);
}
}
void app_main(void)
{
// create shared timer
esp_timer_create_args_t create_args = {
.callback = esp_timer_callback,
.arg = NULL,
.dispatch_method = ESP_TIMER_TASK,
.name = "shared_timer",
.skip_unhandled_events = true
};
ESP_ERROR_CHECK(esp_timer_create(&create_args, &shared_timer));
// start second task
xTaskCreatePinnedToCore(toggle_shared_timer, "second_task", 4096, NULL, 20, NULL, 1);
// toggle the shared timer in a loop
toggle_shared_timer(NULL);
} Example console output
|
Thanks for addressing my comment and for the reproducer, @JensGutermuth! One final request, could you please combine your two commits into one ( |
Inadequate locking in the esp_timer component allowed corruption of the s_timers linked list: 1. timer_armed(timer) returns false 2. another task arms the timer and adds it to s_timers 3. the list is locked 4. the timer is inserted into s_timers again The last step results in a loop in the s_timers list, which causes an infinite loop when iterated. This change always locks the list before checking if the timer is already armed avoiding the data race.
a2eb8a4
to
3ba7049
Compare
Sure :) |
sha=3ba70490c94f306dc2573965879581b2f3210cc2 |
I can reproduce the panic with v4.3 branch using #11215 (comment) . |
I am also hitting this in v4.4 all the time. In my case it caused a circular list of timers and corresponding infinite loop. Please get this merged and backported. My issue: #11338 |
Inadequate locking in the esp_timer component allowed corruption of the s_timers linked list:
The last step results in a loop in the s_timers list, which causes an infinite loop when iterated. This change always locks the list before checking if the timer is already armed avoiding the data race.