Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OOB-5.14][DISCO_L475VG_IOT01A] defining MBED_SLEEP_TRACING_ENABLED with LPTICKER fails in mutex ISR error. #11497

Closed
korjaa opened this issue Sep 17, 2019 · 38 comments · Fixed by #13034

Comments

@korjaa
Copy link
Contributor

korjaa commented Sep 17, 2019

Description

Target: DISCO_L475VG_IOT01A
Toolchain: GCC_ARM 6.3.1 20170620 (release)
mbed-cli: 1.10.0
mbed-os: f60e9a8075 (HEAD, tag: mbed-os-5.14.0-rc2, origin/mbed-os-5.14) Merge pull request #11476 from ARMmbed/release-candidate

Steps to reproduce:

  1. Create new project with mbed new
  2. Checkout mbed-os-5.14.0-rc2
  3. Add below to main.cpp (https://os.mbed.com/docs/mbed-os/development/tutorials/power-optimization.html)
#include "mbed.h"
#include "mbed_stats.h"

static DigitalOut led(LED1);

int main() {
   while (1) {
       led = !led;
       wait_ms(2000);

       mbed_stats_cpu_t stats;
       mbed_stats_cpu_get(&stats);
       printf("Uptime: %llu ", stats.uptime / 1000);
       printf("Sleep time: %llu ", stats.sleep_time / 1000);
       printf("Deep Sleep: %llu\n", stats.deep_sleep_time / 1000);
   }
}
  1. Add MBED_SLEEP_TRACING_ENABLED to mbed_app.json macros.
  2. Build with mbed compile --target DISCO_L475VG_IOT01A --toolchain GCC_ARM --clean

Log output:
Below message is printed in a loop.

++ MbedOS Error Info ++
Error Status: 0x80010133 Code: 307 Module: 1
Error Message: Mutex: 0x2000267C, Not allowed in ISR context
Location: 0x8001FC1
Error Value: 0x2000267C
Current Thread: �0�  Id: 0x0 Entry: 0x8002245 StackSize: 0x0 StackMem: 0x80039F5 SP: 0x10000490 
For more info, visit: https://mbed.com/s/error?error=0x80010133&tgt=DISCO_L475VG_IOT01A
-- MbedOS Error Info --

= System will be rebooted due to a fatal error =
= Reboot count(=134268113) reached maximum, system will halt after rebooting

The message it self says the system will be halted, but that doesn't seem to happen.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug
@0xc0170
Copy link
Contributor

0xc0170 commented Sep 17, 2019

There are similar reports #10051

cc @ARMmbed/team-st-mcd

@korjaa
Copy link
Contributor Author

korjaa commented Sep 17, 2019

There are similar reports #10051

There is a mention of stack overflow issues in https://os.mbed.com/docs/mbed-os/development/tutorials/power-optimization.html
The recommendation is to increase idle thread stack size with "rtos.idle-thread-stack-size": 1024

I'm observing Not allowed in ISR context issue, I'm not sure if they are related.

@LMESTM
Copy link
Contributor

LMESTM commented Sep 17, 2019

Could you still increase the stack size here and see if that can help ?
Rather than increasing rtos.idle-thread-stack-size please increase the idle-thread-stack-size-debug-extra value here (to 512 for instance):
https://github.com/ARMmbed/mbed-os/blob/master/rtos/source/TARGET_CORTEX/mbed_lib.json#L36

Stack overflow seems to possibly have side-effects of mutex/ISR issues ...

@maxgerhardt
Copy link
Contributor

No idea if this type of error still occurs in this version but maybe this is related: #10481

@ciarmcom
Copy link
Member

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-1768

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

Hi @LMESTM, I did try the mentioned value of 1024 on idle thread, but not on debug thread. I'll try that next.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

As a side note, I just tried and it seems to reproduce in NUCLEO_F411RE also. This was tried with "rtos.idle-thread-stack-size": 4096.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

My current mbed_app.json configuration

{
    "target_overrides": {
    	"*": {
    		 "platform.cpu-stats-enabled": 1,
    		 "rtos.idle-thread-stack-size": 4096,
    		 "rtos.idle-thread-stack-size-debug-extra": 4096
    	}
    },
    "macros": ["MBED_TICKLESS=1", "MBED_SLEEP_TRACING_ENABLED"]
}

This fails on both NUCLEO_F411RE and DISCO_L475VG_IOT01A

@LMESTM
Copy link
Contributor

LMESTM commented Sep 18, 2019

Ok I could reproduce it.

The backtrace shows that it is caused by mbed_error_printf being called from interrupt context in the functions related to sleep tracing (MBED_SLEEP_TRACING_ENABLED)

#7  0x08001470 in mbed_error (error_status=error_status@entry=1, error_msg=0x800dd04 "Not allowed in ISR context",
    error_value=error_value@entry=134274308, filename=filename@entry=0x0, line_number=line_number@entry=0)
    at .\platform\source\mbed_error.c:282
#8  0x08001fc0 in trap_rtx_error (error_value=134274308, rtx_status=0, error_status=1)
    at .\rtos\source\TARGET_CORTEX\mbed_rtx_handlers.c:118
#9  0x08002fd0 in osMutexAcquire (mutex_id=<optimized out>, timeout=timeout@entry=4294967295)
    at .\rtos\source\TARGET_CORTEX\rtx5\RTX\Source\rtx_mutex.c:483
#10 0x080018aa in singleton_lock () at ./platform/SingletonPtr.h:54
#11 0x080019f2 in __cxa_guard_acquire (guard_object_p=0x20000dd8 <guard variable for default_console()::console>)
    at .\platform\source\mbed_retarget.cpp:1503
#12 0x08001a9c in default_console () at .\platform\source\mbed_retarget.cpp:274
#13 get_console (fd=2) at .\platform\source\mbed_retarget.cpp:293
#14 mbed::mbed_file_handle (fd=2) at .\platform\source\mbed_retarget.cpp:304
#15 0x08001b22 in write (fildes=fildes@entry=2, buf=buf@entry=0x1000053c, length=length@entry=0)
    at .\platform\source\mbed_retarget.cpp:677
#16 0x0800107e in mbed_error_puts (str=str@entry=0x1000053c "LOCK: lp_ticker.c, ln: 363, lock count: 1\r\n")
    at .\platform\source\mbed_board.c:97
#17 0x080010c8 in mbed_error_vprintf (format=format@entry=0x800e1f1 "lp_ticker.c", arg=..., arg@entry=...)
    at .\platform\source\mbed_board.c:71
#18 0x080010e4 in mbed_error_printf (format=0x800d97b "LOCK: %s, ln: %i, lock count: %u\r\n")
    at .\platform\source\mbed_board.c:55
#19 0x08001700 in sleep_tracker_lock (filename=filename@entry=0x800e1f1 "lp_ticker.c", line=line@entry=363)
    at .\platform\source\mbed_power_mgmt.c:167
#20 0x0800589a in lp_ticker_set_interrupt (timestamp=<optimized out>) at .\targets\TARGET_STM\lp_ticker.c:363
#21 0x080007de in ticker_set_handler (ticker=0x800d260 <lp_data>,
    handler=handler@entry=0x80002c3 <mbed::TimerEvent::irq(unsigned long)>) at .\hal\mbed_ticker_api.c:281
#22 0x080002e8 in mbed::TimerEvent::TimerEvent (
    this=0x20000ce0 <mbed::internal::(anonymous namespace)::os_timer_data>, data=<optimized out>)
    at .\drivers\source\TimerEvent.cpp:31
#23 0x08000a68 in mbed::internal::SysTimer<1000ul, true>::SysTimer (
    this=0x20000ce0 <mbed::internal::(anonymous namespace)::os_timer_data>, data=<optimized out>)
    at .\platform\source\SysTimer.cpp:74
#24 0x0800151e in mbed::internal::init_os_timer () at .\platform\source\mbed_os_timer.cpp:53
#25 0x08002162 in OS_Tick_Enable () at .\rtos\source\TARGET_CORTEX\mbed_rtx_idle.cpp:81
#26 0x0800246e in svcRtxKernelStart () at .\rtos\source\TARGET_CORTEX\rtx5\RTX\Source\rtx_kernel.c:247
#27 0x080021ba in SVC_Handler () at irq_cm4f.S:63

That is somehow surprising as the history of mbed_power_mgmt.c shows that mbed_error_printf is supposed to be ISR friendly ...

$ git log -L 167,167:platform/source/mbed_power_mgmt.c
commit 7e5e5361bf5cb88ae7d47c6f05254fd86a02c57c
Author: deepikabhavnani <[email protected]>
Date:   Fri Jan 4 16:45:09 2019 -0600

    Use mbed_error_printf instead of debug for ISR friendly prints

    Internally in sleep tracing `debug` was used for trace prints, but
    all sleep API's are ISR safe and used in interrupts. This resulted in
    hardfaults / errors.

    Solution is to use `mbed_error_printf` instead for printing on UART

so I'm not sure what to look at next ...
@0xc0170 Who could confirm whether mbed_error_printf is really ISR friendly ?

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 18, 2019

It should be:

 * @note Synchronization level: Interrupt safe, as long as the
 *       FileHandle::write of the stderr device is. The default
 *       serial console is safe, either buffered or not. If the
 *       console has not previously been initialized, an attempt
 *       to use this from interrupt may during console initialization.
 *       Special handling of `mbed_error` relaxes various system traps
 *       to increase the chance of initialization working.
 *
 * @param str    C string that contains data stream to be printed.
 *
 */
void mbed_error_puts(const char *str);

this puts function has write(STDERR_FILENO, str, 0); that might be problematic. From the logs above, I can see this one is triggering it:

 #16 0x0800107e in mbed_error_puts (str=str@entry=0x1000053c "LOCK: lp_ticker.c, ln: 363, lock count: 1\r\n")
    at .\platform\source\mbed_board.c:97

@kjbracey-arm Can you review ?

@kjbracey
Copy link
Contributor

kjbracey commented Sep 18, 2019

mbed_error_printf is ISR friendly (if the underlying console device is) but only if the console device has already been initialised.

A dummy print at the top of main before any sleeps happen will initialise it, making subsequent calls from ISR/critical section work.

@kjbracey
Copy link
Contributor

That note actually describes the behaviour, but the word "crash" is missing.

If the
 *       console has not previously been initialized, an attempt
 *       to use this from interrupt may CRASH during console initialization.

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 18, 2019

A dummy print at the top of main before any sleeps happen will initialise it, making subsequent calls from ISR/critical section work.

I recall this was discussed previously and we haven't found a better way than this. Is this documented? If not we should have. A user should be able to find this once this error is hit in their app.

Looking at https://os.mbed.com/docs/mbed-os/development/tutorials/power-optimization.html , there is no note about console relation , shall we add it to the troubleshooting ? For a user to find it as they will search for it once they hit the error ("sleep logging mutex error").

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

We quickly tried to add a dummy print to the main, but unfortunately without much luck.

@kjbracey
Copy link
Contributor

Is this documented?

It's documented in mbed_error_printf itself, but it certainly needs to be documented in the sleep tracing docs as they're then built on top of that.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

Adding the dummy print to the beginning and using unbuffered serial causes a reset loop like originally repoted.

If this is interrupted, the backtrace looks as following:

(gdb) bt
#0  0x0800587c in serial_writable (obj=obj@entry=0x200038f8 <stdio_uart>)
    at ./mbed-os/targets/TARGET_STM/serial_api.c:522
#1  0x080050e2 in serial_putc (obj=obj@entry=0x200038f8 <stdio_uart>, c=56)
    at ./mbed-os/targets/TARGET_STM/TARGET_STM32F4/serial_device.c:283
#2  0x08001ff0 in DirectSerial::write (this=<optimized out>, buffer=<optimized out>, size=1)
    at ./mbed-os/platform/source/mbed_retarget.cpp:185
#3  0x08002236 in write (fildes=fildes@entry=2, buf=buf@entry=0x2001fd56, length=length@entry=1)
    at ./mbed-os/platform/source/mbed_retarget.cpp:683
#4  0x080017a8 in mbed_error_puts (str=0x2001fd56 "8004269 SP: 0x2001FD90 ", 
    str@entry=0x2001fd0c "\nCurrent Thread: \370\060\320  Id: 0x0 Entry: 0x80029ED StackSize: 0x0 StackMem: 0x8004269 SP: 0x2001FD90 ") at ./mbed-os/platform/source/mbed_board.c:107
#5  0x080017dc in mbed_error_vprintf (format=format@entry=0x0, arg=..., arg@entry=...)
    at ./mbed-os/platform/source/mbed_board.c:71
#6  0x080017f8 in mbed_error_printf (
    format=0x800d3ab "\nCurrent Thread: %s  Id: 0x%lX Entry: 0x%lX StackSize: 0x%lX StackMem: 0x%lX SP: 0x%lX ")
    at ./mbed-os/platform/source/mbed_board.c:55
#7  0x080019e8 in print_error_report (error_msg=error_msg@entry=0x800d8d0 "Not allowed in ISR context", 
    error_filename=error_filename@entry=0x0, error_line=error_line@entry=0, ctx=0x20000fa4 <last_error_ctx>)
    at ./mbed-os/platform/source/mbed_error.c:535
#8  0x08001b88 in mbed_error (error_status=error_status@entry=1, error_msg=0x800d8d0 "Not allowed in ISR context", 
    error_value=error_value@entry=134273232, filename=filename@entry=0x0, line_number=line_number@entry=0)
    at ./mbed-os/platform/source/mbed_error.c:282
#9  0x08002768 in trap_rtx_error (error_value=134273232, rtx_status=0, error_status=1)
    at ./mbed-os/rtos/source/TARGET_CORTEX/mbed_rtx_handlers.c:118
#10 0x08003778 in osMutexAcquire (mutex_id=<optimized out>, timeout=timeout@entry=4294967295)
    at ./mbed-os/rtos/source/TARGET_CORTEX/rtx5/RTX/Source/rtx_mutex.c:483
#11 0x08001fc2 in singleton_lock () at ./mbed-os/platform/SingletonPtr.h:54
#12 0x080020ea in __cxa_guard_acquire (guard_object_p=0x200010e0 <guard variable for default_console()::console>)
    at ./mbed-os/platform/source/mbed_retarget.cpp:1503
#13 0x08002194 in default_console () at ./mbed-os/platform/source/mbed_retarget.cpp:274
#14 get_console (fd=2) at ./mbed-os/platform/source/mbed_retarget.cpp:293
#15 mbed::mbed_file_handle (fd=2) at ./mbed-os/platform/source/mbed_retarget.cpp:304
#16 0x0800221a in write (fildes=fildes@entry=2, buf=buf@entry=0x2001fe4c, length=length@entry=0)
    at ./mbed-os/platform/source/mbed_retarget.cpp:677
#17 0x08001772 in mbed_error_puts (str=str@entry=0x2001fe4c "LOCK: Ticker.cpp, ln: 71, lock count: 1\r\n")
    at ./mbed-os/platform/source/mbed_board.c:97
#18 0x080017dc in mbed_error_vprintf (format=format@entry=0x800cd29 <_fini+100> "Ticker.cpp", arg=..., arg@entry=...)
    at ./mbed-os/platform/source/mbed_board.c:71
#19 0x080017f8 in mbed_error_printf (format=0x800d555 "LOCK: %s, ln: %i, lock count: %u\r\n")
    at ./mbed-os/platform/source/mbed_board.c:55
#20 0x08001e18 in sleep_tracker_lock (filename=filename@entry=0x800cd29 <_fini+100> "Ticker.cpp", line=line@entry=71)
    at ./mbed-os/platform/source/mbed_power_mgmt.c:167
#21 0x080004f4 in mbed::Ticker::attach_us(mbed::Callback<void ()>, unsigned long long) (
    this=this@entry=0x20000ea0 <ticker_wrapper_data+8>, func=..., t=<optimized out>)
    at ./mbed-os/drivers/source/Ticker.cpp:71
#22 0x080008f2 in LowPowerTickerWrapper::_schedule_match (this=this@entry=0x20000e98 <ticker_wrapper_data>, 
    current=<optimized out>) at ./mbed-os/hal/LowPowerTickerWrapper.cpp:312
#23 0x08000990 in LowPowerTickerWrapper::set_interrupt (this=0x20000e98 <ticker_wrapper_data>, timestamp=8846289)
    at ./mbed-os/hal/LowPowerTickerWrapper.cpp:143
#24 0x08000e2e in schedule_interrupt (ticker=ticker@entry=0x20000e98 <ticker_wrapper_data>)
    at ./mbed-os/hal/mbed_ticker_api.c:266
#25 0x08001018 in ticker_insert_event_us (ticker=0x20000e98 <ticker_wrapper_data>, 
    obj=obj@entry=0x20000ff0 <mbed::internal::(anonymous namespace)::os_timer_data+8>, timestamp=<optimized out>, 
    id=id@entry=536874984) at ./mbed-os/hal/mbed_ticker_api.c:385
#26 0x080005ea in mbed::TimerEvent::insert_absolute (
    this=this@entry=0x20000fe8 <mbed::internal::(anonymous namespace)::os_timer_data>, timestamp=<optimized out>)
    at ./mbed-os/drivers/source/TimerEvent.cpp:53
#27 0x08001212 in mbed::internal::SysTimer<1000ul, true>::_schedule_tick (
    this=this@entry=0x20000fe8 <mbed::internal::(anonymous namespace)::os_timer_data>)
    at ./mbed-os/platform/source/SysTimer.cpp:182
#28 0x080013be in mbed::internal::SysTimer<1000ul, true>::start_tick (
    this=0x20000fe8 <mbed::internal::(anonymous namespace)::os_timer_data>)
    at ./mbed-os/platform/source/SysTimer.cpp:176
#29 0x08002912 in OS_Tick_Enable () at ./mbed-os/rtos/source/TARGET_CORTEX/mbed_rtx_idle.cpp:81
#30 0x08002c16 in svcRtxKernelStart () at ./mbed-os/rtos/source/TARGET_CORTEX/rtx5/RTX/Source/rtx_kernel.c:247
#31 0x08002962 in SVC_Handler () at irq_cm4f.S:63

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

Am I reading that correctly that there is mbed_error_printf() call before main()? And as such, there is never a chance to initialize the stdout in non-interrupt context.

This would make sense in a way that I'm unable to break at main().

@kjbracey
Copy link
Contributor

One thought - printf might not be technically quite right, as error_printf goes to STDERR_FILENO, not STDOUT_FILENO. With the built-in buffered or unbuffered serial though, they both initialise the same device, so it doesn't matter.

And yes, it looks like it's calling a mbed_error_printf before main. But in that case, there's a separate layer that's supposed to help which is that the singleton_lock should be a no-op if the RTOS hasn't started yet.

Clearly we haven't quite started yet (we're in osRtxKernelStart), but that flag that singleton_lock is looking at thinks we have. Maybe that check could be adjusted.

@kjbracey
Copy link
Contributor

You could try forcing a pre-main console initialisation with

static int dummy = printf("Hello\n");

@korjaa
Copy link
Contributor Author

korjaa commented Sep 18, 2019

Switched my main.cpp to following:

#include "mbed.h"
#include "mbed_stats.h"

static DigitalOut led(LED1);

static int dummy = printf("Hello\n");

int main() {
   printf("Start example\n");
   fflush(stdout);
   wait_ms(2000);
   while (1) {
       led = !led;
       wait_ms(2000);

       mbed_stats_cpu_t stats;
       mbed_stats_cpu_get(&stats);
       printf("Uptime: %llu ", stats.uptime / 1000);
       printf("Sleep time: %llu ", stats.sleep_time / 1000);
       printf("Deep Sleep: %llu\n", stats.deep_sleep_time / 1000);
   }
}

Unfortunately it still ended up in a reboot loop.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 20, 2019

Recording it here too:
I re-tried this with mbed-os-5.14.0-rc3 and I'm still facing the same issue that when MBED_SLEEP_TRACING_ENABLED is added, the program ends in reboot loop.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 20, 2019

Using MBED_SLEEP_TRACING_ENABLED with K64F & K82F seems to work.

@LMESTM
Copy link
Contributor

LMESTM commented Sep 20, 2019

The reason is that LP TICKER dirver for STM32 makes use of deep sleep lock (which BTW I think should be done as well by Serial drivers for instance to make sure we don't enter deep sleep before the transmit buffer is fully empty ...).

Anyway the lock is being taken before the stdout is available.

I think @kjbracey-arm pointed a possible forward:

And yes, it looks like it's calling a mbed_error_printf before main. But in that case, there's a separate layer that's supposed to help which is that the singleton_lock should be a no-op if the RTOS hasn't started yet.
Clearly we haven't quite started yet (we're in osRtxKernelStart), but that flag that singleton_lock is looking at thinks we have. Maybe that check could be adjusted.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 20, 2019

Based on above I tried "MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER=1" which did indeed re-enable sleep tracing.

@korjaa korjaa changed the title [OOB-5.14][DISCO_L475VG_IOT01A] defining MBED_SLEEP_TRACING_ENABLED fails in mutex ISR error. [OOB-5.14][DISCO_L475VG_IOT01A] defining MBED_SLEEP_TRACING_ENABLED with LPTICKER fails in mutex ISR error. Sep 20, 2019
@LMESTM
Copy link
Contributor

LMESTM commented Sep 20, 2019

Based on above I tried "MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER=1" which did indeed re-enable sleep tracing.

By doing this you actually fully prevent use of deep sleep because us_ticker (comparted to lp_ticker) does not allow deep_sleep ... so you're tracing sleep but will always get 0% deep_sleep ;-)

I'd rather think of asking @kjbracey-arm to propose a change to singleton_lock ...

@korjaa
Copy link
Contributor Author

korjaa commented Sep 20, 2019

That is true, I just wanted to try it with the USTICKER if it removes the error.

@kjbracey
Copy link
Contributor

kjbracey commented Sep 20, 2019

One suggestion - change singleton_lock to check osKernelGetState() == osKernelRunning instead of singleton_mutex_id.

@korjaa
Copy link
Contributor Author

korjaa commented Sep 20, 2019

I tried above, it does remove the crash, but it 99% disables deep sleep too. Maybe this is expected with the deep sleep lock tracing? I'm not sure. The DISCO_L475VG_IOT01A board though seems to get into some indefinite loop (log capture below).

Diff for above modification:

$ git diff
diff --git a/platform/SingletonPtr.h b/platform/SingletonPtr.h
index a71e0873ce..9ca5d47e9e 100644
--- a/platform/SingletonPtr.h
+++ b/platform/SingletonPtr.h
@@ -47,8 +47,8 @@ extern osMutexId_t singleton_mutex_id;
 inline static void singleton_lock(void)
 {
 #ifdef MBED_CONF_RTOS_PRESENT
-    if (!singleton_mutex_id) {
-        // RTOS has not booted yet so no mutex is needed
+    if (osKernelGetState() != osKernelRunning) {
+        // RTOS is not active so no mutex is needed
         return;
     }
     osMutexAcquire(singleton_mutex_id, osWaitForever);
@@ -64,8 +64,8 @@ inline static void singleton_lock(void)
 inline static void singleton_unlock(void)
 {
 #ifdef MBED_CONF_RTOS_PRESENT
-    if (!singleton_mutex_id) {
-        // RTOS has not booted yet so no mutex is needed
+    if (osKernelGetState() != osKernelRunning) {
+        // RTOS is not active so no mutex is needed
         return;
     }
     osMutexRelease(singleton_mutex_id);

The interesting part though is that, with this enabled, it seems to do different things in DISCO_L475VG_IOT01A and NUCLEO_F411RE.

Without deep sleep lock tracing, the both application trace following:

Launch rc3 example
Uptime: 2019 Sleep time: 1 Deep Sleep: 1997
Uptime: 4064 Sleep time: 4 Deep Sleep: 3994
Uptime: 6109 Sleep time: 5 Deep Sleep: 5991
Uptime: 8154 Sleep time: 8 Deep Sleep: 7988

If the deep sleep lock tracing is enabled with DISCO_L475VG_IOT01A:

LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
...

and it never seems to end.

NUCLEO_F411RE:

LOCK: Ticker.cpp, ln: 71, lock count: 1
UNLOCK: Ticker.cpp, ln: 42, lock count: 0
Launch rc3 example
LOCK: Ticker.cpp, ln: 71, lock count: 1
Sleep locks held:
[id: Ticker.cpp, count: 1]
UNLOCK: Ticker.cpp, ln: 42, lock count: 0
Sleep locks held:
...
Sleep locks held:
Sleep locks held:
Sleep locks held:
LOCK: SysTimer.cpp, ln: 134, lock count: 1
UNLOCK: SysTimer.cpp, ln: 152, lock count: 0
Uptime: 2197 Sleep time: 0 Deep Sleep: 0
LOCK: Ticker.cpp, ln: 71, lock count: 1
Sleep locks held:
[id: Ticker.cpp, count: 1]
UNLOCK: Ticker.cpp, ln: 42, lock count: 0
Sleep locks held:
Sleep locks held:
Sleep locks held:
...

The Deep Sleep counter does increase very slowly.

@LMESTM
Copy link
Contributor

LMESTM commented Sep 20, 2019

I think that tracing each and every lock / unlock may not always make sense.
One may want to check who holding a lock at a precise time, so it could make sense to have options here:

  • trace every lock / unlock
  • do not trace by default, but offer an API to trace the current lock holders so that users can debug what happens

@jeromecoutant
Copy link
Collaborator

One suggestion - change singleton_lock to check osKernelGetState() == osKernelRunning instead of singleton_mutex_id.

@kjbracey-arm Is it on going ?

@pdemil
Copy link

pdemil commented Oct 11, 2019

One suggestion - change singleton_lock to check osKernelGetState() == osKernelRunning instead of singleton_mutex_id.

Tried this suggestion of @kjbracey-arm on target DISCO_L072CZ_LRWAN1 with toolchain GCC_ARM
Same result as @korjaa above:

LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
LOCK: lp_ticker.c, ln: 363, lock count: 1
UNLOCK: lp_ticker.c, ln: 233, lock count: 0
...

and the application is not functional.

@LMESTM
Copy link
Contributor

LMESTM commented Oct 11, 2019

@pdemil the suggestion above is needed, but is not enough.
There will be hundreds of lock / unlock from lp_ticker every seconds, that is not worth tracing them all, there is a need for possible filtering. What would actually be needed also is to be able to get the status of what drivers are having a lock at some point in time ... this would be a new tracing mechanism to add

@pdemil
Copy link

pdemil commented Oct 16, 2019

hundreds every second? Wow, I agree it is not worth tracing all those lock/unlocks.

I think @korjaa wants deep sleep for a long time (>5 minutes) and no wake-ups every second, so maybe an RTC based wake-up (or 32 bit LP timer) is needed?
In that case the existing MBED_SLEEP_TRACING_ENABLED would make sense, right @LMESTM ?

@LMESTM
Copy link
Contributor

LMESTM commented Oct 16, 2019

@pdemil Yes it could make sense, but still when your application is active, you would probably get the many lock / unlock iterations.

EDIT : actually using a debugger you can get access to the table of the various lock / unlocker user. It is much more efficient than tracing to access the table at any point in time when setting a breakpoint in your application and see who is holding a lock

For avoiding wake-ups every seconds, you'll need to add application specific code as of now - refer to discussion in #10211

@MarceloSalazar
Copy link

@LMESTM Just checking whether this issue is still valid.
Many things have changed since this was raised.

@MarceloSalazar
Copy link

@LMESTM just a reminder on this issue - seems it's still valid.

@faydr
Copy link

faydr commented May 23, 2020

I'm having similar issues trying to trace sleep locks while developing on my NUCLEO-H743ZI2 board.

In short, to get the sleep tracing to not crash, I need to:

  1. Use non-buffered serial for the console.
  2. Enable "MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER=1", which also has the annoying side-effect of having the board never actually sleep.

@LMESTM
Copy link
Contributor

LMESTM commented May 25, 2020

@MarceloSalazar @faydr I've suggested a few way forward and solutions don't seem to be STM32 related:

  1. Changes done here [OOB-5.14][DISCO_L475VG_IOT01A] defining MBED_SLEEP_TRACING_ENABLED with LPTICKER fails in mutex ISR error. #11497 (comment) should be upstreamed in a Pull Request by mbed-os core team I guess.
  2. What would actually be needed also is to be able to get the status of what drivers are having a lock at some point in time ... this would be a new tracing mechanism to add => change request for a new API
    So far as I know this ticket now actually depends on the 2 above possible changes ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants