Fixes for tickless and LPTICKER_DELAY_TICKS #7524

c1728p9 · 2018-07-16T23:53:37Z

Description

This PR fixes problems that occur when both tickless and LPTICKER_DELAY_TICKS are turned on. The two primary problems are:

Tests which use the microsecond ticker directly interfere with the lp ticker wrapper
The Timeout object used by the lp ticker wrapper locks deep sleep at various times causing tests which assert that deep sleep is allowed to fail

The tests susceptible to problem 1 are:

mbed_hal/common_tickers
mbed_hal/sleep

The tests susceptible to problem 2 are:

mbed_drivers/sleep_lock
mbed_drivers/timeout
mbed_hal/lp_ticker
mbed_hal/rtc
mbed_hal/sleep
mbed_hal/sleep_manager
mbed_hal/sleep_manager_racecondition

This PR fixes the problems mentioned above. Unfortunately, there are still additional problems which need to be addressed. Further fixes require:

Common ticker handling should be suspended before calling us/lp ticker directly. This is important for the below test, but is not currently causing failures.
- mbed_hal/common_tickers
- mbed_hal/common_tickers_freq
- mbed_hal/lp_ticker
- mbed_hal/sleep
The Serial driver must unlock deep sleep only after all bytes have been sent to prevent bytes from being dropped by entering deep sleep. This requires a HAL extension. I am seeing test failures due to this on the NUCLEO_F401RE when tickless is enabled.

Pull request type

[x] Fix
[ ] Refactor
[ ] New target
[ ] Feature
[ ] Breaking change

jeromecoutant · 2018-07-17T12:17:28Z

Hi Russ

I have started a full non regression for each STM32 family.
I will provide this status later.

In parallel, I quickly checked TICKLESS mode with F401. I agree status is much better.

For the moment, I only added some DeepSleepLock in tests-mbed_drivers-lp_ticker.

mprse

Potential inconsistency between result returned by sleep_manager_can_deep_sleep() and action performed by sleep_manager_sleep_auto().

mprse · 2018-07-17T13:02:02Z

hal/mbed_sleep_manager.c

@@ -196,7 +203,11 @@ void sleep_manager_sleep_auto(void)
 #ifdef MBED_DEBUG
    hal_sleep();
 #else
-    if (sleep_manager_can_deep_sleep()) {
+    if (sleep_manager_can_deep_sleep()
+#if DEVICE_LPTICKER && (LPTICKER_DELAY_TICKS > 0)


If lp ticker's Timeout object is in use (i.e. lp_ticker_get_timeout_pending() returns true) and deep-sleep is not locked by other features (i.e. lock_count==1), then:

sleep_manager_can_deep_sleep() returns true - indicates that we can put board into deep-sleep mode.

sleep_manager_sleep_auto() puts the board into the sleep mode (not deep-sleep).

It looks like a design issue since we can not relay on sleep_manager_can_deep_sleep() function.
From the user perspective it can be very misleading. From the testing perspective sleep_manager_can_deep_sleep() function is used in tests to check if current environment state allows to go into deep-sleep mode which is to be tested. So in some cases the test may test sleep mode instead of required deep-sleep mode.

It looks like a design issue since we can not relay on sleep_manager_can_deep_sleep() function.

where this comes from?

I agree with @mprse, sleep_manager_can_deep_sleep() shouldn't be updated this way. I do understand the purpose, but think this change should be test specific, not general.

Please take a look at these changes c85856b.

Some possible alternatives I can think of are:

Modify tests to suspend the OS before running the tests - this may be intrusive as you can no longer use most OS primitives

Assert in tests using a different function - something like sleep_manager_can_deep_sleep_test()

Assert something different in the tests, such as time actually spent in deep sleep

Do you have any thoughts or preferences on this @mprse and @fkjagodzinski?

In my opinion first bullet is ok, but only for low level tests (HAL) where target specific drivers are tested, where the OS can be omitted. For driver layer tests I think that OS should be working normally and tests should take into consideration possible influence of the OS like systick handling, task scheduling, ticker interrupts, etc. So I suggest to use second bullet for upper layer tests.

Since one of the problems is:

The Timeout object used by the lp ticker wrapper locks deep sleep at various times causing tests which assert that deep sleep is allowed to fail

we could modify the tests to wait for the deep sleep to be unlocked before calling sleep(). But even this would require using critical section, so would also introduce more complexity to tests.

Adding sleep_manager_can_deep_sleep_test() seems straight forward, but such change would mean we accept that tests are occasionally run in an environment different from what they were supposed to. Simply put, the test result can be meaningless sometimes.

In my opinion the most honest solution would be adding

#if DEVICE_LPTICKER && (LPTICKER_DELAY_TICKS > 0) #error [NOT_SUPPORTED] Test not supported for this target #endif

or executing affected test cases conditionally based on LPTICKER_DELAY_TICKS > 0.

mprse · 2018-07-17T13:07:55Z

hal/mbed_sleep_manager.c

-    return deep_sleep_lock == 0 ? true : false;
+    uint32_t lock_count = deep_sleep_lock;
+#if DEVICE_LPTICKER && (LPTICKER_DELAY_TICKS > 0)
+    if (lp_ticker_get_timeout_pending() && (lock_count > 0)) {


(lock_count > 0) - is this condition needed?
In case when lp ticker's Timeout object is in use deep-sleep is locked (i.e. lock_count > 0).

The check is to prevent underflow. If this function is wrapped in a critical section then the check wouldn't be needed, as lock_count would always be at least 1 if lp_ticker_get_timeout_pending() returned true.

jeromecoutant · 2018-07-17T14:10:01Z

More detailed status with NUCLEO_F401RE:

I applied few patches:
c4f0eaf

And I still got issues with:

tests-mbed_drivers-lp_timeout
[1531832056.30][CONN][RXD] >>> Running case Add debug-info option #10: 'Zero delay (attach_us)'...
[1531832056.37][CONN][RXD] :213::FAIL: Expected 1 Was 0
tests-mbedmicro-rtos-mbed-systimer

[1531833600.45][CONN][RXD] >>> Running case #5: 'Schedule zero ticks'...
[1531833600.53][CONN][RXD] :229::FAIL: Expected 1 Was 0

mprse · 2018-07-18T12:55:37Z

Regarding lp ticker wrapper I have doubts if this is a correct approach. According to the description it is designed for cases when:

Interrupt may not fire if set earlier than LPTICKER_DELAY_TICKS low power clock cycles.
Setting the interrupt back-to-back will block.

So the first bullet is the case when we have some hardware limitations like on NRF51_DK board:

If the COUNTER is N, writing N or N+1 to a CC register may not trigger a COMPARE event.

In this case the nearest interrupt which can be set is current tick + 2. This can be handled using lp ticker wrapper and LPTICKER_DELAY_TICKS = 2, but currently is handled by ensuring that value written to CC is greater than or equal to current tick + 2 in the Nordic NRF5x lp ticker driver:

mbed-os/targets/TARGET_NORDIC/TARGET_NRF5x/common_rtc.c

Lines 198 to 205 in c29fe89

    
           const uint32_t now = nrf_rtc_counter_get(COMMON_RTC_INSTANCE); 
        
           if (now == ticks_count || 
        
               RTC_WRAP(now + 1) == ticks_count) { 
        
               ticks_count = RTC_WRAP(ticks_count + 2); 
        
           } 
        
           nrf_rtc_cc_set(COMMON_RTC_INSTANCE, cc_channel, ticks_count);

Personally I prefer current option since usage of lp ticker wrapper which utilizes Timeout object in the HAL layer as we can see has too many side effects. Additionally since this is hardware specific limitation it should be handled in the target layer.

Regarding second bullet I'm not sure if I correctly understand the problem. Can you provide some use case for this?

c1728p9 · 2018-07-18T15:20:59Z

Hi @mprse, the primary problem is the second point - some low power ticker hardware cannot be written to twice within a certain number of low power clock cycles (LPTICKER_DELAY_TICKS ticks).

The OS scheduler enters tickless mode and sets the low power ticker to wake 500ms in the future
An interrupt fires right after and wakes the OS early
The OS uses the low power ticker to schedule the next tick (1ms in the future)

If the next tick time is written to the low power ticker immediately the new match time will be dropped, as the hardware is still trying to set the first match value. Options to work around this are:

Block in lp_ticker_set_interrupt until the new value is written
Don't allow low power tickers which require a delay
Use a Timeout object to write the second match value after LPTICKER_DELAY_TICKS ticks has passed

Option 1. is what was done prior to the ticker HAL specification. The problem with this is interrupt latency is impacted. An example in practice is the NUCLEO_F401RE. It has a low power ticker clock speed of 8KHz and requires 3 clock cycles between writes. This means that back-to-back writes will block for 366us in a critical section.

Option 2. was considered but it was found that a large number of targets have this low power ticker hardware behavior. Vendors with effected targets include ST, Nuvoton and OnSemi to name a few. Because this would rule out tickless for many targets this option was less than ideal as well. Note - the large number of vendors/targets impacted was also one of the main reason this was done in a common layer.

Option 3. both preserves interrupt latency and allows targets with this low power ticker behavior still implement the low power ticker API. Compared to Option 1, power savings is better since the CPU can go into sleep mode rather than busy waiting. Unfortunately it does add complexity and interferes with deep sleep tests.

@mprse do you know of any alternatives which could simplify this? These were the only options that I could think of, but if there is a better way to do it would be great.

mprse · 2018-07-19T09:30:12Z

@c1728p9 Thanks Russ for detailed explanation of the problem. It looks like option number 3 despite its complexity is the best solution for now.

We have disused with @fkjagodzinski possible solutions for the deep-sleep/sleep issue. Please consider the following option:

sleep_manager_can_deep_sleep(): returns true in case when only Timeout used by the lp ticker wrapper stops us from going into deep-sleep mode (as it is done now).
sleep_manager_sleep_auto(): when we can go into deep-sleep (sleep_manager_can_deep_sleep() returns true), but Timeout object of the lp ticker wrapper is in use (lp_ticker_get_timeout_pending() returns true), then we are waiting in busy loop outside the critical section until timeout is not pending (lp_ticker_get_timeout_pending() returns false) and then go into deep-sleep.

Since sleep_manager_sleep_auto() function shell automatically decide which mode should be selected its a good place to handle this specific case. Additionally it would solve the problem with the sleep tests.
I'm not sure about possible side effects of this solutions, but still if we were just about to enter into deep-sleep mode we can waste ~366 us (there are currently no other things to do).
We will lost time which we could spend in sleep mode, but for the rest of planed time we will be in deep-sleep which saves more power.

c1728p9 · 2018-07-26T04:17:37Z

@mprse @fkjagodzinski I updated this PR to address the feedback you gave me. In specific I added and updated the test to use the function sleep_manager_can_deep_sleep_test_check() which checks to see if deep sleep is allowed within 2ms. This way if deep sleep is blocked temporarily for any reason - low power ticker wrapper, buffered uart bytes, or some other unforeseen reason then tests will not start failing again. Let me know what you think.

mprse

@c1728p9 Looks good to me.

What bothers me is that test for sleep_manager_can_deep_sleep_test_check() is missing (even though this function is intended for use in testing). I think it should be added or task for this should be created.
I suggest also to add information in the description of sleep_manager_can_deep_sleep_test_check() that timeout is set to 2 ms.

fkjagodzinski · 2018-07-30T12:18:27Z

@mprse @c1728p9 I've got an open PR with sleep manager tests update (#7582), I can add a few test cases for that new function.

cmonr · 2018-07-31T17:00:46Z

/morph build

cmonr · 2018-07-31T19:08:46Z

hal/mbed_lp_ticker_wrapper.h

+/**
+ * Wrapper around lp_ticker_set_interrupt to prevent blocking
+ *
+ * Problems this function is solving:


mbed-ci · 2018-07-31T19:35:05Z

Build : SUCCESS

Build number : 2710
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/7524/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

mbed-ci · 2018-07-31T22:35:02Z

Exporter Build : SUCCESS

Build number : 2339
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/exporter/7524/

mbed-ci · 2018-08-01T07:27:52Z

Test : FAILURE

Build number : 2441
Test logs :http://mbed-os-logs.s3-website-us-west-1.amazonaws.com/?prefix=logs/7524/2441

When the define LPTICKER_DELAY_TICKS is set deep sleep can be randomly disallowed when using the low power ticker. This is because a Timer object, which locks deep sleep, is used to protect from back-to-back writes to lp tickers which can't support that. This causes tests which assert that deep sleep is allowed to intermittently fail. To fix this intermittent failure this patch adds the function sleep_manager_can_deep_sleep_test_check() which checks if deep sleep is allowed over a duration. It updates all the tests to use sleep_manager_can_deep_sleep_test_check() rather than sleep_manager_can_deep_sleep() so the tests work even if deep sleep is spuriously blocked.

Update the low power ticker wrapper code so it does not violate any properties of the ticker specification. In specific this patch fixes the following: - Prevent spurious interrupts - Fire interrupt only when the ticker times increments to or past the value set by ticker_set_interrupt - Disable interrupts when ticker_init is called

cmonr · 2018-08-17T14:32:56Z

/morph build

mbed-ci · 2018-08-17T15:26:58Z

Build : FAILURE

Build number : 2820
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/7524/

Fix the HAL common_tickers and sleep tests so they work correctly when the define LPTICKER_DELAY_TICKS is set.

To handle timer rollovers the test tests-mbed_hal-common_tickers_freq calls intf->set_interrupt(0). For this to work correctly the ticker implementation must fire an interrupt on every rollover event though intf->set_interrupt(0) was called only once. Whether an interrupt will fire only once or multiple times is undefined behavior which cannot be relied upon. To avoid this undefined behavior this patch continually schedules an interrupt and performs overflow detection on every read. This also removes the possibility of race conditions due to overflowCounter incrementing at the wrong time.

Only reschedule the Timeout object in the low power ticker wrapper if it is not already pending.

cmonr · 2018-08-17T22:15:18Z

/morph build

mbed-ci · 2018-08-18T00:24:49Z

Build : SUCCESS

Build number : 2835
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/7524/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

mbed-ci · 2018-08-18T01:42:08Z

Exporter Build : SUCCESS

Build number : 2464
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/exporter/7524/

mbed-ci · 2018-08-19T01:12:08Z

Test : FAILURE

Build number : 2585
Test logs :http://mbed-os-logs.s3-website-us-west-1.amazonaws.com/?prefix=logs/7524/2585

cmonr · 2018-08-20T20:07:54Z

/morph test

NirSonnenschein · 2018-08-21T08:24:32Z

/morph uvisor-test

mbed-ci · 2018-08-21T09:24:21Z

Test : SUCCESS

Build number : 2598
Test logs :http://mbed-os-logs.s3-website-us-west-1.amazonaws.com/?prefix=logs/7524/2598

Fixes for tickless and LPTICKER_DELAY_TICKS

c1728p9 mentioned this pull request Jul 16, 2018

lp_timeout test in TICKLESS mode #7328

Closed

0xc0170 requested a review from a team July 17, 2018 08:33

0xc0170 added the needs: review label Jul 17, 2018

0xc0170 requested review from mprse and a team and removed request for a team July 17, 2018 08:35

mprse reviewed Jul 17, 2018

View reviewed changes

fkjagodzinski mentioned this pull request Jul 23, 2018

Update sleep manager tests #7582

Merged

c1728p9 force-pushed the tickless_fix branch from 0f80394 to 26b73fd Compare July 26, 2018 04:13

mprse approved these changes Jul 26, 2018

View reviewed changes

fkjagodzinski approved these changes Jul 30, 2018

View reviewed changes

cmonr added needs: preceding PR and removed needs: preceding PR labels Jul 30, 2018

cmonr added needs: CI release-version: 5.9.5 and removed needs: review labels Jul 31, 2018

cmonr reviewed Jul 31, 2018

View reviewed changes

cmonr approved these changes Jul 31, 2018

View reviewed changes

c1728p9 added 2 commits August 17, 2018 09:29

c1728p9 force-pushed the tickless_fix branch from aeebd80 to 959e687 Compare August 17, 2018 14:30

cmonr added needs: CI and removed needs: work labels Aug 17, 2018

cmonr added needs: work and removed needs: CI labels Aug 17, 2018

c1728p9 added 3 commits August 17, 2018 11:58

Fix tests to work with LPTICKER_DELAY_TICKS

f68958d

Fix the HAL common_tickers and sleep tests so they work correctly when the define LPTICKER_DELAY_TICKS is set.

Speed optimization for LowPowerTickerWrapper

dc2e2c0

Only reschedule the Timeout object in the low power ticker wrapper if it is not already pending.

c1728p9 force-pushed the tickless_fix branch from 959e687 to dc2e2c0 Compare August 17, 2018 17:00

cmonr added needs: CI and removed needs: work labels Aug 20, 2018

NirSonnenschein added ready for merge and removed needs: CI labels Aug 21, 2018

cmonr merged commit e02466a into ARMmbed:master Aug 21, 2018

0xc0170 removed the ready for merge label Aug 21, 2018

mattbrown015 mentioned this pull request Aug 22, 2018

Recent LP Ticker Changes Break Tickless #7858

Closed

pan- pushed a commit to pan-/mbed that referenced this pull request Aug 22, 2018

Merge pull request ARMmbed#7524 from c1728p9/tickless_fix

199ad2a

Fixes for tickless and LPTICKER_DELAY_TICKS

c1728p9 mentioned this pull request Sep 17, 2018

Nuvoton: Fix Greentea test common_tickers failed #8030

Merged

TacoGrandeTX mentioned this pull request Nov 27, 2018

Using DEVICE_LPTICKER on STM32L486 #8783

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for tickless and LPTICKER_DELAY_TICKS #7524

Fixes for tickless and LPTICKER_DELAY_TICKS #7524

c1728p9 commented Jul 16, 2018

jeromecoutant commented Jul 17, 2018

mprse left a comment

mprse Jul 17, 2018

0xc0170 Jul 17, 2018

fkjagodzinski Jul 17, 2018

mprse Jul 17, 2018

c1728p9 Jul 17, 2018

mprse Jul 18, 2018

fkjagodzinski Jul 18, 2018

mprse Jul 17, 2018

c1728p9 Jul 17, 2018

jeromecoutant commented Jul 17, 2018

mprse commented Jul 18, 2018

c1728p9 commented Jul 18, 2018

mprse commented Jul 19, 2018 •

edited

Loading

c1728p9 commented Jul 26, 2018

mprse left a comment

fkjagodzinski commented Jul 30, 2018

cmonr commented Jul 31, 2018

cmonr Jul 31, 2018

mbed-ci commented Jul 31, 2018

mbed-ci commented Jul 31, 2018

mbed-ci commented Aug 1, 2018

cmonr commented Aug 17, 2018

mbed-ci commented Aug 17, 2018

cmonr commented Aug 17, 2018

mbed-ci commented Aug 18, 2018

mbed-ci commented Aug 18, 2018

mbed-ci commented Aug 19, 2018

cmonr commented Aug 20, 2018

NirSonnenschein commented Aug 21, 2018

mbed-ci commented Aug 21, 2018

Fixes for tickless and LPTICKER_DELAY_TICKS #7524

Fixes for tickless and LPTICKER_DELAY_TICKS #7524

Conversation

c1728p9 commented Jul 16, 2018

Description

Pull request type

jeromecoutant commented Jul 17, 2018

mprse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeromecoutant commented Jul 17, 2018

mprse commented Jul 18, 2018

c1728p9 commented Jul 18, 2018

mprse commented Jul 19, 2018 • edited Loading

c1728p9 commented Jul 26, 2018

mprse left a comment

Choose a reason for hiding this comment

fkjagodzinski commented Jul 30, 2018

cmonr commented Jul 31, 2018

Choose a reason for hiding this comment

mbed-ci commented Jul 31, 2018

Triggering tests

mbed-ci commented Jul 31, 2018

mbed-ci commented Aug 1, 2018

cmonr commented Aug 17, 2018

mbed-ci commented Aug 17, 2018

cmonr commented Aug 17, 2018

mbed-ci commented Aug 18, 2018

Triggering tests

mbed-ci commented Aug 18, 2018

mbed-ci commented Aug 19, 2018

cmonr commented Aug 20, 2018

NirSonnenschein commented Aug 21, 2018

mbed-ci commented Aug 21, 2018

mprse commented Jul 19, 2018 •

edited

Loading