cpu/native: fix race in thread_yield_higher() #10891

kaspar030 · 2019-01-28T16:16:32Z

Contribution description

Error case:
1. thread_yield_higher() stores the thread's ucontext
2. creates an "isr ucontext" for isr_thread_yield, switches to it

Case 1: no signals are pending, continues in isr_thread_yield()
3a. sched_run is called
4a. return to sched_active_thread ucontext

Case 2: signals pending (the crashing scenario), continues in native_irq_handler()
3b. handles signals
4b. if sched_context_switch_request is set, call sched_run
5b. return to sched_active_thread ucontext

4b misses the call to sched_run(), leading to a possible return into a
non-ready thread.

Fixes #10881.

Testing procedure

See #10881 and #10908

Issues/PRs references

Fixes #10881.
Fixes #6123.

Error case: 1. thread_yield_higher() stores the thread's ucontext 2. creates an "isr ucontext" for isr_thread_yield, switches to it Case 1: no signals are pending, continues in isr_thread_yield() 3a. sched_run is called 4a. return to sched_active_thread ucontext Case 2: signals pending (the crashing scenario), continues in native_irq_handler() 3b. handles signals 4b. if sched_context_switch_request is set, call sched_run 5b. return to sched_active_thread ucontext 4b misses the call to sched_run(), leading to a possible return into a non-ready thread.

kaspar030 · 2019-01-28T16:17:18Z

This is pretty central in native, so I guess a couple of eyes need to take a look...

kaspar030 · 2019-01-28T16:21:02Z

@LudwigKnuepfer maybe you could take a look?

kaspar030 · 2019-01-28T16:21:27Z

This fixes #6123 for me!

miri64 · 2019-01-28T16:30:59Z

I'm rather wondering if

diff --git a/cpu/native/native_cpu.c b/cpu/native/native_cpu.c
index 2629e55..79534e3 100644
--- a/cpu/native/native_cpu.c
+++ b/cpu/native/native_cpu.c
@@ -142,7 +142,7 @@ void isr_cpu_switch_context_exit(void)
     ucontext_t *ctx;
 
     DEBUG("isr_cpu_switch_context_exit\n");
-    if ((sched_context_switch_request == 1) || (sched_active_thread == NULL)) {
+    if (sched_active_thread == NULL) {
         sched_run();
     }

wouldn't be a more valid fix. Because this is basically also creating the situation you are creating:

isr_cpu_switch_context_exit() is only called by cpu_switch_context_exit() (either with a context switch to ISR when not in ISR, or directly when in ISR)
cpu_switch_context_exit() is called when all signal handlers were run in native_irq_handler() and in sched_task_exit, which is also called when you end a thread (so the user probably also wants to switch contexts), and also kernel_init for some reason...
so isr_switch_context_exit() should only be called when a context switch was requested, right?

miri64 · 2019-01-28T16:41:06Z

Some historic research:

The check in isr_cpu_switch_context_exit() was introduced by @LudwigKnuepfer in fix some segfault sources in native #229
The setting of of sched_context_switch_request in the else branch was introduced by @kaspar030 in cpu: native: fix thread_yield_higher isr_is_in() case #6660

kaspar030 · 2019-01-28T16:54:28Z

if (sched_active_thread == NULL)
wouldn't be a more valid fix.

I don't think so, as it would not call sched_run() anymore (sched_active_thread is never NULL unless threading is starting). always calling sched_run there would also be a "fix" but it would probably call the scheduler more often than needed.

miri64 · 2019-01-28T16:56:53Z

The fix I proposed causes the scheduler only to be called when the thread is exited, so this obviously was bullshit. Also, since the title of #229 where this check was introduced mentions segfaults we probably shouldn't mess with that.

kaspar030 · 2019-01-28T17:00:08Z

This fixes #6123 for me!

stable for 30min now, sudo true; for _ in $(seq 50); do sudo ping -s1392 -f "${TARGET}" & done.

This bug has haunted my benchmarking efforts on native for years!

miri64 · 2019-01-28T17:02:01Z

Penny for your thoughts on this though #6660 (comment)

miri64 · 2019-01-28T17:05:58Z

This bug has haunted my benchmarking efforts on native for years!

Next time use some elbow grease like I did ;D.

kaspar030 · 2019-01-28T17:07:40Z

Penny for your thoughts on this though #6660 (comment)

You mean at the end of native's IRQ handler (native_irq_handler()) the function cpu_switch_context_exit() gets called.?

I think the naming is a little off.

cpu_switch_context_exit() basically jumps to another context without saving (or better: with completely ignoring) the current one. That is also why thread_yield_higher(), after saving the context (via swapcontext()), ends up there, that's why "kernel_init()" uses it to kick off scheduling and why the function is put into a new thread's stack as return value so when a thread function exits, it ends up calling.

miri64 · 2019-01-28T17:11:27Z

Ok my final assessment for today for this PR is that sched_run might be called unnecessarily in some cases, which on native I can live with. I'll run some tests tomorrow and then give my final verdict.

miri64 · 2019-01-28T20:00:28Z

I tested most thread-related tests on native locally again. All succeeded. The rest I leave to Murdock.

miri64 · 2019-01-28T20:01:43Z

I was however not able find an isolated test case yet :-(.

miri64 · 2019-01-28T21:57:22Z

Maybe because I don't fully understand how and when sched_context_switch_request can be 0 in and when it is 1.

miri64 · 2019-01-28T22:06:24Z

I was also wondering why not all platforms need to set it in thread_yield_higher() and only native. Except native only atmega_common seems to do that and also only in the else branch of an irq_is_in() == 0 check:

RIOT/cpu/atmega_common/thread_arch.c

Lines 219 to 230 in 782b181

    
           void thread_yield_higher(void) 
        
           { 
        
               if (irq_is_in() == 0) { 
        
                   __context_save(); 
        
                   sched_run(); 
        
                   __context_restore(); 
        
                   __asm__ volatile ("ret"); 
        
               } 
        
               else { 
        
                   sched_context_switch_request = 1; 
        
               } 
        
           }

(edit: but maybe this was just mapped after native)

All others just seem to be content to trigger a software interrupt except for msp430_common which seems to implement its own version of context switching:

RIOT/cpu/msp430_common/cpu.c

Lines 21 to 34 in 782b181

    
           __attribute__((naked)) void thread_yield_higher(void) 
        
           { 
        
               __asm__("push r2"); /* save SR */ 
        
               __disable_irq(); 
        
               __save_context(); 
        
               /* have sched_active_thread point to the next thread */ 
        
               sched_run(); 
        
               __restore_context(); 
        
               UNREACHABLE(); 
        
           }

miri64

Anyway, tests pass and conceptually the change makes sense to me so and is hard to argue with. So it gets my soft ACK (which may be counted as 1 approve, given the "Impact: major" label, if others more knowledgeable in the deep dungeons of RIOT's scheduling approve)

miri64 · 2019-01-28T22:14:12Z

Maybe @OlegHahm can also have a look, since he reviewed #6660?

miri64 · 2019-01-28T22:18:11Z

Except native only atmega_common seems to do that and also only in the else branch of an irq_is_in() == 0 check:

RIOT/cpu/atmega_common/thread_arch.c

Lines 219 to 230 in 782b181

void thread_yield_higher(void)

{

if (irq_is_in() == 0) {

__context_save();

sched_run();

__context_restore();

__asm__ volatile ("ret");

}

else {

sched_context_switch_request = 1;

}

}

(edit: but maybe this was just mapped after native)

Was introduced in #8904 so summon @ZetaR60 as well.

kaspar030 · 2019-01-28T22:38:06Z

Maybe because I don't fully understand how and when sched_context_switch_request can be 0 in and when it is 1.

TL;DR sched_context_switch_request is used to request a (necessarily deferred) context switch from within an ISR.

A context switch can be triggered either from the running task or from within an ISR.
In the former case, it can be done right away: save the current context, run sched_run() to find the next thread, then restore that.

In ISRs, the context switch cannot be done right away. See e.g. this (pseudocode):

void foobar_isr() {
  do_stuff();
  if (whatever > FOO) {
    mutex_unlock(mutex_a_thread_is_waiting_on);
  } 
  do_more_stuff(); 
}

In ISR context, the ISR must finish first. It is not possible to "pause" ISR context, switch to the thread, then "continue" the ISR context. ISRs must finish. So the thread that would probably be unlocked by the mutec_unlock() call is merely put on the runqueue, but we need to somehow pass along the information that a context switch might be necessary.

This is what "sched_context_switch_request" is used for: pass along the information that a context switch might be necessary after an ISR has finished.

Thing is, different platforms have different concepts.

E.g., on Cortex-M, there's PendSV, which will always be called last. Also, there are the banked ISR registers (a second set of registers), thus for serving an ISR, no manual context saving is necessary. Only when, after all ISRs have run, a different user thread should continue to run, the current context needs to be saved, the next selected and then restored, through fiddling with the user mode stack. sched_context_switch_request is used as a flag to let the architecture code, at the end of an ISR, know that this work should be done and thus PendSV should be triggered.

Cortex-M checks the flag here:

RIOT/cpu/cortexm_common/include/cpu.h

Lines 185 to 190 in 782b181

    
           static inline void cortexm_isr_end(void) 
        
           { 
        
               if (sched_context_switch_request) { 
        
                   thread_yield_higher(); 
        
               } 
        
           }

Other platforms do it differently, but all are using the flag.

miri64 · 2019-01-28T22:56:07Z

Ok, I think most of that I was able to figure out myself, so thanks for the affirmation and putting it in words. However, I still don't really am able to track how the case before was wrong / let to invalid unblocking so that is why I don't feel 100% confident in this bug fix.

kaspar030 · 2019-01-29T11:18:58Z

However, I still don't really am able to track how the case before was wrong / let to invalid unblocking so that is why I don't feel 100% confident in this bug fix.

I'll try to put it in different words.

Usually on thread_yield_higher() on native, when called outside of an ISR, an "isr context" is created with "makecontext" which executes "isr_thread_yield". "swapcontext()" saves the thread context and jumps into the newly created "isr context", launching "isr_thread_yield".

Now, if there are no signals pending, the scheduler is run, then a context switch to the then-current sched_active_thread is initiated. All is good.

But it is possible that signals queue up (after irq_disable in thread_yield_higher()), making "isr_thread_yield()" call "native_irq_handler()". That function does not return, as it returns to thread context by itself, after handling ISRs. The jumping-back to thread context is almost identical to how "isr_thread_yield()" does it (if there'd been no signals pending). The main difference is that "native_irq_handler()" does not always call "sched_run()", but only if "sched_context_switch_request" was set.

This lead to the crash at hand: msg_receive() removes the current thread from the runqueue, then calls thread_yield_higher() expecting to be scheduled away. But in the off chance that a signal pops up at the wrong time (and the ISR does not set sched_context_switch_request=1), the scheduler doesn't get called, and msg_receive() returns without receiving.

The fix works by always setting "sched_context_switch_request" to 1, so should "isr_thread_yield" degrade to "native_irq_handler", the scheduler is still called.

miri64 · 2019-01-29T11:28:25Z

Ok, so the tests I started to write yesterday weren't all that wrong (trigger a timer in very short intervals [which causes a SIGALRM on native], while a message exchange between two threads is happening), they just were not able to trigger the race condition.

kaspar030 · 2019-01-29T11:30:52Z

(trigger a timer in very short intervals [which causes a SIGALRM on native], while a message exchange between two threads is happening), they just were not able to trigger the race condition.

Actually yes, if the timer does not trigger a context switch (e.g., no mutex unlock or msg send)!

miri64 · 2019-01-30T15:13:50Z

Actually yes, if the timer does not trigger a context switch (e.g., no mutex unlock or msg send)!

Aha, that was what missing then :-).

miri64 · 2019-01-30T15:41:24Z

Ok, I tried that, but still not really able to reproduce the issue. Probably the receiving thread needs to do a little more than just printing something.

miri64 · 2019-01-30T15:41:32Z

https://github.com/miri64/RIOT/blob/tests/new/i10881/tests/thread_msg_block_race/main.c

kaspar030 · 2019-01-30T15:59:21Z

https://github.com/miri64/RIOT/blob/tests/new/i10881/tests/thread_msg_block_race/main.c

Can you try to make the timeouts of the timer random, and make sure it does not cause a context switch (remove the mutex_unlock())? Also I think thread_1 can go.

kaspar030 · 2019-01-30T16:01:13Z

Aha, that was what missing then :-).

Ah, sorry, I didn't get you. No context switch was right...

miri64 · 2019-01-30T16:16:12Z

Ok, I did that now and also replaced msg_try_send() with msg_send() (otherwise the main thread starves). However, now sending stops after 3 times trying... :(

miri64 · 2019-01-30T18:04:44Z

I got it! Let me write a quick test script and I provide a PR.

miri64

Works with #10908 and I did not encounter any problems in the other thread tests with native.

OlegHahm

I would ACK this PR even though I have to admit that I haven't dug into all the consequences this change might possible have. However, we're carrying this bug with us for ore more than two years now and apparently this PR fixet it and does not break any other standard test. Plus, it is basically an one-liner.

emmanuelsearch · 2019-02-01T14:46:41Z

@kaspar030 so we're green?
(which means start preparing a backport hehe ;)

kaspar030 · 2019-02-01T15:10:48Z

@kaspar030 so we're green?

Yes!

Many many thanks to @miri64 for reducing #6123 to msg_receive() and debugging this!

(which means start preparing a backport hehe ;)

Could someone? Isn't there a script?

miri64 · 2019-02-01T15:12:43Z

Yepp, but not merged yet #8968 (you can still use it though from @bergzand's branch)

miri64 · 2019-02-01T15:23:49Z

Backport provided in #10921

kaspar030 added 2 commits January 28, 2019 17:12

core/msg: add assert for problem identified in RIOT-OS#10881

82b2362

kaspar030 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: native Platform: This PR/issue effects the native platform Impact: major The PR changes a significant part of the code base. It should be reviewed carefully labels Jan 28, 2019

miri64 requested review from miri64, jnohlgard and LudwigKnuepfer January 28, 2019 16:39

miri64 added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Jan 28, 2019

miri64 reviewed Jan 28, 2019

View reviewed changes

miri64 requested a review from OlegHahm January 28, 2019 22:13

miri64 added this to the Release 2019.01 milestone Jan 28, 2019

miri64 mentioned this pull request Jan 29, 2019

task03: add stress test for native RIOT-OS/Release-Specs#95

Merged

miri64 added the Community: Hack'n'ACK candidate This PR is a candidate for review and discussion during one of RIOT's monthly Hack'n'ACK parties label Jan 29, 2019

miri64 mentioned this pull request Jan 29, 2019

Release 2019.01 notes #10871

Merged

miri64 mentioned this pull request Jan 30, 2019

tests: add tests to reproduce #10881 #10908

Merged

miri64 approved these changes Jan 31, 2019

View reviewed changes

OlegHahm approved these changes Feb 1, 2019

View reviewed changes

kaspar030 merged commit aa90f93 into RIOT-OS:master Feb 1, 2019

kaspar030 added the Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch label Feb 1, 2019

kaspar030 deleted the fix_native_thread_yield_higher branch February 1, 2019 15:11

miri64 mentioned this pull request Feb 1, 2019

cpu/native: fix race in thread_yield_higher() [backport 2019.01] #10921

Merged

cpu/native: fix race in thread_yield_higher() #10891

cpu/native: fix race in thread_yield_higher() #10891

Conversation

kaspar030 commented Jan 28, 2019 • edited by miri64 Loading

Contribution description

Testing procedure

Issues/PRs references

kaspar030 commented Jan 28, 2019 • edited Loading

kaspar030 commented Jan 28, 2019

kaspar030 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

kaspar030 commented Jan 28, 2019

miri64 commented Jan 28, 2019 • edited Loading

kaspar030 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

kaspar030 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019 • edited Loading

miri64 left a comment

Choose a reason for hiding this comment

miri64 commented Jan 28, 2019

miri64 commented Jan 28, 2019

kaspar030 commented Jan 28, 2019

miri64 commented Jan 28, 2019

kaspar030 commented Jan 29, 2019

miri64 commented Jan 29, 2019

kaspar030 commented Jan 29, 2019

miri64 commented Jan 30, 2019

miri64 commented Jan 30, 2019

miri64 commented Jan 30, 2019

kaspar030 commented Jan 30, 2019

kaspar030 commented Jan 30, 2019

miri64 commented Jan 30, 2019

miri64 commented Jan 30, 2019

miri64 left a comment

Choose a reason for hiding this comment

OlegHahm left a comment

Choose a reason for hiding this comment

emmanuelsearch commented Feb 1, 2019 • edited Loading

kaspar030 commented Feb 1, 2019

miri64 commented Feb 1, 2019

miri64 commented Feb 1, 2019

kaspar030 commented Jan 28, 2019 •

edited by miri64

Loading

kaspar030 commented Jan 28, 2019 •

edited

Loading

miri64 commented Jan 28, 2019 •

edited

Loading

miri64 commented Jan 28, 2019 •

edited

Loading

emmanuelsearch commented Feb 1, 2019 •

edited

Loading