Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu/native: fix race in thread_yield_higher() #10891

Merged
merged 2 commits into from
Feb 1, 2019

Conversation

kaspar030
Copy link
Contributor

@kaspar030 kaspar030 commented Jan 28, 2019

Contribution description

Error case:
1. thread_yield_higher() stores the thread's ucontext
2. creates an "isr ucontext" for isr_thread_yield, switches to it

Case 1: no signals are pending, continues in isr_thread_yield()
3a. sched_run is called
4a. return to sched_active_thread ucontext

Case 2: signals pending (the crashing scenario), continues in native_irq_handler()
3b. handles signals
4b. if sched_context_switch_request is set, call sched_run
5b. return to sched_active_thread ucontext

4b misses the call to sched_run(), leading to a possible return into a
non-ready thread.

Fixes #10881.

Testing procedure

See #10881 and #10908

Issues/PRs references

Fixes #10881.
Fixes #6123.

Error case:
1. thread_yield_higher() stores the thread's ucontext
2. creates an "isr ucontext" for isr_thread_yield, switches to it

Case 1: no signals are pending, continues in isr_thread_yield()
3a. sched_run is called
4a. return to sched_active_thread ucontext

Case 2: signals pending (the crashing scenario), continues in native_irq_handler()
3b. handles signals
4b. if sched_context_switch_request is set, call sched_run
5b. return to sched_active_thread ucontext

4b misses the call to sched_run(), leading to a possible return into a
non-ready thread.
@kaspar030 kaspar030 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: native Platform: This PR/issue effects the native platform Impact: major The PR changes a significant part of the code base. It should be reviewed carefully labels Jan 28, 2019
@kaspar030
Copy link
Contributor Author

kaspar030 commented Jan 28, 2019

This is pretty central in native, so I guess a couple of eyes need to take a look...

@kaspar030
Copy link
Contributor Author

@LudwigKnuepfer maybe you could take a look?

@kaspar030
Copy link
Contributor Author

This fixes #6123 for me!

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

I'm rather wondering if

diff --git a/cpu/native/native_cpu.c b/cpu/native/native_cpu.c
index 2629e55..79534e3 100644
--- a/cpu/native/native_cpu.c
+++ b/cpu/native/native_cpu.c
@@ -142,7 +142,7 @@ void isr_cpu_switch_context_exit(void)
     ucontext_t *ctx;
 
     DEBUG("isr_cpu_switch_context_exit\n");
-    if ((sched_context_switch_request == 1) || (sched_active_thread == NULL)) {
+    if (sched_active_thread == NULL) {
         sched_run();
     }

wouldn't be a more valid fix. Because this is basically also creating the situation you are creating:

  • isr_cpu_switch_context_exit() is only called by cpu_switch_context_exit() (either with a context switch to ISR when not in ISR, or directly when in ISR)
  • cpu_switch_context_exit() is called when all signal handlers were run in native_irq_handler() and in sched_task_exit, which is also called when you end a thread (so the user probably also wants to switch contexts), and also kernel_init for some reason...
  • so isr_switch_context_exit() should only be called when a context switch was requested, right?

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Some historic research:

@kaspar030
Copy link
Contributor Author

if (sched_active_thread == NULL)
wouldn't be a more valid fix.

I don't think so, as it would not call sched_run() anymore (sched_active_thread is never NULL unless threading is starting). always calling sched_run there would also be a "fix" but it would probably call the scheduler more often than needed.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

The fix I proposed causes the scheduler only to be called when the thread is exited, so this obviously was bullshit. Also, since the title of #229 where this check was introduced mentions segfaults we probably shouldn't mess with that.

@kaspar030
Copy link
Contributor Author

This fixes #6123 for me!

stable for 30min now, sudo true; for _ in $(seq 50); do sudo ping -s1392 -f "${TARGET}" & done.

This bug has haunted my benchmarking efforts on native for years!

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Penny for your thoughts on this though #6660 (comment)

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

This bug has haunted my benchmarking efforts on native for years!

Next time use some elbow grease like I did ;D.

@kaspar030
Copy link
Contributor Author

Penny for your thoughts on this though #6660 (comment)

You mean at the end of native's IRQ handler (native_irq_handler()) the function cpu_switch_context_exit() gets called.?

I think the naming is a little off.

cpu_switch_context_exit() basically jumps to another context without saving (or better: with completely ignoring) the current one. That is also why thread_yield_higher(), after saving the context (via swapcontext()), ends up there, that's why "kernel_init()" uses it to kick off scheduling and why the function is put into a new thread's stack as return value so when a thread function exits, it ends up calling.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Ok my final assessment for today for this PR is that sched_run might be called unnecessarily in some cases, which on native I can live with. I'll run some tests tomorrow and then give my final verdict.

@miri64 miri64 added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Jan 28, 2019
@miri64
Copy link
Member

miri64 commented Jan 28, 2019

I tested most thread-related tests on native locally again. All succeeded. The rest I leave to Murdock.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

I was however not able find an isolated test case yet :-(.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Maybe because I don't fully understand how and when sched_context_switch_request can be 0 in and when it is 1.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

I was also wondering why not all platforms need to set it in thread_yield_higher() and only native. Except native only atmega_common seems to do that and also only in the else branch of an irq_is_in() == 0 check:

void thread_yield_higher(void)
{
if (irq_is_in() == 0) {
__context_save();
sched_run();
__context_restore();
__asm__ volatile ("ret");
}
else {
sched_context_switch_request = 1;
}
}

(edit: but maybe this was just mapped after native)

All others just seem to be content to trigger a software interrupt except for msp430_common which seems to implement its own version of context switching:

__attribute__((naked)) void thread_yield_higher(void)
{
__asm__("push r2"); /* save SR */
__disable_irq();
__save_context();
/* have sched_active_thread point to the next thread */
sched_run();
__restore_context();
UNREACHABLE();
}

Copy link
Member

@miri64 miri64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, tests pass and conceptually the change makes sense to me so and is hard to argue with. So it gets my soft ACK (which may be counted as 1 approve, given the "Impact: major" label, if others more knowledgeable in the deep dungeons of RIOT's scheduling approve)

@miri64 miri64 requested a review from OlegHahm January 28, 2019 22:13
@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Maybe @OlegHahm can also have a look, since he reviewed #6660?

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Except native only atmega_common seems to do that and also only in the else branch of an irq_is_in() == 0 check:

void thread_yield_higher(void)
{
if (irq_is_in() == 0) {
__context_save();
sched_run();
__context_restore();
__asm__ volatile ("ret");
}
else {
sched_context_switch_request = 1;
}
}

(edit: but maybe this was just mapped after native)

Was introduced in #8904 so summon @ZetaR60 as well.

@miri64 miri64 added this to the Release 2019.01 milestone Jan 28, 2019
@kaspar030
Copy link
Contributor Author

Maybe because I don't fully understand how and when sched_context_switch_request can be 0 in and when it is 1.

TL;DR sched_context_switch_request is used to request a (necessarily deferred) context switch from within an ISR.

A context switch can be triggered either from the running task or from within an ISR.
In the former case, it can be done right away: save the current context, run sched_run() to find the next thread, then restore that.

In ISRs, the context switch cannot be done right away. See e.g. this (pseudocode):

void foobar_isr() {
  do_stuff();
  if (whatever > FOO) {
    mutex_unlock(mutex_a_thread_is_waiting_on);
  } 
  do_more_stuff(); 
}

In ISR context, the ISR must finish first. It is not possible to "pause" ISR context, switch to the thread, then "continue" the ISR context. ISRs must finish. So the thread that would probably be unlocked by the mutec_unlock() call is merely put on the runqueue, but we need to somehow pass along the information that a context switch might be necessary.

This is what "sched_context_switch_request" is used for: pass along the information that a context switch might be necessary after an ISR has finished.

Thing is, different platforms have different concepts.

E.g., on Cortex-M, there's PendSV, which will always be called last. Also, there are the banked ISR registers (a second set of registers), thus for serving an ISR, no manual context saving is necessary. Only when, after all ISRs have run, a different user thread should continue to run, the current context needs to be saved, the next selected and then restored, through fiddling with the user mode stack. sched_context_switch_request is used as a flag to let the architecture code, at the end of an ISR, know that this work should be done and thus PendSV should be triggered.

Cortex-M checks the flag here:

static inline void cortexm_isr_end(void)
{
if (sched_context_switch_request) {
thread_yield_higher();
}
}

Other platforms do it differently, but all are using the flag.

@miri64
Copy link
Member

miri64 commented Jan 28, 2019

Ok, I think most of that I was able to figure out myself, so thanks for the affirmation and putting it in words. However, I still don't really am able to track how the case before was wrong / let to invalid unblocking so that is why I don't feel 100% confident in this bug fix.

@miri64 miri64 added the Community: Hack'n'ACK candidate This PR is a candidate for review and discussion during one of RIOT's monthly Hack'n'ACK parties label Jan 29, 2019
@kaspar030
Copy link
Contributor Author

However, I still don't really am able to track how the case before was wrong / let to invalid unblocking so that is why I don't feel 100% confident in this bug fix.

I'll try to put it in different words.

Usually on thread_yield_higher() on native, when called outside of an ISR, an "isr context" is created with "makecontext" which executes "isr_thread_yield". "swapcontext()" saves the thread context and jumps into the newly created "isr context", launching "isr_thread_yield".

Now, if there are no signals pending, the scheduler is run, then a context switch to the then-current sched_active_thread is initiated. All is good.

But it is possible that signals queue up (after irq_disable in thread_yield_higher()), making "isr_thread_yield()" call "native_irq_handler()". That function does not return, as it returns to thread context by itself, after handling ISRs. The jumping-back to thread context is almost identical to how "isr_thread_yield()" does it (if there'd been no signals pending). The main difference is that "native_irq_handler()" does not always call "sched_run()", but only if "sched_context_switch_request" was set.

This lead to the crash at hand: msg_receive() removes the current thread from the runqueue, then calls thread_yield_higher() expecting to be scheduled away. But in the off chance that a signal pops up at the wrong time (and the ISR does not set sched_context_switch_request=1), the scheduler doesn't get called, and msg_receive() returns without receiving.

The fix works by always setting "sched_context_switch_request" to 1, so should "isr_thread_yield" degrade to "native_irq_handler", the scheduler is still called.

@miri64
Copy link
Member

miri64 commented Jan 29, 2019

Ok, so the tests I started to write yesterday weren't all that wrong (trigger a timer in very short intervals [which causes a SIGALRM on native], while a message exchange between two threads is happening), they just were not able to trigger the race condition.

@kaspar030
Copy link
Contributor Author

(trigger a timer in very short intervals [which causes a SIGALRM on native], while a message exchange between two threads is happening), they just were not able to trigger the race condition.

Actually yes, if the timer does not trigger a context switch (e.g., no mutex unlock or msg send)!

@miri64 miri64 mentioned this pull request Jan 29, 2019
@miri64
Copy link
Member

miri64 commented Jan 30, 2019

Actually yes, if the timer does not trigger a context switch (e.g., no mutex unlock or msg send)!

Aha, that was what missing then :-).

@miri64
Copy link
Member

miri64 commented Jan 30, 2019

Ok, I tried that, but still not really able to reproduce the issue. Probably the receiving thread needs to do a little more than just printing something.

@kaspar030
Copy link
Contributor Author

https://github.com/miri64/RIOT/blob/tests/new/i10881/tests/thread_msg_block_race/main.c

Can you try to make the timeouts of the timer random, and make sure it does not cause a context switch (remove the mutex_unlock())? Also I think thread_1 can go.

@kaspar030
Copy link
Contributor Author

Aha, that was what missing then :-).

Ah, sorry, I didn't get you. No context switch was right...

@miri64
Copy link
Member

miri64 commented Jan 30, 2019

Ok, I did that now and also replaced msg_try_send() with msg_send() (otherwise the main thread starves). However, now sending stops after 3 times trying... :(

@miri64
Copy link
Member

miri64 commented Jan 30, 2019

I got it! Let me write a quick test script and I provide a PR.

Copy link
Member

@miri64 miri64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works with #10908 and I did not encounter any problems in the other thread tests with native.

Copy link
Member

@OlegHahm OlegHahm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would ACK this PR even though I have to admit that I haven't dug into all the consequences this change might possible have. However, we're carrying this bug with us for ore more than two years now and apparently this PR fixet it and does not break any other standard test. Plus, it is basically an one-liner.

@emmanuelsearch
Copy link
Member

emmanuelsearch commented Feb 1, 2019

@kaspar030 so we're green?
(which means start preparing a backport hehe ;)

@kaspar030
Copy link
Contributor Author

@kaspar030 so we're green?

Yes!

Many many thanks to @miri64 for reducing #6123 to msg_receive() and debugging this!

(which means start preparing a backport hehe ;)

Could someone? Isn't there a script?

@kaspar030 kaspar030 merged commit aa90f93 into RIOT-OS:master Feb 1, 2019
@kaspar030 kaspar030 added the Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch label Feb 1, 2019
@kaspar030 kaspar030 deleted the fix_native_thread_yield_higher branch February 1, 2019 15:11
@miri64
Copy link
Member

miri64 commented Feb 1, 2019

Yepp, but not merged yet #8968 (you can still use it though from @bergzand's branch)

@miri64
Copy link
Member

miri64 commented Feb 1, 2019

Backport provided in #10921

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Community: Hack'n'ACK candidate This PR is a candidate for review and discussion during one of RIOT's monthly Hack'n'ACK parties Impact: major The PR changes a significant part of the code base. It should be reviewed carefully Platform: native Platform: This PR/issue effects the native platform Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants