Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/isr_thread_yield_higher: test behavior from interrupt #11759

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cladmi
Copy link
Contributor

@cladmi cladmi commented Jun 28, 2019

Contribution description

From interrupt context on some architectures, thread_yield_higher yields
immediately, on some others it yields after the end of the interrupt.

However, some cpus implementation like 'atmega_common' have an explicit
'irq_is_in' switch to a different handling.

So I assume either this special handling is not required, or the
architectures that behave differently are wrong.

Question

How should thread_yield_higher behave from interrupt context ?
Note that masking interrupts also let it currently yield, at least on the msba2.

The problem was found as xtimer_mutex_lock_timeout is using thread_yield_higher from interrupts #6158 (when not spinning).
It currently brings back to the main thread with xtimer, _in_handler variable still set to 1, so _timer_callback not being finished.

Testing procedure

Run the test on different cpus/boards.

Boards that have a value of 1, so yield after the interrupt

Boards that have a value of 0, so yield inside the interrupt

FIXED value on `msba2`

#11887

2019-06-28 16:22:13,378 - INFO # Connect to serial port /dev/riot/ttyMSBA2
Welcome to pyterm!
Type '/exit' to exit.
2019-06-28 16:22:14,384 - INFO # main(): This is RIOT! (Version: buildtest)
2019-06-28 16:22:14,385 - INFO # Send character to start
2019-06-28 16:22:16,418 - INFO # Go to sleep
2019-06-28 16:22:17,410 - INFO # Value is 0
RIOT MSP430 hardware initialization complete.
main(): This is RIOT! (Version: 2019.07-devel-847-g9f964-pr/tests/isr_thread_yield_higher)
Send character to start
Go to sleep
Value is 0
Timeout in expect script at "child.expect_exact('Value is 1')" (tests/isr_thread_yield_higher/tests/test.py:16)

Test run on most of the boards, it fails for the msba2.
https://ci-ilab.imp.fu-berlin.de/job/experimental-pull-request-tests/40/consoleText

Issues/PRs references

This different behavior was found when testing #11660
The original implementation was using thread_yield_higher already, but not setting a value after.

@cladmi cladmi added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR labels Jun 28, 2019
@cladmi cladmi requested a review from kaspar030 June 28, 2019 14:20
@cladmi
Copy link
Contributor Author

cladmi commented Jun 28, 2019

Not sure who to ask more about this issue…

@cladmi cladmi added this to the Release 2019.10 milestone Jun 28, 2019
@kaspar030
Copy link
Contributor

kaspar030 commented Jun 28, 2019

How should thread_yield_higher behave from interrupt context ?

Run scheduler after interrupts have finished. The equivalent of sched_context_switch_request = 1;.

@cladmi
Copy link
Contributor Author

cladmi commented Jun 28, 2019

How should thread_yield_higher behave from interrupt context ?

Run scheduler after interrupts have finished. The equivalent of set sched_context_switch_request = 1;.

So the test is indeed showing the right behavior, and msp430 and msba2 are wrong.
I can rewrite the test documentation in that direction and update the API documentation.

If we do not mind loosing the debug messages, this should also be updated to directly use thread_yield_higher right ?

RIOT/core/sched.c

Lines 186 to 209 in c9bf22b

void sched_switch(uint16_t other_prio)
{
thread_t *active_thread = (thread_t *) sched_active_thread;
uint16_t current_prio = active_thread->priority;
int on_runqueue = (active_thread->status >= STATUS_ON_RUNQUEUE);
DEBUG("sched_switch: active pid=%" PRIkernel_pid" prio=%" PRIu16 " on_runqueue=%i "
", other_prio=%" PRIu16 "\n",
active_thread->pid, current_prio, on_runqueue, other_prio);
if (!on_runqueue || (current_prio > other_prio)) {
if (irq_is_in()) {
DEBUG("sched_switch: setting sched_context_switch_request.\n");
sched_context_switch_request = 1;
}
else {
DEBUG("sched_switch: yielding immediately.\n");
thread_yield_higher();
}
}
else {
DEBUG("sched_switch: continuing without yield.\n");
}
}

@kaspar030
Copy link
Contributor

this should also be updated to directly use thread_yield_higher right?

Hmm. Let's think this through.

It is obviously a common pattern to check if in isr, maybe yield, maybe set "sched_yield_higher". But the isr check is not necessary in all cases. And we should avoid wasting cycles at this level, or at least establish if always checking for isr_is_in() has a measurable impact on context switches.
Maybe a static inline wrapper for if isr_is_in() sched_context_swittch_request = 1; else; thread_yield_higher(); makes sense. Could be overkill.

@cladmi
Copy link
Contributor Author

cladmi commented Jun 28, 2019

It is obviously a common pattern to check if in isr, maybe yield, maybe set "sched_yield_higher". But the isr check is not necessary in all cases. And we should avoid wasting cycles at this level

It is currently checked everytime for atmega_common, esp32, esp8266.
native also has a special handling, but native is special.

void thread_yield_higher(void)
{
if (irq_is_in() == 0) {
__context_save();
sched_run();
__context_restore();
__asm__ volatile ("ret");
}
else {
sched_context_switch_request = 1;
}
}

/**
* If we are already in an interrupt handler, the function simply sets the
* context switch flag, which indicates that the context has to be switched
* in the _frxt_int_exit function when exiting the interrupt. Otherwise, we
* will generate a software interrupt to force the context switch when
* terminating the software interrupt (see thread_yield_isr).
*/
void thread_yield_higher(void)
{
/* reset hardware watchdog */
system_wdt_feed();
/* yield next task */
#if defined(ENABLE_DEBUG) && defined(DEVELHELP)
if (sched_active_thread) {
DEBUG("%u old task %u %s %u\n", system_get_time(),
sched_active_thread->pid, sched_active_thread->name,
sched_active_thread->sp - sched_active_thread-> stack_start);
}
#endif
if (!irq_is_in()) {
/* generate the software interrupt to switch the context */
DPORT_WRITE_PERI_REG(DPORT_CPU_INTR_FROM_CPU_0_REG, DPORT_CPU_INTR_FROM_CPU_0);
}
else {
/* set the context switch flag */
_frxt_setup_switch();
}
#if defined(ENABLE_DEBUG) && defined(DEVELHELP)
if (sched_active_thread) {
DEBUG("%u new task %u %s %u\n", system_get_time(),
sched_active_thread->pid, sched_active_thread->name,
sched_active_thread->sp - sched_active_thread-> stack_start);
}
#endif
/*
* Instruction fetch synchronize: Waits for all previously fetched load,
* store, cache, and special register write instructions that affect
* instruction fetch to be performed before fetching the next instruction.
*/
__asm__("isync");
return;
}

void thread_yield_higher(void)
{
/* reset hardware watchdog */
system_soft_wdt_feed();
/* yield next task */
#if defined(ENABLE_DEBUG) && defined(DEVELHELP)
if (sched_active_thread) {
DEBUG("%u old task %u %s %u\n", phy_get_mactime(),
sched_active_thread->pid, sched_active_thread->name,
sched_active_thread->sp - sched_active_thread-> stack_start);
}
#endif
if (!irq_is_in()) {
#ifdef CONTEXT_SWITCH_BY_INT
WSR(BIT(ETS_SOFT_INUM), interrupt);
#else
vPortYield();
#endif
}
else {
_frxt_setup_switch();
}
#if defined(ENABLE_DEBUG) && defined(DEVELHELP)
if (sched_active_thread) {
DEBUG("%u new task %u %s %u\n", phy_get_mactime(),
sched_active_thread->pid, sched_active_thread->name,
sched_active_thread->sp - sched_active_thread-> stack_start);
}
#endif
return;
}

JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Jul 5, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Jul 5, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Jul 5, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
@cladmi
Copy link
Contributor Author

cladmi commented Jul 8, 2019

native supposes the function can be called from interrupt context

void thread_yield_higher(void)
{
sched_context_switch_request = 1;
if (_native_in_isr == 0) {
ucontext_t *ctx = (ucontext_t *)(sched_active_thread->sp);
_native_in_isr = 1;
if (!native_interrupts_enabled) {
warnx("thread_yield_higher: interrupts are disabled - this should not be");
}
irq_disable();
native_isr_context.uc_stack.ss_sp = __isr_stack;
native_isr_context.uc_stack.ss_size = SIGSTKSZ;
native_isr_context.uc_stack.ss_flags = 0;
makecontext(&native_isr_context, isr_thread_yield, 0);
if (swapcontext(ctx, &native_isr_context) == -1) {
err(EXIT_FAILURE, "thread_yield_higher: swapcontext");
}
irq_enable();
}
}

@cladmi
Copy link
Contributor Author

cladmi commented Jul 8, 2019

Reference as it was how it was found, xtimer_mutex_lock_timeout is using thread_yield_higher from interrupts #6158 (when not spinning).

JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Jul 10, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
@cladmi
Copy link
Contributor Author

cladmi commented Jul 18, 2019

@MrKevinWeiss This is still there, show we do something about it for the release.
@JulianHolzwarth has a fix in https://github.com/RIOT-OS/RIOT/pull/11807/files for at least not use thread_yield_higher which led to getting out oft he interrupt before it was finished.

@MrKevinWeiss
Copy link
Contributor

Something tells me the fix won't be ready that fast. The challenge of doing a summer release is many people are not available. As a worst case I will put it in the release notes.

@cladmi
Copy link
Contributor Author

cladmi commented Jul 22, 2019

I noticed that also thread_yield is called from interrupt context in tests/thread_float which breaks execution on the msba2.

JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Jul 30, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
JulianHolzwarth added a commit to JulianHolzwarth/RIOT that referenced this pull request Aug 9, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
olegart pushed a commit to unwireddevices/RIOT that referenced this pull request Sep 10, 2019
because of pr RIOT-OS#11759: not all boards check for is_in_irq when thread_yield_higher
@cladmi cladmi force-pushed the pr/tests/isr_thread_yield_higher branch from 9f96449 to e1be672 Compare September 11, 2019 10:40
Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments

tests/isr_thread_yield_higher/main.c Outdated Show resolved Hide resolved
tests/isr_thread_yield_higher/main.c Outdated Show resolved Hide resolved
@maribu
Copy link
Member

maribu commented Aug 5, 2020

IMO we should also just implement thread_yield_higher as:

static inline void thread_yield_higher(void)
{
    if (irq_is_in()) {
        sched_context_switch_request = 1;
    }
    else {
        thread_yield_higher_arch();
    }
}

And thread_yield_higher_arch() is guaranteed to never be called form ISR and all special handling can be dropped.

@fjmolinas fjmolinas removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 5, 2020
@maribu
Copy link
Member

maribu commented Aug 5, 2020

LGTM. Feel free to squash!

@fjmolinas fjmolinas force-pushed the pr/tests/isr_thread_yield_higher branch from 3b8169a to f3bd3ed Compare August 6, 2020 07:14
@fjmolinas fjmolinas added Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Aug 6, 2020
@fjmolinas fjmolinas added this to the Release 2020.10 milestone Aug 6, 2020
@fjmolinas fjmolinas removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 6, 2020
@fjmolinas fjmolinas force-pushed the pr/tests/isr_thread_yield_higher branch from f3bd3ed to b23c3c8 Compare August 6, 2020 07:36
From interrupt context on some architectures, thread_yield_higher yields
immediately, on some others it yields after the end of the interrupt.

However, some cpus implementation like 'atmega_common' have an explicit
'irq_is_in' switch to do a different handling.

So I assume either this special handling is not required, or the
architectures that behave differently are wrong.
@fjmolinas fjmolinas force-pushed the pr/tests/isr_thread_yield_higher branch from b23c3c8 to 9f5822c Compare August 6, 2020 07:38
@fjmolinas
Copy link
Contributor

Another Travis comment popped up, amended and pushed right away. (Only changed the type for value)

@fjmolinas fjmolinas added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 6, 2020
Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late comments. Feel free to squash right in.

tests/isr_thread_yield_higher/main.c Show resolved Hide resolved
tests/isr_thread_yield_higher/main.c Outdated Show resolved Hide resolved
tests/isr_thread_yield_higher/main.c Outdated Show resolved Hide resolved
tests/isr_thread_yield_higher/tests/test.py Outdated Show resolved Hide resolved
@fjmolinas fjmolinas removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Aug 6, 2020
@fjmolinas
Copy link
Contributor

@maribu I also noticed that this PR was before test_utils_interactive_sync was included so I removed the extra send line. Can you check the comment on thread_sleep as well?

include ../Makefile.tests_common

USEMODULE += xtimer
USEMODULE += stdin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring the stdin module. (Sorry for the confusion.)

@fjmolinas
Copy link
Contributor

Is this one still relevant @maribu?

@maribu
Copy link
Member

maribu commented Nov 18, 2021

I think this should get in. It will need a rebase to be passing the CI, though.

Copy link
Contributor

@kfessel kfessel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice test for the expected behavior
needs rebase and optional ztimer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: tests Area: tests and testing framework Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants