-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid ISR queue overflow when using flags #283
Comments
I ran into this same issue whilst using mbed. When I discovered that the problem was the queue filling up with the same events I created a temporary solution by blocking an item from the queue if it was already there. This seems to work but I don't know if this will be causing any issues in other parts of the OS. See ARMmbed/mbed-os#7986 for the mbed issue. My solution was based on searching the queue for a match on the same item. As @kjbracey-arm mentioned it could be more efficient if a flag is used for detection if the item is already queued. |
Thanks for raising awareness of this issue. |
My approach ist --- a/CMSIS/RTOS2/RTX/Include/rtx_os.h
+++ b/CMSIS/RTOS2/RTX/Include/rtx_os.h
@@ -58,6 +58,7 @@ extern "C"
/// Object Flags definitions
#define osRtxFlagSystemObject 0x01U
#define osRtxFlagSystemMemory 0x02U
+#define osRtxFlagQueued 0x04U
--- a/CMSIS/RTOS2/RTX/Source/rtx_system.c
+++ b/CMSIS/RTOS2/RTX/Source/rtx_system.c
@@ -45,7 +82,10 @@ static uint32_t isr_queue_put (os_object_t *object) {
#if (EXCLUSIVE_ACCESS == 0)
__disable_irq();
- if (osRtxInfo.isr_queue.cnt < max) {
+ if ((object->flags & osRtxFlagQueued)) {
+ ret = 1U;
+ } else if (osRtxInfo.isr_queue.cnt < max) {
+ object->flags |= osRtxFlagQueued;
osRtxInfo.isr_queue.cnt++;
osRtxInfo.isr_queue.data[osRtxInfo.isr_queue.in] = object;
if (++osRtxInfo.isr_queue.in == max) {
@@ -60,7 +100,9 @@ static uint32_t isr_queue_put (os_object_t *object) {
__enable_irq();
}
#else
- if (atomic_inc16_lt(&osRtxInfo.isr_queue.cnt, max) < max) {
+ if (test_and_set_bit8(&object->flags, osRtxFlagQueued)) {
+ ret = 1U;
+ } else if (atomic_inc16_lt(&osRtxInfo.isr_queue.cnt, max) < max) {
n = atomic_inc16_lim(&osRtxInfo.isr_queue.in, max);
osRtxInfo.isr_queue.data[n] = object;
ret = 1U;
@@ -94,6 +136,7 @@ static os_object_t *isr_queue_get (void) {
if (++osRtxInfo.isr_queue.out == max) {
osRtxInfo.isr_queue.out = 0U;
}
+ ret->flags &= ~osRtxFlagQueued;
} else {
ret = NULL;
}
@@ -105,6 +148,7 @@ static os_object_t *isr_queue_get (void) {
if (atomic_dec16_nz(&osRtxInfo.isr_queue.cnt) != 0U) {
n = atomic_inc16_lim(&osRtxInfo.isr_queue.out, max);
ret = osRtxObject(osRtxInfo.isr_queue.data[n]);
+ clr_bit8(&ret->flags, osRtxFlagQueued);
} else {
ret = NULL;
}
and something like inline static uint8_t test_and_set_bit8(uint8_t *mem, uint8_t bit)
{
uint8_t res;
unsigned long tmp;
unsigned long val;
__asm__ __volatile__ ("1: ldrexb %[res], %[mem]\n"
" orr %[val], %[res], %[bit]\n"
" strexb %[tmp], %[val], %[mem]\n"
" cmp %[tmp], #0\n"
" bne 1b\n"
: [tmp] "=&r" (tmp),
[mem] "+Q" (*mem),
[res] "=&r" (res),
[val] "=&r" (val)
: [bit] "r" (bit)
: "cc");
return (res & bit);
}
inline static void clr_bit8(uint8_t *mem, uint8_t bit)
{
unsigned long tmp;
unsigned long val;
__asm__ __volatile__ ("1: ldrexb %[val], %[mem]\n"
" bic %[val], %[bit]\n"
" strexb %[tmp], %[val], %[mem]\n"
" cmp %[tmp], #0\n"
" bne 1b\n"
: [tmp] "=&r" (tmp),
[val] "=&r" (val),
[mem] "+Q" (*mem)
: [bit] "r" (bit)
: "cc");
} |
@RobertRostohar: We should finally conclude on this one. |
I agree that ISR queue handling could be extended in order to reduce the number of items put into the queue in certain situations. The solution proposed by @kjbracey-arm is a good approach for Thread Flags and Event Flags. The solution proposed by @ensc which addresses this at the ISR queue level covers all RTOS objects that post into the queue and would be my preference. However on one side there are concerns of the additional cycles that will be introduced when user calls RTOS functions from ISR. But the real question is what kind of user cases will this enhancement solve? ISR queue handling is processed immediately after ISRs and before returning to threads. ISR queue overflow indicates that the system is busy executing ISRs most of the time and has no time to execute threads. Usually it indicates a problem in the application design and it is questionable if it will function as intended. Short burst of interrupts that put items into the ISR queue can be simply solved by increasing the ISR queue size. But if there are sustained interrupts that trigger postprocessing then the above solutions might not help much. |
Is there any update regarding to these problem set(s)? |
It can be quite easy to generate ISR queue overflow if pushing a bit too hard with signalling from interrupts.
Maybe this is sometimes unavoidable, but there is one obvious case where RTX seems to be performing suboptimally -
osEventFlagsSet
andosThreadFlagsSet
.Part of the point of flags is that they're "squashable" - signalling an already-set flag is a no-op. If the notifier is signalling faster than the consumer is reading, it shouldn't cost anything.
But the ISR set routines always put a post-process entry onto the ISR queue even when they haven't modified the flags. So it's easy to cause ISR queue overflow with a trivial interrupt handler that just sets a flag, if it's a pulse-based interrupt.
I would suggest changing the internal atomic-set helper functions to return the old value rather than the new (as you can deduce the new from the old, but not vice versa). Then it's straightforward for
isrRtxEventFlagsSet
to do:You could also conditionalise the core work of
svcRtxEventFlagsSet
on the same test - possibly not much of an optimisation, but would make the logic consistent - the conditionalised bit should exactly match the post-process work.It's possible there could be a more general mechanism whereby you never queue the same object twice for post-processing, but I imagine that would require more thought. It's relatively easy to specifically fix flags, and I've seen a number of cases now where people have naturally chosen flags because they're squashable, only to find out they don't squash from ISR.
The text was updated successfully, but these errors were encountered: