Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv_common: Increase ISR stack size if DEVELHELP is enabled #16448

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nmeum
Copy link
Member

@nmeum nmeum commented May 5, 2021

Contribution description

The handle_trap() implementation for riscv_common includes the following debugging code if DEVELHELP is enabled:

#ifdef DEVELHELP
printf("Unhandled trap:\n");
printf(" mcause: 0x%" PRIx32 "\n", mcause);
printf(" mepc: 0x%" PRIx32 "\n", read_csr(mepc));
printf(" mtval: 0x%" PRIx32 "\n", read_csr(mtval));
#endif

This code is executed on the ISR stack and not on any standard RIOT thread stack. The stack size of this ISR stack is defined in the ldscript for riscv_common as follows:

__stack_size = DEFINED(__stack_size) ? __stack_size : 256;

Unfortunately, 256 Bytes is insufficient for newlib printf() usage and thus causes a stack overflow on the ISR stack if DEVELHELP is enabled and this code path in handle_trap() is triggered. One solution to avoid this problem is to increase the size of the ISR stack if DEVELHELP is enabled, this is the solution this PR proposes. Alternatively, it would also be possible to simply remove the printf() invocations from handle_trap().

Testing procedure

I discovered this through an external tool I am currently working on which I cannot share at this point. Similar to #16395, this bug can be confirmed independently by checking the value of the sp register after each instruction in handle_trap() using GDB. For instance, by using the following modified version of the hello-world example application to trigger the code path with the printf statements in handle_trap():

diff --git a/examples/hello-world/main.c b/examples/hello-world/main.c
index f51bf8c0a0..4a109f3d59 100644
--- a/examples/hello-world/main.c
+++ b/examples/hello-world/main.c
@@ -28,5 +28,6 @@ int main(void)
     printf("You are running RIOT on a(n) %s board.\n", RIOT_BOARD);
     printf("This board features a(n) %s MCU.\n", RIOT_MCU);
 
-    return 0;
+    int *foo = NULL;
+    return *foo;
 }

I am not sure if there is any interest in these ISR stack overflow edge cases but I wanted to document this anyhow in case some else runs into them by accident. If you don't consider it worthwhile to fix these edge cases let me know.

Issues/PRs references

This is similar to the aforementioned issue #16395 which is also a stack overflow on the ISR stack which occurs due to printf() usage on this stack. This issue is, however, not fixed by this PR as ENABLE_DEBUG can be enabled independent of DEVELHELP.

Without this change, the ISR stack will overflow if DEVELHELP is
enabled and an unhandled trap is raised as cpu/riscv_common/irq_arch.c
uses printf in this case and printf requires more than 256 bytes of
stack space.
@benpicco benpicco requested a review from bergzand May 5, 2021 11:56
@benpicco benpicco added Area: cpu Area: CPU/MCU ports Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) labels May 5, 2021
@chrysn
Copy link
Member

chrysn commented May 8, 2021

Just to check for milder means (as having ISR stack size depend on devhelp is correct here but may mask ISR overflow issues when devhelp is switched off):

Would expressing the printf calls as putsprint_str [edit: which doesn't add the trailing newline] and print_u32_hex allow things to fit within the stack?

@nmeum
Copy link
Member Author

nmeum commented May 10, 2021

Would expressing the printf calls as putsprint_str [edit: which doesn't add the trailing newline] and print_u32_hex allow things to fit within the stack?

Yes, that does seem to work. Though the problem then becomes that core_panic (as called by the trap handler) indirectly uses newlib printf functions again through the LOG_ERROR macro:

RIOT/core/panic.c

Lines 75 to 85 in 2692957

LOG_ERROR("*** RIOT kernel panic:\n%s\n\n", message);
#ifdef DEVELHELP
#ifdef MODULE_PS
ps();
LOG_ERROR("\n");
#endif
LOG_ERROR("*** halted.\n\n");
#else
LOG_ERROR("*** rebooting...\n\n");
#endif

@chrysn
Copy link
Member

chrysn commented May 10, 2021 via email

@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 22, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@maribu
Copy link
Member

maribu commented Jan 5, 2023

Does the stack also overflow with picolibc? There seems to be a strong consensus that we should switch to picolibc for all platforms that support it. It is currently only being blocked by the toolchain our CI runs having an out-of-date version of picolibc installed.

@Teufelchen1
Copy link
Contributor

@nmeum hey, still interested in getting this in? I remember running in this exact issue a couple months ago...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: cpu Area: CPU/MCU ports Platform: RISC-V Platform: This PR/issue effects RISC-V-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants