-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Fix context restoration as #101709 describes #101865
[RISC-V] Fix context restoration as #101709 describes #101865
Conversation
Link to original discussion #101709. |
The t4 could get a random value in case of the race and in case the stack would overwrite the context. Would there be a way to do something along the lines I did for arm (32 bit) in my PR - storing the t4 value below the SP that we are restoring to, restoring the SP to the location of the stored t4 and popping it off the stack? That would keep it safe. I am not familiar with the RISC-V instruction set though, so maybe it cannot be done there. And in case the ABI supports red zone, it would need to be done slightly differently to not to corrupt it. |
c4ec523
to
1e6b313
Compare
@janvorli Thanks for detailed explanation! In the recent commit Regarding red zone, I did not find reliable information for riscv64. |
|
Sorry, I don't know either. CC @dotnet/samsung Does anyone know about redzone in riscv? |
@t-mustafin is right, there is no red zone on RISC-V. This wording was added to our ABI to clarify it:
|
RISC-V test results for qemu-prio0-checked: 2717 / 2722 (99.82%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for starfive-prio0-checked: 2720 / 2722 (99.93%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for qemu-prio1-checked: 9396 / 9401 (99.95%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for starfive-prio1-checked: 9399 / 9401 (99.98%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for starfive-prio1-checked: 9399 / 9402 (99.97%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for qemu-prio1-checked: 9397 / 9402 (99.95%)detailsGIT: 1e6b313
failed tests
killed tests
skipped tests
|
RISC-V test results for qemu-prio1-checked (
|
@janvorli can you please take a look? |
ld t1, (CONTEXT_Pc)(t4) // Since we cannot control $pc directly, we're going to corrupt t1 | ||
ld t4, (CONTEXT_T4)(t4) | ||
ld sp, (CONTEXT_Sp)(t4) | ||
ld t4, -8(sp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I read this correctly, but this seems to still have an issue. It seems that if an async signal occurs after we load the sp, but before we load t4 from (sp-8), the t4 value could be corrupted by the stack of the async signal handler.
We would need to load the sp with an address that is 8 bytes below the target SP, then load the t4 from there, then increment sp by 8 and finally jump to t1. I have attempted to write that in RISC-V code below, please take it with a grain of salt as this is my first RISC-V code ever.
ld t4, (CONTEXT_Sp)(t4)
addi sp, t4, -8
ld t4, (sp)
addi sp, sp, 8
jr t1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, seems your solution should work undependable to fields of context structure.
RISC-V test results for starfive-prio1-checked (
|
1e6b313
to
3599429
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
RISC-V test results for qemu-prio1-checked (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@janvorli Is it proper fix? Seems for race condition case it would trash 2 temporary registers - t1 and t4, so it may be better completely remove t4 restoration
ld t4, (CONTEXT_T4)(t4)
.cc @jakobbotsch cc @dotnet/samsung