Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu/avr8_common: Optimize context switch #19799

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

nandojve
Copy link
Contributor

@nandojve nandojve commented Jul 5, 2023

Contribution description

As PR title says, this optimize a little the cost of a context switch for AVR family.

Testing procedure

examples/thread_duel
tests/isr_context_switch_by_ext_int (ATxmega)

Issues/PRs references

Extends #19777, #19784

CC: @maribu , @benpicco , @hugueslarrive I apreciate if you could check on your devices to make sure I didn't miss anything.
This is on top of #19777, #19784 for now and only the last 4 are relevant.

@github-actions github-actions bot added Area: tests Area: tests and testing framework Platform: AVR Platform: This PR/issue effects AVR-based platforms Area: drivers Area: Device drivers Area: boards Area: Board ports Area: cpu Area: CPU/MCU ports Area: Kconfig Area: Kconfig integration labels Jul 5, 2023
@nandojve nandojve force-pushed the avr8/optimize_context_switch branch from 50fd301 to 86eb2cf Compare July 7, 2023 18:05
@nandojve
Copy link
Contributor Author

nandojve commented Jul 7, 2023

@maribu ,

Last commit add support to a dedicated ISR stack and I was surprised that I can not reduce the idle thread size. This means that instead save memory this approach requires at least idle thread stack size to run. IMHO a idle thread which consist of a spin loop should use no more than 50 bytes of RAM for a large mega.

I was thinking about and my guess is that since it is allowed to switch between threads outside an interrupt context the system requires a "huge amount" of memory from each thread to allow that operation. This may explain the why about I can not decrease idle thread size.

Besides that the optimization still valid and I'm sharing the patch about ISR stack for who could be interested to test.

Let me know if you have some inputs about the high usage of idle thread since the ISR not seems the cause of it.

@hugueslarrive
Copy link
Contributor

CC: @maribu , @benpicco , @hugueslarrive I apreciate if you could check on your devices to make sure I didn't miss anything.

I have an arduino-nano and an atmega1284p.

tests/isr_context_switch_by_ext_int does not compile because of the lack of BTN* configuration.

examples/thread_duel seems to be ok on atmega1284p (results not very readable but quite similar to master except that some annoying '�' have disappeared). It no longer fits in the flash of the arduino-nano.

@nandojve
Copy link
Contributor Author

nandojve commented Jul 9, 2023

Hi @hugueslarrive ,

Thank you for testing.

tests/isr_context_switch_by_ext_int does not compile because of the lack of BTN* configuration.

This test was created to make sure xmega doesn't have an issues because it is more complex the ISR handling.

examples/thread_duel seems to be ok on atmega1284p (results not very readable but quite similar to master except that some annoying '�' have disappeared). It no longer fits in the flash of the arduino-nano.

True, I didn't sent a config proposal for small mega devices, related to post #19799 (comment).

If you still interested to test on nano, you can validate the newer changes removing the last commit. That one only add a separated stack for ISR which we need to decide if this approach is feasible or not.

@benpicco
Copy link
Contributor

That one only add a separated stack for ISR which we need to decide if this approach is feasible or not.

I think this is a great improvement - as @maribu mentioned that means we can shrink the other stacks as they no longer need to account for the possibility to also fit the interrupt stack.

@nandojve
Copy link
Contributor Author

That one only add a separated stack for ISR which we need to decide if this approach is feasible or not.

I think this is a great improvement - as @maribu mentioned that means we can shrink the other stacks as they no longer need to account for the possibility to also fit the interrupt stack.

Yes, I believe so too. The thing is that you can not decrease any stack usage with this change at this moment. My expectation is that with a dedicated ISR stack the idle thread could be set to 64 bytes or less but the example thread_dual not ran if you decrease. This is what I can not understand. Maybe I'm missing something.

@nandojve nandojve force-pushed the avr8/optimize_context_switch branch from 86eb2cf to 862dc26 Compare July 11, 2023 18:45
@github-actions github-actions bot removed the Area: drivers Area: Device drivers label Jul 11, 2023
@nandojve
Copy link
Contributor Author

rebased since #19777 got merged.

This try to explore a potential issue that may happen at context switch
inside ISR.

Signed-off-by: Gerson Fernando Budke <[email protected]>
AVR-8 supports nested interrupts but there is no infrastructure to
use that.  This refactor infraestrocuture and enable the enhancement.
It rework the ISR processing to use a shared stack implemeting the
isr prolog/epilog/return.

Signed-off-by: Gerson Fernando Budke <[email protected]>
Adjust the code style of cpu_get_caller_pc function.

Signed-off-by: Gerson Fernando Budke <[email protected]>
The current context switch implementaion don't take in consideration
the call-used registers.  This means that ISR and context switch can
not be optimized.  This rework both context switch and ISR save/restore
order to optimize the context switch on ISR and reduce code size.

This chance should improve the performance of AVR platform on RIOT-OS
if is compared with the 2023.04 release.

Signed-off-by: Gerson Fernando Budke <[email protected]>
Signed-off-by: Gerson Fernando Budke <[email protected]>
@github-actions github-actions bot removed the Area: Kconfig Area: Kconfig integration label Dec 3, 2023
@nandojve
Copy link
Contributor Author

nandojve commented Dec 3, 2023

rebased since #19784 got merged.

@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Dec 3, 2023
@riot-ci
Copy link

riot-ci commented Dec 3, 2023

Murdock results

FAILED

4503f2e cpu/avr8_common: Add dedicated ISR stack

Success Failures Total Runtime
2481 0 8098 04m:07s

Artifacts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: boards Area: Board ports Area: cpu Area: CPU/MCU ports Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: AVR Platform: This PR/issue effects AVR-based platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants