-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Easy way to transfer code from flash to RAM? #394
Comments
There's no easy way but you can place functions in RAM using the latest stable release of cortex-m-rt. Example below: #![no_std]
extern crate blue_pill; // or some device crate
extern crate cortex_m_rt;
use core::ptr;
#[inline(never)]
fn main() {
let x = unsafe { ptr::read_volatile(0x2000_0000 as *const u8) };
// call function placed in RAM
let y = foo(x);
unsafe { ptr::write_volatile(0x2000_0001 as *mut u8, y) };
}
// plain function placed in RAM
#[link_section = ".data"] // <- the important part
#[inline(never)] // you don't want this to be inlined, right?
fn foo(x: u8) -> u8 {
x + 1
}
// interrupt handler placed in RAM
// we can't compose `link_section` with `interrupt!` so we have to write the expansion of
// `interrupt!` here (bummer)
#[allow(non_snake_case)]
#[link_section = ".data"] // <- the important part
#[no_mangle]
pub fn EXTI0() {
let x = unsafe { ptr::read_volatile(0x2000_0000 as *const u8) };
// call function placed in RAM
let y = foo(x);
unsafe { ptr::write_volatile(0x2000_0001 as *mut u8, y) };
} I can't test this right now but the disassembly looks correct: $ arm-none-eabi-objdump -CD target/thumbv7m-none-eabi/release/cortex-m-quickstart
080001a4 <cortex_m_quickstart::main>:
80001a4: b580 push {r7, lr}
80001a6: 466f mov r7, sp
80001a8: f04f 5000 mov.w r0, #536870912 ; 0x20000000
80001ac: 7800 ldrb r0, [r0, #0]
80001ae: f000 f80b bl 80001c8 <___ZN19cortex_m_quickstart3foo17h20bca7cdf6dc430dE_venee
r>
80001b2: 2101 movs r1, rust-embedded/cortex-m-rt#1
80001b4: f2c2 0100 movt r1, #8192 ; 0x2000
80001b8: 7008 strb r0, [r1, #0]
80001ba: bd80 pop {r7, pc}
(..)
080001c8 <___ZN19cortex_m_quickstart3foo17h20bca7cdf6dc430dE_veneer>:
80001c8: f85f f000 ldr.w pc, [pc] ; 80001cc <___ZN19cortex_m_quickstart3foo17h20bca
7cdf6dc430dE_veneer+0x4>
80001cc: 20000001 andcs r0, r0, r1
Disassembly of section .data:
20000000 <cortex_m_quickstart::foo>:
20000000: 3001 adds r0, rust-embedded/cortex-m-rt#1
20000002: 4770 bx lr
20000004 <EXTI0>:
20000004: b580 push {r7, lr}
20000006: 466f mov r7, sp
20000008: f04f 5000 mov.w r0, #536870912 ; 0x20000000
2000000c: 7800 ldrb r0, [r0, #0]
2000000e: f7ff fff7 bl 20000000 <cortex_m_quickstart::foo>
20000012: 2101 movs r1, rust-embedded/cortex-m-rt#1
20000014: f2c2 0100 movt r1, #8192 ; 0x2000
20000018: 7008 strb r0, [r1, #0]
2000001a: bd80 pop {r7, pc} Both There seems to be no problem in calling RAM functions from a RAM function or from a Flash function but calling a Flash function from a RAM function gave me this linker error (I moved foo to Flash): = note: /home/japaric/tmp/cortex-m-quickstart/target/thumbv7m-none-eabi/release/deps/cortex_m_quickstart-893c31d38a914189.cortex_m_quickstart0.rcgu.o: In function `cortex_m_quickstart::EXTI0':
/home/japaric/tmp/cortex-m-quickstart/src/main.rs:32:(.data+0xa): relocation truncated to fit: R_ARM_THM_CALL against `cortex_m_quickstart::foo' which kind of looks familiar. One thing I should note is that if you call e.g. I think it would make sense to have some macro / attribute to make placing functions in RAM easier but this should get more testing first. Again, I have not tested anything myself :-). |
Sweet! Thank you so much @japaric. That works an absolute treat and has reduced the time to handle an interrupt (including instrumentation so I can measure it but excluding the veneer overhead which consists of quite a few instructions) from 6.2µs to 5.4µs. |
Nice wins! Glad to hear it worked. Let's keep this issue open to track adding this as a proper feature. |
It would be even nicer if we could directly jump into the function in RAM via the interrupt/exception vector rather than have an entry point to an exception handler then jump into the veneer and then jump into RAM and all the way back. |
Things that need to be decided before implementing this:
We could start by simply supporting the first scenario and postpone supporting the other scenarios. In any case, we don't have a great story for placing linker section in this or that memory region -- everything is kind of hard coded right now. (*) Only the processor can access the CCRAM so CCRAM has better performance than plain RAM because the processor doesn't have to share the CCRAM bus with the DMA. |
I think a common scenario for this feature might be a self programming. When you erase the FLASH you can't read or execute code from it, so you need to place the necessary code and data in RAM. Interrupt table should also be in RAM in this case. |
🤔 |
@pftbest you mean putting the whole program in RAM, right? That's sound like a different feature to me. We could have a Cargo feature that when enabled changes the Flash reset handler (boot code) to copy the program from Flash to RAM and then jumps to the RAM reset handler. What @therealprof asked for was a way to place certain functions and interrupt handler in RAM. Which is also a wanted feature since you may not always be able to fit the whole program in RAM -- in that case you can place only the most performance critical sections of your program in RAM. @therealprof you can put the vector table in RAM but you need to adjust the VTOR register accordingly. After reset VTOR will always be 0x0 (Flash), I think, so you need to change VTOR in the boot code. |
I'm not working too much with Cortex-M4... ;) |
I linked the M4 documentation because that was the first google hit :P. But it's also present on the M3 and seems to be optional on the M0+. No mention about it for the M0 so I guess it doesn't exist there. In any case, it's not necessary to put the vector table in RAM -- unless you want to able to modify it at runtime. |
I haven't seen a M0/M0+ device that had it implemented at least. I do have plenty of M3 and higher devices, too but I haven't looked too deeply into those architecture details. ;) But I agree that this is mostly a feature one would need for dynamic interrupt routine changes and I haven't seen a flashing implementation that used interrupts so far... |
this attribute lets you place functions in RAM closes #42
We're about to do this in my day job. Useful when you want the performance of CCRAM, but not all program code fits into it. |
I had the need to run some code from RAM recently and I figured I'd leave my experiences here. Maybe it will inform any future work on this feature, maybe it will help someone else who's having the same problems, or maybe someone will show up and tell me I'm doing it wrong (hopefully it's the last one :-) ). So, I created a Rust function with First, I got linker errors whenever I wanted to do anything in the function:
Switching to the GNU linker, which gave me this:
It's been a long time since I wrote ARM assembler, but I remember that there are various ways to call a function, and that at least one way involves jumping to a relative address. I understand these error messages to mean that the compiler generated such a relative jump, but that the symbol it wanted to jump to turned out to be too far away. That makes sense to me, since this is a RAM function calling a Flash function. I didn't spend any time trying to get this to work, as calling any Flash function is unacceptable for my use case. I'm trying to program a page of Flash memory, and any access to Flash while this is ongoing will interrupt the process. Problem is, you can't write a lot of Rust code without running into a call to some function that I don't control and thus can't slap a I wasn't able to write anything useful in release mode either, and I didn't dive too deeply into that. I figured that any solution I can find would be dependent on compiler implementation details and could break any time. I then considered inline assembler, but the crate I'm working on currently compiles on stable, and I don't think changing that would be acceptable. I ended up writing my function in C and linking that into the Rust code. (Aside: Putting To summarize, writing Rust code that is guaranteed to run in RAM seems to be pretty much impossible at this point. The solution proposed here (and presumably the one implemented in rust-embedded/cortex-m-rt#100) may help with some performance optimization, but it is useless if executing from Flash is not acceptable. |
Not sure what you expect us to do about that. That's not what this approach is supposed to address. Indeed if you cannot allow any flash access at all you're screwed. Usually I would expect that you write to one flash region while you execute from the other. It would be great to have a detailed control about the inlining of crate code per crate and I've mentioned that before in various places but if you need to have your code in RAM and you cannot use any flash, your only chance is to do what flashers are doing: Loading a pre-compiled binary blob into RAM and executing that. |
I don't expect you to do anything, and I'm sorry if it came off like that.
Yes, but that wasn't obvious to me after reading the comments in this issue, or in rust-embedded/cortex-m-rt#100. At the very least, my comment may prevent someone with the same problem from wasting their time.
Yes. The STM32L0 I'm working with does have two flash banks, and erasing/writing one while executing from the other is possible. However, to write a half-page, I first need to write all 16 words to the memory controller, and any other Flash access (regardless of bank) will interrupt the process.
That's what I ended up doing. |
@hannobraun I think it would be a good idea to document these limitations directly in the rust-embedded/cortex-m-rt#100 RFC so that it is clear to people in the future what is actually executed from RAM and what is not when using |
@eldruin I posted there and linked to my comment. Not sure what the status of that PR is though. |
@hannobraun Ah, I noticed now that the RFC does not include a separate RFC document as otherwise usual. I am not sure where to add it either; maybe directly in the docs of an example or so. |
I have an interrupt handler which is called very frequently and due to the CPU running at high speed I have to use wait states which seems to slow down processing quite a bit. Is there any simple way to declare that the IRQ handler should be copied to RAM during initialisation instead of being run from flash?
The text was updated successfully, but these errors were encountered: