Skip to content
This repository has been archived by the owner on Jan 24, 2022. It is now read-only.

DefaultPreInit gets random address in debug mode [Stack pointer initialization issue] #102

Closed
korken89 opened this issue Sep 3, 2018 · 5 comments

Comments

@korken89
Copy link
Contributor

korken89 commented Sep 3, 2018

Hi all,

I have stumbled on a strange issue that took some work to find as it does not happen often.
My issue was that when compiling in debug mode the address provided by __pre_init becomes different addresses, depending on when the system was reset. And this comes from the stack pointer not being initialized yet, and a corrupt stack pointer passing through.
This can occur if the system's stack pointer has been corrupted and a soft-reset has been requested, then the erroneous value of the stack pointer sticks until after reset.

I wrongly assumed that the system always read the stack pointer from the vector table and wrote it, hence I recommend that we add an assembler instruction to set MSP and not allow this kind of undefined behavior.
Comments @rust-embedded/cortex-m ?

Example of corrupt stack pointer after reset:

r0             0x8000251        134218321
r1             0xf00000 15728640
r2             0x20000000       536870912
r3             0x0      0
r4             0x0      0
r5             0x0      0
r6             0x0      0
r7             0x0      0
r8             0x0      0
r9             0x0      0
r10            0x0      0
r11            0x0      0
r12            0x0      0
sp             0xfffffe90       0xfffffe90
lr             0x8000315        0x8000315 <core::mem::uninitialized+12>
pc             0x80001c4        0x80001c4 <Reset+12>
xpsr           0x61000003       1627389955
fpscr          0x0      0
msp            0xfffffe90       0xfffffe90
psp            0x0      0x0
primask        0x0      0
basepri        0x0      0
faultmask      0x0      0
control        0x0      0
@adamgreig
Copy link
Member

As far as I can tell from the architecture reference manuals, both local and power-on resets should cause SP to be written with the first vector table entry (B1.5.5 for both ARMv6 and ARMv7). How are you generating the reset that causes this situation? Is it possible the code is jumping to the reset handler without actually doing a reset at all?

@korken89
Copy link
Contributor Author

korken89 commented Sep 3, 2018

Hmm, I think you are correct. I have seen it both when flashing an empty chip, or when restarting (load + continue) via the debugger after hardfault. Loading with a forced reset does indeed help, where I was assuming it was doing resets with load.

Still being able to arrive in this state at all is, for me, unacceptable. :)

@adamgreig
Copy link
Member

Sounds like an error in the debugger if it's jumping to the reset vector without either doing an actual reset or loading SP itself. Who knows what else is not correctly initialised? I usually run rather than continue after reflashing or hitting exception handlers which I think does cause a reset.

I'm not super keen on having every single cortex-m-rt program begin by writing SP when that's exactly what the hardware should do anyway. What else would we have to do to ensure correct state? Any of the peripheral registers (both core and device-specific) could also be wrong...

@ithinuel
Copy link
Member

ithinuel commented Sep 3, 2018

In my experience you often need a stepi to make gdb/openocd actually load/refresh the register states after any kind of reset. Before that the value you read is not reliable.

@korken89
Copy link
Contributor Author

korken89 commented Sep 4, 2018

Who knows what else is not correctly initialised?

Indeed, this is true. A full reset is what should be done to be on the safe side.

For now I will change the startup scripts to do a load + tb Reset + run - rather than the old load + stepi currently part of cortex-m-quickstart.

Just for comparison I checked ChibiOS and FreeRTOS and how they handle initialization, and both were setting the MSP (ChibiOS conditionally based on a define).
Though I do not know about the design decisions which went into their startup code, but it seems a bit odd to have it if it should be guaranteed as you point out @adamgreig

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants