Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple SoCs have flash configurations unsupported by MCUboot #713

Open
d3zd3z opened this issue Apr 10, 2020 · 30 comments
Open

Multiple SoCs have flash configurations unsupported by MCUboot #713

d3zd3z opened this issue Apr 10, 2020 · 30 comments
Labels
area: core Affects core functionality stale

Comments

@d3zd3z
Copy link
Member

d3zd3z commented Apr 10, 2020

Some newer devices have flash configurations that are not supported currently by MCUboot. This issue attempts to collect these in one place to help with the design of any solutions to handle this situation.

Known devices:

  • LPC55S69, 512-byte writes, 512-byte erase
  • PSOC6x, 512-byte writes, 512-byte erase
  • SAM-X70, 16-byte writes, 128k-byte erase
@jameswalmsley
Copy link

STM32H7 (43) 32 bytes write, 128K erase

@ghost
Copy link

ghost commented Jan 26, 2021

A further complication with STM32H7 is that the internal flash has an integrated ECC. Writing to the same flash word twice has a high probability of causing an ECC error, and if it's a double ECC error then a read results in a bus error. Does MCUboot rely on being able to write to the same flash word more than once?

@utzig
Copy link
Member

utzig commented Jan 26, 2021

Does MCUboot rely on being able to write to the same flash word more than once?

No, it only writes once to any "word΅ which is the alignment size for the flash supplied by the OS.

@ghost
Copy link

ghost commented Jan 27, 2021

I have MCUboot working on a PoC level on STM32H7. This required changing both BOOT_MAX_ALIGN and BOOT_MAGIC_SZ to 32. The boot_img_magic array had to be changed as well (one copy in bootutil_misc.c, one in image.py, one in Zephyr's mcuboot.c, maybe more) because it's only 16 bytes which is less than the alignment. All in all it's simple changes, but just changing BOOT_MAX_ALIGN for everyone isn't backwards compatible.

Is there any good reason to not make BOOT_MAX_ALIGN configurable? A user could set it to e.g. 32, which would then cause the magic to be padded to 32 bytes. (Or 16, or 512?). Everyone else would keep it at 8 and keep compatibility.

@jameswalmsley
Copy link

jameswalmsley commented Jan 27, 2021 via email

@nvlsianpu
Copy link
Collaborator

I have MCUboot working on a PoC level on STM32H7. This required changing both BOOT_MAX_ALIGN and BOOT_MAGIC_SZ to 32

@d3zd3z @utzig What is your opinion about making this configurable?

@utzig
Copy link
Member

utzig commented Jan 28, 2021

@d3zd3z @utzig What is your opinion about making this configurable?

If someone is gonna tackle it, the person has to fix bootutil, the simulator, imgtool, mcumgr and newt, maybe the integrations in the supported Oses, and maybe other stuff which I fail to remember. Probably a bit more work that it might seem at first, but I don't think there are any big technical impediments.

@d3zd3z
Copy link
Member Author

d3zd3z commented Jan 29, 2021

The other thing that is going to come up is that adding simulator support for this type of configure is going to point out the "rare" or "occasional" failures, and we'll need to actually figure out a way to fix them. Having some percentage of upgrade devices need recovery really isn't something I'd consider acceptable for a regular option.

We do have a completely different swap strategy that is under development that is intended for devices where the writes are larger (and typically use ECC), however this more requires the erase size to be fairly small. Having 8k erases would waste quite a bit of flash for these sectors.

However, the existing swap code should be assuming that each write block can only be written once, so this is probably a corner case bug, perhaps because of the larger write size.

@ghost
Copy link

ghost commented Mar 18, 2021

@jameswalmsley Your description of how you handled ECC errors on STM32H7 by involving the watchdog did not fill me with joy. So I came up with a different way to solve the problem, by trapping ECC errors and returning them as -EIO in the flash API. Normally a bus fault can't be trapped, but there is a way around that.

I've opened zephyrproject-rtos/zephyr#33140 with a description of the problem, my proposed solution, and I link to some code that shows that it can work. The code fiddles with some architectural registers, so it would be good to get some feedback from someone who knows more about how those registers interact with the rest of the system.

@jameswalmsley
Copy link

@weinholtendian Nice, I have to check out your solution.
Yes the watchdog was really a "catch-all" solution from having to implement something quickly, and the system we work on can easily be recovered by an external device, should it all go wrong.

There are some other systems that we have that won't tolerate that though, PR on this is great timing :)
I was trying to find some way of stopping the hard-fault like you have done in your current implementation, but unfortunately didn't have time to attempt it.

I will pull in your PR and check it on our systems and try to review it soon.

We've also created a new swap method for the h7 that makes use of the stm32h7 bank-swapping.
It works really well, and is much faster due to less need to erase and write sectors.

@SwissKnife64
Copy link

I'd like to confirm also that I have been running MCUBoot on STM32H7 for 12 months now. We made the same changes as you described above. We did find its possible to get an ECC error if we lost power during an image swap, and so MCUBoot would cause a hard-fault during swap-resume. We solved this by: 1. Using a watchdog if an ECC fault is triggered. 2. If boot-reason is due to a watchdog in the bootloader then we erase the image, and scratch areas. 3. We provide a recovery mode, where we wait for a repair image over DFU. 4. Just before we boot the application we set a flag to say the firmware was booted. If the firmware triggers WD then we don't cause recovery. 5. If firmware crashes multiple times without a power-cycle, recovery is triggered. (We also count the number of watchdog resets). Due to the possibility of getting an ECC error during resume, we have disabled resuming of partial image-swaps.. Interruption of image swaps will cause a recovery. Probably sounds a bit complex, but this has worked really well for our application. We went for catching ECC faults with a watchdog and recovery mode to ensure that all eventualities are covered, and no matter what happens we can recover the device. Best J

Hello J
We are implementing an application with Zephyr on an STM32H743 and are struggling with integrating the MCUBoot. It is based on the 32 byte minimal FLASH write size of th H7.
It looks like you have solved the alignment problem. Can you please share your code with solution with us.
Happy coding
Chris

@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@d3zd3z
Copy link
Member Author

d3zd3z commented Jun 8, 2021

Re-opening to track.

@d3zd3z d3zd3z reopened this Jun 8, 2021
@nvlsianpu nvlsianpu reopened this Jun 23, 2021
@github-actions github-actions bot closed this as completed Jul 8, 2021
@d3zd3z d3zd3z removed the stale label Jul 14, 2021
@d3zd3z d3zd3z reopened this Jul 14, 2021
@elcritch
Copy link

We do have a completely different swap strategy that is under development that is intended for devices where the writes are larger (and typically use ECC), however this more requires the erase size to be fairly small. Having 8k erases would waste quite a bit of flash for these sectors.

Is there any code that I could help test or implement for ECC flash? I have both LPC55S69 and CC3220 and would be interested to see if it'd be possible to get them both working with OTA, even if it's on a fork (for now).

However, the existing swap code should be assuming that each write block can only be written once, so this is probably a corner case bug, perhaps because of the larger write size.

Alternatively, it sounds like there is a bug and would make it easier to use ECC flashes with the current scheme by setting a larger block sizes? Any pointers on where to dive in would be great.

@elcritch
Copy link

This seems related to 841: Boot: Introduce new swap method using status partition, especially for chips with ECC based flash?

@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@dleach02
Copy link
Contributor

@d3zd3z can we consider reopening this ticket? We have a hyperflash platform that has this problem of not being able to be supported by MCUBoot due to the size of the write (it has ECC)

@irose-PeLtd
Copy link

I'd like to see a fix for this, and I'd be willing to contribute to that. We use the LPC55xx processors and being able to use the MCUboot with them would be nice, especially since we are already using it with Nordic NRF52 and i.MX RT series of micros. It seems as though the issue is isolated to memories that implement ECC, is that correct?

@d3zd3z
Copy link
Member Author

d3zd3z commented Sep 13, 2022

I'll go ahead an re-open this, as I am actually working on what I hope is a solution to this. I want to basically add support for flash devices with large write sizes. This will likely also require relatively small erase sizes.

@vipulkute-eaton
Copy link

Hi @d3zd3z

I was facing problem with firmware upgrade on STM32H743 controller because of 32byte alignment issue. Is this problem is fixed any of the new release version. I am looking for standard solution which can be compatible with other stm32 controller as well. Can you please provide the update on this issue.
Thanks.

@RomainPelletant
Copy link

Any news regarding LPC55xx series support? It would be awesome.
If not still supported, a PR/draft trying to implement it exists?

@maximevince
Copy link
Contributor

Same question over here. Is there an ongoing effort? Anyone from NXP that can assist? Maybe @DerekSnell ? (Referring to zephyrproject-rtos/zephyr#49246)

@GeorgeCGV
Copy link
Contributor

GeorgeCGV commented Mar 8, 2023

Is there any good reason to not make BOOT_MAX_ALIGN configurable?

#1609 but that only allows setting a custom value.

Didn't do anything regarding:

fix bootutil, the simulator, imgtool, mcumgr and newt, maybe the integrations in the supported Oses, and maybe other stuff which I fail to remember.

@github-actions
Copy link

github-actions bot commented Sep 5, 2023

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@dleach02
Copy link
Contributor

Need to unstale this

@github-actions github-actions bot removed the stale label Sep 13, 2023
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the stale label Mar 12, 2024
@maximevince
Copy link
Contributor

maximevince commented Mar 22, 2024

It seems MCUboot is now supported on the LPC5500 devices, although only in UPGRADE_ONLY mode.

Linking @DerekSnell from NXP's reply here, for those interesting in LPC55xx support:
zephyrproject-rtos/zephyr#49246 (comment)

@github-actions github-actions bot removed the stale label Mar 23, 2024
@utzig
Copy link
Member

utzig commented Apr 3, 2024

One fix for this would be to change the way the upgrade process is "logged". Instead of writing to flash each step of the process, as it's done now, which is limited by the "write size", one could build a table of the pre-calculated CRC-32 of every sector which will be swapped, and save it all in a single write. If the swap is interrupted the CRC-32 data can be used to find where it stopped. At least I think it makes sense in theory! Not a walk in the park, but probably not too hard and time consuming to create a PoC.

@macharlachanakya
Copy link

is triggered. (We also count the number of watchdog resets). Due to the possibility of getting an ECC error during resume, we have disabled resuming of partial image-swaps..

We are also using STM32H7, and also facing same issue, not able to decipher your comments, regarding to BOOT_MAX_ALIGN and BOOT_MAGIC_SZ , how can we adapt changes?

Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the stale label Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: core Affects core functionality stale
Projects
Status: No status
Development

No branches or pull requests