Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripts: generate image info header file #41685

Merged
merged 2 commits into from
Jan 22, 2022

Conversation

tejlmand
Copy link
Collaborator

This PR is inspired by the discussion at dev-review, 2022-01-06 and #41579

The #41579 embeds a binary Zephyr image inside the current Zephyr image in order to load it to second core through RAM.
This works, but the two images are independent of each other.
One disadvantage of converting the elf file to binary image and embed it directly is that any gaps between segments will be padded, and this padding will then also be copied.
By providing the ability to generate a header with segment information, then the image can be flashed independently, and it can be copied without padding data as the LMA address and segment size is available.


This commit adds the gen_image_info.py script which supports creation
of a header file with image information from the EFL file.

This version populates the header file with:

  • Number of segments in the image
  • LMA address of each segment
  • VMA address of each segment
  • Size of each segment

The header file can be used by a secondary build system which needs this
information.

Signed-off-by: Torsten Rasmussen [email protected]

Copy link
Collaborator

@SebastianBoe SebastianBoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite useful.

Did you consider include/generated ?

sebo@sebo:~/ncs/nrf/samples/bluetooth/peripheral_uart$ ls build/zephyr/include/
generated
sebo@sebo:~/ncs/nrf/samples/bluetooth/peripheral_uart$ ls build/zephyr/include/generated/
app_data_alignment.ld  devicetree_unfixed.h  snippets-data-sections.ld  syscall_dispatch.c
app_smem_aligned.ld    driver-validation.h   snippets-noinit.ld         syscall_list.h
app_smem.ld            kobj-types-enum.h     snippets-ram-sections.ld   syscalls
app_smem_unaligned.ld  ncs_version.h         snippets-rodata.ld         version.h
autoconf.h             offsets.h             snippets-rom-start.ld
device_extern.h        otype-to-size.h       snippets-rwdata.ld
devicetree_fixups.h    otype-to-str.h        snippets-sections.ld

@SebastianBoe
Copy link
Collaborator

How dynamic are the values that #41579 needs?

Is it not possible to know statically the location's and sizes and then pass those on to the LD and to the other image?

@tejlmand
Copy link
Collaborator Author

Did you consider include/generated ?

yes, but that would be wrong location for this file cause the include/generated is used by current build, whereas the include/public is to be used by another build.
For example a second Zephyr image that has it's own include/generated, so placing the zephyr_image_info.h in same folder as the other generated files could become a mess and high risk that the second Zephyr image suddenly pull the wrong set of headers for everything else.

@tejlmand
Copy link
Collaborator Author

Is it not possible to know statically the location's and sizes and then pass those on to the LD and to the other image?

well, for the image being built how will you know this before the final linking stage ?

@tejlmand tejlmand added the DNM This PR should not be merged (Do Not Merge) label Jan 10, 2022
@tejlmand
Copy link
Collaborator Author

DNM until comments from @danieldegrasse / #41579 as to be sure this direction is also suitable for that usecase.

Copy link
Collaborator

@danieldegrasse danieldegrasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will work for my use case, but I'm unsure if the implementation will be useful for the general "load image to ram" use case. My concern stems from the fact that the M4 core and M7 core on the RT1170 have different address spaces, and hence only some memory can be accessed by the M7 for load the M4 image. To manage this, the M4 image is built with CONFIG_XIP, even though it executes from RAM. This way, the M4 image will store the data section as a relocatable section, and copy it to RAM at boot (the region the M4 uses as RAM is only accessible to it).

My concern is that with this method, the M4 image now must be linked into flash in order to produce useful LMAs. However, with the M4 image linked to flash, and the M4 core executing with CONFIG_XIP enabled, the data section will be copied from the LMA address in flash, correct? In the specific case of the RT1170 the flash part has the same address and is accessible to both the M4 and M7 cores, but I'm not sure this is the case for every SOC.

To summarize, my concern is the following:

  • enabling this method of loading the image requires LMAs to be set to addresses in flash
  • since each core has different memory mappings, data section must be copied from "flash" to the RAM section
  • with LMAs set to addresses in flash, cores that are only capable of reading from RAM addresses will fail to load data sections

@tejlmand Please let me know if this concern makes sense. I might be missing something, but I don't currently see a way to relocate the data section at boot without loading it from the LMA to the VMA within Zephyr

@SebastianBoe
Copy link
Collaborator

well, for the image being built how will you know this before the final linking stage ?

The sizes of the segments could be determined in Kconfig and then the LD could link accordingly.

But in any case, this is a useful approach for those that want dynamic sizes.

So +1 from me.

@danieldegrasse
Copy link
Collaborator

How dynamic are the values that #41579 needs?

@SebastianBoe in theory, the primary image could be made aware that the second image will reside within a flash partition of a given size, and the entire contents of the flash partition would be copied from flash to ram. However, this may introduce a large amount of unnecessary memory copying at boot, in the case of a small image. So it would be preferable to know the size of the image (or the size of the image segments) at build time to permit the smallest possible copies

@tejlmand
Copy link
Collaborator Author

I think this will work for my use case, but I'm unsure if the implementation will be useful for the general "load image to ram" use case.

Thanks for the comment.
I don't believe this PR can solve all the possible cases out, but would be nice if it could 😉

But if generating a public header with useful information, just like we export compile commands and generate a Makefile export for integrating with other tools / build systems, will be helpful then I think it make sense to do so.

How that information is then used must be decided by the consuming system.

My concern is that with this method, the M4 image now must be linked into flash in order to produce useful LMAs. However, with the M4 image linked to flash, and the M4 core executing with CONFIG_XIP enabled, the data section will be copied from the LMA address in flash, correct?

Yes, that would most likely be the case here, as the image will have the LMA of the flash.
To be sure I understand this arch correctly, then the load procedure the M4 / M7 needs would be:

M7 core                           | M4 core
Copy segments (code/data)         | XIP code 
LMA (flash) to VMA (shared RAM)   | Copy data VMA to M4 only RAM

So what we need to my understanding would be two sets of LMA and VMA addresses for the M4 image, like this:

LMA1 : Load address seen by M7 and where image is flashed
VMA1 : Address in RAM to where the M7 will load the image
LMA2 : Load address seen by M4 and where image is located when M4 starts up.
VMA2 : For M4 code, this is identical to VMA1
       For M4 data, this will be the internal M4 RAM

I can see this being a bit tricky if we want to have regular M4 image that can be flashed independently of M7 image, as that requires the ELF image to have LMA1 addresses.
The M7 image will not directly be using the VMA address, cause it doesn't know that address. M7 address only knows addresses provided in the zephyr_image_info.h, just like it doesn't know the internals of the M4 image today, it just copies the .remote.data section to the OCRAM address.
So the M4 image should be able to use the VMA2 addresses which would result in:

  • VMA2 (== VMA1) for code will be copied correctly by M7
  • VMA2 for data.

So one option would be to build the image for M4 as today, and then post-process the ELF image adjusting the LMA.
The LMA will be used by the flash tools, but the code itself will use the internal values set during linking and not the information directly in the ELF file.
On target the image knows nothing about ELF.

We can then have all knowledge contained in the info header.
The linked LMA and VMA addresses, but also the adjusted addresses.

@tejlmand
Copy link
Collaborator Author

tejlmand commented Jan 12, 2022

@danieldegrasse added an extra commit here: 0ddd88d

This commit allows to adjust the LMA address of the image build for flash purposes.
For the M4 image case you can then build with correct addresses (LMA/VMA), but adjust the elf / hex file so that it will be flashed at a dedicated M4 partition from where the M7 can copy the image to RAM.

That can be setup for example by having this in Kconfig:

DT_CHOSEN_IMAGE_M4 := nxp,m4-partition
DT_CHOSEN_Z_FLASH := zephyr,flash

config BUILD_OUTPUT_ADJUST_LMA
       default "$(dt_chosen_reg_addr_hex,$(DT_CHOSEN_IMAGE_M4))-\
       $(dt_chosen_reg_addr_hex,$(DT_CHOSEN_Z_FLASH))"

Note, this requires that the devicetree for the M4 uses partitions which doesn't seem to be the case today.

@danieldegrasse
Copy link
Collaborator

But if generating a public header with useful information, just like we export compile commands and generate a Makefile export for integrating with other tools / build systems, will be helpful then I think it make sense to do so.

I agree. I think even if this PR does not enable the specific use case for the RT1170, it is still a valuable feature.

So what we need to my understanding would be two sets of LMA and VMA addresses for the M4 image, like this:

That set of LMA and VMA addresses looks right to me, as does the boot flow. The key difference, which you captured, is that in the case of the M4, the LMA address the actual ELF file is flashed to by debugging tools and the LMA address the image is built with must differ.

@danieldegrasse added an extra commit here: 0ddd88d

This looks like a great fix. I haven't had time to verify this, but will test later today and ensure it works for the RT1170 use case

Kconfig.zephyr Outdated
This will not affect the internal address symbols inside the image but
can be useful when adjusting LMA addresses for flash tools or multi
stage loaders where a pre-loader may copy image to a second location
before booting second core.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very contradictory, the config name implies only LMA is adjusted.

But the first sentence states that VMA is also adjusted.

And then the second sentence says that the "internal address symbols" are not affected. I assume "interal address symbols" is referring to VMA again. So this contradicts the first sentece.

Copy link
Collaborator Author

@tejlmand tejlmand Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very contradictory, the config name implies only LMA is adjusted.

But the first sentence states that VMA is also adjusted.

That is because the purpose of this flag is to adjust the LMA so that the image may be automatically flashed at a different address or adjusted when converting to hex (bin files are unaffected by this as the don't contain any address information (LMA/VMA)).

When flashing an ELF, the flash tool will use the information provided by LMA to know where to flash the segments.

But when using objcopy --change-addresses then both LMA and VMA gets adjusted, not only the LMA.
Thus I find it important to make the user aware of this, although the purpose is to get LMA adjusted hence the reason for only LMA in the name (we don't care for the VMA in this particular case).

There could be other bintools, for example elfconvert with the ARM Compiler that supports adjusting only LMA, but so far I've focused only on GNU bintools.

And then the second sentence says that the "internal address symbols" are not affected. I assume "interal address symbols" is referring to VMA again. So this contradicts the first sentece.

Internal address symbols are those used by the code and populated by the linker.
For example, if code refers to __text_region_start and that symbol points to address 0x00001212, then the address of that symbol is unaltered when we change LMA. Meaning that if LMA is adjusted with 0x10000000, then __text_region_start is still pointing to 0x00001212.

I don't expect all developers to be aware of such details, thus I find it important to inform.

I will try to see if I can find a better description, but any proposals are more than welcome.
Hopefully you get the intention of the sentence now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(bin files are unaffected by this as the don't contain any address information (LMA/VMA)).

VMA is encoded into every machine instruction generated so bin files do contain VMA information.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VMA is encoded into every machine instruction generated so bin files do contain VMA information.

correct, but it is not encoded as VMA information in itself.
In the binary it is purely an address, the VMA is a term and in the ELF file it's the p_vaddr when loading an ELF.
But bin files doesn't load an ELF when flashed (the native posix target does, cause it's using the ELF loader (ld)).
The bin file for embedded contains the information where to copy something as an address, but that's just it, an address.

So what is encoded as p_vaddr in the ELF and presented as VMA is lost as soon as the ELF is converted to binary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you want to adjust the LMA, but are using a flag that adjusts both.

Could you try

--change-section-lma sectionpattern{=,+,-}val

with sectionpattern set to .*

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--change-section-lma sectionpattern{=,+,-}val

I actually tried yesterday and failed to get desired behavior, but re-tried again now.

Happy to tell it works when pattern is correctly escaped:

--change-section-lma \*+<val>

will update PR, cause prefer to only adjust what strictly needs adjustment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and VMA removed from the description which should make it clearer.

@tejlmand
Copy link
Collaborator Author

@danieldegrasse added an extra commit here: 0ddd88d

This looks like a great fix. I haven't had time to verify this, but will test later today and ensure it works for the RT1170 use case

btw. for a simple test you can adjust directly without using flash partitions from device tree.
For example to adjust LMA from 0x20200000 to 0x30000000 you can simply add:

config BUILD_OUTPUT_ADJUST_LMA
       default "0x0FE00000"

in your boards Kconfig.defconfig file.

(you might want a little higher offset to not collide with M7 code area)

@danieldegrasse
Copy link
Collaborator

@tejlmand Sorry for the delay, but I got a chance to test this and can confirm it works for me. This still needs the primary core image to be dependent on the build of the second core image completing, but provided that is available I think this satisfies the use case for the RT1170.

One question though-the sum of all the segment sizes defined the image header file does not equate to the size of the zephyr binary on disk. Do you know what would be in that extra space in the binary? Both core execute fine with only the load sizes given by the image header file, so I'm not too concerned, just interested to know what might cause the discrepancy.

@tejlmand
Copy link
Collaborator Author

One question though-the sum of all the segment sizes defined the image header file does not equate to the size of the zephyr binary on disk.

That is correct.
Cause the binary image doesn't contain direct address information like an ELF or hex file.
All address information is encoded inside the image itself as addresses for the application to use.

This means if an ELF image has two segments of size 0xF00, and starting at address 0x20200000 and 0x20201000 respectively, then all data of first segment will start at beginning of the binary and fill up to 0xF00.
The binary itself has no header so when examine the bin file the data starting at address 0x0 will be the data that should be loaded at address 0x20200000 .

The second segment will start at offset 0x1000 and fill up to 0x1F00.
Because the bin file cannot encode this knowledge anywhere it must pad the binary.
This means the area between 0x0F00 till 0x1000 will be padded.
Usually with 0x00 or 0xFF data.
The padded data will not be used by the application.

So using the information from the generated header file actually means the M7 avoid copying padded data.
And this is also why the two size information does not equal.

This commit adds the `gen_image_info.py` script which supports creation
of a header file with image information from the EFL file.

This version populates the header file with:
- Number of segments in the image
- LMA address of each segment
- VMA address of each segment
- Size of each segment

The header file can be used by a secondary build system which needs this
information.

Signed-off-by: Torsten Rasmussen <[email protected]>
This commit adds support for adjust the addresses of the final image.
This is useful when the image is to be flashed at a location different
from the LMA address encoded in the ELF file by the linker.

An example use-case is multicore systems where core A might load image
from a flash partition into RAM in order for core B to execute and load,
but where the image itself is build with the RAM addresses as LMA.

It updates the zephyr_image_info.h header with information of adjustment
value.

Signed-off-by: Torsten Rasmussen <[email protected]>
@tejlmand tejlmand removed the DNM This PR should not be merged (Do Not Merge) label Jan 18, 2022
@cfriedt cfriedt merged commit d51a67b into zephyrproject-rtos:main Jan 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants