Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTS device dependency is shifting memory addresses between builds #31546

Closed
andrewboie opened this issue Jan 22, 2021 · 0 comments · Fixed by #31548
Closed

DTS device dependency is shifting memory addresses between builds #31546

andrewboie opened this issue Jan 22, 2021 · 0 comments · Fixed by #31548
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug

Comments

@andrewboie
Copy link
Contributor

Describe the bug
I was notified Friday morning that my demand paging PR had been reverted as it had been identified as the culprit in a failure in tests/kernel/fatal/exception. The crash occurred at very early boot, right after installing page tables.

However, digging deeper this is not the case and my PR just exposed a different issue with a PR that went in about the same time.

When I reproduced locally, I found that the memory addresses of page tables were shifting in between builds:

$ nm -n zephyr/zephyr.elf | grep ptables
0000000000120000 R z_x86_kernel_ptables
$ nm -n zephyr/zephyr_prebuilt.elf | grep ptables 
000000000011f000 D z_x86_kernel_ptables

This will totally break the system, we absolutely rely on memory addresses of symbols not shifting in between builds as certain auto-generated CPU structures (such as page tables) will not function correctly if the information used to build them in zephyr_prebuilt.elf becomes stale.

In this particular case, this resulted in the memory mapping for the kernel image to not be properly sized. One of the changes in the demand paging PR is that only the kernel image is now memory-mapped with other page frames being left un-mapped and available for anonymous memory mappings.

In addition to this problem, this can also break userspace as the kernel object tables are created from symbol addresses in zephyr_prebuilt.elf. If the addresses of kernel object symbol addresses changes between zephyr.elf and zephyr_prebuilt.elf no syscalls will work. This is why the kernel object tables themselves are located at the very end of RAM so that other addresses do not shift.

I then looked at the image to see where the addresses started shifting. The culprit is some symbols all prefixed with __devicehdl:

 0000000000108070 R __device_handles_start
-0000000000108070 V __devicehdl_DT_N_S_soc_S_uart_2f8
+0000000000108070 R __devicehdl_sys_init_z_clock_driver_init0
 0000000000108070 R __init_APPLICATION_start
 0000000000108070 R __init_end
 0000000000108070 R __init_POST_KERNEL_start
 0000000000108070 R __init_SMP_start
-0000000000108080 V __devicehdl_DT_N_S_soc_S_uart_3f8
-000000000010808a V __devicehdl_sys_init_z_clock_driver_init0
-0000000000108090 R __app_shmem_regions_end
-0000000000108090 R __app_shmem_regions_start
...snip...
+0000000000108078 R __devicehdl_DT_N_S_soc_S_uart_3f8
+0000000000108088 R __devicehdl_DT_N_S_soc_S_uart_2f8
+0000000000108092 R __app_shmem_regions_end
+0000000000108092 R __app_shmem_regions_start

The overall size of these handles is increasing:

 0000000000108070 R __device_handles_start
-0000000000108090 R __device_handles_end
+0000000000108092 R __device_handles_end

We are off by 2 bytes. As bad luck would have it, this particular test case resulted in the size of the kernel image being pushed out by an additional page due to alignment requirements.

To Reproduce
Back up in the tree to right before where my demand paging PR was reverted and run tests/kernel/fatal/exception using sentinel.conf.

Not sure why other tests for userspace aren't also failing, we might be getting lucky with alignement directives restoring the synchronization between symbol addresses after they get messed up.

Expected behavior
This new DTS infrastruture can't change in size between builds.

Impact
Showstopper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants