-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: native_posix fails to compile on macOS #37123
RFC: native_posix fails to compile on macOS #37123
Conversation
Currently failing with
|
Need to look into a Mach-O tool instead of pyelftools for macos inside of pypi has macholib, although the source code seems to be incorrectly linked |
70c82da
to
114a77d
Compare
macholib was actually kind of old and unmaintained so I just ran Now it just fails with errors about mach-o sections / segments. More info: https://stackoverflow.com/questions/17669593/how-to-get-a-pointer-to-a-binary-section-in-mac-os-x |
After prefixing
Just to recap, Mach-O / Darwin sections look something like https://opensource.apple.com/source/xnu/xnu-4903.221.2/EXTERNAL_HEADERS/mach-o/loader.h.auto.html So nasty... why not just use an index into a string table, like ELF? |
4dcf6ec
to
b764161
Compare
Some static functions might be getting discarded, but it's almost linking now. Getting a few warnings like this:
And then this error:
|
3b53621
to
d1d0036
Compare
Just about linking...
|
e1315f2
to
f41cea9
Compare
Added a few fake section names as symbols. Will need to figure out what to do there soon. Looks like the issue with "__weak" is that the weak versions of symbols are not getting discarded.
|
Getting a bit further now - just flaking out with
|
8a022fe
to
9e738af
Compare
5fb5e33
to
b96f6f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this.
As a draft it looks good, but a couple of comments regarding the build system that we must work on before taking it to a mergeable state.
Regarding the 1 < section_name <= 16
issue, I do believe #36140 will provide us possibilities of handle that limitation, but I believe that should be investigated after #36140 has completed.
@@ -1461,6 +1464,8 @@ if(CONFIG_OUTPUT_DISASSEMBLE_ALL) | |||
) | |||
endif() | |||
|
|||
# probably some equivalent command exists to read stats from a Mach-O binary | |||
if (NOT (${CMAKE_HOST_SYSTEM_NAME} STREQUAL "Darwin" AND CONFIG_ARCH_POSIX)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid testing for Darwin and ARCH_POSIX everywhere.
We already have an infrastructure in place regarding compiler and linker flags, as well as bintools commands / flags / scripts.
See more here:
https://github.com/zephyrproject-rtos/zephyr/tree/main/cmake/toolchain
https://github.com/zephyrproject-rtos/zephyr/tree/main/cmake/compiler
https://github.com/zephyrproject-rtos/zephyr/tree/main/cmake/bintools
https://github.com/zephyrproject-rtos/zephyr/tree/main/cmake/linker
Of course that is not directly related to host, mostly the toolchain / compiler / linker, but we do have host-gnu
which basically means Linux (I haven't tested if host
works on Windows, but I don't think it does).
So it could hint that host-gnu
should support a subidentifier with system name, Linux, Darwin, Windows.
I believe we should try to see how host specific knowledge can fit into that design.
(Of course this code is fine for exploration work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tejlmand - Yea, this PR is very much a hack, so not very pretty.
On macos, the native toolchain is not gnu based, so it might not fall into that category precisely. There is a gcc, but it is just a copy of clang (not even a symlink) and then it uses Apple's ld
.
md5sum /usr/bin/clang /usr/bin/gcc
ad07b41d86b6cff03436930eed7d3b8a /usr/bin/clang
ad07b41d86b6cff03436930eed7d3b8a /usr/bin/gcc
Homebrew does provide a gcc and binutils for native development. I'll see how far that gets us..
if (CONFIG_ARCH_POSIX AND ${CMAKE_HOST_SYSTEM_NAME} STREQUAL "Darwin") | ||
set(symbol_prefix "_") | ||
else() | ||
set(symbol_prefix "") | ||
endif() | ||
|
||
foreach(symbol ${ARGN}) | ||
zephyr_link_libraries(${LINKERFLAGPREFIX},-u,${symbol}) | ||
zephyr_link_libraries(${LINKERFLAGPREFIX},-u,${symbol_prefix}${symbol}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be described through a linker property, for example:
set_linker_property(TARGET linker APPEND PROPERTY symbol_prefix _)
inside:
https://github.com/zephyrproject-rtos/zephyr/blob/main/cmake/linker/ld/host-gcc/linker_flags.cmake
(again, we need to extend current design to take host system into consideration)
Then this code would simply become:
if (CONFIG_ARCH_POSIX AND ${CMAKE_HOST_SYSTEM_NAME} STREQUAL "Darwin") | |
set(symbol_prefix "_") | |
else() | |
set(symbol_prefix "") | |
endif() | |
foreach(symbol ${ARGN}) | |
zephyr_link_libraries(${LINKERFLAGPREFIX},-u,${symbol}) | |
zephyr_link_libraries(${LINKERFLAGPREFIX},-u,${symbol_prefix}${symbol}) | |
zephyr_link_libraries(${LINKERFLAGPREFIX},-u,$<TARGET_PROPERTY:linker,symbol_prefix>${symbol}) |
Note: #24851 focused on compiler, we still need to cleanup of linker functions like toolchain_ld_force_undefined_symbols
in similar way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tejlmand - inside cmake/linker/ld/host-gcc/linker_flags.cmake
, if I add
set_linker_property(TARGET linker APPEND PROPERTY symbol_prefix _)
it results in a CMake error
Unknown CMake command "set_linker_property".
Do you have another PR in progress that allows us to call set_linker_property()
?
@@ -109,18 +115,30 @@ function(toolchain_ld_link_elf) | |||
set(use_linker "-fuse-ld=bfd") | |||
endif() | |||
|
|||
if (CONFIG_ARCH_POSIX AND ${CMAKE_HOST_SYSTEM_NAME} STREQUAL "Darwin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment regarding linker flags.
ea84138
to
8b8d578
Compare
I installed homebrew's gcc and binutils with
a
Incrementally apply hacks from this PR to see if there are any gains using a GNU toolchain rather than the native Clang toolchain.
The The There did not seem to be any complaints about the weak attribute, which I was a bit suspicious of, but the section error presented itself slightly differently as
Without the "__DATA," prefix (or any segment identifier), the assembler would complain that there was an unexpected token. Actually, there was no warning at all about weak symbols with gcc and it appears that it just linked whatever came first, which is arguably worse (no warning and unexpected behaviour). The homebrew binutils does not provide any I think I'll opt to use the native tools for now, as it is what is primarily supported by Apple and used by HomeBrew. Will make changes requested by @tejlmand shortly and update this PR as I go along. |
8b8d578
to
d684da3
Compare
2069e54
to
cd6e158
Compare
6a2eaa5
to
012fb1d
Compare
This was a bit of a hack, but it really highlights some of the things that we take for granted with Linux and otherwise with ELF-based targets. Specifically, on macOS: * the Mach-O binary file format: * lacks weak symbol support * lacks support for section names > 16 characters * prefixes all symbols with `_` * The assembler (both llvm & GNU) does not support the `.type` directive * Linker options are different with Apple's ld * unsupported: `--whole-archive` / `--no-whole-archive` * `-Map` becomes `-map` * linker script format is completely different * macOS uses `-Wl,-order_file` instead of `-Wl,-T` for linker script Details about the Mach-O binary format and linker-generated sections can be found at the links below: https://opensource.apple.com/source/xnu/xnu-4903.221.2/EXTERNAL_HEADERS/mach-o/loader.h.auto.html https://github.com/aidansteele/osx-abi-macho-file-format-reference https://stackoverflow.com/questions/17669593/how-to-get-a-pointer-to-a-binary-section-in-mac-os-x Fixes zephyrproject-rtos#10945 Signed-off-by: Christopher Friedt <[email protected]>
012fb1d
to
8856971
Compare
I wrote a machotools python package (took a bit of work, very loosely based on
In a contrived example, the output of
Now, each of these symbols has been placed in a custom section called I am taking a 2-stage linking approach, where the first stage output is linked with
Furthermore, the linker knows to look for these symbols in a dynamically linked library and fill them in inside of the literal pool!
The next stage involves examining the first binary for Next, the Finally, the second-stage link does not use the So, it's definitely less invasive in that a special case does not need to be made for every different type of symbol section and should make things Just Work ™️ on macOS for The PoC repo for this stuff is here: I would imagine that, aside from using dynamic linking, this isn't too far off from what our regular build process is for Zephyr. I would imagine though, that adding in device handles could be kind of tricky, so it will likely take a bit of time before that is finally worked out. In the ideal case, the API would be somewhat consistent when compared to attn: @stephanosio , @tejlmand , @nashif, @mbolivar-nordic |
#38836 may be relevant then |
Changing my strategy w.r.t. the 16-byte segment / section issue. Going to patch clang instead so Zephyr should not require any source-level modifications to section names. Instead of using the entire section name, as-is, will calculate the SHA256 of the section name, then will convert to Base64 representation, and truncate at 16 characters. This approach has been used before by other runtimes (e.g. Rust) that need to fit larger section names into Mach-O binaries. |
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
This was a bit of a hack, but it really highlights some of the things that we take for granted with Linux and otherwise with ELF-based targets.
Specifically, on macOS:
.type
directive--whole-archive
option is not supported by the macOS linkerva_list
size is inconsistent for cbprintf:sizeof(va_list)
: 8sizeof(struct __va_list)
: 32The Mach-O binary format limitations definitely present the greatest challenges currently to targetting macOS for native_posix_64 in the long run.
Weak Symbol Support
The
__attribute__((weak_import))
is supposedly supported on macOS, but at the time this commit was made, it did absolutely nothing. Perhaps there is some other special applesauce that was missing 🤷.Often, weak symbols are used in place of explicit infrastructure for overriding function implementations or symbol values. Weak symbols are actually not a standard feature of C and are mainly an ELF convenience. As a result, Zephyr will not be fully portable to different toolchains until weak symbols are removed entirely.
Section Name Limitations
In addition to a section name, macOS also relies on a segment name. The combination of the two looks like this:
__attribute__((section("segment_name,section_name")))
Both the
segment_name
andsection_name
are limited to 1 <= n <= 16 ascii characters in length. ELF solves that problem by placing section names into the string table. Perhaps that could be a long- term solution if the proper patches could be supplied to LLVM and Apple, but the adoption of that is unlikely. It's quite likely that the same proposal has already been made a number of times.Currently, we rely a great deal on
Z_ITERABLE_SECTION()
andZ_STRUCT_SECTION_FOREACH()
. It does not seem like a realistic expectation to rely on the C preprocessor and linker to perform all of the heavy lifting necessary for processing sections on macOS.In addition to section names that are defined dynamically via macro-pasting (set at compile time), we also have statically defined sections that are just defined in source code without any macro-pasting (set prior to compile time). There is some work currently being done on CMake generated linker scripts #36140 where the section naming is brought out of C and placed in the build system, which would help for the statically defined sections.
For dynamic section names that are set at compile-time, currently there are 2 possible solutions.
Possible Solution No. 1
It may be possible to solve this problem by creating a new macro that behaves slightly differently for ELF vs Mach-O targets; on ELF, it could create the section name as usual, but with Mach-O, it could use the
__attribute__((constructor))
to copy the relevant details to an unordered map for later processing. A slight modification of this approach could be used to sort each item into an ordered map.If we then transition
Z_STRUCT_SECTION_FOREACH()
to a runtime function call with a callback, then the ELF implementation could operate more or less as usual, but the Mach-O implementation could be adjusted to simply process the previously updated data structures.There is a proof of concept working for that in this PR.
Possible Solution No. 2
Another possible solution for dynamic sections is to add symbol metadata in the form of strings and then perform some post-processing of those symbols and sections, similarly to what we do now with the various ELF hack scripts.
For this approach, my thoughts are:
__attribute__((section("__DATA,z_macho_tmp")))
"symbol_name,intended_elf_section"
tuple as a const string in__attribute__((section("__DATA,z_macho_tuple")))
a. possibly some other construct that guarantees uniqueness
a. so e.g.
__z_foo_init_PRIORITY_BOOT_1_foo
=>a57c313908dbe
or somethingz_macho_tmp
sectionz_macho_tuple
sectionUseful Links
Details about the Mach-O binary format and linker-generated sections can
be found at the links below:
https://opensource.apple.com/source/xnu/xnu-4903.221.2/EXTERNAL_HEADERS/mach-o/loader.h.auto.html
https://github.com/aidansteele/osx-abi-macho-file-format-reference
https://stackoverflow.com/questions/17669593/how-to-get-a-pointer-to-a-binary-section-in-mac-os-x
Fixes #10945