-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x86_64 architecture #9522
x86_64 architecture #9522
Conversation
arch/x86_64/Kconfig
Outdated
@@ -0,0 +1,4 @@ | |||
gsource "arch/x86_64/soc/*/Kconfig.soc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a plain source
here now. I recently improved how globbing is handled, and gsource
is only supported for backwards compatibility now: #9415 (the commit message has a motivation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kconfig nit fixed
Codecov Report
@@ Coverage Diff @@
## master #9522 +/- ##
==========================================
+ Coverage 48.27% 48.33% +0.06%
==========================================
Files 295 295
Lines 44283 44290 +7
Branches 10601 10606 +5
==========================================
+ Hits 21377 21408 +31
+ Misses 18637 18612 -25
- Partials 4269 4270 +1
Continue to review full report at Codecov.
|
d1db0d9
to
bc4436f
Compare
One silly question: Is this not a bit far away from Zephyr's target of resource constrained embedded systems? |
Yeah, it's very unlikely anyone will be running a Zephyr app on a PC for production use. The original impetus for this port was to have a virtualized SMP platform for testing under Qemu (right now we can only do SMP on esp32 hardware). It also helps to have a simplified architecture vs. the existing x86 layer which has a bunch of historical cruft. It's also a good testbed to start work on making the Zephyr source 64 bit clean (which isn't happening yet -- this port uses the x32 ABI so pointers are still 32 bit even though the registers are 64 bits wide). But longer term, who knows. A tiny Zephyr layer running on IA hardware opens the door to "Zephyr as bootloader" and maybe "Zephyr as hypervisor", etc... |
welcome back :) |
These files were using z_thread_malloc() without including kernel_internal.h. On existing architectures that works due to transitive includes, but x86_64 has a thinner include layer and doesn't do it for us. Include the files required for the APIs we use. Signed-off-by: Andy Ross <[email protected]>
This was apparently intended to allow for per-arch linker includes, but no such includes ever existed. All it does is senselessly throw an error on unrecognized architectures. Yank. Signed-off-by: Andy Ross <[email protected]>
The HPET default is to deliver events on the same INTIn as the legacy PIT IRQ, and in fact our code requires that because it uses the "legacy routing" option. So this isn't really a configurable and has to be set correctly. Do it right in the kconfig default instead of forcing boards to set it. (No, I have no idea where "20" came from either.) Signed-off-by: Andy Ross <[email protected]>
I was half way through typing up my own one of these when I realized there was one already in the tree. Move it to a shared header. (FWIW: I really doubt that most architectures actually benefit from their own versions of these tools -- GCC's optimizer is really good, and custom assembly defeats optimization and factorizations of the expressions in context.) Signed-off-by: Andy Ross <[email protected]>
These files were relying on _thread_essential_set() from kernel_internal.h, but not including it directly. New architectures won't transitively include things the same way. Signed-off-by: Andy Ross <[email protected]>
It's worth using custom timing information on a few systems to save cycles or gain precision. But make the use of k_cycle_get_32() a proper default instead of hardcoding all the platforms and failing to build on new ones. On Xtensa and RISC-V (and now x86_64) the cycle informatoin from that call is a very fast wrapper around the native counters anyway -- all you would save would be the function call overhead. Signed-off-by: Andy Ross <[email protected]>
Host toolchains don't tend to provide an x32 libgcc. But we don't actually need one for existing code. This is fragile, but better to work for all but obscure cases than break outright. Signed-off-by: Andy Ross <[email protected]>
If we don't have a detected cross compiler for x86_64, use the host compiler instead. Signed-off-by: Andy Ross <[email protected]>
This patch adds a x86_64 architecture and qemu_x86_64 board to Zephyr. Only the basic architecture support needed to run 64 bit code is added; no drivers are added, though a low-level console exists and is wired to printk(). The support is built on top of a "X86 underkernel" layer, which can be built in isolation as a unit test on a Linux host. Limitations: + Right now the SDK lacks an x86_64 toolchain. The build will fall back to a host toolchain if it finds no cross compiler defined, which is tested to work on gcc 8.2.1 right now. + No x87/SSE/AVX usage is allowed. This is a stronger limitation than other architectures where the instructions work from one thread even if the context switch code doesn't support it. We are passing -no-sse to prevent gcc from automatically generating SSE instructions for non-floating-point purposes, which has the side effect of changing the ABI. Future work to handle the FPU registers will need to be combined with an "application" ABI distinct from the kernel one (or just to require USERSPACE). + Paging is enabled (it has to be in long mode), but is a 1:1 mapping of all memory. No MMU/USERSPACE support yet. + We are building with -mno-red-zone for stack size reasons, but this is a valuable optimization. Enabling it requires automatic stack switching, which requires a TSS, which means it has to happen after MMU support. + The OS runs in 64 bit mode, but for compatibility reasons is compiled to the 32 bit "X32" ABI. So while the full 64 bit registers and instruction set are available, C pointers are 32 bits long and Zephyr is constrained to run in the bottom 4G of memory. Signed-off-by: Andy Ross <[email protected]>
The call to _arch_switch is a giant screaming sign inviting optimizer bugs. The code that appears before is what happened long ago when we were switched out, but the version that EXECUTED just now is actually in a different thread. So the assignment to _current before the switch actually assigned OUR thread (the "new_thread" of the old context!) to _current. But obviously the optimizer looks at that code and assumes that the _current which got assigned to the thread we were switching to long ago is still correct, and used it when retrieving the swap return value. Obviously the real bug here is that the _arch_switch() in question lacked a memory clobber (and it's getting one). But we can remove two lines, remove code from inside the interrupt lock and make the implementation more robust by moving the read to after the irq_unlock() (which generally also has a memory clobber). Signed-off-by: Andy Ross <[email protected]>
When tickless was disabled, this inverted test would never fire the first interrupt and the timer would be silent. Just remove it. There's no harm in unconditionally enabling a single timer interrupt at boot. Signed-off-by: Andy Ross <[email protected]>
No MPU support there yet. This test should really be predicated on a kconfig variable, not architecture. Signed-off-by: Andy Ross <[email protected]>
This is intended to be a value set by the platform to adjust the size of stacks created by tests. This test was setting it explicitly, and failing to honor it when creating its own stacks. Signed-off-by: Andy Ross <[email protected]>
This architecture doesn't support stack canaries. In fact the gcc -fstack-protect features don't seem to be working at all. I'm guessing it's an x32 ABI mismatch? Signed-off-by: Andy Ross <[email protected]>
Stacks created by tests should add this amount so thread-hungry architectures can tune it. Signed-off-by: Andy Ross <[email protected]>
There is actually nothing wrong with this test code idiom. But it's tickling a qemu emulator bug with the hpet driver and x86_64[1]. The rapidly spinning calls to k_uptime_get_32() need to disable interrupts, read timer hardware state and enable them. Something goes wrong in qemu with this process and the timer interrupt gets lost. The counter blows right past the comparator without delivering its interrupt, and thus the interrupt won't be delivered until the counter is next reset in idle after exit from the busy loop, which is obviously too late to interrupt the timeslicing thread. Just replace the loops with a single call to k_busy_wait(). The resulting code ends up being much simpler anyway. An added bonus is that we can remove the special case handling for native_posix (which was an entirely unrelated thing, but with a similar symptom). [1] But oddly not the same emulated hardware running with the same driver under the same qemu binary when used with a 32 bit kernel. Signed-off-by: Andy Ross <[email protected]>
This builds with a host compiler, not one from the SDK, and so no newlib library is available. There is work to enable newlib detection at and above the cmake level. This patch can be reverted when that lands. Signed-off-by: Andy Ross <[email protected]>
These two tests are hitting a stack overflow on x86_64 (not entirely surprisingly), but can't just increase stack size because there is an assert in the CMSIS compatibility layer that stacks be under 512 bytes. Just disable for now. Signed-off-by: Andy Ross <[email protected]>
Rebased and pushed. Just does a simple whitelisting for the newlib stuff. One extra change needed as some CMSIS tests started overflowing their stacks since the last version. Note again (as the earlier note is lost somewhere above) that this will continue to show two checkpatch errors (also three warnings). These are false positives: it's misparsing the ":" of a bitfield as a label (probably because it's on the same line as an inline enum declaration) and demanding it not have any whitespace before it. |
# When building in x32 mode with a host compiler, there is no libgcc | ||
# shipped (because it's an x86_64 compiler, not x32). That's | ||
# actually non-fatal, as no known features we hit in existing code | ||
# actually require the library. But I can't find an exaustive list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I just added -lgcc to fix 32bits x86 in PR #12674, I broke this [qemu_]x86_64 support in the ZEPHYR_TOOLCHAIN_VARIANT=host
case. Tentative fix in PR #12805, thanks in advance for reviewing it.
As you can see in my other comments below I'm quite confused about which toolchain configuration(s) this was tested with.
# but falling back to using the host toolchain is a very sane behavior | ||
# too. | ||
if(CONFIG_X86_64) | ||
if(CMAKE_C_COMPILER STREQUAL CMAKE_C_COMPILER-NOTFOUND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyross did you actually test this fallback? I can run it and make it work but only after some serious hacking because the same check in compiler/gcc/generic.cmake
runs first and aborts cmake first before this target.cmake file is even looked at.
In the cover letter of this PR you also wrote:
The architecture has to be built on a Linux host with ZEPHYR_TOOLCHAIN_VARIANT=host.
ZEPHYR_TOOLCHAIN_VARIANT=host means no CROSS_COMPILE and none of this compiler/gcc/target.cmake
code is used at all! (compiler/host-gcc/target.cmake
is used instead)
Were you testing both ZEPHYR_TOOLCHAIN_VARIANT=host and cross-compile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fallback has just been removed: 33ca224ffdda90
# too. | ||
if(CONFIG_X86_64) | ||
if(CMAKE_C_COMPILER STREQUAL CMAKE_C_COMPILER-NOTFOUND) | ||
find_program(CMAKE_C_COMPILER gcc ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW find_program() hits the cmake var CACHE first. So if CROSS_COMPILE is incomplete/partially broken then some of these find_program() will have no effect and the CMAKE_* end results will be partially from CROSS_COMPILE and partially from say /usr/bin.
PR zephyrproject-rtos#9522 series ending with commit c2c9265 ("tests: cmsis: Disable two cmsis portability tests on x86_64") added -mx32 support for the x86_64 ARCH and qemu_x86_64. While this was implemented in "compiler/gcc/target.cmake" as fall back from cross-compilation to the host compiler, it worked with a direct ZEPHYR_TOOLCHAIN_VARIANT=host too. Later, -lgcc was added to "compiler/host-gcc/target.cmake" by PR zephyrproject-rtos#12674 to fix the -m32 x86 build. This broke the x86_64 build when using ZEPHYR_TOOLCHAIN_VARIANT=host because even "multilib" packages usually don't feature the -mx32 version of libgcc. Fix this by excluding -lgcc in compiler/host-gcc/target.cmake just like compiler/gcc/target.cmake always did for x86_64. Signed-off-by: Marc Herbert <[email protected]>
PR #9522 series ending with commit c2c9265 ("tests: cmsis: Disable two cmsis portability tests on x86_64") added -mx32 support for the x86_64 ARCH and qemu_x86_64. While this was implemented in "compiler/gcc/target.cmake" as fall back from cross-compilation to the host compiler, it worked with a direct ZEPHYR_TOOLCHAIN_VARIANT=host too. Later, -lgcc was added to "compiler/host-gcc/target.cmake" by PR #12674 to fix the -m32 x86 build. This broke the x86_64 build when using ZEPHYR_TOOLCHAIN_VARIANT=host because even "multilib" packages usually don't feature the -mx32 version of libgcc. Fix this by excluding -lgcc in compiler/host-gcc/target.cmake just like compiler/gcc/target.cmake always did for x86_64. Signed-off-by: Marc Herbert <[email protected]>
x86_64 architecture support
This patch adds a x86_64 architecture and qemu_x86_64 board to Zephyr.
Only the basic architecture support needed to run 64 bit code is
added; no drivers are added, though a low-level console exists and is
wired to printk().
The support is built on top of a "X86 underkernel" layer, which can be
built in isolation as a unit test on a Linux host.
Limitations:
Right now the SDK lacks an x86_64 toolchain. The architecture has
to be built on a Linux host with ZEPHYR_TOOLCHAIN_VARIANT=host.
No x87/SSE/AVX usage is allowed. This is a stronger limitation than
other architectures where the instructions work from one thread even
if the context switch code doesn't support it. We are passing
-no-sse to prevent gcc from automatically generating SSE
instructions for non-floating-point purposes, which has the side
effect of changing the ABI. Future work to handle the FPU registers
will need to be combined with an "application" ABI distinct from the
kernel one (or just to require USERSPACE).
Paging is enabled (it has to be in long mode), but is a 1:1 mapping
of all memory. No MMU/USERSPACE support yet.
We are building with -mno-red-zone for stack size reasons, but this
is a valuable optimization. Enabling it requires automatic stack
switching, which requires a TSS, which means it has to happen after
MMU support.
The OS runs in 64 bit mode, but for compatibility reasons is
compiled to the 32 bit "X32" ABI. So while the full 64 bit
registers and instruction set are available, C pointers are 32 bits
long and Zephyr is constrained to run in the bottom 4G of memory.
Bugs to be fixed before merge:
The "make run" support for qemu broke recently. This relies on an
existing qemu_machine_hack.py script to change the ELF binary type
of the file, which worked for a while and fails now. For now it
requires a manual conversion with objcopy (which arguably should be
the mechanism of record even on IAMCU):
objcopy -O binary zephyr.elf ztmp.bin
echo '.incbin "ztmp.bin"' | as --32 -c - -o ztmp.o
gcc -m32 -T linker.cmd -Wl,--build-id=none -nostdlib -nodefaultlibs -nostartfiles -o zephyr.elf -fno-pic ztmp.o
All CPUs come up and enter Zephyr, but each is hanging on the first
context swtich. I have this reproduced in the xuk unit test, will
fix.
Each CPU requires a separate timer interrupt vector because of a
thinko with the way the xuk IRQ API was written (timer interrupts
are in some sense "per-CPU devices", but still should share the same
vector)
I need to write up docs for the xuk layer spelling out the way the
layering works.