-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
layout: Support running the firmware as a BIOS image #24
base: main
Are you sure you want to change the base?
Conversation
ecb4507
to
dc97d6b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rbradford this code is ready for review (but we may want to merge #23 first). I had some questions (commented inline) that we should resolve before merging this.
src/asm/rom.s
Outdated
reset_vector: # 0xffff_fff0 | ||
# This code must be 16 bytes or less, so be careful when adding anyting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an easy way to check in this file that the reset_vector
code is exactly 16 bytes? I had to do objdump
to make sure everything was setup correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest placing only the jmp here, and moving the CR0 write after the jump. The interrupt flag is already 0, so CLI is not needed either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote this to just have jmp rom16
at the reset vector, and then the rom16
code just follows the exact steps in the SDM to enter protected mode.
I kept the CLI, as the SDM says to call it before entering protected mode, but I might be misreading it (or reading the wrong section).
src/asm/ram32.s
Outdated
movw $0x3f8, %dx | ||
movb $'R', %al | ||
outb %al, %dx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep these characters being printed to serial output? I found it useful when debugging/writing the assembly, but it is a little weird to see RPL
printed right before the firmware starts up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine for now, but maybe we could look into splitting out output between 0x402 (QEMU firmware I/O debug port) and the serial port. This could be a candidate for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be nice to have all of the log!
statements go to debug output and only have SimpleTextOutputProtocol
log to the standard serial port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all the serial logs from the assembly. I just have them in a separate branch for local debugging.
Looking good. The virtio issue is really that the PCI bar address is zero but that zero bar is still the one that the virtio device is still saying to be used. I think understanding why that PCI bar is zero is the next step forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest only leaving a jmp at the reset vector.
src/asm/rom.s
Outdated
reset_vector: # 0xffff_fff0 | ||
# This code must be 16 bytes or less, so be careful when adding anyting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest placing only the jmp here, and moving the CR0 write after the jump. The interrupt flag is already 0, so CLI is not needed either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes to simplify and better document the code.
src/asm/ram32.s
Outdated
movw $0x3f8, %dx | ||
movb $'R', %al | ||
outb %al, %dx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it would be nice to have all of the log!
statements go to debug output and only have SimpleTextOutputProtocol
log to the standard serial port.
src/asm/rom.s
Outdated
reset_vector: # 0xffff_fff0 | ||
# This code must be 16 bytes or less, so be careful when adding anyting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote this to just have jmp rom16
at the reset vector, and then the rom16
code just follows the exact steps in the SDM to enter protected mode.
I kept the CLI, as the SDM says to call it before entering protected mode, but I might be misreading it (or reading the wrong section).
@rbradford @bonzini do the fixes and comments look good? Is there anything else you want me to do here? I could include a high-level doc file explain the multistage boot process (rom16 -> rom32 -> ram32 -> ram64 -> Rust), if you think it's necessary. |
@josephlr I think what you've got here is great. I do think it's a good idea to boot through to userspace before merging this as there is obviously something missing. We have a good data point now with the fact that PVH mode with Seabios does show up the PCI BAR address. |
Seems reasonable, I'll focus on #26, and try to get that booting to user-space first, and then come back to this (once we know why it's broken). |
@josephlr I was able to dump the registers for QEMU and for CH just before jumping to the kernel. They should be in the same state: https://gist.github.com/rbradford/7ccc15bc2c55d6423840896a42491ac0 I've not done a deep analysis yet but one obvious thing stands out and that is that the CS and CR0 look different. |
I think the CS segments being different is OK (as most of the bits are ignored), CR0 difference is also interesting. However, manually setting the CR0 and CS bits to match CH didn't seem to help (kernel stopped in the same place). |
In case you want to reproduce: Modifying the main.rs just after "Jumping to kernel" to write the 0x80 I/O port. CH has a function that prints out debug messages when that is written to so it wasn't hard to print the KVM registers. For QEMU i modified the firmware to boot infinitely after the same message and used "-serial mon:stdio" and used the monitor (ctrl-a c) to get the registers ("info registers") |
Clearing CR0.CD and CR0.NW is a good idea anyway for the firmware, but it's only a matter of tidiness (disabling the cache or writeback doesn't work in VMs). |
So setting So the problem is a kernel panic due to it running out of memory. |
Oh duh, we don't get an e820 map passed in with PVH, so we have to make it ourselves. Right now we're passing in a map with zero memory, and this makes Linux quite unhappy. |
@josephlr Use the CMOS to get the limits and make one? Although CH currently has a CMOS implementation it might not be sticking around as nothing is currently using it so best use an E820 where available and fall back to CMOS where it's not. |
It looks like pvh gives us an e820 map, so we should be able to just use that. |
So it looks like regardless of how we run the firmware (QEMU, PVH, CH, Firecracker, etc..) we eventually need to get two pieces of information from the host:
All the other info we get from the host is optional. We can get the info in the following way:
|
For Q35 As we have a very simple memory model we might be able to get away with hardcoded ranges + the memory output details from the CMOS which might be easier than implementing QEMU FW_CFG. |
89ff693
to
fd40b6e
Compare
cb38019
to
4d22a46
Compare
9217c0a
to
3c821b0
Compare
Our 32-bit GDT just has 1 code and 1 data segment descriptor. They are both flat (spanning the entire 4G address space) and are placed in normal static memory (for now). Signed-off-by: Joe Richey <[email protected]>
The new assembly files handle: - reset.s: Jumping from reset - rom16.s: Transitioning to 32-bit mode - rom32.s: Copying data from ROM to RAM, jumping to PVH entry point To place this code correctly, we add a new Program Header for the code and data that expect to be in ROM. See the comments in layout.ld for more information. We also place the 32-bit GDT in the ROM. This is mostly for convenience, as it lets us use the GDT directly from the ROM code without having to do any complex offset calculations. As laying out the code for a ROM makes the binary ~45% bigger, we gate building as a ROM behind an optional feature. Signed-off-by: Joe Richey <[email protected]>
Without this change, QEMU/CloudHV will attempt to load the ROM into the memory region below 4GiB. But something (i.e. SeaBIOS) is already there. We just pick an arbitrary address. It doesn't actually matter where it gets loaded, as the ROM code isn't used when doing PVH boot. Signed-off-by: Joe Richey <[email protected]>
Depends on #37
Fixes #5
Now you can do the following:
And the firmware will go from the real-mode reset vector all the way to Rust.