-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TD boot failure with 24.04 host and guest #100
Comments
Hello, thanks, i did a test and can create a VM with your qemu command:
does the failure happen all the time to you or only occasionally ? |
The issue likely happened after input "reboot" in a TD console (actually it's not rebooted but down) and then start the TD again. Once the issue happens, the next a few boots will fail with the same error as well. |
I reproduced this issue, occasionally, use my normal qemu command: with below host dmesg call trace and segment fault: |
Thank you for reporting us your feedback! The internal ticket has been created: https://warthogs.atlassian.net/browse/PEK-648.
|
One more thing, the issue happened frequently if the TD has some workload running for a while and then quit the TD and boot it again. Configuration vCPU/memory: 4/16GB |
[35681.431248] audit: type=1400 audit(1716225051.602:121): apparmor="DENIED" operation="mknod" class="file" profile="ubuntu_pro_apt_news" name="/usr/lib/python3/dist-packages/uaclient/pycache/apt_news.cpython-312.pyc.124103736237104" pid=23850 comm="python3" requested_mask="c" denied_mask="c" fsuid=0 ouid=0 @ruomengh is apparmor running in your system? Not sure if it is related, I will disable apparmor to try |
ok, useless, still can be reproduced |
Will add some debugging codes on the reproduced machine to investigate. |
Edit: I'm still seeing this happens occasionally after this. I'll still need to look further into it. I think I might have some idea for this. If you check the SEAMCALL's return code, it's actually TDX_TLB_TRACKING_NOT_DONE. I'm running into this issue in |
TD boot failure with error as below. After the issue happens occasionally, TD cannot boot in the next a few attempts.
Error:
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64: Failed to get registers: Input/output error
qemu-system-x86_64:./qemu-test.sh: line 379: 2988094 Segmentation fault (core dumped) /usr/bin/qemu-system-x86_64 -accel kvm -name process=tdxvm,debug-threads=on -m 16G -vga none -monitor pty -nodefaults -drive file=/home/ruomeng/images/tdx-2404.qcow2,if=virtio,format=qcow2 -monitor telnet:127.0.0.1:9072,server,nowait -bios /usr/share/qemu/OVMF.fd -object tdx-guest,sept-ve-disable=on,id=tdx -cpu host,-kvm-steal-time,pmu=off,tsc-freq=1000000000 -machine q35,hpet=off,kernel_irqchip=split,memory-encryption=tdx -device virtio-net-pci,netdev=mynet0 -netdev user,id=mynet0,net=10.0.2.0/24,dhcpstart=10.0.2.15,hostfwd=tcp::10059-:22 -smp 4 -chardev stdio,id=mux,mux=on,logfile=/tmp/vm_log_2024-05-10T0232.log -device virtio-serial,romfile= -device virtconsole,chardev=mux -monitor chardev:mux -serial chardev:mux -nographic
dmesg of the host:
[74411.414026] kvm: vcpu 0: requested 24992 ns lapic timer period limited to 200000 ns
[74411.416450] kvm: vcpu 1: requested 24992 ns lapic timer period limited to 200000 ns
[74411.418247] kvm: vcpu 2: requested 24992 ns lapic timer period limited to 200000 ns
[74411.419948] kvm: vcpu 3: requested 24992 ns lapic timer period limited to 200000 ns
[74414.466490] SEAMCALL (0x0000000000000006) failed: 0xc0000b0d00000001 RCX 0x8000004683bc70f7 RDX 0x0000000000000400 R8 0x0000004683bc7000 R9 0x0000000000000000 R10 0x0000000000000000 R11 0x0000000000000000
[74414.466498] SEAMCALL (0x0000000000000006) failed: 0xc0000b0d00000001 RCX 0x8000004683b710f7 RDX 0x0000000000000400 R8 0x0000004683b71000 R9 0x0000000000000000 R10 0x0000000000000000 R11 0x0000000000000000
[74415.068003] CPU 2/KVM[2993257]: segfault at 72d21ec00fe8 ip 000072d227e690dc sp 000072d21ec00ff0 error 6 in libc.so.6[72d227e28000+188000] likely on CPU 0 (core 0, socket 0)
[74415.068022] Code: 48 89 45 c8 48 8b 05 3b 9d 19 00 f3 0f 6f 0a 64 8b 00 0f 11 8d b8 fb ff ff 89 85 08 fb ff ff 48 8b 42 10 48 89 85 c8 fb ff ff af f4 fb ff 48 89 de 4c 89 ef 48 89 c2 48 89 85 f8 fa ff ff 49
The text was updated successfully, but these errors were encountered: