Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider's offer not visible in the network #43

Open
stan7123 opened this issue Sep 2, 2024 · 0 comments
Open

Provider's offer not visible in the network #43

stan7123 opened this issue Sep 2, 2024 · 0 comments

Comments

@stan7123
Copy link
Collaborator

stan7123 commented Sep 2, 2024

A long running provider stopped being visible in the network.
It might be something with the GPU power management.

dmesg logs:

[1209655.251126] vfio-pci 0000:01:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[1209655.252414] vfio-pci 0000:01:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[1209655.252422] vfio-pci 0000:01:00.1: can't change power state from D3hot to D0 (config space inaccessible)
[1209655.252428] vfio-pci 0000:01:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[1209655.252667] vfio-pci 0000:01:00.1: can't change power state from D3hot to D0 (config space inaccessible)
[1209657.540216] vfio-pci 0000:01:00.0: not ready 1023ms after bus reset; waiting
[1209658.596269] vfio-pci 0000:01:00.0: not ready 2047ms after bus reset; waiting
[1209660.868316] vfio-pci 0000:01:00.0: not ready 4095ms after bus reset; waiting
[1209665.220456] vfio-pci 0000:01:00.0: not ready 8191ms after bus reset; waiting
[1209673.668715] vfio-pci 0000:01:00.0: not ready 16383ms after bus reset; waiting
[1209692.101287] vfio-pci 0000:01:00.0: not ready 32767ms after bus reset; waiting
[1209726.918320] vfio-pci 0000:01:00.0: not ready 65535ms after bus reset; giving up
[1209726.919548] vfio-pci 0000:01:00.0: invalid power transition (from D3cold to D3hot)

golemsp.service logs:

wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:31:41.557727 +00:00] INFO [runtime/src/self_test.rs:135] Task package: /usr/lib/yagna/plugins/ya-runtime-vm-nvidia/runtime/>
wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:31:41.558205 +00:00] INFO [runtime/src/self_test.rs:66] Starting runtime
wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:31:41.558268 +00:00] INFO [runtime/src/vmrt.rs:160] Executing command: Command { std: cd "/usr/lib/yagna/plugins/ya-runtime>
wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:31:41.558531 +00:00] INFO [runtime/src/guest_agent_comm.rs:467] Waiting for Guest Agent socket ...
wrz 02 12:34:03 golem-provider golemsp[324920]: vmrt: -chardev socket,path=/tmp/e76af1ccfa284185ba753ec86b90cacf_vpn.sock,server,wait=off,id=vpn_cdev: warning: short-form boolean option >
wrz 02 12:34:03 golem-provider golemsp[324920]: Please use server=on instead
wrz 02 12:34:03 golem-provider golemsp[324920]: vmrt: -chardev socket,path=/tmp/e76af1ccfa284185ba753ec86b90cacf_inet.sock,server,wait=off,id=inet_cdev: warning: short-form boolean optio>
wrz 02 12:34:03 golem-provider golemsp[324920]: Please use server=on instead
wrz 02 12:34:03 golem-provider golemsp[324920]: vmrt: ../qemu/hw/pci/pci.c:1637: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:34:03.175077 +00:00] ERROR [/home/runner/.cargo/git/checkouts/ya-runtime-sdk-9ad26604fa07f4ec/0395b0c/ya-runtime-sdk/src/ru>
wrz 02 12:34:03 golem-provider golemsp[324920]: [2024-09-02 12:34:03.180541 +00:00] ERROR [exe-unit/src/bin.rs:357] Test failed

yagna: 0.15.0
Linux golem-provider 5.15.0-118-generic #128-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant