Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't access crc from crc start on Fedora 33 - err: ssh: handshake failed #2214

Closed
aireilly opened this issue Apr 12, 2021 · 13 comments
Closed

Comments

@aireilly
Copy link

aireilly commented Apr 12, 2021

I can't get access to the crc VM on crc start.

time="2021-04-12T17:21:04+01:00" level=debug msg="error: Temporary error: ssh command error:\ncommand : exit 0\nerr     : ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain\\n - sleeping 1s"
time="2021-04-12T17:21:05+01:00" level=debug msg="retry loop: attempt 139"
time="2021-04-12T17:21:05+01:00" level=debug msg="Running SSH command: exit 0"
time="2021-04-12T17:21:05+01:00" level=debug msg="Using ssh private keys: [/home/aireilly/.crc/cache/crc_libvirt_4.7.2/id_ecdsa_crc]"
time="2021-04-12T17:21:05+01:00" level=debug msg="SSH command results: err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain, output: "

I've tried:

$ crc stop -f
$ crc delete -f
$ crc cleanup
$ sudo rm -rf ~/.crc
< REBOOT >
$ crc setup
$ crc start --log-level debug

Software version:
CodeReady Containers version: 1.24.0+5f06e84b
OpenShift version: 4.7.2 (embedded in executable
Fedora 33 (Workstation Edition)
crc.log

I'm not on VPN, and running crc on a Lenovo thinkpad t590.

What can I try?

@cfergeau
Copy link
Contributor

The VM has apparently failed to start, not sure why.
Can you check virsh -c qemu:///system list after crc start failed? And also look in journalctl -u libvirtd, and in /var/lib/libvirt/qemu/crc.log

@aireilly
Copy link
Author

aireilly commented Apr 12, 2021

Thanks for quick response!

aireilly@aireilly-t590: ~ $ virsh -c qemu:///system list
 Id   Name   State
----------------------
 1    crc    running

journalctl -u libvirtd

-- Reboot --
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: reading /etc/resolv.conf
Apr 12 17:22:06 aireilly-t590 dnsmasq[5010]: using nameserver 127.0.0.53#53
Apr 12 17:22:06 aireilly-t590 dnsmasq[2499]: using nameserver 127.0.0.53#53
Apr 12 17:22:30 aireilly-t590 dnsmasq-dhcp[5010]: DHCPDISCOVER(crc) 52:fd:fc:07:21:82
Apr 12 17:22:30 aireilly-t590 dnsmasq-dhcp[5010]: DHCPOFFER(crc) 192.168.130.11 52:fd:fc:07:21:82
Apr 12 17:22:30 aireilly-t590 dnsmasq-dhcp[5010]: DHCPREQUEST(crc) 192.168.130.11 52:fd:fc:07:21:82
Apr 12 17:22:30 aireilly-t590 dnsmasq-dhcp[5010]: DHCPACK(crc) 192.168.130.11 52:fd:fc:07:21:82 crc-rsppg-master-0
Apr 12 17:32:20 aireilly-t590 libvirtd[5152]: End of file while reading data: Input/output error
Apr 12 17:53:50 aireilly-t590 dnsmasq-dhcp[5010]: DHCPREQUEST(crc) 192.168.130.11 52:fd:fc:07:21:82
Apr 12 17:53:50 aireilly-t590 dnsmasq-dhcp[5010]: DHCPACK(crc) 192.168.130.11 52:fd:fc:07:21:82 crc-rsppg-master-0
Apr 12 18:21:08 aireilly-t590 dnsmasq-dhcp[5010]: DHCPREQUEST(crc) 192.168.130.11 52:fd:fc:07:21:82
Apr 12 18:21:08 aireilly-t590 dnsmasq-dhcp[5010]: DHCPACK(crc) 192.168.130.11 52:fd:fc:07:21:82 crc-rsppg-master-0
Apr 12 19:06:56 aireilly-t590 libvirtd[5152]: End of file while reading data: Input/output error
Apr 12 19:06:56 aireilly-t590 libvirtd[5152]: End of file while reading data: Input/output error
Apr 12 19:07:14 aireilly-t590 libvirtd[5152]: End of file while reading data: Input/output error


qemu/crc.log doesn't exist?

cat: /var/lib/libvirt/qemu/crc.log: No such file or directory

@aireilly
Copy link
Author

aireilly commented Apr 12, 2021

Some more detail:

aireilly@aireilly-t590: ~ $ crc status
CRC VM:          Running
OpenShift:       Unreachable (v4.7.2)
Disk Usage:      0B of 0B (Inside the CRC VM)
Cache Usage:     13.17GB
Cache Directory: /home/aireilly/.crc/cache
aireilly@aireilly-t590: ~ $ crc config view
- consent-telemetry                     : yes
aireilly@aireilly-t590: ~ $ crc config view
- consent-telemetry                     : yes
aireilly@aireilly-t590: ~ $ crc ip

aireilly@aireilly-t590: ~ $

@aireilly
Copy link
Author

aireilly@aireilly-t590: ~ $ free
              total        used        free      shared  buff/cache   available
Mem:       32620212     4680744     6564908      917296    21374560    26549456
Swap:       4194300           0     4194300

@praveenkumar
Copy link
Member

@aireilly following is what concerning in your case, even VM is running

time="2021-04-12T17:21:05+01:00" level=debug msg="Using ssh private keys: [/home/aireilly/.crc/cache/crc_libvirt_4.7.2/id_ecdsa_crc]"
time="2021-04-12T17:21:05+01:00" level=debug msg="SSH command results: err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain, output: "

Which should be something like.

DEBU Using ssh private keys: [/home/prkumar/.crc/cache/crc_libvirt_4.7.5/id_ecdsa_crc] 
DEBU SSH command results: err: dial tcp 192.168.130.11:22: connect: no route to host, output:  
DEBU error: Temporary error: ssh command error:
command : exit 0
err     : dial tcp 192.168.130.11:22: connect: no route to host\n - sleeping 1s 
DEBU retry loop: attempt 1                        
DEBU Running SSH command: exit 0                  

Can you manually able to ssh to that running machine using ssh -i /home/aireilly/.crc/cache/crc_libvirt_4.7.2/id_ecdsa_crc [email protected] ? Also check from virt-manager if some issue with VM.

@gbraad
Copy link
Contributor

gbraad commented Apr 13, 2021

When looking at:

Apr 12 17:22:30 aireilly-t590 dnsmasq-dhcp[5010]: DHCPACK(crc) 192.168.130.11 52:fd:fc:07:21:82 crc-rsppg-master-0
Apr 12 17:32:20 aireilly-t590 libvirtd[5152]: End of file while reading data: Input/output error

you see the VM does a DHCP request, gets an offer and acks this... but is followed by a failure. Likely it is not possible to persist the DHCP lease ?

$ cat /var/lib/libvirt/dnsmasq/default.leases

Perhaps the following can show more about the network state (untested)

$ domifaddr crc
$ net-dhcp-leases crc

@aireilly
Copy link
Author

I can ssh with ssh -i /home/aireilly/.crc/cache/crc_libvirt_4.7.2/id_ecdsa_crc [email protected]

but default.leases doesn't exist.

aireilly@aireilly-t590: ~ $ cat /var/lib/libvirt/dnsmasq/default.leases
cat: /var/lib/libvirt/dnsmasq/default.leases: No such file or directory

and

virsh # net-dhcp-leases crc
error: failed to get network 'crc'
error: Network not found: no network with matching name 'crc'

virsh # domifaddr crc
error: failed to get domain 'crc'
```
`

@cfergeau
Copy link
Contributor

cfergeau commented Apr 13, 2021

The logs shows that crc failed to connect to the VM using something similar to ssh -i /home/aireilly/.crc/cache/crc_libvirt_4.7.2/id_ecdsa_crc [email protected], but then the same command works. I'm wondering if it could be that the VM was very slow to start?
Or maybe it is caused by time="2021-04-12T17:18:38+01:00" level=debug msg="(crc) DBG | time=\"2021-04-12T17:18:38+01:00\" level=warning msg=\"Unable to determine VM's IP address, did it fail to boot?\""
Not sure why we would continue if we did not get an IP address

cfergeau added a commit to cfergeau/machine-driver-libvirt that referenced this issue Apr 13, 2021
At the moment, nil is returned when Start() fails to gets the VM IP
crc's code assumes that when Start() returns, the VM has an IP.
This should help catch this situation.

This is related to crc-org/crc#2214
cfergeau added a commit to cfergeau/machine-driver-libvirt that referenced this issue Apr 13, 2021
At the moment, nil is returned when Start() fails to gets the VM IP
crc's code assumes that when Start() returns, the VM has an IP.
This should help catch this situation.

This is related to crc-org/crc#2214
cfergeau added a commit to cfergeau/crc that referenced this issue Apr 13, 2021
This can be useful to detect some `crc start` issues early, and is not
always present in the debug log at the moment.

This is related to crc-org#2214
praveenkumar pushed a commit to crc-org/machine-driver-libvirt that referenced this issue Apr 14, 2021
At the moment, nil is returned when Start() fails to gets the VM IP
crc's code assumes that when Start() returns, the VM has an IP.
This should help catch this situation.

This is related to crc-org/crc#2214
praveenkumar pushed a commit that referenced this issue Apr 14, 2021
This can be useful to detect some `crc start` issues early, and is not
always present in the debug log at the moment.

This is related to #2214
@aireilly
Copy link
Author

Hi, so I updated my Fedora OS with some random system updates last night, and now my crc cluster comes up and I can interact with it completely normally. I wish there was more that I could say, is there anything else that I could provide now with running config that could help diagnose future issues like this?

@aireilly
Copy link
Author

thanks for your help, I'll close this now.

@gbraad
Copy link
Contributor

gbraad commented Apr 15, 2021

you could check dnf history and see what could have been the culprit before the update.

@aireilly
Copy link
Author

aireilly commented Apr 15, 2021

    94 | install docker --allowerasing                                                                 | 2021-04-13 13:12 | E, I           |    5 EE
    93 | remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-lo | 2021-04-13 13:11 | Removed        |    5 EE
    92 | install docker-ce docker-ce-cli containerd.io                                                 | 2021-04-13 13:00 | Install        |    7 EE
    91 | remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-lo | 2021-04-13 12:59 | Removed

@gbraad
Copy link
Contributor

gbraad commented Apr 15, 2021

Docker .... interesting. We had issues with this before, but these were around the DNS and resolving, but this mostly affected Docker instead of us. IIRC this might update firewall rules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants