Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH connection failed #40

Open
dgivens opened this issue Mar 7, 2014 · 16 comments
Open

SSH connection failed #40

dgivens opened this issue Mar 7, 2014 · 16 comments

Comments

@dgivens
Copy link

dgivens commented Mar 7, 2014

I'm running into an issue where is appears test-kitchen isn't waiting long enough to SSH when I'm using /sbin/init for my run_command. If I wait a couple of second, I can then ssh into the instance. Is this something that can be addressed by kitchen-docker or are there any workarounds I might employ?

I'm trying to solve the problem of runit not starting when the package has been installed. On Debian systems, which I'm testing on, it starts via the inittab and the runit cookbook handles this by issuing a telinit once the package is installed.

$ sudo kitchen converge
-----> Starting Kitchen (v1.2.1)
-----> Creating <default-debian-74>...
       Step 0 : FROM dgivens/wheezy
        ---> f8c92d987c7a
       Step 1 : ENV DEBIAN_FRONTEND noninteractive
        ---> Using cache
        ---> 47cd72727fd3
       Step 2 : RUN dpkg-divert --local --rename --add /sbin/initctl
        ---> Using cache
        ---> 68a0adca627d
       Step 3 : RUN ln -sf /bin/true /sbin/initctl
        ---> Using cache
        ---> 22aa8539b9a5
       Step 4 : RUN apt-get update
        ---> Using cache
        ---> e7a978183de9
       Step 5 : RUN apt-get install -y sudo openssh-server curl lsb-release
        ---> Using cache
        ---> c3c4b40c5f5f
       Step 6 : RUN mkdir -p /var/run/sshd
        ---> Using cache
        ---> 44186f2bdfc1
       Step 7 : RUN useradd -d /home/kitchen -m -s /bin/bash kitchen
        ---> Using cache
        ---> 7bd45f37b4a4
       Step 8 : RUN echo kitchen:kitchen | chpasswd
        ---> Using cache
        ---> b20a2a304c97
       Step 9 : RUN echo 'kitchen ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
        ---> Using cache
        ---> bec122d32801
       Successfully built bec122d32801
       e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652
       [{
           "ID": "e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652",
           "Created": "2014-03-07T13:59:25.340911399Z",
           "Path": "/sbin/init",
           "Args": [],
           "Config": {
        "Hostname": "e79dce6f1a2b",
        "Domainname": "",
        "User": "",
        "Memory": 0,
        "MemorySwap": 0,
        "CpuShares": 0,
        "AttachStdin": false,
        "AttachStdout": false,
        "AttachStderr": false,
        "PortSpecs": null,
        "ExposedPorts": {
            "22/tcp": {}
        },
        "Tty": false,
        "OpenStdin": false,
        "StdinOnce": false,
        "Env": [
            "HOME=/",
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "DEBIAN_FRONTEND=noninteractive"
        ],
        "Cmd": [
            "/sbin/init"
        ],
        "Dns": null,
        "Image": "bec122d32801",
        "Volumes": null,
        "VolumesFrom": "",
        "WorkingDir": "",
        "Entrypoint": null,
        "NetworkDisabled": false,
        "OnBuild": null
           },
           "State": {
        "Running": true,
        "Pid": 12513,
        "ExitCode": 0,
        "StartedAt": "2014-03-07T13:59:25.491002705Z",
        "FinishedAt": "0001-01-01T00:00:00Z",
        "Ghost": false
           },
           "Image": "bec122d328011c78d9fe642c0b8d858894c9a5271da51b465831a6e718c935a2",
           "NetworkSettings": {
        "IPAddress": "172.17.0.2",
        "IPPrefixLen": 16,
        "Gateway": "172.17.42.1",
        "Bridge": "docker0",
        "PortMapping": null,
        "Ports": {
            "22/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49195"
                }
            ]
        }
           },
           "ResolvConfPath": "/etc/resolv.conf",
           "HostnamePath": "/var/lib/docker/containers/e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652/hostname",
           "HostsPath": "/var/lib/docker/containers/e79dce6f1a2b245a4314ece432944f0e93f33783b6e75d76783cbed778b2b652/hosts",
           "Name": "/dreamy_davinci5",
           "Driver": "aufs",
           "Volumes": {},
           "VolumesRW": {},
           "HostConfig": {
        "Binds": null,
        "ContainerIDFile": "",
        "LxcConf": [],
        "Privileged": false,
        "PortBindings": {
            "22/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49195"
                }
            ]
        },
        "Links": null,
        "PublishAllPorts": false
           }
       }]
       Finished creating <default-debian-74> (0m0.55s).
-----> Converging <default-debian-74>...
       Preparing files for transfer
       Resolving cookbook dependencies with Berkshelf 2.0.14...
       Removing non-cookbook files before transfer
       Preparing data bags
       Preparing environments
       Preparing encrypted data bag secret
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::Disconnect: connection closed by remote host>)
>>>>>> Converge failed on instance <default-debian-74>.
>>>>>> Please see .kitchen/logs/default-debian-74.log for more details
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: connection closed by remote host
>>>>>> ----------------------
daniel.givens@jenkins-n02:~/fusion$ ssh kitchen@localhost -p 49195
The authenticity of host '[localhost]:49195 ([127.0.0.1]:49195)' can't be established.
ECDSA key fingerprint is 84:80:c6:6b:86:bd:47:ed:35:53:0c:e2:99:07:bd:99.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[localhost]:49195' (ECDSA) to the list of known hosts.
kitchen@localhost's password: 
Linux e79dce6f1a2b 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 07:38:26 UTC 2013 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
kitchen@e79dce6f1a2b:~$ 
@dgivens
Copy link
Author

dgivens commented Mar 7, 2014

I'm finding that test-kitchen with vagrant-lxc is a better fit.

@dgivens dgivens closed this as completed Mar 7, 2014
@portertech portertech reopened this Mar 10, 2014
@portertech
Copy link
Contributor

@fnichol can we increase the number of SSH attempts for asset scp, or make it configurable?

@portertech
Copy link
Contributor

Hmm, wait_for_ssh() should be avoiding this issue.

@damm
Copy link

damm commented Mar 10, 2014

Isn't there some bit about Ubuntu restarting sshd at startup? Could be why it's connection closed by remote host

@dustinmm80
Copy link

I'm also running into this issue, running on Jenkins:

�[0m�[36m       Finished converging <building-tagger-ubuntu-1204> (4m30.00s).
�[0m�[36m-----> Setting up <building-tagger-ubuntu-1204>...
�[0m�[36m       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
�[0m�[36m       [SSH] connection failed, retrying (#<Net::SSH::Disconnect: connection closed by remote host>)
�[0m�[36m$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::Disconnect: connection closed by remote host>)

@fnichol
Copy link
Contributor

fnichol commented Mar 29, 2014

After looking into this, it looks like the code that establishes an SSH connection in Test Kitchen retries but doesn't pause between attempts (unlike the #wait_for_ssh logic). As a result @portertech and I have been testing out test-kitchen/test-kitchen#399 today with reasonable success. I'd like to add a bit more configuration for Driver authors and then merge it into Test Kitchen core.

@bplunkert
Copy link

I too am experiencing this issue, and I believe test-kitchen/test-kitchen/pull/454 may offer a workaround and/or solution if it is merged.

@vitalis
Copy link

vitalis commented Sep 12, 2014

I'm facing same issue,
Did someone found a solution?

@Yserz
Copy link

Yserz commented Dec 22, 2014

Is it possible that this issue is not with SSH'ing but with SCP? Can you try to copy something with SCP into the container or install openssh-server+openssh-client (SCP is in client) on the container.

@xacaxulu
Copy link

+1

@azazi-sa
Copy link

+1

-----> Starting Kitchen (v1.15.0)
-----> Creating <default-centos-73>...
       Sending build context to Docker daemon 24.58 MB
       Step 1/7 : FROM centos:7
        ---> 67591570dd29
       Step 2/7 : MAINTAINER "msameera" <[email protected]>
        ---> Using cache
        ---> dfeeb2440e5a
       Step 3/7 : ENV container docker
        ---> Using cache
        ---> 3cdba08c07a6
       Step 4/7 : EXPOSE 32773
        ---> Running in 583109b9c836
        ---> 779a403e4c47
       Removing intermediate container 583109b9c836
       Step 5/7 : RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == systemd-tmpfiles-setup.service ] || rm -f $i; done); rm -f /lib/systemd/system/multi-user.target.wants/*;rm -f /etc/systemd/system/*.wants/*;rm -f /lib/systemd/system/local-fs.target.wants/*; rm -f /lib/systemd/system/sockets.target.wants/*udev*; rm -f /lib/systemd/system/sockets.target.wants/*initctl*; rm -f /lib/systemd/system/basic.target.wants/*;rm -f /lib/systemd/system/anaconda.target.wants/*;
        ---> Running in 2451fba617e7
        ---> 2bcb0ff84da1
       Removing intermediate container 2451fba617e7
       Step 6/7 : VOLUME /sys/fs/cgroup
        ---> Running in 0166c6ef6be3
        ---> 509d6f7a7309
       Removing intermediate container 0166c6ef6be3
       Step 7/7 : CMD /usr/sbin/init
        ---> Running in fd5db50d5441
        ---> 2e0bda625ddc
       Removing intermediate container fd5db50d5441
       Successfully built 2e0bda625ddc
       f65e491a3cd55b21cdf56f0bbc308776ae82f0ad9fa7256fcf3d74d2a53e276e
       0.0.0.0:32774
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds
       Waiting for SSH service on localhost:32774, retrying in 3 seconds

@jcalonso
Copy link

I'm having the same problem and I found a workaround:

  • kitchen converge
  • wait for the message Waiting for SSH service on localhost:32773, retrying in 3 seconds
  • In another terminal enter the docker container docker ps then with the container id: docker exec -it xxxxxx bash
  • Reset kitchen password: passwd kitchen
  • Now kitchen is able to enter the container

@rgarrigue
Copy link

rgarrigue commented Feb 12, 2019

I've the same

       Waiting for SSH service on localhost:32776, retrying in 3 seconds
       Waiting for SSH service on localhost:32776, retrying in 3 seconds
       Waiting for SSH service on localhost:32776, retrying in 3 seconds

I got rid of it commenting out run_command: /lib/systemd/systemd.

Follow up problem is, of course no services are running, can't test my stuff properly :-(

@EugenMayer
Copy link

I have the same issue when starting systemd, seems like the vanilla startup-command, which run_command: /bin/systemd is replacing, is dealing with the ssh server setup - and is missing otherwise.

Since i require upstart / systemd docker is road blocked for me in https://github.com/EugenMayer/chef-tinc-cookbook/blob/master/.kitchen.docker.yml

@rgarrigue
Copy link

Here's our kitchen.yml, which include our workaround for this issue, the first provision command. The second one fix a testinfra related issue which may be useful for other tools relying on /sbin/init.

---
driver:
  name: docker
  use_sudo: false
  provision_command:
    - rm /lib/systemd/system/ssh.service
    - '[ ! -f /sbin/init ] && ln -s /lib/systemd/systemd /sbin/init || true'
  run_command: /bin/systemd
  privileged: true
  volume:
    - "/sys/fs/cgroup:/sys/fs/cgroup:ro"
  dns:
    - 1.1.1.1
    - 9.9.9.9


transport:
  name: sftp


platforms:
  - name: stretch
    driver_config:
      image: jrei/systemd-debian:9
      platform: debian
  - name: buster
    driver_config:
      image: jrei/systemd-debian:10
      platform: debian

suites:
  - name: nitrogen
    provisioner:
      salt_bootstrap_options: -X -p git -x python2.7 stable 2017.7
  - name: fluorine
    provisioner:
      salt_bootstrap_options: -X -p git -x python2.7 stable 2019.2


provisioner:
  name: salt_solo
  salt_install: bootstrap
  is_file_root: true
  require_chef: true
  salt_copy_filter:
    - .git
  dependencies:
  - name: common
    repo: git
    source: https://gitlab+deploy-token-6:[email protected]/salt/common-formula
    branch: dev
  state_top:
    base:
      "*":
        - bender
  pillars_from_files:
    pillar.sls: test/pillar.sls
  pillars:
    top.sls:
      base:
        "*":
          - pillar


verifier:
  name: shell
  remote_exec: false
  command: pytest --junitxml=test/${KITCHEN_INSTANCE}_test_report.xml --html=test/${KITCHEN_INSTANCE}_test_report.html --self-contained-html --color=yes --host="docker://root@${KITCHEN_CONTAINER_ID}" "test/integration/"

@EugenMayer
Copy link

EugenMayer commented May 28, 2019

i fixed it right now by doing

  run_command: /bin/systemd
  provision_command:
    - apt-get install systemd -y
  disable_upstart: false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests