Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"vagrant up provisioner" failing on Ubuntu 20.04.2 with libvirtd backend #59

Closed
nshalman opened this issue Mar 2, 2021 · 8 comments · Fixed by #81
Closed

"vagrant up provisioner" failing on Ubuntu 20.04.2 with libvirtd backend #59

nshalman opened this issue Mar 2, 2021 · 8 comments · Fixed by #81
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@nshalman
Copy link
Member

nshalman commented Mar 2, 2021

I have a machine running Ubuntu 20.024.2 where when following the documentation when I get to the vagrant up provisioner step I get a failure (detailed below.) I was able to bisect it to have started failing at commit 9edecbf.

Expected Behaviour

vagrant up provisioner should succeed

Current Behaviour

vagrant up provisioner fails

Last bit of output is this:

    provisioner: + cd /certs
    provisioner: + '[' '!' -f ca-key.pem ]
    provisioner: + cfssl gencert -initca ca.json
    provisioner: + cfssljson -bare ca
    provisioner: 2021/03/03 16:13:19 [INFO] generating a new CA key and certificate from CSR
    provisioner: 2021/03/03 16:13:19 [INFO] generate received request
    provisioner: 2021/03/03 16:13:19 [INFO] received CSR
    provisioner: 2021/03/03 16:13:19 [INFO] generating key: rsa-2048
    provisioner: 2021/03/03 16:13:19 [INFO] encoded CSR
    provisioner: 2021/03/03 16:13:19 [INFO] signed certificate with serial number 6198577501498366853967432593968947302343768126
    provisioner: + '[' '!' -f
    provisioner:  server.pem ]
    provisioner: + cfssl
    provisioner:  gencert
    provisioner:  '-ca=ca.pem'
    provisioner:  '-ca-key=ca-key.pem'
    provisioner:  '-config=/ca-config.json' '-profile=server' server-csr.json
    provisioner: + cfssljson -bare server
    provisioner: 2021/03/03 16:13:19 [INFO] generate received request
    provisioner: 2021/03/03 16:13:19 [INFO] received CSR
    provisioner: 2021/03/03 16:13:19 [INFO] generating key: rsa-2048
    provisioner: 2021/03/03 16:13:19 [INFO] encoded CSR
    provisioner: 2021/03/03 16:13:19 [INFO] signed certificate with serial number 185473496253083687571958641817618602018650548890
    provisioner: +
    provisioner: cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: +
    provisioner: cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: +
    provisioner: mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

Possible Solution

Unclear, but git bisect blames 9edecbf

Steps to Reproduce (for bugs)

  1. Take existing Ubuntu 20.04.2 machine with Vagrant and Libvirt installed
  2. clone this repo
  3. cd deploy/vagrant
  4. vagrant up provisioner

Context

I am unable to use Vagrant and Libvirt to bring up the provisioner.

Your Environment

  • Operating System and version (e.g. Linux, Windows, MacOS):
    Ubuntu 20.04.2

  • How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
    Vagrant and Libvirt

@gianarb
Copy link
Contributor

gianarb commented Mar 3, 2021

Thank you for taking the time to do a bisect! I am totally looking to something different with my fix apparently #57

@gianarb
Copy link
Contributor

gianarb commented Mar 3, 2021

This is how I managed to reproduce the issue on Equinix Metal Ubuntu 20.04:

$ apt update
$ apt install qemu libvirt-daemon-system libvirt-clients libxslt-dev libxml2-dev libvirt-dev zlib1g-dev ruby-dev ruby-libvirt ebtables dnsmasq-base qemu-utils build-essentials git qemu-kvm
$ curl -O https://releases.hashicorp.com/vagrant/2.2.9/vagrant_2.2.9_x86_64.deb
$ apt install -y ./vagrant_2.2.9_x86_64.deb
$ vagrant plugin install vagrant-libvirt
$ vagrant up --provider=libvirt provisioner

@gianarb
Copy link
Contributor

gianarb commented Mar 3, 2021

Just to centralize information:

#57 (comment)

@markjacksonfishing markjacksonfishing self-assigned this Apr 2, 2021
@markjacksonfishing markjacksonfishing added the kind/bug Categorizes issue or PR as related to a bug. label Apr 2, 2021
@markjacksonfishing
Copy link

I will take a look at this and thank you @nshalman for providing this information

@zdyxry
Copy link

zdyxry commented Apr 8, 2021

I have the same problem, is there any solution now?

@zdyxry
Copy link

zdyxry commented Apr 8, 2021

OS version:

NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Vagrant up log:

...
    provisioner: + make_certs_writable
    provisioner: + local certdir=/etc/docker/certs.d/192.168.1.1
    provisioner: + sudo mkdir -p /etc/docker/certs.d/192.168.1.1
    provisioner: + sudo chown -R root /etc/docker/certs.d/192.168.1.1
    provisioner: + ./setup.sh
    provisioner: INFO: starting tinkerbell stack setup
    provisioner: INFO: verifying prerequisites for ubuntu (18.04)
    provisioner:        Found prerequisite: docker
    provisioner:        Found prerequisite: docker-compose
    provisioner:        Found prerequisite: ip
    provisioner:        Found prerequisite: jq
    provisioner:        Found prerequisite: netplan
    provisioner: INFO: tinkerbell network interface is already configured
    provisioner: INFO: found existing osie files, skipping osie setup
    provisioner: Sending build context to Docker daemon  8.704kB
    provisioner: Step 1/5 : FROM alpine:3.11
    provisioner:  ---> 44dc5a8658dc
    provisioner: Step 2/5 : ENTRYPOINT [ "/entrypoint.sh" ]
    provisioner:  ---> Using cache
    provisioner:  ---> e585b202977c
    provisioner: Step 3/5 : RUN apk add --no-cache --update --upgrade ca-certificates postgresql-client
    provisioner:  ---> Using cache
    provisioner:  ---> ad3e1f23d62b
    provisioner: Step 4/5 : RUN apk add --no-cache --update --upgrade --repository=http://dl-cdn.alpinelinux.org/alpine/edge/testing cfssl
    provisioner:  ---> Using cache
    provisioner:  ---> 526eb5787443
    provisioner: Step 5/5 : COPY . .
    provisioner:  ---> Using cache
    provisioner:  ---> e2064efd9051
    provisioner: Successfully built e2064efd9051
    provisioner: Successfully tagged tinkerbell-certs:latest
    provisioner: creating directory
    provisioner: + cd /certs
    provisioner: + '[' '!' -f ca-key.pem ]
    provisioner: + '[' '!' -f server.pem ]
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + rm bundle.pem.tmp
    provisioner: Error: No such object:

@Boskovits
Copy link

I see exactly the same in Debian 10.7

@mmlb mmlb assigned mmlb and unassigned markjacksonfishing Apr 26, 2021
@mmlb
Copy link
Contributor

mmlb commented Apr 26, 2021

I came across this too, and have tracked it down. Will open up a PR tomorrow.

mmlb added a commit that referenced this issue Apr 27, 2021
This fixes the vagrant based sandbox from not working. This was particularly
annoying to track down because of not having `set -x` in `setup.sh` but
what looks like xtrace output in stderr. The xtrace output on stderr
was actually from the `generate_certificates` container:

```
    provisioner: 2021/04/26 21:22:32 [INFO] signed certificate with serial number 142120228981443865252746731124927082232998754394
    provisioner: + cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: + cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: + mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
==> provisioner: Clearing any previously set forwarded ports...
==> provisioner: Removing domain...
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
```
I ended up doubting the `if ! cmp` blocks until I added `set -euxo pipefail` and
the issue was pretty obviously in docker-compose land.

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: 2021/04/27 18:20:13 [INFO] signed certificate with serial number 138080403356863347716407921665793913032297783787
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + mv bundle.pem.tmp bundle.pem
    provisioner: + local certs_dir=/etc/docker/certs.d/192.168.1.1
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + [[ -d /etc/docker/certs.d/192.168.1.1/ ]]
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=
    provisioner: + local start_moment
    provisioner: + local current_status
    provisioner: ++ docker inspect '' --format '{{ .State.StartedAt }}'
    provisioner: Error: No such object:
    provisioner: + start_moment=
    provisioner: + finish
    provisioner: + rm -rf /tmp/tmp.ve3XJ7qtgA
```

Notice that `container_id` is empty. This turns out to be because
`docker-compose` is an empty file!

```
vagrant@provisioner:/vagrant/deploy$ docker-compose up --build registry
vagrant@provisioner:/vagrant/deploy$ which docker-compose
/usr/local/bin/docker-compose
vagrant@provisioner:/vagrant/deploy$ docker-compose -h
vagrant@provisioner:/vagrant/deploy$ file /usr/local/bin/docker-compose
/usr/local/bin/docker-compose: empty
```

So with the following test patch:

```diff
diff --git a/deploy/vagrant/scripts/tinkerbell.sh b/deploy/vagrant/scripts/tinkerbell.sh
index 915f27f..dcb379c 100644
--- a/deploy/vagrant/scripts/tinkerbell.sh
+++ b/deploy/vagrant/scripts/tinkerbell.sh
@@ -34,6 +34,14 @@ setup_nat() (
 main() (
 	export DEBIAN_FRONTEND=noninteractive

+	local name=docker-compose-$(uname -s)-$(uname -m)
+	local url=https://github.com/docker/compose/releases/download/1.26.0/$name
+	curl -fsSLO "$url"
+	curl -fsSLO "$url.sha256"
+	sha256sum -c <"$name.sha256"
+	chmod +x "$name"
+	sudo mv "$name" /usr/local/bin/docker-compose
+
 	if ! [[ -f ./.env ]]; then
 		./generate-env.sh eth1 >.env
 	fi
```

We can try again and we're back to a working state:

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: Creating network "deploy_default" with the default driver
    provisioner: Creating volume "deploy_postgres_data" with default driver
    provisioner: Building registry
    provisioner: Step 1/7 : FROM registry:2.7.1
...
    provisioner: Successfully tagged deploy_registry:latest
    provisioner: Creating deploy_registry_1 ...
Creating deploy_registry_1 ... done
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=2e3d9557fd4c0d7f7e1c091b957a0033d23ebb93f6c8e5cdfeb8947b2812845c
...
    provisioner: + sudo -iu vagrant docker login --username=admin --password-stdin 192.168.1.1
    provisioner: WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json.
    provisioner: Configure a credential helper to remove this warning. See
    provisioner: https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    provisioner: Login Succeeded
    provisioner: + set +x
    provisioner: NEXT:  1. Enter /vagrant/deploy and run: source ../.env; docker-compose up -d
    provisioner:        2. Try executing your fist workflow.
    provisioner:           Follow the steps described in https://tinkerbell.org/examples/hello-world/ to say 'Hello World!' with a workflow.
```

:toot:

Fixes #59

Signed-off-by: Manuel Mendez <[email protected]>
mmlb added a commit that referenced this issue Apr 27, 2021
This fixes the vagrant based sandbox from not working. This was particularly
annoying to track down because of not having `set -x` in `setup.sh` but
what looks like xtrace output in stderr. The xtrace output on stderr
was actually from the `generate_certificates` container:

```
    provisioner: 2021/04/26 21:22:32 [INFO] signed certificate with serial number 142120228981443865252746731124927082232998754394
    provisioner: + cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: + cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: + mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
==> provisioner: Clearing any previously set forwarded ports...
==> provisioner: Removing domain...
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
```
I ended up doubting the `if ! cmp` blocks until I added `set -euxo pipefail` and
the issue was pretty obviously in docker-compose land.

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: 2021/04/27 18:20:13 [INFO] signed certificate with serial number 138080403356863347716407921665793913032297783787
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + mv bundle.pem.tmp bundle.pem
    provisioner: + local certs_dir=/etc/docker/certs.d/192.168.1.1
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + [[ -d /etc/docker/certs.d/192.168.1.1/ ]]
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=
    provisioner: + local start_moment
    provisioner: + local current_status
    provisioner: ++ docker inspect '' --format '{{ .State.StartedAt }}'
    provisioner: Error: No such object:
    provisioner: + start_moment=
    provisioner: + finish
    provisioner: + rm -rf /tmp/tmp.ve3XJ7qtgA
```

Notice that `container_id` is empty. This turns out to be because
`docker-compose` is an empty file!

```
vagrant@provisioner:/vagrant/deploy$ docker-compose up --build registry
vagrant@provisioner:/vagrant/deploy$ which docker-compose
/usr/local/bin/docker-compose
vagrant@provisioner:/vagrant/deploy$ docker-compose -h
vagrant@provisioner:/vagrant/deploy$ file /usr/local/bin/docker-compose
/usr/local/bin/docker-compose: empty
```

So with the following test patch:

```diff
diff --git a/deploy/vagrant/scripts/tinkerbell.sh b/deploy/vagrant/scripts/tinkerbell.sh
index 915f27f..dcb379c 100644
--- a/deploy/vagrant/scripts/tinkerbell.sh
+++ b/deploy/vagrant/scripts/tinkerbell.sh
@@ -34,6 +34,14 @@ setup_nat() (
 main() (
 	export DEBIAN_FRONTEND=noninteractive

+	local name=docker-compose-$(uname -s)-$(uname -m)
+	local url=https://github.com/docker/compose/releases/download/1.26.0/$name
+	curl -fsSLO "$url"
+	curl -fsSLO "$url.sha256"
+	sha256sum -c <"$name.sha256"
+	chmod +x "$name"
+	sudo mv "$name" /usr/local/bin/docker-compose
+
 	if ! [[ -f ./.env ]]; then
 		./generate-env.sh eth1 >.env
 	fi
```

We can try again and we're back to a working state:

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: Creating network "deploy_default" with the default driver
    provisioner: Creating volume "deploy_postgres_data" with default driver
    provisioner: Building registry
    provisioner: Step 1/7 : FROM registry:2.7.1
...
    provisioner: Successfully tagged deploy_registry:latest
    provisioner: Creating deploy_registry_1 ...
Creating deploy_registry_1 ... done
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=2e3d9557fd4c0d7f7e1c091b957a0033d23ebb93f6c8e5cdfeb8947b2812845c
...
    provisioner: + sudo -iu vagrant docker login --username=admin --password-stdin 192.168.1.1
    provisioner: WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json.
    provisioner: Configure a credential helper to remove this warning. See
    provisioner: https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    provisioner: Login Succeeded
    provisioner: + set +x
    provisioner: NEXT:  1. Enter /vagrant/deploy and run: source ../.env; docker-compose up -d
    provisioner:        2. Try executing your fist workflow.
    provisioner:           Follow the steps described in https://tinkerbell.org/examples/hello-world/ to say 'Hello World!' with a workflow.
```

:toot:

Fixes #59

Signed-off-by: Manuel Mendez <[email protected]>
mmlb added a commit that referenced this issue Apr 28, 2021
This fixes the vagrant based sandbox from not working. This was particularly
annoying to track down because of not having `set -x` in `setup.sh` but
what looks like xtrace output in stderr. The xtrace output on stderr
was actually from the `generate_certificates` container:

```
    provisioner: 2021/04/26 21:22:32 [INFO] signed certificate with serial number 142120228981443865252746731124927082232998754394
    provisioner: + cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: + cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: + mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
==> provisioner: Clearing any previously set forwarded ports...
==> provisioner: Removing domain...
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
```
I ended up doubting the `if ! cmp` blocks until I added `set -euxo pipefail` and
the issue was pretty obviously in docker-compose land.

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: 2021/04/27 18:20:13 [INFO] signed certificate with serial number 138080403356863347716407921665793913032297783787
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + mv bundle.pem.tmp bundle.pem
    provisioner: + local certs_dir=/etc/docker/certs.d/192.168.1.1
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + [[ -d /etc/docker/certs.d/192.168.1.1/ ]]
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=
    provisioner: + local start_moment
    provisioner: + local current_status
    provisioner: ++ docker inspect '' --format '{{ .State.StartedAt }}'
    provisioner: Error: No such object:
    provisioner: + start_moment=
    provisioner: + finish
    provisioner: + rm -rf /tmp/tmp.ve3XJ7qtgA
```

Notice that `container_id` is empty. This turns out to be because
`docker-compose` is an empty file!

```
vagrant@provisioner:/vagrant/deploy$ docker-compose up --build registry
vagrant@provisioner:/vagrant/deploy$ which docker-compose
/usr/local/bin/docker-compose
vagrant@provisioner:/vagrant/deploy$ docker-compose -h
vagrant@provisioner:/vagrant/deploy$ file /usr/local/bin/docker-compose
/usr/local/bin/docker-compose: empty
```

So with the following test patch:

```diff
diff --git a/deploy/vagrant/scripts/tinkerbell.sh b/deploy/vagrant/scripts/tinkerbell.sh
index 915f27f..dcb379c 100644
--- a/deploy/vagrant/scripts/tinkerbell.sh
+++ b/deploy/vagrant/scripts/tinkerbell.sh
@@ -34,6 +34,14 @@ setup_nat() (
 main() (
 	export DEBIAN_FRONTEND=noninteractive

+	local name=docker-compose-$(uname -s)-$(uname -m)
+	local url=https://github.com/docker/compose/releases/download/1.26.0/$name
+	curl -fsSLO "$url"
+	curl -fsSLO "$url.sha256"
+	sha256sum -c <"$name.sha256"
+	chmod +x "$name"
+	sudo mv "$name" /usr/local/bin/docker-compose
+
 	if ! [[ -f ./.env ]]; then
 		./generate-env.sh eth1 >.env
 	fi
```

We can try again and we're back to a working state:

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: Creating network "deploy_default" with the default driver
    provisioner: Creating volume "deploy_postgres_data" with default driver
    provisioner: Building registry
    provisioner: Step 1/7 : FROM registry:2.7.1
...
    provisioner: Successfully tagged deploy_registry:latest
    provisioner: Creating deploy_registry_1 ...
Creating deploy_registry_1 ... done
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=2e3d9557fd4c0d7f7e1c091b957a0033d23ebb93f6c8e5cdfeb8947b2812845c
...
    provisioner: + sudo -iu vagrant docker login --username=admin --password-stdin 192.168.1.1
    provisioner: WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json.
    provisioner: Configure a credential helper to remove this warning. See
    provisioner: https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    provisioner: Login Succeeded
    provisioner: + set +x
    provisioner: NEXT:  1. Enter /vagrant/deploy and run: source ../.env; docker-compose up -d
    provisioner:        2. Try executing your fist workflow.
    provisioner:           Follow the steps described in https://tinkerbell.org/examples/hello-world/ to say 'Hello World!' with a workflow.
```

:toot:

Except that my results are not due to the way docker-compose is being installed
at all. After still running into this issue when using a box built with the new
install method I was still seeing empty docker-compose files. I ran a bunch of
experiments to try and figure out what is going on. The issue is strictly
in vagrant-libvirt since vagrant-virtualbox works fine. Turns out data isn't
being flushed back to disk at shutdown. Both calling `sync` or writing multiple
copies of the binary to the fs (3x at least) ended up working. Then I was informed
of a known vagrant-libvirt issue which matches this behavior, vagrant-libvirt/vagrant-libvirt#1013!

Fixes #59

Signed-off-by: Manuel Mendez <[email protected]>
@mergify mergify bot closed this as completed in #81 Apr 29, 2021
mergify bot added a commit that referenced this issue Apr 29, 2021
## Description

Ensures docker-compose is correctly downloaded.
Also adds some better debuggability to setup.sh and the vagrant provision script.
A bunch of misc clean ups following the boy scout rule (leave things better than you found them)

## Why is this needed

Fixes: #59 

## How Has This Been Tested?

`vagrant up provisioner` now works

## How are existing users impacted? What migration steps/scripts do we need?

Fixes a bug where the vagrant sandbox wasn't working.

## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
ttwd80 pushed a commit to ttwd80/tinkerbell-playground that referenced this issue Sep 7, 2024
This fixes the vagrant based sandbox from not working. This was particularly
annoying to track down because of not having `set -x` in `setup.sh` but
what looks like xtrace output in stderr. The xtrace output on stderr
was actually from the `generate_certificates` container:

```
    provisioner: 2021/04/26 21:22:32 [INFO] signed certificate with serial number 142120228981443865252746731124927082232998754394
    provisioner: + cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: + cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: + mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
==> provisioner: Clearing any previously set forwarded ports...
==> provisioner: Removing domain...
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
```
I ended up doubting the `if ! cmp` blocks until I added `set -euxo pipefail` and
the issue was pretty obviously in docker-compose land.

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: 2021/04/27 18:20:13 [INFO] signed certificate with serial number 138080403356863347716407921665793913032297783787
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + mv bundle.pem.tmp bundle.pem
    provisioner: + local certs_dir=/etc/docker/certs.d/192.168.1.1
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + [[ -d /etc/docker/certs.d/192.168.1.1/ ]]
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=
    provisioner: + local start_moment
    provisioner: + local current_status
    provisioner: ++ docker inspect '' --format '{{ .State.StartedAt }}'
    provisioner: Error: No such object:
    provisioner: + start_moment=
    provisioner: + finish
    provisioner: + rm -rf /tmp/tmp.ve3XJ7qtgA
```

Notice that `container_id` is empty. This turns out to be because
`docker-compose` is an empty file!

```
vagrant@provisioner:/vagrant/deploy$ docker-compose up --build registry
vagrant@provisioner:/vagrant/deploy$ which docker-compose
/usr/local/bin/docker-compose
vagrant@provisioner:/vagrant/deploy$ docker-compose -h
vagrant@provisioner:/vagrant/deploy$ file /usr/local/bin/docker-compose
/usr/local/bin/docker-compose: empty
```

So with the following test patch:

```diff
diff --git a/deploy/vagrant/scripts/tinkerbell.sh b/deploy/vagrant/scripts/tinkerbell.sh
index 915f27f..dcb379c 100644
--- a/deploy/vagrant/scripts/tinkerbell.sh
+++ b/deploy/vagrant/scripts/tinkerbell.sh
@@ -34,6 +34,14 @@ setup_nat() (
 main() (
 	export DEBIAN_FRONTEND=noninteractive

+	local name=docker-compose-$(uname -s)-$(uname -m)
+	local url=https://github.com/docker/compose/releases/download/1.26.0/$name
+	curl -fsSLO "$url"
+	curl -fsSLO "$url.sha256"
+	sha256sum -c <"$name.sha256"
+	chmod +x "$name"
+	sudo mv "$name" /usr/local/bin/docker-compose
+
 	if ! [[ -f ./.env ]]; then
 		./generate-env.sh eth1 >.env
 	fi
```

We can try again and we're back to a working state:

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: Creating network "deploy_default" with the default driver
    provisioner: Creating volume "deploy_postgres_data" with default driver
    provisioner: Building registry
    provisioner: Step 1/7 : FROM registry:2.7.1
...
    provisioner: Successfully tagged deploy_registry:latest
    provisioner: Creating deploy_registry_1 ...
Creating deploy_registry_1 ... done
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=2e3d9557fd4c0d7f7e1c091b957a0033d23ebb93f6c8e5cdfeb8947b2812845c
...
    provisioner: + sudo -iu vagrant docker login --username=admin --password-stdin 192.168.1.1
    provisioner: WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json.
    provisioner: Configure a credential helper to remove this warning. See
    provisioner: https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    provisioner: Login Succeeded
    provisioner: + set +x
    provisioner: NEXT:  1. Enter /vagrant/deploy and run: source ../.env; docker-compose up -d
    provisioner:        2. Try executing your fist workflow.
    provisioner:           Follow the steps described in https://tinkerbell.org/examples/hello-world/ to say 'Hello World!' with a workflow.
```

:toot:

Except that my results are not due to the way docker-compose is being installed
at all. After still running into this issue when using a box built with the new
install method I was still seeing empty docker-compose files. I ran a bunch of
experiments to try and figure out what is going on. The issue is strictly
in vagrant-libvirt since vagrant-virtualbox works fine. Turns out data isn't
being flushed back to disk at shutdown. Both calling `sync` or writing multiple
copies of the binary to the fs (3x at least) ended up working. Then I was informed
of a known vagrant-libvirt issue which matches this behavior, vagrant-libvirt/vagrant-libvirt#1013!

Fixes tinkerbell#59

Signed-off-by: Manuel Mendez <[email protected]>
ttwd80 pushed a commit to ttwd80/tinkerbell-playground that referenced this issue Sep 7, 2024
## Description

Ensures docker-compose is correctly downloaded.
Also adds some better debuggability to setup.sh and the vagrant provision script.
A bunch of misc clean ups following the boy scout rule (leave things better than you found them)

## Why is this needed

Fixes: tinkerbell#59 

## How Has This Been Tested?

`vagrant up provisioner` now works

## How are existing users impacted? What migration steps/scripts do we need?

Fixes a bug where the vagrant sandbox wasn't working.

## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
ttwd80 pushed a commit to ttwd80/tinkerbell-playground that referenced this issue Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants