Update rhcos to 42.80.20190827.1 #766

stbenjam · 2019-08-23T23:59:58Z

We'll need this when openshift/installer#2264 lands.

Eventually we can avoid this if and when openshift/installer#2092 lands, otherwise there's not a convenient way to get the data out of the installer.

hardys · 2019-08-27T12:46:45Z

I've been testing this locally, and the bootstrap VM doesn't come up correctly - still trying to figure out why, but it's also been replicated by @karmab and a few other folks - reverting openshift/installer to 0454021 appears to resolve the issue so I suspect some badness in the new qemu image

metal3ci · 2019-08-27T14:31:13Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1099/

stbenjam · 2019-08-27T18:44:35Z

On the latest rhcos the code that keeps retrying waiting for etcd cluster to come up isn't working...

Aug 27 18:22:46 localhost bootkube.sh[1669]: https://etcd-0.ostest.test.metalkube.org:2379 is unhealthy: f
ailed to connect: dial tcp: lookup etcd-0.ostest.test.metalkube.org on 192.168.111.1:53: no such host
Aug 27 18:22:46 localhost bootkube.sh[1669]: Error: unhealthy cluster
Aug 27 18:22:46 localhost bootkube.sh[1669]: etcd cluster up. Killing etcd certificate signer...

This should retry:

https://github.com/openshift/installer/blob/master/data/data/bootstrap/files/usr/local/bin/bootkube.sh.template#L272-L288

but it doesn't.

stbenjam · 2019-08-27T19:13:19Z

I can verify it's exiting 0, so we never retry:

# bootkube_podman_run \
>                 --rm \
>                 --name etcdctl \
>                 --env ETCDCTL_API=3 \
>                 --volume /opt/openshift/tls:/opt/openshift/tls:ro,z \
>                 --entrypoint etcdctl \
>                 6f3f175b53ca \
>                 --dial-timeout=5s \
>                 --cacert=/opt/openshift/tls/etcd-ca-bundle.crt \
>                 --cert=/opt/openshift/tls/etcd-client.crt \
>                 --key=/opt/openshift/tls/etcd-client.key \
>                 --endpoints=https://nonsense.example:2379 \
>                 endpoint health
https://nonsense.example:2379 is unhealthy: failed to connect: dial tcp: lookup nonsense.example on 127.0.0.1:53: no such host
Error: unhealthy cluster
[root@localhost openshift]# echo $?
0

Here's the full debug logs:

https://gist.github.com/stbenjam/24bf10326664ddee968df407c716e20b

podman version:

[root@localhost openshift]# podman --version
podman version 1.4.2-stable1

I don't notice any changes to etcd, it's last commit is from July.

stbenjam · 2019-08-27T19:22:48Z

Interactively in the container, it does return 1:

[root@localhost /]# etcdctl --dial-timeout=5s --cacert=/opt/openshift/tls/etcd-ca-bundle.crt --cert=/opt/openshift/tls/etcd-client.crt --key=/opt/openshift/tls/etcd-client.key --endpoints=https://nonsense.example:2379  endpoint health
https://nonsense.example:2379 is unhealthy: failed to connect: dial tcp: lookup nonsense.example on 127.0.0.1:53: no such host
Error: unhealthy cluster
[root@localhost /]# echo $?
1

stbenjam · 2019-08-27T20:48:00Z

Summary:

podman in 42.80.20190823.0 is busted, we need https://bugzilla.redhat.com/show_bug.cgi?id=1741157 backported
Backport fixes for v1.4.2 branch containers/podman#3895 will backport the required changes to podman, which will eventually end up in RHCOS
In the meanwhile, the older, working version of podman is going to be pinned, and we'll get a new RHCOS build shortly. Once we have that we'll update openshift/installer + this PR.

stbenjam · 2019-08-28T00:59:22Z

42.80.20190827.1 is out with the fixes, got a good local install.

openshift/installer#2277 bumps the installer rhcos.

metal3ci · 2019-08-28T01:03:58Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1103/

metal3ci · 2019-08-28T02:03:14Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1104/

metal3ci · 2019-08-28T03:00:58Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1105/

hardys · 2019-08-28T13:43:27Z

Ok with openshift/installer#2277 and rebased to include #775 this is working for me locally 👍

@stbenjam thanks for all the effort yesterday debugging the podman issues! :)

ocp_install_env.sh

metal3ci · 2019-08-28T14:43:52Z

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1107/

metal3ci · 2019-08-28T15:42:09Z

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1108/

hardys · 2019-08-28T18:16:38Z

Proven locally and CI so should be good to go when the installer PR lands and we remove the temporary stuff here to test it 👍

stbenjam · 2019-08-28T18:45:46Z

Great thanks! openshift PR landed. We should get a nightly with that fix in about an hour.

hardys · 2019-08-29T08:33:17Z

Looks like the nightly picked up the new image so I removed the changes to run the installer from source and force-pushed - I then saw #778 which looks good too, we can discuss later which approach you'd prefer to go with.

metal3ci · 2019-08-29T09:16:56Z

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1111/

russellb · 2019-08-29T10:55:47Z

Since CI passed here, let’s merge this. #778 can go in on top of this later

stbenjam added the CI check this PR with CI label Aug 27, 2019

stbenjam force-pushed the rhcos branch from 4ff7af1 to 88d08d9 Compare August 28, 2019 00:58

stbenjam changed the title ~~Update rhcos to 42.80.20190823.0~~ Update rhcos to 42.80.20190827.1 Aug 28, 2019

stbenjam force-pushed the rhcos branch from 88d08d9 to 94f6a66 Compare August 28, 2019 01:19

stbenjam mentioned this pull request Aug 28, 2019

bootstrap VM using old version of ironic #775

Closed

hardys reviewed Aug 28, 2019

View reviewed changes

ocp_install_env.sh Outdated Show resolved Hide resolved

stbenjam force-pushed the rhcos branch from 11f4429 to bc80d44 Compare August 28, 2019 14:27

stbenjam force-pushed the rhcos branch from bc80d44 to d877902 Compare August 28, 2019 14:45

stbenjam mentioned this pull request Aug 28, 2019

Kubernetes API is not available after install using latest dev-scripts #771

Closed

hardys self-requested a review August 28, 2019 18:14

hardys approved these changes Aug 28, 2019

View reviewed changes

stbenjam mentioned this pull request Aug 28, 2019

Always use RHCOS version from installer #778

Merged

Update rhcos to 42.80.20190827.1

7497381

hardys force-pushed the rhcos branch from d877902 to 7497381 Compare August 29, 2019 08:28

russellb merged commit 88a71ad into openshift-metal3:master Aug 29, 2019

hardys mentioned this pull request Aug 29, 2019

Nodes status is 'registering' indefinitely in 12_csr_hack.sh #779

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update rhcos to 42.80.20190827.1 #766

Update rhcos to 42.80.20190827.1 #766

stbenjam commented Aug 23, 2019

hardys commented Aug 27, 2019

metal3ci commented Aug 27, 2019

stbenjam commented Aug 27, 2019 •

edited

Loading

stbenjam commented Aug 27, 2019 •

edited

Loading

stbenjam commented Aug 27, 2019

stbenjam commented Aug 27, 2019

stbenjam commented Aug 28, 2019 •

edited

Loading

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

hardys commented Aug 28, 2019

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

hardys commented Aug 28, 2019

stbenjam commented Aug 28, 2019

hardys commented Aug 29, 2019

metal3ci commented Aug 29, 2019

russellb commented Aug 29, 2019

Update rhcos to 42.80.20190827.1 #766

Update rhcos to 42.80.20190827.1 #766

Conversation

stbenjam commented Aug 23, 2019

hardys commented Aug 27, 2019

metal3ci commented Aug 27, 2019

stbenjam commented Aug 27, 2019 • edited Loading

stbenjam commented Aug 27, 2019 • edited Loading

stbenjam commented Aug 27, 2019

stbenjam commented Aug 27, 2019

stbenjam commented Aug 28, 2019 • edited Loading

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

hardys commented Aug 28, 2019

metal3ci commented Aug 28, 2019

metal3ci commented Aug 28, 2019

hardys commented Aug 28, 2019

stbenjam commented Aug 28, 2019

hardys commented Aug 29, 2019

metal3ci commented Aug 29, 2019

russellb commented Aug 29, 2019

stbenjam commented Aug 27, 2019 •

edited

Loading

stbenjam commented Aug 27, 2019 •

edited

Loading

stbenjam commented Aug 28, 2019 •

edited

Loading