You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently tried upgrading our etcd cluster form 3.1 to 3.2. After each node restart, the affected node (we run a 9 node cluster) seemed to come back up and report its 3.2 version number correctly.
However, after the leader was restarted, the entire cluster fell apart.
Even removing the data folder and restarting from scratch could not restore operation.
We eventually traced this back to having multiple URLs defined in ETCD_LISTEN_PEER_URLS
We used to do: ETCD_LISTEN_PEER_URLS=http://127.0.0.1:2380,http://SOMEIP:2380
which seems to have broken that listener somehow with 3.2
When changed to: ETCD_LISTEN_PEER_URLS=http://SOMEIP:2380
the cluster bootstrapped correctly again and resumed operation with the 3.2 feature set.
In fact, on manual probings with curl in the above scenario, the HTTP server on 2380 accepted TCP connections but never replied with anything, the request just sat there and didn't get processed by etcd.
A downgrade to 3.1 in this broken state also worked and restored operation, so it seems to be a 3.2 exclusive regression / unintentional breaking change.
The text was updated successfully, but these errors were encountered:
We recently tried upgrading our etcd cluster form 3.1 to 3.2. After each node restart, the affected node (we run a 9 node cluster) seemed to come back up and report its 3.2 version number correctly.
However, after the leader was restarted, the entire cluster fell apart.
Even removing the data folder and restarting from scratch could not restore operation.
We eventually traced this back to having multiple URLs defined in
ETCD_LISTEN_PEER_URLS
We used to do:
ETCD_LISTEN_PEER_URLS=http://127.0.0.1:2380,http://SOMEIP:2380
which seems to have broken that listener somehow with 3.2
When changed to:
ETCD_LISTEN_PEER_URLS=http://SOMEIP:2380
the cluster bootstrapped correctly again and resumed operation with the 3.2 feature set.
In fact, on manual probings with
curl
in the above scenario, the HTTP server on 2380 accepted TCP connections but never replied with anything, the request just sat there and didn't get processed by etcd.A downgrade to 3.1 in this broken state also worked and restored operation, so it seems to be a 3.2 exclusive regression / unintentional breaking change.
The text was updated successfully, but these errors were encountered: