Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd start without error, but can't accept connection #2868

Closed
silenceleaf opened this issue May 25, 2015 · 13 comments
Closed

etcd start without error, but can't accept connection #2868

silenceleaf opened this issue May 25, 2015 · 13 comments

Comments

@silenceleaf
Copy link

etcd2 version: 2.01

I am using public discovery service

My configuration:

coreos:
  etcd2:
    name: coreos1
    discovery: https://discovery.etcd.io/a0fc6f91ae40f46c1d3a6c1b4f4f34d8
    initial-cluster-token: etcd2-cluster-1
    initial-advertise-peer-urls: http://192.168.0.111:2380
    listen-peer-urls: http://192.168.0.111:2380
    advertise-client-urls: http://192.168.0.111:2379
    listen-client-urls: http://192.168.0.111:2379    
    data-dir: /home/core/etcd2/
    initial-cluster-state: new

Journal is OK
journalctl -b -u etcd2 --no-pager

May 25 13:21:21 coreos1 systemd[1]: Started etcd2.
May 25 13:21:21 coreos1 systemd[1]: Starting etcd2...
May 25 13:21:21 coreos1 etcd2[523]: 2015/05/25 13:21:21 etcd: listening for peers on http://192.168.0.111:2380
May 25 13:21:21 coreos1 etcd2[523]: 2015/05/25 13:21:21 etcd: listening for client requests on http://192.168.0.111:2379
May 25 13:21:21 coreos1 etcd2[523]: 2015/05/25 13:21:21 etcdserver: datadir is valid for the 2.0.1 format
May 25 13:21:22 coreos1 etcd2[523]: 2015/05/25 13:21:22 discovery: found self c5a74c1a94a0abf7 in the cluster
May 25 13:21:22 coreos1 etcd2[523]: 2015/05/25 13:21:22 discovery: found 1 peer(s), waiting for 4 more

When I use:

etcdctl -C 192.168.0.111:2379 member list

It give me

context deadline exceeded

When I check netstat -l

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        1      0 192.168.0.111:2379      *:*                     LISTEN
tcp        0      0 192.168.0.111:2380      *:*                     LISTEN

Recv-Q have one waiting request. I think it means network is ok. The data haven't been consumed by etcd2.

Because I can't use etcdctl, I don't know if etcd work well.

Would you mind diagnose this for me?

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

@silenceleaf As you saw in the log, etcd is not started yet. It is still waiting for other. You have to start the other 4 peers.

@xiang90 xiang90 closed this as completed May 25, 2015
@silenceleaf
Copy link
Author

Question is even if I start other 4 peers, nothing happened in #1 machine. I can only config static mode of etcd.

@silenceleaf
Copy link
Author

Network is fine. I can ping other machine without any issue.

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

@silenceleaf You need to give all 5 members the same discovery token. It would be helpful if you can provide the configuration and log from all your 5 members.

@silenceleaf
Copy link
Author

Two of my cloud config. These two machine can not find each other.

coreos1

#cloud-config

hostname: coreos1

coreos:
  etcd2:
    name: coreos1
    discovery: https://discovery.etcd.io/a0fc6f91ae40f46c1d3a6c1b4f4f34d8
    initial-cluster-token: etcd2-cluster-1
    initial-advertise-peer-urls: http://192.168.0.111:2380
    listen-peer-urls: http://192.168.0.111:2380
    # initial-cluster: coreos1=http://192.168.0.111:2380,coreos2=http://192.168.0.112:2380,coreos3=http://192.168.0.113:2380,coreos4=http://192.168.0.114:2380,coreos5=http://192.168.0.115:2380
    advertise-client-urls: http://192.168.0.111:2379
    listen-client-urls: http://192.168.0.111:2379    
    data-dir: /home/core/etcd2/
    initial-cluster-state: new
  # fleet:
    # public-ip: 192.168.0.111
    # etcd_servers: http://192.168.0.111:2379
    # metadata: HomeCluster
  units:
    - name: 10-ens3.network
      content: |
        [Match]
        MACAddress=52:54:00:fe:b3:c1
        [Network]
        Address=192.168.0.111/24
        Gateway=192.168.0.1
        DNS=8.8.8.8
    - name: etcd2.service
      command: start
    # - name: fleet.service
    #  command: start

coreos2

#cloud-config

hostname: coreos2

coreos:
  etcd2:
    name: coreos2
    discovery: https://discovery.etcd.io/a0fc6f91ae40f46c1d3a6c1b4f4f34d8
    initial-cluster-token: etcd2-cluster-1
    initial-advertise-peer-urls: http://192.168.0.112:2380
    listen-peer-urls: http://192.168.0.112:2380
    # initial-cluster: coreos1=http://192.168.0.111:2380,coreos2=http://192.168.0.112:2380,coreos3=http://192.168.0.113:2380,coreos4=http://192.168.0.114:2380,coreos5=http://192.168.0.115:2380
    advertise-client-urls: http://192.168.0.112:2379
    listen-client-urls: http://192.168.0.112:2379    
    data-dir: /home/core/etcd2/
    initial-cluster-state: new
  fleet:
    public-ip: 192.168.0.112
  units:
    - name: 10-ens3.network
      content: |
        [Match]
        MACAddress=52:54:00:fe:b3:c2
        [Network]
        Address=192.168.0.112/24
        Gateway=192.168.0.1
        DNS=8.8.8.8
    - name: etcd2.service
      command: start
    # - name: fleet.service
    #   command: start

@silenceleaf
Copy link
Author

The log of every machine said they are waiting for connection, but no connection come in.

May 25 18:20:41 coreos2 etcd2[522]: 2015/05/25 18:20:41 discovery: found peer c5a74c1a94a0abf7 in the cluster
May 25 18:20:41 coreos2 etcd2[522]: 2015/05/25 18:20:41 discovery: found self 5b9b099334e46510 in the cluster
May 25 18:20:41 coreos2 etcd2[522]: 2015/05/25 18:20:41 discovery: found 2 peer(s), waiting for 3 more
May 25 18:21:40 coreos2 etcd2[522]: 2015/05/25 18:21:40 discovery: during waiting for other nodes connection to https://discovery.etcd.io timed out, retrying in 4h33m4s

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

@silenceleaf That is great. The two you gave me work well. You just need to add 3 more.

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

If you do not add 3 more, it will simply wait forever.

@silenceleaf
Copy link
Author

@xiang90 Suppose even if I have only two machines, it should try to connect with each other. So, let me start more machine and have try.

@silenceleaf
Copy link
Author

This the log of the third machine. Seems it find other peers from discovery service. But why no one try to connect other peers?

May 25 21:41:15 coreos3 systemd[1]: Started etcd2.
May 25 21:41:15 coreos3 systemd[1]: Starting etcd2...
May 25 21:41:15 coreos3 etcd2[514]: 2015/05/25 21:41:15 etcd: listening for peers on http://192.168.0.113:2380
May 25 21:41:15 coreos3 etcd2[514]: 2015/05/25 21:41:15 etcd: listening for client requests on http://192.168.0.113:2379
May 25 21:41:15 coreos3 etcd2[514]: 2015/05/25 21:41:15 etcdserver: datadir is valid for the 2.0.1 format
May 25 21:41:16 coreos3 etcd2[514]: 2015/05/25 21:41:16 discovery: found peer c5a74c1a94a0abf7 in the cluster
May 25 21:41:16 coreos3 etcd2[514]: 2015/05/25 21:41:16 discovery: found peer 5b9b099334e46510 in the cluster
May 25 21:41:16 coreos3 etcd2[514]: 2015/05/25 21:41:16 discovery: found self 3bfc2eb37a9d49af in the cluster
May 25 21:41:16 coreos3 etcd2[514]: 2015/05/25 21:41:16 discovery: found 3 peer(s), waiting for 2 more

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

@silenceleaf

I think you need to read the doc here https://github.com/coreos/etcd/blob/master/Documentation/clustering.md#etcd-discovery. As I said before, etcd will not start unless all your members are ready when you are using discovery service.

I do not know why you think they should start to connect to each other. I can tell you they WILL NOT.

@xiang90
Copy link
Contributor

xiang90 commented May 25, 2015

@silenceleaf Go ahead to start the other 2 memebers. It will work.

@silenceleaf
Copy link
Author

Aha, That is what I am confusing. Try it now, thanks for your quick response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants