Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd 3.1.5 crashes when checking endpoint status from etcdctl #7722

Closed
raoofm opened this issue Apr 12, 2017 · 11 comments
Closed

etcd 3.1.5 crashes when checking endpoint status from etcdctl #7722

raoofm opened this issue Apr 12, 2017 · 11 comments
Labels

Comments

@raoofm
Copy link
Contributor

raoofm commented Apr 12, 2017

sudo ETCDCTL_API=3 ./etcdctl --endpoints=https://dev-vm-01.example.com:2379,https://dev-vm-02.example.com:2379,https://dev-vm-03.example.com:2379 --cacert=/var/home/rm/cfssl/ca.pem -w table endpoint status

[rm@dev-vm-01 etcd]$ sudo ETCDCTL_API=3 ./etcdctl --endpoints=https://dev-vm-01.example.com:2379 version
etcdctl version: 3.1.5
API version: 3.1

2017-04-12 16:18:56.478276 W | embed: expected IP in URL for binding (https://dev-vm-02.example.com:2380)
2017-04-12 16:18:56.478334 W | embed: expected IP in URL for binding (https://dev-vm-02.example.com:2379)
2017-04-12 16:18:56.478347 I | etcdmain: etcd Version: 3.1.5
2017-04-12 16:18:56.478354 I | etcdmain: Git SHA: 20490ca
2017-04-12 16:18:56.478359 I | etcdmain: Go Version: go1.7.5
2017-04-12 16:18:56.478365 I | etcdmain: Go OS/Arch: linux/amd64
2017-04-12 16:18:56.478371 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2017-04-12 16:18:56.478404 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2017-04-12 16:18:56.478416 W | embed: expected IP in URL for binding (https://dev-vm-02.example.com:2380)
2017-04-12 16:18:56.478427 W | embed: expected IP in URL for binding (https://dev-vm-02.example.com:2379)
2017-04-12 16:18:56.478453 I | embed: peerTLS: cert = /var/home/rm/cfssl/member2.pem, key = /var/home/rm/cfssl/member2-key.pem, ca = , trusted-ca = /var/home/rm/cfssl/ca.pem, client-cert-auth = false
2017-04-12 16:18:56.479625 I | embed: listening for peers on https://dev-vm-02.example.com:2380
2017-04-12 16:18:56.479715 I | embed: listening for client requests on dev-vm-02.example.com:2379
2017-04-12 16:18:56.482979 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-04-12 16:18:56.484213 I | etcdserver: recovered store from snapshot at index 1810223
2017-04-12 16:18:56.484229 I | etcdserver: name = node02
2017-04-12 16:18:56.484235 I | etcdserver: data dir = /var/lib/etcd/cluster/datadir
2017-04-12 16:18:56.484242 I | etcdserver: member dir = /var/lib/etcd/cluster/datadir/member
2017-04-12 16:18:56.484248 I | etcdserver: heartbeat = 200ms
2017-04-12 16:18:56.484254 I | etcdserver: election = 2000ms
2017-04-12 16:18:56.484260 I | etcdserver: snapshot count = 10000
2017-04-12 16:18:56.484272 I | etcdserver: advertise client URLs = https://dev-vm-02.example.com:2379
2017-04-12 16:18:56.920575 I | etcdserver: restarting member ac463f989b4b89ae in cluster ae708a00c1115219 at commit index 1814996
2017-04-12 16:18:56.921884 I | raft: ac463f989b4b89ae became follower at term 521
2017-04-12 16:18:56.921916 I | raft: newRaft ac463f989b4b89ae [peers: [684f2e4032b4dcc2,ac463f989b4b89ae,c8d81b0c5a97ef5f], term: 521, commit: 1814996, applied: 1810223, lastindex: 1814996, lastterm: 520]
2017-04-12 16:18:56.922122 I | etcdserver/api: enabled capabilities for version 3.1
2017-04-12 16:18:56.922143 I | etcdserver/membership: added member c8d81b0c5a97ef5f [https://dev-vm-01.example.com:2380] to cluster ae708a00c1115219 from store
2017-04-12 16:18:56.922152 I | etcdserver/membership: added member 684f2e4032b4dcc2 [https://dev-vm-03.example.com:2380] to cluster ae708a00c1115219 from store
2017-04-12 16:18:56.922159 I | etcdserver/membership: added member ac463f989b4b89ae [https://dev-vm-02.example.com:2380] to cluster ae708a00c1115219 from store
2017-04-12 16:18:56.922167 I | etcdserver/membership: set the cluster version to 3.1 from store
2017-04-12 16:18:56.992088 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-04-12 16:18:56.992792 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-04-12 16:18:56.992834 I | rafthttp: starting peer 684f2e4032b4dcc2...
2017-04-12 16:18:56.992871 I | rafthttp: started HTTP pipelining with peer 684f2e4032b4dcc2
2017-04-12 16:18:56.994945 I | rafthttp: started streaming with peer 684f2e4032b4dcc2 (writer)
2017-04-12 16:18:56.994973 I | rafthttp: started streaming with peer 684f2e4032b4dcc2 (writer)
2017-04-12 16:18:56.995940 I | rafthttp: started peer 684f2e4032b4dcc2
2017-04-12 16:18:56.995974 I | rafthttp: added peer 684f2e4032b4dcc2
2017-04-12 16:18:56.995996 I | rafthttp: starting peer c8d81b0c5a97ef5f...
2017-04-12 16:18:56.996029 I | rafthttp: started HTTP pipelining with peer c8d81b0c5a97ef5f
2017-04-12 16:18:56.997333 I | rafthttp: started streaming with peer 684f2e4032b4dcc2 (stream MsgApp v2 reader)
2017-04-12 16:18:56.997693 I | rafthttp: started streaming with peer 684f2e4032b4dcc2 (stream Message reader)
2017-04-12 16:18:56.998921 I | rafthttp: started streaming with peer c8d81b0c5a97ef5f (writer)
2017-04-12 16:18:56.999040 I | rafthttp: started streaming with peer c8d81b0c5a97ef5f (writer)
2017-04-12 16:18:57.001888 I | rafthttp: started peer c8d81b0c5a97ef5f
2017-04-12 16:18:57.001917 I | rafthttp: added peer c8d81b0c5a97ef5f
2017-04-12 16:18:57.001953 I | etcdserver: starting server... [version: 3.1.5, cluster version: 3.1]
2017-04-12 16:18:57.002004 I | embed: ClientTLS: cert = /var/home/rm/cfssl/member2.pem, key = /var/home/rm/cfssl/member2-key.pem, ca = , trusted-ca = /var/home/rm/cfssl/ca.pem, client-cert-auth = false
2017-04-12 16:18:57.003718 I | rafthttp: started streaming with peer c8d81b0c5a97ef5f (stream MsgApp v2 reader)
2017-04-12 16:18:57.004065 I | rafthttp: started streaming with peer c8d81b0c5a97ef5f (stream Message reader)
2017-04-12 16:18:57.036975 I | rafthttp: peer c8d81b0c5a97ef5f became active
2017-04-12 16:18:57.036997 I | rafthttp: established a TCP streaming connection with peer c8d81b0c5a97ef5f (stream Message writer)
2017-04-12 16:18:57.038052 I | rafthttp: established a TCP streaming connection with peer c8d81b0c5a97ef5f (stream MsgApp v2 writer)
2017-04-12 16:18:57.038138 I | rafthttp: established a TCP streaming connection with peer c8d81b0c5a97ef5f (stream MsgApp v2 reader)
2017-04-12 16:18:57.038663 I | rafthttp: established a TCP streaming connection with peer c8d81b0c5a97ef5f (stream Message reader)
2017-04-12 16:18:57.041056 I | raft: ac463f989b4b89ae [term: 521] received a MsgVote message with higher term from c8d81b0c5a97ef5f [term: 523]
2017-04-12 16:18:57.041078 I | raft: ac463f989b4b89ae became follower at term 523
2017-04-12 16:18:57.041092 I | raft: ac463f989b4b89ae [logterm: 520, index: 1814996, vote: 0] cast MsgVote for c8d81b0c5a97ef5f [logterm: 520, index: 1814996] at term 523
2017-04-12 16:18:57.043928 I | raft: raft.node: ac463f989b4b89ae elected leader c8d81b0c5a97ef5f at term 523
2017-04-12 16:18:57.052246 I | etcdserver: published {Name:node02 ClientURLs:[https://dev-vm-02.example.com:2379]} to cluster ae708a00c1115219
2017-04-12 16:18:57.052339 I | embed: ready to serve client requests
2017-04-12 16:18:57.052688 I | embed: serving client requests on 10.112.145.153:2379
2017-04-12 16:19:01.997953 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp 10.1.4.6:2380: getsockopt: connection refused
2017-04-12 16:19:06.998102 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp 10.1.4.6:2380: getsockopt: connection refused
2017-04-12 16:19:11.998253 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp 10.1.4.6:2380: getsockopt: connection refused
2017-04-12 16:19:16.998413 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp 10.1.4.6:2380: getsockopt: connection refused
2017-04-12 16:19:21.998584 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp: lookup dev-vm-03.example.com on 127.0.0.1:53: dial udp 127.0.0.1:53: i/o timeout
2017-04-12 16:19:26.998734 W | rafthttp: health check for peer 684f2e4032b4dcc2 could not connect: dial tcp 10.1.4.6:2380: getsockopt: connection refused
2017-04-12 16:19:28.617489 I | rafthttp: peer 684f2e4032b4dcc2 became active
2017-04-12 16:19:28.617539 I | rafthttp: established a TCP streaming connection with peer 684f2e4032b4dcc2 (stream Message writer)
2017-04-12 16:19:28.629542 I | rafthttp: established a TCP streaming connection with peer 684f2e4032b4dcc2 (stream MsgApp v2 reader)
2017-04-12 16:19:28.629921 I | rafthttp: established a TCP streaming connection with peer 684f2e4032b4dcc2 (stream MsgApp v2 writer)
2017-04-12 16:19:28.647031 I | rafthttp: established a TCP streaming connection with peer 684f2e4032b4dcc2 (stream Message reader)

2017-04-12 16:20:13.186168 I | auth: authorized root, token is zBSHwivRLencYvpU.1815001
2017-04-12 16:22:32.266362 I | auth: authorized root, token is iJPGEqPNYhoVHKCM.1815002
2017-04-12 16:22:32.366605 I | auth: authorized root, token is GzmlVYDNJkzfmxkF.1815003
2017-04-12 16:22:32.470821 I | auth: authorized root, token is CScsBNbTktJQHDYz.1815004
2017-04-12 16:22:32.574887 I | auth: authorized root, token is vJpRwQlHblgHdKwS.1815005
2017-04-12 16:22:51.752320 I | auth: authorized root, token is yFmylCaqhVLGhVky.1815006
2017-04-12 16:22:51.855501 I | auth: authorized root, token is mnruoHoeSwalINFm.1815007
2017-04-12 16:22:51.955843 I | auth: authorized root, token is aZIIZEBmnknOWCby.1815008
2017-04-12 16:22:52.073383 I | auth: authorized root, token is eNRUvLgrhJCgWihM.1815009
2017-04-12 16:22:54.330292 I | auth: authorized root, token is KygSKUhyOSUlQAPD.1815010
2017-04-12 16:22:54.430929 I | auth: authorized root, token is xZkLiBPXoUrtulbY.1815011
2017-04-12 16:22:54.530982 I | auth: authorized root, token is VaMVPaDtjkKGTSwy.1815012
2017-04-12 16:22:54.645818 I | auth: authorized root, token is WXmSUeuiJbhdZwya.1815013
2017-04-12 16:22:56.640483 I | auth: authorized root, token is dczvDjqugcmzRiWq.1815014
2017-04-12 16:22:56.742951 I | auth: authorized root, token is ulmtpcMGPxFcOsWt.1815015
2017-04-12 16:22:56.845783 I | auth: authorized root, token is oAcfFIQMTLkmdCVE.1815016
2017-04-12 16:22:56.949350 I | auth: authorized root, token is XYoZmJVCKrQfaynZ.1815017
2017-04-12 16:24:05.017492 I | auth: authorized root, token is QtPtfOoyowDRZlKb.1815018
2017-04-12 16:24:05.021815 I | auth: authorized root, token is ZaIteJBwpCpTWyqD.1815019
2017-04-12 16:24:05.022019 I | auth: authorized root, token is euhTINoSkuQMvHCS.1815020
2017-04-12 16:24:23.371087 I | auth: authorized root, token is PgkvGYqLKcCCYoLn.1815021
2017-04-12 16:24:23.470085 I | auth: authorized root, token is tvguqwECtaBmJwzg.1815022
2017-04-12 16:24:23.567769 I | auth: authorized root, token is YfvZqESbDViiwBlH.1815023
2017-04-12 16:24:23.670346 I | auth: authorized root, token is XAyspWVpsjWawtgW.1815024
2017-04-12 16:25:13.979591 I | auth: deleting token zBSHwivRLencYvpU.1815001 for user root
2017-04-12 16:26:45.064763 I | auth: authorized root, token is WkZBtsonQwUpeajI.1815025
2017-04-12 16:26:45.165171 I | auth: authorized root, token is kMIlzaoXHXGgIzkQ.1815026
2017-04-12 16:26:45.267130 I | auth: authorized root, token is mdLgDFYTtCXMqQtB.1815027
2017-04-12 16:26:45.374121 I | auth: authorized root, token is nhmqOtESQKVJWHJz.1815028
2017-04-12 16:26:51.658823 I | auth: authorized root, token is pJBVFLLPsNyloQyn.1815029
2017-04-12 16:26:51.755565 I | auth: authorized root, token is YEzkRvncjqOcvGok.1815030
2017-04-12 16:26:51.856203 I | auth: authorized root, token is pgDCEkLccfficYff.1815031
2017-04-12 16:26:51.962460 I | auth: authorized root, token is lWUrBUuPBsIMrrQR.1815032
2017-04-12 16:27:32.979649 I | auth: deleting token GzmlVYDNJkzfmxkF.1815003 for user root
2017-04-12 16:27:32.979735 I | auth: deleting token vJpRwQlHblgHdKwS.1815005 for user root
2017-04-12 16:27:32.979748 I | auth: deleting token CScsBNbTktJQHDYz.1815004 for user root
2017-04-12 16:27:32.979757 I | auth: deleting token iJPGEqPNYhoVHKCM.1815002 for user root
2017-04-12 16:27:51.979624 I | auth: deleting token mnruoHoeSwalINFm.1815007 for user root
2017-04-12 16:27:51.979693 I | auth: deleting token yFmylCaqhVLGhVky.1815006 for user root
2017-04-12 16:27:51.979704 I | auth: deleting token aZIIZEBmnknOWCby.1815008 for user root
2017-04-12 16:27:52.979610 I | auth: deleting token eNRUvLgrhJCgWihM.1815009 for user root
2017-04-12 16:27:54.979600 I | auth: deleting token VaMVPaDtjkKGTSwy.1815012 for user root
2017-04-12 16:27:54.979652 I | auth: deleting token xZkLiBPXoUrtulbY.1815011 for user root
2017-04-12 16:27:54.979662 I | auth: deleting token KygSKUhyOSUlQAPD.1815010 for user root
2017-04-12 16:27:54.979684 I | auth: deleting token WXmSUeuiJbhdZwya.1815013 for user root
2017-04-12 16:27:56.979596 I | auth: deleting token oAcfFIQMTLkmdCVE.1815016 for user root
2017-04-12 16:27:56.979655 I | auth: deleting token ulmtpcMGPxFcOsWt.1815015 for user root
2017-04-12 16:27:56.979667 I | auth: deleting token XYoZmJVCKrQfaynZ.1815017 for user root
2017-04-12 16:27:56.979675 I | auth: deleting token dczvDjqugcmzRiWq.1815014 for user root
2017-04-12 16:29:05.979638 I | auth: deleting token euhTINoSkuQMvHCS.1815020 for user root
2017-04-12 16:29:05.979715 I | auth: deleting token QtPtfOoyowDRZlKb.1815018 for user root
2017-04-12 16:29:05.979727 I | auth: deleting token ZaIteJBwpCpTWyqD.1815019 for user root
2017-04-12 16:29:23.979613 I | auth: deleting token YfvZqESbDViiwBlH.1815023 for user root
2017-04-12 16:29:23.979669 I | auth: deleting token XAyspWVpsjWawtgW.1815024 for user root
2017-04-12 16:29:23.979698 I | auth: deleting token tvguqwECtaBmJwzg.1815022 for user root
2017-04-12 16:29:23.979718 I | auth: deleting token PgkvGYqLKcCCYoLn.1815021 for user root
2017-04-12 16:31:45.979613 I | auth: deleting token mdLgDFYTtCXMqQtB.1815027 for user root
2017-04-12 16:31:45.979694 I | auth: deleting token kMIlzaoXHXGgIzkQ.1815026 for user root
2017-04-12 16:31:45.979707 I | auth: deleting token nhmqOtESQKVJWHJz.1815028 for user root
2017-04-12 16:31:45.979715 I | auth: deleting token WkZBtsonQwUpeajI.1815025 for user root
2017-04-12 16:31:51.979597 I | auth: deleting token lWUrBUuPBsIMrrQR.1815032 for user root
2017-04-12 16:31:51.979657 I | auth: deleting token pgDCEkLccfficYff.1815031 for user root
2017-04-12 16:31:51.979669 I | auth: deleting token pJBVFLLPsNyloQyn.1815029 for user root
2017-04-12 16:31:51.979677 I | auth: deleting token YEzkRvncjqOcvGok.1815030 for user root
2017-04-12 16:36:53.944279 I | auth: authorized root, token is ruGCojbkDfPBUujz.1815033
2017-04-12 16:36:54.054961 I | auth: authorized root, token is mGJBOcznItNwdKkm.1815034
2017-04-12 16:36:54.153377 I | auth: authorized root, token is YvOeqibIugmEqpme.1815035
2017-04-12 16:36:54.260312 I | auth: authorized root, token is xlJEqXMBeJdMpDyg.1815036
2017-04-12 16:37:43.008787 I | auth: authorized root, token is ffPbBZFpcWJXjXKi.1815038
2017-04-12 16:37:43.020294 N | auth: Authentication already enabled
2017-04-12 16:37:49.185344 I | auth: authorized root, token is dhPTMUGzURHqHwpF.1815040
2017-04-12 16:37:49.286076 I | auth: authorized root, token is VjVqvRJMBTXHWmVG.1815041
2017-04-12 16:37:49.392987 I | auth: authorized root, token is NvYmGNKxFHINZwgd.1815042
2017-04-12 16:37:49.498005 I | auth: authorized root, token is DMOPrEvLAJIdDhhS.1815043
2017-04-12 16:37:54.791031 I | auth: authorized root, token is kfLaFbxhabMPtSpU.1815044
2017-04-12 16:37:54.811727 N | auth: Authentication disabled
2017-04-12 16:38:13.015662 W | rafthttp: lost the TCP streaming connection with peer c8d81b0c5a97ef5f (stream Message reader)
2017-04-12 16:38:13.015789 E | rafthttp: failed to read c8d81b0c5a97ef5f on stream Message (unexpected EOF)
2017-04-12 16:38:13.015822 I | rafthttp: peer c8d81b0c5a97ef5f became inactive
2017-04-12 16:38:13.015863 W | rafthttp: lost the TCP streaming connection with peer c8d81b0c5a97ef5f (stream MsgApp v2 reader)
2017-04-12 16:38:15.122395 I | raft: ac463f989b4b89ae is starting a new election at term 523
2017-04-12 16:38:15.122463 I | raft: ac463f989b4b89ae became candidate at term 524
2017-04-12 16:38:15.122485 I | raft: ac463f989b4b89ae received MsgVoteResp from ac463f989b4b89ae at term 524
2017-04-12 16:38:15.122507 I | raft: ac463f989b4b89ae [logterm: 523, index: 1815048] sent MsgVote request to 684f2e4032b4dcc2 at term 524
2017-04-12 16:38:15.122528 I | raft: ac463f989b4b89ae [logterm: 523, index: 1815048] sent MsgVote request to c8d81b0c5a97ef5f at term 524
2017-04-12 16:38:15.122545 I | raft: raft.node: ac463f989b4b89ae lost leader c8d81b0c5a97ef5f at term 524
2017-04-12 16:38:15.125785 W | rafthttp: lost the TCP streaming connection with peer c8d81b0c5a97ef5f (stream Message writer)
2017-04-12 16:38:15.128251 I | raft: ac463f989b4b89ae received MsgVoteResp from 684f2e4032b4dcc2 at term 524
2017-04-12 16:38:15.128305 I | raft: ac463f989b4b89ae [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-12 16:38:15.128327 I | raft: ac463f989b4b89ae became leader at term 524
2017-04-12 16:38:15.128369 I | raft: raft.node: ac463f989b4b89ae elected leader ac463f989b4b89ae at term 524
2017-04-12 16:38:15.144457 W | etcdserver: failed to reach the peerURL(https://dev-vm-01.example.com:2380) of member c8d81b0c5a97ef5f (Get https://dev-vm-01.example.com:2380/version: dial tcp 10.112.145.152:2380: getsockopt: connection refused)
2017-04-12 16:38:15.144492 W | etcdserver: cannot get the version of member c8d81b0c5a97ef5f (Get https://dev-vm-01.example.com:2380/version: dial tcp 10.112.145.152:2380: getsockopt: connection refused)
2017-04-12 16:38:16.999365 W | rafthttp: lost the TCP streaming connection with peer c8d81b0c5a97ef5f (stream MsgApp v2 writer)
2017-04-12 16:38:19.148654 W | etcdserver: failed to reach the peerURL(https://dev-vm-01.example.com:2380) of member c8d81b0c5a97ef5f (Get https://dev-vm-01.example.com:2380/version: dial tcp 10.112.145.152:2380: getsockopt: connection refused)
2017-04-12 16:38:19.148691 W | etcdserver: cannot get the version of member c8d81b0c5a97ef5f (Get https://dev-vm-01.example.com:2380/version: dial tcp 10.112.145.152:2380: getsockopt: connection refused)
2017-04-12 16:38:22.026415 W | etcdserver: timed out waiting for read index response
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8bcc2a]

goroutine 3621 [running]:
panic(0xd15a00, 0xc4200100a0)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/auth.(*authStore).AuthInfoFromToken(0xc42011dce0, 0xc4219fb460, 0x18, 0x7f68af6ec5a8, 0xc421775e60)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/auth/store.go:650 +0x2a
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/auth.(*authStore).AuthInfoFromCtx(0xc42011dce0, 0x7f68af6ec5a8, 0xc421775e60, 0xc42026a810, 0x40f9e8, 0x20)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/auth/store.go:995 +0x109
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).doSerialize(0xc42016b080, 0x7f68af6ec5a8, 0xc421775e60, 0xc420206c20, 0xc42026a888, 0x0, 0x0)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/v3_server.go:620 +0x52
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).Range(0xc42016b080, 0x7f68af6ec5a8, 0xc421775e60, 0xc4219b8bd0, 0x12e3380, 0xc420895c80, 0x12e0f80)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/v3_server.go:111 +0x234
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*kvServer).Range(0xc420199400, 0x7f68af6ec5a8, 0xc421775e60, 0xc4219b8bd0, 0xc42026a9c0, 0x61fa4a, 0xc420895c80)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/key.go:50 +0x88
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*quotaKVServer).Range(0xc420199440, 0x7f68af6ec5a8, 0xc421775e60, 0xc4219b8bd0, 0x5, 0xe500c6, 0xf)
:23 +0x6b
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb._KV_Range_Handler.func1(0x7f68af6ec5a8, 0xc421775e60, 0xd9f1a0, 0xc4219b8bd0, 0xc420430f00, 0xc42026aa38, 0x9bc256, 0xc421775e60)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:2287 +0xa2
github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.UnaryServerInterceptor(0x7f68af6ec5a8, 0xc421775e60, 0xd9f1a0, 0xc4219b8bd0, 0xc421794020, 0xc421794040, 0x40f9e8, 0x20, 0xd47a20, 0x1)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:29 +0xa4
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.newUnaryInterceptor.func1(0x7f68af6ec5a8, 0xc421775e60, 0xd9f1a0, 0xc4219b8bd0, 0xc421794020, 0xc421794040, 0x1d, 0x6, 0x0, 0xc42026ab28)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/interceptor.go:57 +0xc3
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb._KV_Range_Handler(0xd5fe40, 0xc420199440, 0x7f68af6ec5a8, 0xc421775e60, 0xc420430eb0, 0xc4208d6e60, 0x0, 0x0, 0xc4204fee00, 0x20)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:2289 +0x166
github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc42087d680, 0x12e3560, 0xc4216e9800, 0xc42177ad20, 0xc4203dfe90, 0x1301820, 0x0, 0x0, 0x0)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/server.go:638 +0xc14
github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*Server).handleStream(0xc42087d680, 0x12e3560, 0xc4216e9800, 0xc42177ad20, 0x0)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/server.go:796 +0x6b0
github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc4216ecec0, 0xc42087d680, 0x12e3560, 0xc4216e9800, 0xc42177ad20)
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/server.go:449 +0xab
created by github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
/home/gyuho/go/src/github.com/coreos/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/server.go:450 +0xa3

@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

@xiang90 @heyitsanthony I was running 3.1.3 before and i saw this crash and searched below on etcd issues.

runtime error: invalid memory address or nil pointer dereference

saw that this was fixed in 3.1.4 and so I upgraded all the clusters to 3.1.5. This is a major issue as this brings the whole cluster down, one by one by crashing each member with each run of the etcdctl command.

@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

update... seems unrelated to running etcdctl endpoint status... the raft index on 3 members before crash on every check gives a sequence 311,312,313 ............. 314,315,316..........317,318,319

now the cluster doesn't come up after the crash... it comes up and crashes again on two nodes and 1 can work. Earlier all 3 crashed when they crashed for the first time.

I removed the data directory and brought up a new cluster... will update if I find anything new.

@heyitsanthony
Copy link
Contributor

@raoofm OK, I think I figured out what's happening here. The auth code expects auth to be enabled if it gets a syntactically valid simple token. If auth is disabled while a client still holds a token, it'll try to access a structure that is non-nil only when auth is enabled. If the request goes through raft, this will cause crashes even after restarting the process since etcd will try to replay it.

Patch in the works.

@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

@heyitsanthony sounds good. Its none of my business but just asking for better planning. Targeted for 3.1 or 3.2 and around what time frame?

@heyitsanthony
Copy link
Contributor

The fix will be backported into 3.1. Expect a new release in a few days.

heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Apr 12, 2017
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Apr 12, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes etcd-io#7722
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Apr 12, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes etcd-io#7722
@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

should this be marked as a bug?

@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

@heyitsanthony Does this seem right? can the raft index be different on all nodes. They can be +1 as per doc, but can all 3 have different index. When I started the cluster the index was same for a long time and then when I enabled auth may be it deviated. I'm not sure... Is this fine?

[rm@node-01 etcd]$ sudo ETCDCTL_API=3 ./etcdctl --endpoints=https://node-01.example.com:2379,https://node-02.example.com:2379,https://node-03.example.com:2379 --cacert=/var/home/rm/cfssl/ca.pem -w table --user=root:password endpoint status
2017-04-12 20:46:50.196685 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
+----------------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://node-01.example.com:2379 | c8d81b0c5a97ef5f | 3.1.5 | 49 kB | true | 4 | 96 |
| https://node-02.example.com:2379 | ac463f989b4b89ae | 3.1.5 | 49 kB | false | 4 | 97 |
| https://node-03.example.com:2379 | 684f2e4032b4dcc2 | 3.1.5 | 49 kB | false | 4 | 98 |
+----------------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+

@raoofm
Copy link
Contributor Author

raoofm commented Apr 12, 2017

With every check of endpoint status there are 4 tokens generated in logs (shouldn't there be 3?) and the next index is

+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://node-01.example.com:2379 | c8d81b0c5a97ef5f | 3.1.5 | 49 kB | true | 4 | 100 |
| https://node-02.example.com:2379 | ac463f989b4b89ae | 3.1.5 | 49 kB | false | 4 | 101 |
| https://node-03.example.com:2379 | 684f2e4032b4dcc2 | 3.1.5 | 49 kB | false | 4 | 102 |
+----------------------------------+------------------+---------+---------+-----------+-----------+------------+

gyuho pushed a commit that referenced this issue Apr 13, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722

Signed-off-by: Gyu-Ho Lee <[email protected]>
@raoofm
Copy link
Contributor Author

raoofm commented Apr 13, 2017

@heyitsanthony can you please answer the above query?

@heyitsanthony
Copy link
Contributor

It'll generate four tokens because it'll authenticate for the client on one endpoint then authenticate again once for each endpoint since it dials out with a new grpc connection. The raft index is different for each member because the each authentication request goes through raft; the status commands aren't issued as a single atomic rpc.

gyuho pushed a commit that referenced this issue Apr 13, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722

Signed-off-by: Gyu-Ho Lee <[email protected]>
@raoofm
Copy link
Contributor Author

raoofm commented Apr 13, 2017

@heyitsanthony thanks

gyuho pushed a commit that referenced this issue Apr 13, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722
gyuho pushed a commit that referenced this issue Apr 13, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue Apr 17, 2017
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes etcd-io#7722
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants