Skip to content
This repository has been archived by the owner on Jan 31, 2024. It is now read-only.

Plugin v 2.6.0 does not work with ES 1.6.0 (GCE instances don't see each other) #54

Closed
yaraju opened this issue Jul 8, 2015 · 8 comments

Comments

@yaraju
Copy link

yaraju commented Jul 8, 2015

I'm trying out the GCE plugin on ES 1.6.0 with plugin version 2.6.0.

I'm unable to get multicast autodiscovery to work at all.

Here is my startup script for the instances. (Includes elasticsearch.yml config)

And here is the log info when I switch discovery logging to "TRACE":
(NOTE: Logs are of 2nd run where I made a fresh GCloud project with ID: es-cloud1000)

[2015-07-08 12:13:45,017][INFO ][node                     ] [Jonathan "John" Garrett] version[1.6.0], pid[8724], bu
ild[cdd3ac4/2015-06-09T13:36:34Z]
[2015-07-08 12:13:45,018][INFO ][node                     ] [Jonathan "John" Garrett] initializing ...
[2015-07-08 12:13:45,055][INFO ][plugins                  ] [Jonathan "John" Garrett] loaded [marvel, cloud-gce], sites [marvel, head]
[2015-07-08 12:13:45,113][INFO ][env                      ] [Jonathan "John" Garrett] using [1] data paths, mounts [[/ (/dev/sda1)]], net usable_space [7.8gb], net total_space [9.8gb], types [ext4]
[2015-07-08 12:13:48,138][DEBUG][discovery.zen.elect      ] [Jonathan "John" Garrett] using minimum_master_nodes [-1]
[2015-07-08 12:13:48,144][DEBUG][discovery.zen.ping.multicast] [Jonathan "John" Garrett] using group [224.2.2.4], with port [54328], ttl [3], and address [null]
[2015-07-08 12:13:48,148][DEBUG][discovery.zen.ping.unicast] [Jonathan "John" Garrett] using initial hosts [], with concurrent_connects [10]
[2015-07-08 12:13:48,149][DEBUG][discovery.gce            ] [Jonathan "John" Garrett] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2015-07-08 12:13:48,151][DEBUG][discovery.zen.fd         ] [Jonathan "John" Garrett] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-07-08 12:13:48,153][DEBUG][discovery.zen.fd         ] [Jonathan "John" Garrett] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-07-08 12:13:49,196][INFO ][node                     ] [Jonathan "John" Garrett] initialized
[2015-07-08 12:13:49,196][INFO ][node                     ] [Jonathan "John" Garrett] starting ...
[2015-07-08 12:13:49,276][INFO ][transport                ] [Jonathan "John" Garrett] bound_address {inet[/0.0.0.0:
9300]}, publish_address {inet[/10.240.180.235:9300]}
[2015-07-08 12:13:49,295][INFO ][discovery                ] [Jonathan "John" Garrett] dummy/jF69Ngu7Tyu4nYCcsec81g
[2015-07-08 12:13:49,299][TRACE][discovery.gce            ] [Jonathan "John" Garrett] starting to ping
[2015-07-08 12:13:49,308][TRACE][discovery.zen.ping.multicast] [Jonathan "John" Garrett] [1] sending ping request
[2015-07-08 12:13:49,312][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] connecting to [Jonathan 
"John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]
[2015-07-08 12:13:49,344][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] connected to [Jonathan "
John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]
[2015-07-08 12:13:49,345][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] sending to [Jonathan "Jo
hn" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]
[2015-07-08 12:13:49,372][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] received response from [
Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]: [pi
ng_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.
180.235:9300]]], id[1], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[Jonathan "
John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]], id[2], maste
r [null], hasJoinedOnce [false], cluster_name[dummy]}]
[2015-07-08 12:13:50,810][TRACE][discovery.zen.ping.multicast] [Jonathan "John" Garrett] [1] sending ping request
[2015-07-08 12:13:50,812][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] sending to [Jonathan "Jo
hn" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]
[2015-07-08 12:13:50,819][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] received response from [
Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]: [pi
ng_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.
180.235:9300]]], id[1], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[Jonathan "
John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]], id[3], maste
r [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nY
Ccsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]], id[4], master [null], hasJoinedOnce [fals
e], cluster_name[dummy]}]
[2015-07-08 12:13:52,313][TRACE][discovery.zen.ping.multicast] [Jonathan "John" Garrett] [1] sending last pings
[2015-07-08 12:13:52,314][TRACE][discovery.zen.ping.multicast] [Jonathan "John" Garrett] [1] sending ping request
[2015-07-08 12:13:52,321][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] sending to [Jonathan "Jo
hn" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]
[2015-07-08 12:13:52,328][TRACE][discovery.zen.ping.unicast] [Jonathan "John" Garrett] [1] received response from [
Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]: [pi
ng_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.
180.235:9300]]], id[1], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[Jonathan "
John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]], id[3], maste
r [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nY
Ccsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]]], id[5], master [null], hasJoinedOnce [fals
e], cluster_name[dummy]}, ping_response{node [[Jonathan "John" Garrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-
1000.internal][inet[/10.240.180.235:9300]]], id[6], master [null], hasJoinedOnce [false], cluster_name[dummy]}]
[2015-07-08 12:13:53,065][TRACE][discovery.gce            ] [Jonathan "John" Garrett] full ping responses: {none}
[2015-07-08 12:13:53,066][DEBUG][discovery.gce            ] [Jonathan "John" Garrett] filtered ping responses: (fil
ter_client[true], filter_data[false]) {none}
[2015-07-08 12:13:53,076][INFO ][cluster.service          ] [Jonathan "John" Garrett] new_master [Jonathan "John" G
arrett][jF69Ngu7Tyu4nYCcsec81g][es-node1.c.escloud-1000.internal][inet[/10.240.180.235:9300]], reason: zen-disco-jo
in (elected_as_master)
[2015-07-08 12:13:53,090][TRACE][discovery.gce            ] [Jonathan "John" Garrett] cluster joins counter set to 
[1] (elected as master)
[2015-07-08 12:13:53,225][INFO ][http                     ] [Jonathan "John" Garrett] bound_address {inet[/0.0.0.0:
9200]}, publish_address {inet[/10.240.180.235:9200]}
[2015-07-08 12:13:53,229][INFO ][node                     ] [Jonathan "John" Garrett] started
[2015-07-08 12:13:53,367][INFO ][gateway                  ] [Jonathan "John" Garrett] recovered [1] indices into cl
uster_state

Please let me know if I can provide any additional info.

@yaraju
Copy link
Author

yaraju commented Jul 8, 2015

Additional info:
I was running with two nodes:
es-node1
es-node2
Each of the Marvel indices expect to have 1 shard, 1 replica - but despite the 2nd machine showing up - each of them did not notice the other.

@yaraju
Copy link
Author

yaraju commented Jul 8, 2015

Also, logs from 2nd node:

[2015-07-08 12:14:14,507][INFO ][node                     ] [White Tiger] version[1.6.0], pid[9627], build[cdd3ac4/
2015-06-09T13:36:34Z]
[2015-07-08 12:14:14,508][INFO ][node                     ] [White Tiger] initializing ...
[2015-07-08 12:14:14,535][INFO ][plugins                  ] [White Tiger] loaded [marvel, cloud-gce], sites [marvel
, head]
[2015-07-08 12:14:14,585][INFO ][env                      ] [White Tiger] using [1] data paths, mounts [[/ (/dev/sd
a1)]], net usable_space [7.8gb], net total_space [9.8gb], types [ext4]
[2015-07-08 12:14:16,990][DEBUG][discovery.zen.elect      ] [White Tiger] using minimum_master_nodes [-1]
[2015-07-08 12:14:16,992][DEBUG][discovery.zen.ping.multicast] [White Tiger] using group [224.2.2.4], with port [54
328], ttl [3], and address [null]
[2015-07-08 12:14:16,995][DEBUG][discovery.zen.ping.unicast] [White Tiger] using initial hosts [], with concurrent_
connects [10]
[2015-07-08 12:14:16,996][DEBUG][discovery.gce            ] [White Tiger] using ping.timeout [3s], join.timeout [1m
], master_election.filter_client [true], master_election.filter_data [false]
[2015-07-08 12:14:16,998][DEBUG][discovery.zen.fd         ] [White Tiger] [master] uses ping_interval [1s], ping_ti
meout [30s], ping_retries [3]
[2015-07-08 12:14:17,000][DEBUG][discovery.zen.fd         ] [White Tiger] [node  ] uses ping_interval [1s], ping_ti
meout [30s], ping_retries [3]
[2015-07-08 12:14:17,832][INFO ][node                     ] [White Tiger] initialized
[2015-07-08 12:14:17,834][INFO ][node                     ] [White Tiger] starting ...
[2015-07-08 12:14:17,895][INFO ][transport                ] [White Tiger] bound_address {inet[/0.0.0.0:9300]}, publ
ish_address {inet[/10.240.170.154:9300]}
[2015-07-08 12:14:17,909][INFO ][discovery                ] [White Tiger] dummy/kJxYRSn_RtyEaWxBMZkvXQ
[2015-07-08 12:14:17,912][TRACE][discovery.gce            ] [White Tiger] starting to ping
[2015-07-08 12:14:17,921][TRACE][discovery.zen.ping.multicast] [White Tiger] [1] sending ping request
[2015-07-08 12:14:17,923][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] connecting to [White Tiger][kJxYRSn_
RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]
[2015-07-08 12:14:17,949][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] connected to [White Tiger][kJxYRSn_R
tyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]
[2015-07-08 12:14:17,949][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] sending to [White Tiger][kJxYRSn_Rty
EaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]
[2015-07-08 12:14:17,973][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] received response from [White Tiger]
[kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]: [ping_response{node [[White
 Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[1], master [null
], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.
c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[2], master [null], hasJoinedOnce [false], cluster_name[du
mmy]}]
[2015-07-08 12:14:19,422][TRACE][discovery.zen.ping.multicast] [White Tiger] [1] sending ping request
[2015-07-08 12:14:19,424][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] sending to [White Tiger][kJxYRSn_Rty
EaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]
[2015-07-08 12:14:19,426][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] received response from [White Tiger]
[kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]: [ping_response{node [[White
 Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[1], master [null
], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.
c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[3], master [null], hasJoinedOnce [false], cluster_name[du
mmy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170
.154:9300]]], id[4], master [null], hasJoinedOnce [false], cluster_name[dummy]}]
[2015-07-08 12:14:20,925][TRACE][discovery.zen.ping.multicast] [White Tiger] [1] sending last pings
[2015-07-08 12:14:20,925][TRACE][discovery.zen.ping.multicast] [White Tiger] [1] sending ping request
[2015-07-08 12:14:20,927][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] sending to [White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]
[2015-07-08 12:14:20,929][TRACE][discovery.zen.ping.unicast] [White Tiger] [1] received response from [White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]: [ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[1], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[3], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[5], master [null], hasJoinedOnce [false], cluster_name[dummy]}, ping_response{node [[White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]]], id[6], master [null], hasJoinedOnce [false], cluster_name[dummy]}]
[2015-07-08 12:14:21,676][TRACE][discovery.gce            ] [White Tiger] full ping responses: {none}
[2015-07-08 12:14:21,677][DEBUG][discovery.gce            ] [White Tiger] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-07-08 12:14:21,685][INFO ][cluster.service          ] [White Tiger] new_master [White Tiger][kJxYRSn_RtyEaWxBMZkvXQ][es-node2.c.escloud-1000.internal][inet[/10.240.170.154:9300]], reason: zen-disco-join (elected_as_master)
[2015-07-08 12:14:21,695][TRACE][discovery.gce            ] [White Tiger] cluster joins counter set to [1] (elected as master)
[2015-07-08 12:14:21,805][INFO ][http                     ] [White Tiger] bound_address {inet[/0.0.0.0:9200]}, publish_address {inet[/10.240.170.154:9200]}
[2015-07-08 12:14:21,805][INFO ][node                     ] [White Tiger] started
[2015-07-08 12:14:21,894][INFO ][gateway                  ] [White Tiger] recovered [1] indices into cluster_state

@yaraju yaraju changed the title Plugin does not work (GCE instances don't see each other) Plugin v 2.6.0 does not work with ES 1.6.0 (GCE instances don't see each other) Jul 8, 2015
@yaraju
Copy link
Author

yaraju commented Jul 8, 2015

I rewrote my script to use ES 1.5.0 and plugin version 2.5.0, and that works fine.

So my script is fine, but fail with the new plugin.

I can share my Gcloud project with you if you'd like to take a closer look.

@akleiman
Copy link

Seems like the same problem I had in #53

@schonfeld
Copy link

This, too (see #53), is probably due to someone replacing some important code with a "TODO" comment...

https://github.com/elastic/elasticsearch-cloud-gce/blob/master/src/main/java/org/elasticsearch/discovery/gce/GceDiscovery.java#L49

@danielschonfeld
Copy link

cc @schonfeld @dadoonet 👍

@dadoonet
Copy link
Member

This issue was moved to elastic/elasticsearch#13459

@nezda
Copy link

nezda commented Jan 4, 2016

For users still on the 1.x series, shouldn't this issue remain open?

I'm not sure what problems this has, but I was able to get a 1.7.4 cluster working using 2.5.0 of this plugin. That would seem to indicate this bug was introduced somewhere in v2.5.0...es-1.6 which doesn't look like a very big search space.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants