Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul Mesh-GateWay Federation for Kubernetes as the Primary #10139

Closed
ZEROYXY opened this issue Apr 28, 2021 · 1 comment
Closed

Consul Mesh-GateWay Federation for Kubernetes as the Primary #10139

ZEROYXY opened this issue Apr 28, 2021 · 1 comment

Comments

@ZEROYXY
Copy link

ZEROYXY commented Apr 28, 2021

When filing a bug, please include the following headings if possible. Any example text in this template can be deleted.

Overview of the Issue

Hi all, I am facing an issue when I am trying to build the Federation of Consul Mesh-Gateway. I have been building four consul clusters and make them accross each other by the Consul Mesh-Geteway. I built the primary consul cluster on the Kubernetes and it has been running well since it started. Then I added two consul clusters which built on the K8s to make them accoss each other by the Consul Mesh-Gateway Federation.
I have been tried to add a consul cluster which built on VM into Federation mentioned above and I followed the steps of https://www.consul.io/docs/k8s/installation/multi-cluster/vms-and-kubernetes ---- Kubernetes as the Primary. Then I faced the issue as below:

2021-04-28T15:25:18.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.77:8302: Remote DC has no server currently reachable
2021-04-28T15:25:18.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.146:8302: Remote DC has no server currently reachable
2021-04-28T15:25:18.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.207:8302: Remote DC has no server currently reachable
2021-04-28T15:25:19.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:25:19.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.77:8302: Remote DC has no server currently reachable
2021-04-28T15:25:19.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.140.78:8302: Remote DC has no server currently reachable
2021-04-28T15:25:19.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.146:8302: Remote DC has no server currently reachable
2021-04-28T15:25:19.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:25:26.616+0800 [INFO]  agent.server.memberlist.wan: memberlist: Suspect consul-server-0.dc1 has failed, no acks received
2021-04-28T15:25:27.082+0800 [INFO]  agent.server.serf.lan: serf: EventMemberUpdate: consul-dc4-server2
2021-04-28T15:25:27.082+0800 [INFO]  agent.server: Updating LAN server: server="consul-dc4-server2 (Addr: tcp/192.168.20.71:8300) (DC: dc4)"
2021-04-28T15:25:27.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.140.78:8302: Remote DC has no server currently reachable
2021-04-28T15:25:27.117+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:25:27.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.146:8302: Remote DC has no server currently reachable
2021-04-28T15:25:27.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:25:27.881+0800 [INFO]  agent.server.serf.wan: serf: EventMemberUpdate: consul-dc4-server2.dc4
2021-04-28T15:25:27.882+0800 [INFO]  agent.server: Handled event for server in area: event=member-update server=consul-dc4-server2.dc4 area=wan
2021-04-28T15:25:28.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.207:8302: Remote DC has no server currently reachable
2021-04-28T15:25:28.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.146:8302: Remote DC has no server currently reachable
2021-04-28T15:25:28.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.248.252:8302: Remote DC has no server currently reachable
2021-04-28T15:25:28.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:25:29.115+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.248.252:8302: Remote DC has no server currently reachable
2021-04-28T15:25:29.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.207:8302: Remote DC has no server currently reachable
2021-04-28T15:25:31.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:36.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:41.494+0800 [ERROR] agent: Coordinate update error: error="ACL not found"
2021-04-28T15:25:41.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:46.396+0800 [ERROR] agent.server.memberlist.wan: memberlist: Push/Pull with consul-server-0.dc2 failed: Remote DC has no server currently reachable
2021-04-28T15:25:46.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:51.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:56.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: Remote DC has no server currently reachable
2021-04-28T15:25:58.422+0800 [ERROR] agent: Coordinate update error: error="ACL not found"
2021-04-28T15:25:59.905+0800 [ERROR] agent.anti_entropy: failed to sync remote state: error="ACL not found"
2021-04-28T15:26:04.617+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: Remote DC has no server currently reachable
2021-04-28T15:26:13.443+0800 [ERROR] agent: Coordinate update error: error="ACL not found"
2021-04-28T15:26:22.050+0800 [ERROR] agent.anti_entropy: failed to sync remote state: error="ACL not found"
2021-04-28T15:26:27.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.146:8302: Remote DC has no server currently reachable
2021-04-28T15:26:27.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 10.100.186.199:8302: Remote DC has no server currently reachable
2021-04-28T15:26:27.616+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.207:8302: Remote DC has no server currently reachable
2021-04-28T15:26:28.116+0800 [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 173.0.0.207:8302: Remote DC has no server currently reachable.

Reproduction Steps

Steps to reproduce this issue, eg:

Follow the steps on website https://www.consul.io/docs/k8s/installation/multi-cluster/vms-and-kubernetes

The configuration of the /etc/consul.d/consul.hcl from my side is as below:

cert_file = "/home/cloud/consul/dc4-server-consul-0.pem"
key_file = "/home/cloud/consul/dc4-server-consul-0-key.pem"
ca_file = "/home/cloud/consul/consul-agent-ca.pem"
primary_gateways = ["192.168..:443"]
acl {
enabled = true
default_policy = "deny"
down_policy = "extend-cache"
tokens {
agent = "a6c61787-e229-41f7-8541-0e5adc414b34"
replication = "05eea11e-f7e8-9635-de3e-a8c9d1439135"
}
}
encrypt = "0NcEVIpbnovNODzVnPSXo0QnuLuNXXjkmKzpVwHnX4E="

Other server settings

server = true
datacenter = "dc4"
data_dir = "/opt/consul"
enable_central_service_config = true
primary_datacenter = "dc1"
connect {
enabled = true
enable_mesh_gateway_wan_federation = true
}
verify_incoming_rpc = true
verify_outgoing = true
verify_server_hostname = true
ports {
https = 8501
http = -1
grpc = 8502
}

log_level = "INFO"
node_name = "consul-dc4-server1"
bind_addr = "192.168.."

I got the consul-gossip-encryption-key from K8s as below
kubectl get secrets/consul-gossip-encryption-key --template='{{.data.key}}' |base64 -d
Is what I did as above is currect?

I got the consul-acl-replication-acl-token from K8s as below
kubectl get secrets/consul-acl-replication-acl-token --template='{{.data.token}}' |base64 -d
05eea11e-f7e8-9635-de3e-a8c9d1439135

The command used on the website https://www.consul.io/docs/k8s/installation/multi-cluster/vms-and-kubernetes without the parameter base64 -d. But the token got in that way is strange and show as below:
[root@consul-dc1-master consul]# kubectl get secrets/consul-acl-replication-acl-token --template='{{.data.token}}'
MDVlZWExMWUtZjdlOC05NjM1LWRlM2UtYThjOWQxNDM5MTM1

I don't think the token MDVlZWExMWUtZjdlOC05NjM1LWRlM2UtYThjOWQxNDM5MTM1 shoud be the currect one, right?

Then I didn't understand where I can find the agent = "" on the website https://www.consul.io/docs/k8s/installation/multi-cluster/vms-and-kubernetes as below
acls {
tokens {
agent = ""
replication = "e7924dd1-dc3f-f644-da54-81a73ba0a178"
}
}

I set the part of ACL as below but it dose not work. I doubt if it is caused by the set of agent = "a6c61787-e229-41f7-8541-0e5adc414b34" . Could you please kindly tell from where I can get the exactly agent token and shall I use the primary consul cluster agent token or the Secondry VM Consul Cluster agent token?
acl {
enabled = true
default_policy = "deny"
down_policy = "extend-cache"
tokens {
agent = "a6c61787-e229-41f7-8541-0e5adc414b34"
replication = "05eea11e-f7e8-9635-de3e-a8c9d1439135"
}
}

Consul info for both Client and Server

Client info
[root@consul-dc4-server1 consul]# consul version
Consul v1.9.4
Revision 10bb6cb3b
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)


Server info
[root@consul-dc4-server1 consul]# consul version
Consul v1.9.4
Revision 10bb6cb3b
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)


Operating system and Environment details

[root@consul-dc4-server1 consul]# cat /etc/redhat-release
CentOS Linux release 8.3.2011

Log Fragments

Include appropriate Client or Server log fragments. If the log is longer than a few dozen lines, please include the URL to the gist of the log instead of posting it in the issue. Use -log-level=TRACE on the client and server to capture the maximum log detail.

@ChipV223
Copy link
Contributor

Closing as this is a duplicate of #10138

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants