Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD with TLS showing warning "transport: authentication handshake failed: remote error: tls: bad certificate" #9785

Closed
JinsYin opened this issue May 29, 2018 · 23 comments

Comments

@JinsYin
Copy link

JinsYin commented May 29, 2018

I refer to the following two articles:

https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md
https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md

Initialize a certificate authority

$ cat ca-config.json
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "server": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth"
        ]
      },
      "client": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "client auth"
        ]
      },
      "peer": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth",
          "client auth"
        ]
      }
    }
  }
}

$ cat ca-csr.json
{
  "CN": "My own CA",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "US",
      "L": "CA",
      "O": "My Company Name",
      "ST": "San Francisco",
      "OU": "Org Unit 1",
      "OU": "Org Unit 2"
    }
  ]
}

$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

Generate server certificate

# cfssl print-defaults csr > server.json
$ cat server.json
{
  "CN": "etcd1",
  "hosts": [
    "192.168.1.221"
  ],
  "key": {
    "algo": "ecdsa",
    "size": 256
  },
  "names": [
    {
        "C": "US",
        "L": "CA",
        "ST": "San Francisco"
    }
  ]
}

$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server

Etcd Server

etcd --name infra0 --data-dir infra0 \
  --client-cert-auth --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
  --advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-29 11:17:10.374455 I | etcdmain: etcd Version: 3.3.5
2018-05-29 11:17:10.374527 I | etcdmain: Git SHA: 70c872620
2018-05-29 11:17:10.374534 I | etcdmain: Go Version: go1.9.6
2018-05-29 11:17:10.374540 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-29 11:17:10.374546 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-29 11:17:10.374859 I | embed: listening for peers on http://localhost:2380
2018-05-29 11:17:10.374899 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-29 11:17:10.377043 I | etcdserver: name = infra0
2018-05-29 11:17:10.377067 I | etcdserver: data dir = infra0
2018-05-29 11:17:10.377074 I | etcdserver: member dir = infra0/member
2018-05-29 11:17:10.377079 I | etcdserver: heartbeat = 100ms
2018-05-29 11:17:10.377087 I | etcdserver: election = 1000ms
2018-05-29 11:17:10.377092 I | etcdserver: snapshot count = 100000
2018-05-29 11:17:10.377125 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-29 11:17:10.377133 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2018-05-29 11:17:10.377143 I | etcdserver: initial cluster = infra0=http://localhost:2380
2018-05-29 11:17:10.379279 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32
2018-05-29 11:17:10.379320 I | raft: 8e9e05c52164694d became follower at term 0
2018-05-29 11:17:10.379337 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-29 11:17:10.379344 I | raft: 8e9e05c52164694d became follower at term 1
2018-05-29 11:17:10.385248 W | auth: simple token is not cryptographically signed
2018-05-29 11:17:10.388175 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-29 11:17:10.388842 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-29 11:17:10.389395 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-29 11:17:10.392890 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = true, crl-file = 
2018-05-29 11:17:10.479773 I | raft: 8e9e05c52164694d is starting a new election at term 1
2018-05-29 11:17:10.479819 I | raft: 8e9e05c52164694d became candidate at term 2
2018-05-29 11:17:10.479887 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2
2018-05-29 11:17:10.479906 I | raft: 8e9e05c52164694d became leader at term 2
2018-05-29 11:17:10.479915 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2
2018-05-29 11:17:10.480540 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-29 11:17:10.480670 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-29 11:17:10.480694 I | embed: ready to serve client requests
2018-05-29 11:17:10.480718 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-29 11:17:10.481430 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-29 11:17:10.481638 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-29 11:17:10.532133 I | embed: serving client requests on 127.0.0.1:2379
2018-05-29 11:17:10.539294 I | embed: rejected connection from "127.0.0.1:39794" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.
@JinsYin
Copy link
Author

JinsYin commented May 29, 2018

When I replaced the server certificate with the peer certificate, the warning was gone. Why?

# -profile=peer
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer server.json | cfssljson -bare server
$ etcd --name infra0 --data-dir infra0 \
  --client-cert-auth --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
  --advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-29 11:21:09.053070 I | etcdmain: etcd Version: 3.3.5
2018-05-29 11:21:09.053133 I | etcdmain: Git SHA: 70c872620
2018-05-29 11:21:09.053141 I | etcdmain: Go Version: go1.9.6
2018-05-29 11:21:09.053146 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-29 11:21:09.053152 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-29 11:21:09.053557 I | embed: listening for peers on http://localhost:2380
2018-05-29 11:21:09.053597 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-29 11:21:09.055180 I | etcdserver: name = infra0
2018-05-29 11:21:09.055195 I | etcdserver: data dir = infra0
2018-05-29 11:21:09.055202 I | etcdserver: member dir = infra0/member
2018-05-29 11:21:09.055207 I | etcdserver: heartbeat = 100ms
2018-05-29 11:21:09.055212 I | etcdserver: election = 1000ms
2018-05-29 11:21:09.055220 I | etcdserver: snapshot count = 100000
2018-05-29 11:21:09.055230 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-29 11:21:09.055237 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2018-05-29 11:21:09.055246 I | etcdserver: initial cluster = infra0=http://localhost:2380
2018-05-29 11:21:09.056700 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32
2018-05-29 11:21:09.056732 I | raft: 8e9e05c52164694d became follower at term 0
2018-05-29 11:21:09.056747 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-29 11:21:09.056753 I | raft: 8e9e05c52164694d became follower at term 1
2018-05-29 11:21:09.059841 W | auth: simple token is not cryptographically signed
2018-05-29 11:21:09.061318 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-29 11:21:09.061669 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-29 11:21:09.062072 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-29 11:21:09.063469 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = true, crl-file = 
2018-05-29 11:21:09.657081 I | raft: 8e9e05c52164694d is starting a new election at term 1
2018-05-29 11:21:09.657149 I | raft: 8e9e05c52164694d became candidate at term 2
2018-05-29 11:21:09.657179 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2
2018-05-29 11:21:09.657203 I | raft: 8e9e05c52164694d became leader at term 2
2018-05-29 11:21:09.657215 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2
2018-05-29 11:21:09.657608 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-29 11:21:09.658381 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-29 11:21:09.658457 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-29 11:21:09.658520 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-29 11:21:09.658536 I | embed: ready to serve client requests
2018-05-29 11:21:09.658751 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-29 11:21:09.712055 I | embed: serving client requests on 127.0.0.1:2379

@JinsYin JinsYin changed the title WARNING: Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry. ETCD with TLS showing warning "transport: authentication handshake failed: remote error: tls: bad certificate" May 29, 2018
@hexfusion
Copy link
Contributor

hexfusion commented May 29, 2018

@JinsYin your config defines server profile as server auth only while peer profile has both server auth and client auth extensions. I see how this is confusing as the example uses server in the file name.

embed: rejected connection from "127.0.0.1:39794" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

So it seems as soon as client auth is attempted it fails because the server config does not output certificates that will facilitate client auth. This is how I read it at least.

ref https://github.com/cloudflare/cfssl/blob/master/doc/cmd/cfssl.txt

@JinsYin
Copy link
Author

JinsYin commented May 30, 2018

@hexfusion I agree. My confusion is why etcd server needs client auth.

@JinsYin
Copy link
Author

JinsYin commented May 30, 2018

When I set the --client-cert-auth parameter to false, the warning was gone. So I guess the etcd process will do a health check as a client.

# server auth & --client-cert-auth=false
$ etcd --name infra0 --data-dir infra0 \
  --client-cert-auth=false --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
  --advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-30 11:43:23.150450 I | etcdmain: etcd Version: 3.3.5
2018-05-30 11:43:23.150561 I | etcdmain: Git SHA: 70c872620
2018-05-30 11:43:23.150577 I | etcdmain: Go Version: go1.9.6
2018-05-30 11:43:23.150590 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-30 11:43:23.150602 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-30 11:43:23.150699 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2018-05-30 11:43:23.151409 I | embed: listening for peers on http://localhost:2380
2018-05-30 11:43:23.151494 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-30 11:43:23.152450 I | etcdserver: name = infra0
2018-05-30 11:43:23.152471 I | etcdserver: data dir = infra0
2018-05-30 11:43:23.152484 I | etcdserver: member dir = infra0/member
2018-05-30 11:43:23.152496 I | etcdserver: heartbeat = 100ms
2018-05-30 11:43:23.152516 I | etcdserver: election = 1000ms
2018-05-30 11:43:23.152529 I | etcdserver: snapshot count = 100000
2018-05-30 11:43:23.152550 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-30 11:43:23.153964 I | etcdserver: restarting member 8e9e05c52164694d in cluster cdf818194e3a8c32 at commit index 14
2018-05-30 11:43:23.154047 I | raft: 8e9e05c52164694d became follower at term 7
2018-05-30 11:43:23.154074 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 7, commit: 14, applied: 0, lastindex: 14, lastterm: 7]
2018-05-30 11:43:23.158976 W | auth: simple token is not cryptographically signed
2018-05-30 11:43:23.161144 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-30 11:43:23.162710 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-30 11:43:23.163138 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-30 11:43:23.163261 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-30 11:43:23.165712 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = false, crl-file = 
2018-05-30 11:43:25.054746 I | raft: 8e9e05c52164694d is starting a new election at term 7
2018-05-30 11:43:25.054839 I | raft: 8e9e05c52164694d became candidate at term 8
2018-05-30 11:43:25.054875 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 8
2018-05-30 11:43:25.054908 I | raft: 8e9e05c52164694d became leader at term 8
2018-05-30 11:43:25.054930 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 8
2018-05-30 11:43:25.056827 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-30 11:43:25.056909 I | embed: ready to serve client requests
2018-05-30 11:43:25.057110 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-30 11:43:25.113424 I | embed: serving client requests on 127.0.0.1:2379

@detiber
Copy link

detiber commented Jun 12, 2018

I found this issue as I was troubleshooting issues that arose during an etcd upgrade from 3.1.x to 3.2.x using kubeadm. After some debugging I was able to determine that the new (as of etcd 3.2.x) client usage requirement of the serving certificate is due to the use of the server certificate as a client certificate for the grpc gateway.

This requirement doesn't appear to be documented in any of the places I would expect, such as:
https://coreos.com/os/docs/latest/generate-self-signed-certificates.html
https://coreos.com/etcd/docs/latest/op-guide/security.html
https://coreos.com/etcd/docs/latest/dev-guide/api_grpc_gateway.html
https://coreos.com/etcd/docs/latest/op-guide/configuration.html
https://coreos.com/etcd/docs/latest/upgrades/upgrade_3_2.html

Ideally, I would expect there to be a configuration option to specify a separate client cert for the grpc gateway (and tangentially also be able to specify separate client/server certs for the peer certificates as well).

@KIVagant
Copy link

KIVagant commented Oct 23, 2018

TL;DR: How to fix the issue:

ca-config.json: add "client auth" to the "server" section

{
    "signing": {
        "default": {
            "expiry": "1000000h"
        },
        "profiles": {
            "server": {
                "expiry": "1000000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            },
            "client": {
                "expiry": "1000000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "43800h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}

Regenerate the cert

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server

Check server certificate: (I copied it to /etc/etcd/server.pem)

$ openssl x509 -in /etc/etcd/server.pem -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
...
    Signature Algorithm: sha256WithRSAEncryption
...
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE

Environment vars:

ETCD_CLIENT_CERT_AUTH=true
ETCD_KEY_FILE=/etc/etcd/server-key.pem
ETCD_CERT_FILE=/etc/etcd/server.pem
ETCD_TRUSTED_CA_FILE=/etc/etcd/ca.pem
...

Run etcd

sudo etcd --peer-auto-tls=true
...

@KIVagant
Copy link

Btw, even after the issue was fixed, I still see a lot of messages like this in log:

embed: rejected connection from "35.111.222.111:41886" (error "EOF", ServerName "")

I feel like it could be related to health checks from a Network Load Balancer.

@wenjiaswe
Copy link
Contributor

@JinsYin For your confusion about server and client auth, here is the up to date documentation on etcd tls setup, example 1 refers to "client-cert-auth" situation and example 2 refers to "client-cert-auth" set to true. Thanks to @KIVagant 's detailed demo!

@KIVagant for your "embed: rejected connection from "35.111.222.111:41886" (error "EOF", ServerName "")" comment, may I ask if you are using etcd in k8s? Because there is a bug in k8s that would lead to that. If you are, I will add more details, never mind if not.

@KIVagant
Copy link

KIVagant commented Oct 24, 2018

@wenjiaswe , I'm preparing etcd for K8s but there is nothing else except ETCD, its network balancer and bastion host in my test google cloud. But I see 4 or 5 different IP addresses that are trying to connect to etcd, so I'm still don't really know where they come from.

@wenjiaswe
Copy link
Contributor

@KIVagant So now it's just ETCD with no k8s at all, right?
/cc @jpbetz Joe, do you have any insight on this?

@KIVagant
Copy link

Yes, it's just a clean isolated installation of ETCD.

@mindcrime
Copy link

I'm also seeing a ton of the "rejected connection" errors, running vanilla etcd (likewise, preparing for k8s) on EC2. I've been following the instructions here: https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/04-certificate-authority.md and I suspect that one or more of my certificates are missing something, or need to be tweaked. Still digging into it, but any advice would be much appreciated.

@KIVagant
Copy link

KIVagant commented Oct 26, 2018

@mindcrime first of all check if the cluster works. I believe there is a big difference between working cluster when something external tries to connect to the port and when cluster's nodes really can't join. In my case all nodes operate normal and I can get members info and put messages.

wking added a commit to wking/kubecsr that referenced this issue Dec 6, 2018
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to a gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
wking added a commit to wking/kubecsr that referenced this issue Dec 6, 2018
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to the gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
@gbolo
Copy link

gbolo commented Feb 9, 2019

I ran into this as well. Adding client usage fixed it. I agree that there should be an option for separate client cert for this purpose instead of hijacking the server certificate for this purpose!

@RedSofaForEveryone
Copy link

RedSofaForEveryone commented Apr 21, 2019

@KIVagant
How do you achieve this fix with openssl only?
(Not using cfssl)

edit

Figured it out.

Use the documentation from Kubernetes here:
https://kubernetes.io/docs/concepts/cluster-administration/certificates/

You want to utilize the v3_ext config at the bottom when you are signing your csr with your CA. Note that this is part of the x509 command, not the req command.

@mordf
Copy link

mordf commented Aug 5, 2019

Having trouble deploying etcd-cluster on k8s using bitnami charts i'm getting a lot of embed: rejected connection from "127.0.0.1:45488" (error "tls: first record does not look like a TLS handshake", ServerName "") when I try to run commands within the pod

@wenjiaswe
Copy link
Contributor

@mordf It's probably because the traffic is not HTTPS but HTTP, this post should might have the answer: #9917

@mordf
Copy link

mordf commented Aug 6, 2019

It seems that I had a mistake with addressing the etcd from etcdctl from within the pod, I kubectl exec ectd-cluster-0 sh and ran etcdctl without the --cert --key and --cacert, I thought that when running from within the pod you don't need it, but I guess you do.
So it's working now.

alaypatel07 pushed a commit to alaypatel07/kubecsr that referenced this issue Sep 26, 2019
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to the gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
hexfusion pushed a commit to openshift/kubecsr that referenced this issue Nov 7, 2019
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to the gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
hexfusion pushed a commit to openshift/kubecsr that referenced this issue Nov 7, 2019
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to the gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
hexfusion pushed a commit to openshift/kubecsr that referenced this issue Nov 7, 2019
Avoid issues like [1]:

  WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

In the discussion there, the issue seems to be that etcd 3.2 started
requiring the client usage for the server cert, which is (for some
reason) used when connecting to the gRPC gateway [2,3].

[1]: etcd-io/etcd#9785 (comment)
[2]: etcd-io/etcd#9785 (comment)
[3]: https://github.com/etcd-io/etcd/blob/v3.3.10/Documentation/dev-guide/api_grpc_gateway.md
@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@wenjiaswe
Copy link
Contributor

Yes, you need to specify --cacert ./ca.crt, --cert ./server.crt and --key ./server.key flags for it to work. Looks like you figured it out. I am closing this issue.

@zhangguanzhang
Copy link
Contributor

etcd \
  ....\
  --cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384

@james-ok
Copy link

I still have this problem. I have added the etcd server certificate to be used as client certificate authentication, but the following error still occurs:
authentication handshake failed: remote error: tls: bad certificate
In my case, my etcd cluster uses three different CAs. There are two-way authentication CAs for inter-cluster communication, server CAs, and CAs for verifying client identities. They issue corresponding server certificates and client certificates respectively. Certificate.
My guess is that the CA root certificate used by the gRPC server in ETCD to verify the identity of the gRPC-gateway client is wrong. If this is the case, which CA should the gRPC server use to verify the identity of the gRPC gateway by default?
How can I deal with this problem? Thanks!

@dromadaire54
Copy link

I have the issue "server-name":"etcd-apisix-2.etcd-apisix-headless.apisix.svc.cluster.local","error":"remote error: tls: bad certificate" when I deploy the etcd with the bitnami helmchart the client auth is enabled and the server certificate is generated with the dns altnames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests