Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNI certificates not functioning as expected #1792

Closed
mehstg opened this issue Sep 5, 2019 · 10 comments
Closed

SNI certificates not functioning as expected #1792

mehstg opened this issue Sep 5, 2019 · 10 comments
Labels
stale Issue is stale and will be closed

Comments

@mehstg
Copy link

mehstg commented Sep 5, 2019

Describe the bug
Currently working with @kflynn on this issue via Slack. Defining TLS certificates using TLSContext does not seem to be working correctly.

To Reproduce
Steps to reproduce the behavior:
I removed any trace of the default 'ambassador-certs' secret
I set up a TLSContext as follows,

          ---
          apiVersion: ambassador/v1
          kind: TLSContext
          name: mytls
          ambassador_id: external-gateway
          hosts:
            - "stage01.mydomain.com"
          secret: my-certs

I set up a mapping as follows,

          ---
          apiVersion: ambassador/v1
          kind:  Mapping
          name:  campus-tenant-mapping
          host:  "stage01.mydomain.com"
          ambassador_id: external-gateway
          prefix: /
          service: web-application.frontend:80
          timeout_ms: 5000

I have the tls module also set up in it's most basic form to provide HTTP>HTTPS redirection as follows,

         ---
          apiVersion: ambassador/v1
          kind: Module
          name: tls
          ambassador_id: external-gateway
          config:
            server:
              redirect_cleartext_from: 8080

We are also running an AuthService and Tracing module, however I have disabled these to aid the fault finding.

Expected behavior

I would expect the certificate to be served by Ambassador however this does not happen. In Chrome I see a ERR_CONNECTION_RESET and from the OpenSSL command line I see the following,

connected(00000005)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 327 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)

If I change the TLSContext to use a wildcard. Everything is fine, but obviously I can only serve a single certificate.

          ---
          apiVersion: ambassador/v1
          kind: TLSContext
          name: mytls
          ambassador_id: external-gateway
          hosts:
            - "*"
          secret: my-certs

Versions (please complete the following information):

  • Ambassador: [e.g. 0.32.1] 0.75 (Also tried on 0.76)
  • Kubernetes environment [e.g. Minikube, bare metal, Google Kubernetes Engine]
  • Version [e.g. 1.8.1] Amazon EKS v1.12

Additional context
Add any other context about the problem here.

@stefansedich
Copy link
Contributor

I am seeing the same, however I just noticed if I disable the tls module so that I do not get redirect_cleartext_from it appears to work properly using a non-wildcard TLSContext.

@tomwganem
Copy link

This issue is happening for me as well.

I am trying to serve three distinct domains using ambassador v0.76.0.

---
apiVersion: getambassador.io/v1
kind: TLSContext
name: aoc-tls
hosts:
- "api.realdomain.com"
secret: aoc-cert
ambassador_id: {{ $ambassadorId }}
min_tls_version: v1.2
max_tls_version: v1.3
alpn_protocols: h2[, http/1.1]
---
apiVersion: getambassador.io/v1
kind: TLSContext
name: short-tls
hosts:
- "domain.pub"
secret: short-cert
ambassador_id: {{ $ambassadorId }}
min_tls_version: v1.2
max_tls_version: v1.3
alpn_protocols: h2[, http/1.1]
---
apiVersion: getambassador.io/v1
kind: TLSContext
name: aoc-tls-old
hosts:
- "api.olddomain.com"
secret: aoc-tls-old
ambassador_id: {{ $ambassadorId }}
min_tls_version: v1.2
max_tls_version: v1.3
alpn_protocols: h2[, http/1.1]
---
apiVersion: getambassador.io/v1
kind: Module
name: tls
ambassador_id: {{ $ambassadorId }}
config:
  server:
    redirect_cleartext_from: 8080

the "ambassador-secret" cert in the namespace is for api.realdomain.com

Any calls for api.realdomain.com work correctly

But I try to call something in domain.pub, the request is not routed correctly, and also the certificate is for api.realdomain.com

@stefansedich
Copy link
Contributor

After playing with a bunch of different things today I decided to move to an NLB over the classic ELB and everything appears to be working now, including a default selfsigned cert for all hosts, and then per domain real certificates and even cleartext redirection.

@kflynn
Copy link
Member

kflynn commented Sep 9, 2019

So this is a bug around SNI and host_regex, and I think the simple way forward is this:

  • make the host field in a Mapping support globs with a leading *
  • make the SNI host match be a glob match

or in more lay terms, make things like *.example.com in the host field of a Mapping do what you would expect them to. Only leading * will be supported. You won't be able to do e.g. foo.*.example.com (not that I've ever personally run across anything that does support that).

SNI + host regex will still be problematic, but then host-regex should be largely unnecessary if we support glob patterns.

(Note also that I'm not too concerned about breaking people currently using glob patterns in the host field, because * is not legal in a DNS name, so there's no real way it could have worked.)

@tomwganem
Copy link

I'm not using host_regex.

Some more info:

I have three domains: ["api.olddomain.com", "api.realdomain.com", "domain.pub"]

I have certificates for each of these domains: ["files-cert", "a1-cert", "short-cert"], I also have a secret called "ambassador-certs", which is just a copy of "a1-cert"

Here is my annotation on my ambassador service:

metadata:
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: getambassador.io/v1
      kind:  Module
      name:  ambassador
      ambassador_id: analytics-qa
      config:
        service_port: 8443
        diagnostics:
          enabled: true
        cors:
          origins: '*'
          methods: 'GET, POST, DELETE, PUT, OPTIONS'
          headers:
          - authorization
          - content-type
          exposed_headers: ''
          max_age: '1728000'
        load_balancer:
          policy: ring_hash
          source_ip: true
      ---
      apiVersion: getambassador.io/v1
      kind: TLSContext
      name: aoc-tls
      hosts:
      - "api.qa.realdomain.com"
      secret: a1-cert
      ambassador_id: analytics-qa
      min_tls_version: v1.2
      max_tls_version: v1.3
      alpn_protocols: h2[, http/1.1]
      ---
      apiVersion: getambassador.io/v1
      kind: TLSContext
      name: short-tls
      hosts:
      - "domain.pub"
      secret: short-cert
      ambassador_id: analytics-qa
      min_tls_version: v1.2
      max_tls_version: v1.3
      alpn_protocols: h2[, http/1.1]
      ---
      apiVersion: getambassador.io/v1
      kind: TLSContext
      name: files-tls
      hosts:
      - "api.qa.olddomain.com"
      secret: files-cert
      ambassador_id: analytics-qa
      min_tls_version: v1.2
      max_tls_version: v1.3
      alpn_protocols: h2[, http/1.1]
      ---
      apiVersion: getambassador.io/v1
      kind: Module
      name: tls
      ambassador_id: analytics-qa
      config:
        server:
          redirect_cleartext_from: 8080

Here are the mappings I have for "api.realdomain.com" and "api.olddomain.com"

metadata:
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v1
      kind: Mapping
      name: analytics-qa-f4-back-end-puma-mapping-0
      prefix: /
      host: api.qa.realdomain.com
      service: analytics-qa-f4-back-end-puma.analytics-qa
      rewrite: ""
      ambassador_id: analytics-qa
      tls: aoc-tls
      bypass_auth: true
      timeout_ms: 90000
      ---
      apiVersion: ambassador/v1
      kind: Mapping
      name: analytics-qa-f4-back-end-puma-mapping-1
      prefix: /
      host: api.qa.olddomain.com
      service: analytics-qa-f4-back-end-puma.analytics-qa
      rewrite: ""
      ambassador_id: analytics-qa
      tls: files-tls
      bypass_auth: true
      timeout_ms: 90000

here is my mapping for "domain.pub"

metadata:
  annotations:
    getambassador.io/config: |
      ---
      apiVersion: ambassador/v1
      kind: Mapping
      name: analytics-qa-f4-back-end-short-f4-back-end-short-mapping-0
      prefix: /
      service: http://analytics-qa-f4-back-end-short.analytics-qa
      rewrite: ""
      host: domain.pub
      ambassador_id: analytics-qa
      bypass_auth: true

If I look in the admin view it looks like like everything is good to go:

{
    "match": {
        "case_sensitive": true,
        "headers": [
            {
                "exact_match": "api.qa.olddomain.com",
                "name": ":authority"
            }
        ],
        "prefix": "/",
        "runtime_fraction": {
            "default_value": {
                "denominator": "HUNDRED",
                "numerator": 100
            },
            "runtime_key": "routing.traffic_shift.cluster_analytics_qa_f4_back_end_puma_an-1"
        }
    },
    "per_filter_config": {
        "envoy.ext_authz": {
            "disabled": true
        }
    },
    "route": {
        "cluster": "cluster_analytics_qa_f4_back_end_puma_an-1",
        "cors": {
            "allow_headers": "authorization, content-type",
            "allow_methods": "GET, POST, DELETE, PUT, OPTIONS",
            "allow_origin": [
                "*"
            ],
            "enabled": true,
            "max_age": "1728000"
        },
        "hash_policy": [
            {
                "connection_properties": {
                    "source_ip": true
                }
            }
        ],
        "priority": null,
        "timeout": "90.000s"
    }
}
{
    "match": {
        "case_sensitive": true,
        "headers": [
            {
                "exact_match": "api.qa.realdomain.com",
                "name": ":authority"
            }
        ],
        "prefix": "/",
        "runtime_fraction": {
            "default_value": {
                "denominator": "HUNDRED",
                "numerator": 100
            },
            "runtime_key": "routing.traffic_shift.cluster_analytics_qa_f4_back_end_puma_an-0"
        }
    },
    "per_filter_config": {
        "envoy.ext_authz": {
            "disabled": true
        }
    },
    "route": {
        "cluster": "cluster_analytics_qa_f4_back_end_puma_an-0",
        "cors": {
            "allow_headers": "authorization, content-type",
            "allow_methods": "GET, POST, DELETE, PUT, OPTIONS",
            "allow_origin": [
                "*"
            ],
            "enabled": true,
            "max_age": "1728000"
        },
        "hash_policy": [
            {
                "connection_properties": {
                    "source_ip": true
                }
            }
        ],
        "priority": null,
        "timeout": "90.000s"
    }
}
{
    "match": {
        "case_sensitive": true,
        "headers": [
            {
                "exact_match": "domain.pub",
                "name": ":authority"
            }
        ],
        "prefix": "/",
        "runtime_fraction": {
            "default_value": {
                "denominator": "HUNDRED",
                "numerator": 100
            },
            "runtime_key": "routing.traffic_shift.cluster_http___analytics_qa_f4_back_end_-0"
        }
    },
    "per_filter_config": {
        "envoy.ext_authz": {
            "disabled": true
        }
    },
    "route": {
        "cluster": "cluster_http___analytics_qa_f4_back_end_-0",
        "cors": {
            "allow_headers": "authorization, content-type",
            "allow_methods": "GET, POST, DELETE, PUT, OPTIONS",
            "allow_origin": [
                "*"
            ],
            "enabled": true,
            "max_age": "1728000"
        },
        "hash_policy": [
            {
                "connection_properties": {
                    "source_ip": true
                }
            }
        ],
        "priority": null,
        "timeout": "3.000s"
    }
}

but routing just doesn't seem to work at all

curl https://x.x.x.x/api/v1/keys -H "Host: api.qa.olddomain.com" -k -v
*   Trying x.x.x.x...
* TCP_NODELAY set
* Connected to x.x.x.x (x.x.x.x) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=CA; L=Emeryville; O=Aspera Inc.; OU=Aspera_IT; CN=*.qa.realdomain.com
*  start date: Mar 13 00:00:00 2018 GMT
*  expire date: Mar 20 12:00:00 2020 GMT
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
*  SSL certificate verify ok.
> GET /api/v1/keys HTTP/1.1
> Host: api.qa.olddomain.com
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Fri, 13 Sep 2019 21:47:44 GMT
< server: envoy
< content-length: 0
<
* Connection #0 to host x.x.x.x left intact
curl https://169.63.33.93/api/v1/keys -H "Host: api.qa.realdomain.com" -k -v
*   Trying 169.63.33.93...
* TCP_NODELAY set
* Connected to 169.63.33.93 (169.63.33.93) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=CA; L=Emeryville; O=Aspera Inc.; OU=Aspera_IT; CN=*.qa.realdomain.com
*  start date: Mar 13 00:00:00 2018 GMT
*  expire date: Mar 20 12:00:00 2020 GMT
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
*  SSL certificate verify ok.
> GET /api/v1/keys HTTP/1.1
> Host: api.qa.realdomain.com
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Fri, 13 Sep 2019 21:52:13 GMT
< server: envoy
< content-length: 0
<
* Connection #0 to host 169.63.33.93 left intact
$ curl https://x.x.x.x/api/v1/keys -H "Host: domain.pub" -k -v
*   Trying x.x.x.x...
* TCP_NODELAY set
* Connected to x.x.x.x (x.x.x.x) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=CA; L=Emeryville; O=Aspera Inc.; OU=Aspera_IT; CN=*.qa.realdomain.com
*  start date: Mar 13 00:00:00 2018 GMT
*  expire date: Mar 20 12:00:00 2020 GMT
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
*  SSL certificate verify ok.
> GET /api/v1/keys HTTP/1.1
> Host: domain.pub
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< date: Fri, 13 Sep 2019 21:54:20 GMT
< server: envoy
< content-length: 0
<
* Connection #0 to host x.x.x.x left intact

logs aren't helpful

2019-09-13 21:50:38 diagd 0.76.0 [P248TAmbassadorEventWatcher] INFO: Scout reports {"latest_version": "0.78.0", "application": "ambassador", "cached": true, "timestamp": 1568410987.584329}
2019-09-13 21:50:38 diagd 0.76.0 [P248TAmbassadorEventWatcher] INFO: Scout notices: [{"level": "DEBUG", "message": "Returning cached result"}, {"level": "INFO", "message": "Upgrade available! to Ambassador version 0.78.0"}]
time="2019-09-13T21:50:38Z" level=info msg="Loaded file /ambassador/envoy/envoy.json"
time="2019-09-13T21:50:38Z" level=info msg="Pushing snapshot v4"
ACCESS [2019-09-13T21:50:50.747Z] "GET /api/v1/keys HTTP/1.1" 404 NR 0 0 0 - "172.30.192.128" "curl/7.54.0" "472dd1ed-5729-4992-8683-f6f6e9049c19" "api.qa.realdomain.com" "-"
ACCESS [2019-09-13T21:50:59.511Z] "GET /api/v1/keys HTTP/1.1" 404 NR 0 0 0 - "172.30.192.128" "curl/7.54.0" "215787d0-5048-4a1d-8377-af806f18fd1c" "api.qa.olddomain.com" "-"
ACCESS [2019-09-13T21:56:16.457Z] "GET /api/v1/keys HTTP/1.1" 404 NR 0 0 0 - "172.30.192.128" "curl/7.54.0" "5d08d7be-c25c-448b-bd90-6404015f9778" "domain.pub" "-"

@tomwganem
Copy link

tomwganem commented Sep 13, 2019

Really, it seems that routing doesn't work at all if you define a service to use a specific host, all my routes that don't have a host defined work as expect, all my routes that do have it defined, do not

@tomwganem
Copy link

We think we're running into this issue envoyproxy/envoy#3411, because ambassador renders an envoy.json with an empty filter_chain_match

                "filter_chains": [
                    {
                        "filter_chain_match": {},
...snip
                    {
                        "filter_chain_match": {
                            "server_names": [
                                "api.qa.realdomain.com"
                            ]
                        },
...snip
                    {
                        "filter_chain_match": {
                            "server_names": [
                                "domain.pub"
                            ]
                        },
...etc```

@stale
Copy link

stale bot commented Nov 15, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Issue is stale and will be closed label Nov 15, 2019
@stale
Copy link

stale bot commented Nov 16, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Nov 23, 2019
@virgild
Copy link

virgild commented Jan 4, 2020

Any updates on this? It's still happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issue is stale and will be closed
Projects
None yet
Development

No branches or pull requests

5 participants