Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server,ui: infinite HTTP redirect when accessing UI for secure cluster behind nginx-ingress in k8s #228

Closed
sergeyshaykhullin opened this issue May 26, 2020 · 23 comments · Fixed by cockroachdb/cockroach#109694

Comments

@sergeyshaykhullin
Copy link
Contributor

sergeyshaykhullin commented May 26, 2020

Describe the problem
I am getting too many redirects when trying to access secure cockroachdb cluster behind ingress-nginx using helm chart.

Please describe the issue you observed, and any steps we can take to reproduce it:

It is happening because tls.enabled == secure cluster with tls termination.
Same thing i got with argo cd and solved using --insecure flag (Insecure - just about tls termination)
So i have to use insecure cluster behind ingress (losing auth screen, users with passwords etc) or secure cluster without ingress inside k8s (And i have to manage certificates, use not 80 and 443 ports, it's painful)

To Reproduce

  1. Setup basic k8s cluster
  2. Setup nginx-ingress, cert-manager
  3. Setup secure cockroachdb cluster
  4. Create k8s Ingress for cockroachdb service
  5. Try to access admin panel

Expected behavior
I can disable tls termination in web ui, but don't lose security cluster benefits

Additional data / screenshots

Environment:

  • CockroachDB version 20.1
  • Server OS: Debian 10

Jira issue: CRDB-4219

@blathers-crl
Copy link

blathers-crl bot commented May 26, 2020

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

  • @florence-crl (member of the technical support engineering team)

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@knz
Copy link

knz commented Jun 1, 2020

xref cockroachdb/cockroach#44842

craig bot referenced this issue in cockroachdb/cockroach Sep 10, 2020
53991: pgwire: accept non-TLS client conns safely in secure mode r=aaron-crl,irfansharif,bdarnell a=knz

Fixes #44842.
Informs #49532. 
Informs #53404.

This change makes it possible for a DBA / system administrator to
reconfigure individual nodes *in a secure cluster* to accept SQL
client sessions over TCP without mandating a TLS
handshake. Authentication remains mandatory as per the HBA rules.

Motivation: we have at least two high-profile customers who keep their
nodes and client apps in a private secure network (with network-level
encryption / privacy) and who experience client-side TLS as
unnecessary and expensive friction.

Additionally, **this feature is a prerequisite to upgrade an insecure
cluster to secure mode without downtime.**

Why this does not impair security:

- authentication remains mandatory (as per the HBA rules -- [1] [2]).
- the feature is opt-in: the operator must set a command-line flag
  (`--accept-sql-without-tls`), which is not enabled by default.
- there is an interlock: the user must both set up the flag
  and set log-in passwords for their SQL users (by default,
  users get created without a password and thus cannot log
  in without client certs).
- for now, node-node connections still require TLS.

[1]: https://www.postgresql.org/docs/12/auth-pg-hba-conf.html
[2]: https://dr-knz.net/authentication-in-postgresql-and-cockroachdb.html

For context, the default HBA configuration is the following:

```
host  all root all cert-password # fixed rule
host  all all  all cert-password # built-in CockroachDB default
local all all      password      # built-in CockroachDB default
```

The directive `host` covers both TLS and non-TLS incoming TCP
connections (`local` is for the unix socket). The method
`cert-password` means "client cert or password": without a cert, the
password is mandatory.

As previously, the user can further secure the configuration by
restricting non-TLS connections to just a subnetwork, for example:

```
host  all all 10.0.0.0/8 password # accept conns on the 10/8 network.
host  all all all        reject   # refuse conns from other nets.
local all all            password
```

Note that this change is limited to the server side: CockroachDB's own
`cockroach` CLI commands do not yet know how to connect to a
CockroachDB server without TLS; such connections are only supported
from `psql` or SQL client drivers in apps. See #53994 for a follow-up.

Release justification: fixes for high-priority or high-severity bugs in existing functionality


54019: roachtest: de-flake 'inconsistent' r=knz a=tbg

This test sets up an intentionally corrupted replica and wants its node
to shut down as a result of its detection. When only two of the three
nodes were included in the consistency check, either one of them could
end up terminating (as no obvious majority of healthy replicas can be
determined). Change the test so that we wait for the cluster to come
fully together before setting a low consistency check interval.

Closes #54005.

Release justification: testing
Release note: None


Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
@knz knz changed the title Secure cluster behind nginx-ingress in k8s server,ui: infinite HTTP redirect when accessing UI for secure cluster behind nginx-ingress in k8s May 28, 2021
@knz
Copy link

knz commented May 28, 2021

cc @dhartunian this seems to fall between the cracks of server / obs infra. What to do?

@mihneastaub
Copy link

mihneastaub commented Feb 1, 2022

@knz any updates on this issue?

@knz
Copy link

knz commented Feb 1, 2022

We've not found the direct cause of the issue yet, but you might have interest in cockroachdb/cockroach#74329.

@mihneastaub
Copy link

mihneastaub commented Feb 1, 2022

We've not found the direct cause of the issue yet, but you might have interest in cockroachdb/cockroach#74329.

Thanks for the information, but it looks that neither of those flags will not resolve the problem with the redirect.

@udnay
Copy link

udnay commented Mar 1, 2022

@sergeyshaykhullin can you connect to the web ui from inside the cluster if you do portforwarding and bypass the nginx ingress?

I assume you have something like
external request (https) -> nginx ingress (tls termination) -> web ui (http) and the redirect loop is happening at the CRDB side?

@udnay
Copy link

udnay commented Mar 1, 2022

@mihneastaub or @sergeyshaykhullin can you also provide your k8s nginx annotations?

@mihneastaub
Copy link

@udnay I'm using istio gateway for the ingress controller.

@sergeyshaykhullin
Copy link
Contributor Author

metadata:
    annotations:
      kubernetes.io/tls-acme: "true"
      meta.helm.sh/release-name: crdb
      meta.helm.sh/release-namespace: crdb
      nginx.ingress.kubernetes.io/auth-signin: https://$host/oauth2/start?rd=$escaped_request_uri
      nginx.ingress.kubernetes.io/auth-url: https://$host/oauth2/auth
      nginx.ingress.kubernetes.io/service-upstream: "true"

@pseudomuto pseudomuto transferred this issue from cockroachdb/cockroach Mar 3, 2022
@pseudomuto
Copy link
Contributor

While starting to take a look into this, I was able to reproduce this issue using the following steps. I figured these steps could be helpful if anyone else is interested in verifying the issue or playing around with different config settings.

Start by creating a few files we'll use for getting everything set up.

Support files for testing
# hack/issuer.yaml
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: [email protected]
    privateKeySecretRef:
      name: letsencrypt-staging
    solvers:
     - http01:
         ingress:
           class: nginx
# hack/values.yaml
ingress:
  enabled: true
  annotations:
    cert-manager.io/issuer: letsencrypt-staging
    kubernetes.io/ingress.class: nginx
  hosts:
    - crdb.example.com
  tls:
    - hosts: [crdb.example.com]
      secretName: crdb-example-com-tls
# hack/exec (don't forget to chmod +x this file)
#!/usr/bin/env bash
set -euo pipefail

CLUSTER="${CLUSTER:-"$(whoami)-test"}"
REGION="${REGION:-us-east1}"

main() {
  case "${1:-}" in
    create-cluster) create_cluster;;
    delete-cluster) delete_cluster;;
    apply)
      # ensure we've got the necessary binaries
      make bin/helm bin/kubectl
      install_cert_manager
      install_ingress_nginx
      install_chart;;
    *)
      echo "Unknown command '${1:-}'." 1>&2
      echo "Usage: ${0} <command>"
      echo "  commands:"
      echo "    create-cluster    create a GKE cluster for testing"
      echo "    delete-cluster    delete the GKE cluster"
      echo "    apply             (idempotently) install cert-manager, ingress-nginx, and this chart"
      exit 1;;
  esac
}

create_cluster() {
  println "Creating GKE cluster ${CLUSTER}..."
  gcloud container clusters create "${CLUSTER}" --region "${REGION}" --num-nodes 1
}

delete_cluster() {
  println "Deleting GKE cluster ${CLUSTER}..."
  gcloud container clusters delete "${CLUSTER}" --region "${REGION}"
}

install_chart() {
  println "Installing local cockroachdb chart..."
  bin/helm upgrade -f hack/values.yaml --install crdb ./cockroachdb
}

install_ingress_nginx() {
  println "Installing ingress-nginx..."
  bin/helm upgrade --install ingress-nginx ingress-nginx \
    --repo https://kubernetes.github.io/ingress-nginx \
    --namespace ingress-nginx --create-namespace

  println "Waiting for ingress controller to be available..."
 bin/kubectl wait --namespace ingress-nginx \
    --for=condition=ready pod \
    --selector=app.kubernetes.io/component=controller \
    --timeout=90s
}

install_cert_manager() {
  println "Installing cert-manager..."
  bin/helm upgrade --install cert-manager cert-manager \
    --repo https://charts.jetstack.io \
    --namespace cert-manager \
    --create-namespace \
    --set installCRDs=true

  println "Creating letencrypt issuer..."
  bin/kubectl apply -f hack/issuer.yaml
}

println() {
  echo -e "\033[36m${1}\033[0m"
}

main "$@"

You'll need a working K8s cluster. If you don't have one you can run hack/exec create-cluster to create a GKE cluster. This assumes you've got gcloud installed and configured with an account that can create/manage GKE clusters. The cluster can be deleted with hack/exec delete-cluster when you're done.

Install cert-manager, ingress-nginx, and the cockroachdb chart by running hack/exec apply. This can be re-run as needed (idempotent) which can be useful if you change the values or issuer files and want to upgrade the charts.

With everything set up, get a hold of the external IP for the ingress object:

bin/kubectl get ing crdb-cockroachdb-ingress -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

cURL the ingress and verify you get an error about maximum redirects reached:

curl -kL --resolve crdb.example.com:443:<LB_IP_FROM_ABOVE> https://crdb.example.com
> curl: (47) Maximum (50) redirects followed

@dhartunian
Copy link

I think this behavior is due to this snippet below which redirects all traffic on the HTTP port to HTTPS except for the healthcheck. @pseudomuto I'm guessing /health resolves correctly in your example.

Maybe what we should do is inspect X-Forwarded-Proto and disable redirect when it's set to https. What do you think @udnay @pseudomuto ? (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-Proto, I assume nginx sets this)

https://github.com/cockroachdb/cockroach/blob/946b9c545127557bc8a6e1e05905a85db5095e7f/pkg/server/server_http.go#L222-L238

@nv30
Copy link

nv30 commented Mar 30, 2022

Is there any workaround for this problem?

@Djankaa
Copy link

Djankaa commented May 10, 2022

Any progress here?

@HcgRandon
Copy link

Waiting to see a solution for this? A lot of people use ingresses...

@HcgRandon
Copy link

HcgRandon commented May 19, 2022

Heads up for others; If your ingress is nginx based you can add the nginx.ingress.kubernetes.io/backend-protocol: HTTPS annotation to the ingress definition and this will fix the issue.

Found this solution here: https://stackoverflow.com/questions/54459015/how-to-configure-ingress-to-direct-traffic-to-an-https-backend-using-https/54459898#54459898

@Djankaa
Copy link

Djankaa commented May 19, 2022

@HcgRandon
Works perfectly! Thank you!

@ghost
Copy link

ghost commented May 25, 2022

NOPE STILL DOESNT WORK

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    kubernetes.io/ingress.class: traefik
    meta.helm.sh/release-name: cockroachdb
    meta.helm.sh/release-namespace: whocares-prod
    traefik.ingress.kubernetes.io/service.serversscheme: https
  labels:
    app.kubernetes.io/instance: cockroachdb
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: cockroachdb
    helm.sh/chart: cockroachdb-7.0.3
  name: cockroachdb-ingress
  namespace: whocares-prod
spec:
  rules:
  - host: some.hostname
    http:
      paths:
      - backend:
          service:
            name: cockroachdb-public
            port:
              number: 8080
        path: /
        pathType: Prefix
status:
  loadBalancer: {}

@JonatanCruz
Copy link

JonatanCruz commented Jun 14, 2022

@ghost
you can try add the ingress configuration in operator deploy file

apiVersion: crdb.cockroachlabs.com/v1alpha1
kind: CrdbCluster
metadata:
  # this translates to the name of the statefulset that is created
  name: cockroachdb
spec:
  dataStore:
    pvc:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "20Gi"
        volumeMode: Filesystem
  resources:
    requests:
      # This is intentionally low to make it work on local kind clusters.
      cpu: 2
      memory: 4Gi
    limits:
      cpu: 4
      memory: 8Gi
  tlsEnabled: true
# You can set either a version of the db or a specific image name
# cockroachDBVersion: v21.2.8
  image:
    name: cockroachdb/cockroach:v22.1.1
  # nodes refers to the number of crdb pods that are created
  # via the statefulset
  nodes: 3
  additionalLabels:
    crdb: is-cool
  # affinity is a new API field that is behind a feature gate that is
  # disabled by default.  To enable please see the operator.yaml file.

  # The affinity field will accept any podSpec affinity rule.
  # affinity:
  #   podAntiAffinity:
  #      preferredDuringSchedulingIgnoredDuringExecution:
  #      - weight: 100
  #        podAffinityTerm:
  #          labelSelector:
  #            matchExpressions:
  #            - key: app.kubernetes.io/instance
  #              operator: In
  #              values:
  #              - cockroachdb
  #          topologyKey: kubernetes.io/hostname

  # nodeSelectors used to match against
  # nodeSelector:
  #   worker-pool-name: crdb-workers
  
  # Ingress for access the services
  ingress:
    ui:
      ingressClassName: nginx
      annotations:
        nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" <-- add here
      host: some.hostname

@rajavarapusaiprasanth-hpe

has anyone tried using path based routing instead of host based routing? I am facing issue on /api/v2/sql and /uiconfig when using a different path instead of /

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: HTTPS
    nginx.ingress.kubernetes.io/rewrite-target: /$2
  name: cockroachdb-ingress
  namespace: cockroach
spec:
  rules:
  - host: shared.host.name
    http:
      paths:
      - backend:
          service:
            name: croach-cockroachdb-public
            port:
              number: 8080
        path: /cockroach(/|$)(.*)
        pathType: ImplementationSpecific

@udnay
Copy link

udnay commented Jul 24, 2023

@prafull01 can you triage?

@prafull01
Copy link
Collaborator

@himanshu-cockroach Can you please have a look

@himanshu-cockroach
Copy link
Contributor

@rajavarapusaiprasanth-hpe The path based routing is not working currently because of this issue: cockroachdb/cockroach#91429. since all the other paths including UI dashboard, other than the absolute paths which are being redirected to via the code, its working through path based routing, we can close this thread.

dhartunian added a commit to dhartunian/cockroach that referenced this issue Aug 29, 2023
This commit removes all prefix `/` characters from request paths in
the DB Console and Cluster UI codebases. This ensures that if the
DB Console is proxied at a subpath the requests continue to work as
expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: cockroachdb#91429

Epic: None

Release note (ops change, ui change): The DB Console now constructs
client-side requests using relative URLs instead of absolute ones.
This enables proxying of the DB Console at arbitrary subpaths.
dhartunian added a commit to dhartunian/cockroach that referenced this issue Aug 29, 2023
This commit removes all prefix `/` characters from request paths in
the DB Console and Cluster UI codebases. This ensures that if the
DB Console is proxied at a subpath the requests continue to work as
expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: cockroachdb#91429

Epic: None

Release note (ops change, ui change): The DB Console now constructs
client-side requests using relative URLs instead of absolute ones.
This enables proxying of the DB Console at arbitrary subpaths.
craig bot pushed a commit to cockroachdb/cockroach that referenced this issue Sep 8, 2023
108824: sql: implement datetime builtins r=rafiss,chrisseto a=annrpom

Previously, `make_date`, `make_timestamp`, and `make_timestamptz`
built-ins were not implemented. In addition, a new signature for
`date_trunc` was not implemented. This was inadequate because it caused
an incompatibility issue for tools that needed these datetime functions.
To address this, this patch adds said datetime built-ins. Note that behaviors
do not align with postgres when negative years are inpits, as we adhere
to ISO 8601 standard.

Epic: none
Fixes: #108448

Release note (sql change): Datetime built-ins (make_date,
make_timestamp, and make_timestamptz) are implemented - allowing for
the creation of timestamps, timestamps with time zones, and dates. In
addition, date_trunc now allows for a timestamp to be truncated in a
provided timezone (to a provided precision).

109346: kvcoord: Pace rangefeed client goroutine creation r=miretskiy a=miretskiy

Acquire catchup scan quota prior to goroutine creation in order to pace the goroutine creation rate.

This change results in nice and smooth growth in
goroutine count, thus reducing the pressure on goroutine scheduler, which in turn reduces the impact on SQL latency during changefeed startup.

This change also improves observability in rangefeed client by introducing new counters:
 * `distsender.rangefeed.retry.<reason>`: counter keeping track of the number of ranges that ecountered a retryable error of particular type (e.g. slow counsumer, range split, etc).

Observability also enhanced by adding a column to
 `crdb_internal.active_rangefeed` virtual table augment to indicate
if the range is currently in catchup scan mode.

Fixes #98842

Release note (enterprise change): Pace rangefeed goroutine creation
rate to improve scheduler latency.  Improve observability by adding
additional metrics indicating the reason for rangefeed restart
as well as additional column in the `crdb_internal.active_rangefeed`
table to indicate if the range is currently in catchup scan mode.

109694: ui: all xhr paths from db console are now relative r=santamaura,zachlite,j82w a=dhartunian

This commit removes all prefix `/` characters from request paths in the DB Console and Cluster UI codebases. This ensures that if the DB Console is proxied at a subpath the requests continue to work as expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: #91429

Epic CRDB-21265

Release note (ops change, ui change): The DB Console now constructs client-side requests using relative URLs instead of absolute ones. This enables proxying of the DB Console at arbitrary subpaths.

109774: sql,keys: disregard checks in *ZoneConfig*Batch r=rafiss a=annrpom

Previously, a bug occured where a transactional schema change,
 `ALTER RANGE default CONFIGURE ZONE...`, statement would
produce a CPut failure, even though both statements do take effect.
This was due to a guard in the code that blocked us from adding the
first zone config to the uncommitted cache, causing the expValues
being updated to the KV batch to be the same for both statements.
This patch addresses the issue by removing the check and allowing
for `default` and psuedotables (like system ranges) to be added to
the uncommitted cache.

Epic: none
Release note (bug fix): two `ALTER RANGE default CONFIGURE ZONE` 
statements on the same line no longer displays an error. 
Fixes: #108253

110250: changefeedccl: Fix data race in lagging spans metric r=miretskiy a=miretskiy

Fix a race bug in lagging spans metric.

Fixes: #110235

Release note: None

Co-authored-by: Annie Pompa <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: David Hartunian <[email protected]>
dhartunian added a commit to dhartunian/cockroach that referenced this issue Sep 12, 2023
This commit removes all prefix `/` characters from request paths in
the DB Console and Cluster UI codebases. This ensures that if the
DB Console is proxied at a subpath the requests continue to work as
expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: cockroachdb#91429

Epic: None

Release note (ops change, ui change): The DB Console now constructs
client-side requests using relative URLs instead of absolute ones.
This enables proxying of the DB Console at arbitrary subpaths.
dhartunian added a commit to dhartunian/cockroach that referenced this issue Sep 12, 2023
This commit removes all prefix `/` characters from request paths in
the DB Console and Cluster UI codebases. This ensures that if the
DB Console is proxied at a subpath the requests continue to work as
expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: cockroachdb#91429

Epic: None

Release note (ops change, ui change): The DB Console now constructs
client-side requests using relative URLs instead of absolute ones.
This enables proxying of the DB Console at arbitrary subpaths.
dhartunian added a commit to cockroachdb/cockroach that referenced this issue Oct 4, 2023
This commit removes all prefix `/` characters from request paths in
the DB Console and Cluster UI codebases. This ensures that if the
DB Console is proxied at a subpath the requests continue to work as
expected and are relative to the correct base path.

Resolves: cockroachdb/helm-charts#228
Resolves: #91429

Epic: None

Release note (ops change, ui change): The DB Console now constructs
client-side requests using relative URLs instead of absolute ones.
This enables proxying of the DB Console at arbitrary subpaths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment