Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

Tracing protocol is HTTP even though tempo-k8s is related to self-signed-certificates #153

Closed
shayancanonical opened this issue Jul 26, 2024 · 9 comments

Comments

@shayancanonical
Copy link

Bug Description

The tracing protocol that is expected is HTTP even though tempo-k8s is related to self-signed-certificates. Furthermore, even though we are using charm_tracing_config, tracing is enabled but server_cert is returning None. Since tempo-k8s is using HTTPS, the mysql charm tries to export spans to tempo and fails continuously (leading to the charm getting stuck with the same hook handler - which does not return as span exporting is retrying infinitely)

To Reproduce

In k8s model:

  1. juju deploy cos-lite --trust --channel edge
  2. juju deploy tempo-k8s --channel edge
  3. juju integrate tempo-k8s:grafana-dashboard grafana:grafana-dashboard
  4. juju integrate tempo-k8s:grafana-source grafana:grafana-source
  5. juju integrate tempo-k8s:ingress traefik:traefik-route
  6. juju integrate tempo-k8s:metrics-endpoint prometheus:metrics-endpoint
  7. juju integrate tempo-k8s:logging loki:logging
  8. juju deploy self-signed-certificates
  9. juju integrate tempo-k8s:certificates self-signed-certificates

In lxd model:

  1. checkout code from Update charm tracing libs + add support for exporting traces via HTTPS mysql-operator#465
  2. tox -e build-production
  3. juju deploy -n 1 ./mysql_ubuntu-22.04-amd64.charm
  4. juju consume uk8s:admin/cos.tempo-k8s
  5. juju consume uk8s:admin/cos.self-signed-certificates
  6. juju integrate mysql:tracing-certificates self-signed-certificates
  7. juju integrate mysql tempo-k8s

Environment

juju: 3.4.3
microk8s: MicroK8s v1.27.13 revision 6744
lxd: 6.1
ubuntu 22.4.3 LTS

Relevant log output

...
unit-mysql-3: 15:55:47 WARNING unit.mysql/3.juju-log <class '__main__.MySQLOperatorCharm'>.tracing_server_ca_cert is None; sending traces over INSECURE connection.
unit-mysql-3: 15:55:47 INFO unit.mysql/3.juju-log Unit workload member-state is online with member-role primary
unit-mysql-3: 15:55:48 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 1s.
unit-mysql-3: 15:55:49 WARNING unit.mysql/3.juju-log Timeout was exceeded in force_flush().
unit-mysql-3: 15:55:49 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 2s.
unit-mysql-3: 15:55:51 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 4s.
unit-mysql-3: 15:55:55 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.
unit-mysql-3: 15:56:03 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 16s.
unit-mysql-3: 15:56:19 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 32s.
unit-mysql-3: 15:56:52 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-mysql-3: 15:56:52 WARNING unit.mysql/3.juju-log <class '__main__.MySQLOperatorCharm'>.tracing_server_ca_cert is None; sending traces over INSECURE connection.
unit-mysql-3: 15:56:53 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 1s.
unit-mysql-3: 15:56:54 WARNING unit.mysql/3.juju-log Timeout was exceeded in force_flush().
unit-mysql-3: 15:56:54 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 2s.
unit-mysql-3: 15:56:56 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 4s.
unit-mysql-3: 15:57:00 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.
unit-mysql-3: 15:57:08 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 16s.
unit-mysql-3: 15:57:24 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 32s.
unit-mysql-3: 15:57:57 WARNING unit.mysql/3.juju-log <class '__main__.MySQLOperatorCharm'>.tracing_server_ca_cert is None; sending traces over INSECURE connection.
unit-mysql-3: 15:57:57 INFO unit.mysql/3.juju-log Unit workload member-state is online with member-role primary
unit-mysql-3: 15:57:58 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 1s.
unit-mysql-3: 15:57:59 WARNING unit.mysql/3.juju-log Timeout was exceeded in force_flush().
unit-mysql-3: 15:57:59 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 2s.
unit-mysql-3: 15:58:01 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 4s.
unit-mysql-3: 15:58:05 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.
unit-mysql-3: 15:58:13 WARNING unit.mysql/3.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 16s.
...

Additional context

Certificate retrieved in mysql charm:

ubuntu@juju-5ba1f4-3:~$ ls -la /var/snap/charmed-mysql/common/var/run/
total 16
drwxr-xr-x 3 snap_daemon root 4096 Jul 26 15:27 .
drwxr-xr-x 5 snap_daemon root 4096 Jul 26 15:25 ..
drwxr-xr-x 2 snap_daemon root 4096 Jul 26 15:26 mysqld
-rw-r--r-- 1 root        root 1240 Jul 26 15:27 tracing-ca.crt

databag from the perspective of mysql (note that protocol type is http):

  - relation-id: 19
    endpoint: tracing
    cross-model: true
    related-endpoint: tracing
    application-data:
      receivers: '[{"protocol": {"name": "otlp_http", "type": "http"}, "url": "http://10.0.0.44:4318"}]'
    related-units:
      tempo-k8s/0:
        in-scope: true
        data:
          egress-subnets: 10.152.183.195/32
          ingress-address: 10.152.183.195
          private-address: 10.152.183.195
@mmkay
Copy link
Contributor

mmkay commented Jul 29, 2024

Quick passby thought: did you add certificates relation to traefik too? Maybe this one was missing.

@shayancanonical
Copy link
Author

@mmkay I related traefik with self-signed-certificates (on both certificates and receive-ca-cert endpoints). i am getting the below Query Error in Grafana's explore page. Any ideas what may have gone wrong?

image

@mmkay
Copy link
Contributor

mmkay commented Jul 29, 2024

Not immediately but you can check Administration -> Datasources and check what URL is shown next to Tempo's datasource. In your setup I'd expect https://10.0.0.44:3200. If it is like that you can also try reaching this address from your grafana host by juju sshing into it and running curl.

@lucabello
Copy link
Contributor

@shayancanonical Could you verify if the certificate is being copied in the charm container of mysql-operator and that you're running update-ca-certificates afterwards?

This way, the charm should trust the passed CA.

@Abuelodelanada
Copy link
Contributor

Hi @shayancanonical are you still experiencing this issue??

@shayancanonical
Copy link
Author

@Abuelodelanada @lucabello Can confirm that the issue is still reproducible (even after confirming that the cert is added as a result of running updata-ca-certificates). However, we have some concerns about having to add another relation interface to our charm in order to retrieve the CA from the certificates operator related to tempo-k8s. We suggest the tempo-k8s passing the CA it is using via the databag to the related database charm (MySQL in this case)

We have a similar use case in MySQL: MySQL relates to self-signed-certificates operator and instead of asking the app related to MySQL to get the cert by relating to the self-signed-certificates operator, MySQL passes the cert to the client app through the databag.

The reason I mention this is because doing so would greatly simplify our implementation and reduce the probability of human error.

Logs while reproducing:

unit-mysql-0: 19:49:11 INFO juju.worker.uniter.operation ran "update-status" hook (via hook dispatching script: dispatch)
unit-mysql-0: 19:49:11 WARNING unit.mysql/0.juju-log <class '__main__.MySQLOperatorCharm'>.<property object at 0x7fd85a03f060> returned None; sending traces over INSECURE connection.
unit-mysql-0: 19:49:12 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 1s.
unit-mysql-0: 19:49:13 WARNING unit.mysql/0.juju-log Timeout was exceeded in force_flush().
unit-mysql-0: 19:49:13 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 2s.
unit-mysql-0: 19:49:15 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 4s.
unit-mysql-0: 19:49:19 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 8s.
unit-mysql-0: 19:49:27 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 16s.
unit-mysql-0: 19:49:43 WARNING unit.mysql/0.juju-log Transient error Internal Server Error encountered while exporting span batch, retrying in 32s.

@PietroPasotti
Copy link
Contributor

image

attempting to repro with:

  • juju deploy cos-lite --channel edge --trust
  • juju deploy tempo-k8s --channel edge tempo
  • jhack imatrix fill
  • juju deploy self-signed-certificates --channel edge ssc
  • jhack imatrix fill
  • offer/consume as specified by OP

everything seems to work. Endpoint is https
image

@PietroPasotti
Copy link
Contributor

issue was that traefik wasn't related to SSC. closing this, and opening a different ticket instead to see if we can surface this situation better

@PietroPasotti
Copy link
Contributor

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants