Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPE-1794 Implement COS integration #93

Merged
merged 40 commits into from
Mar 19, 2024
Merged

Conversation

shayancanonical
Copy link
Contributor

@shayancanonical shayancanonical commented Nov 21, 2023

Issue

We need to introduce COS support for MySQLRouter

Solution

  1. Introduce COS support for MySQLRouter
  2. Introduce the scaffolding for secrets to be able to store the monitoring password in the juju secrets backend

Prerequisites

  1. We need to merge the snap PR that adds the COS machinery to the snap. After, we need to update the snap revision in the snap.py file
  2. We need to import the secrets lib after the following secrets PR in data-platform-libs is merged

TODO

  1. Determine why the All Connections Information and Route byte to/from server sections on the dashboard are not populated when more than 1 routers are deployed (see DPE-1794 Implement COS integration #93 (comment))
  2. Test router log collection (via Loki) when the following issue is resolved (see DPE-1794 Implement COS integration #93 (comment))

Demo

image

Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of using Juju secrets/are they necessary?

@shayancanonical
Copy link
Contributor Author

I believe that there are small things which result in using juju secrets to store the monitoring password a good idea:

  1. What if the admin wants to retrieve the monitoring password in order to use the password manually? We would have no way of implementing the get-password user=monitoring action
  2. We would need to chain the ephemeral password through a bunch of functions in a stateless approach (which is a minor maintainability nit)

Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still reviewing, haven't looked at most of it yet)

pyproject.toml Outdated Show resolved Hide resolved
src/snap.py Outdated Show resolved Hide resolved
src/snap.py Show resolved Hide resolved
src/utils.py Outdated Show resolved Hide resolved
src/utils.py Outdated Show resolved Hide resolved
src/utils.py Outdated Show resolved Hide resolved
lib/charms/data_platform_libs/v0/data_secrets.py Outdated Show resolved Hide resolved
metadata.yaml Outdated
Comment on lines 42 to 43
mysql-router-peers:
interface: mysql_router_peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you name this based on what the relation is used for?

my thinking is that it's a better practice to use separate peer relations for separate things, so that you get different events for different things (e.g. upgrades uses its own peer relation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our convention has been <charm_name>_peers to store all internal charm related data in the peerbag as well as secrets. please let me know if you can think of a better name for this interface/endpoint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think this comment went through on last review: suggest cos-rest-api-password or something similar

src/abstract_charm.py Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/relations/cos.py Outdated Show resolved Hide resolved
src/snap.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
src/relations/cos.py Outdated Show resolved Hide resolved
src/relations/cos.py Show resolved Hide resolved
src/relations/cos.py Outdated Show resolved Hide resolved
@shayancanonical
Copy link
Contributor Author

@carlcsaposs-canonical @taurus-forever refactored and polished quite a bit addressing this last iteration of PR feedback

the main functional change: we will always use the rest_api_credentials file for the HTTP server. this enables the disabling/enabling of the router everytime there is a relation formed/broken with COS. instead, we will only change the credentials in this rest_api_credentials file everytime we form/break a COS relation

1 similar comment
@shayancanonical
Copy link
Contributor Author

@carlcsaposs-canonical @taurus-forever refactored and polished quite a bit addressing this last iteration of PR feedback

the main functional change: we will always use the rest_api_credentials file for the HTTP server. this enables the disabling/enabling of the router everytime there is a relation formed/broken with COS. instead, we will only change the credentials in this rest_api_credentials file everytime we form/break a COS relation

Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good
For integration testing, wouldn't deploying grafana-agent-k8s and relating, be enough to start the exporter and test the endpoints?

src/container.py Outdated Show resolved Hide resolved
src/relations/cos.py Show resolved Hide resolved
src/snap.py Outdated Show resolved Hide resolved
@shayancanonical
Copy link
Contributor Author

Added an integration test for the exporter endpoint (when relation formed with COS) in 05302d9

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor change (can be done later or skipped).
Please fire all TODOs from PR descriptions as Jira tickets.

tests/integration/test_exporter.py Outdated Show resolved Hide resolved
@shayancanonical
Copy link
Contributor Author

Confirmed that all the TODOs in the PR description are no longer relevant.

  1. Confirmed that 2 MySQLRouter applications integrated with the same COS stack results in functioning dashboards (fully populated) for both applications
  2. Confirmed that logs flowing through to loki and accessible through COS grafana for both MySQLRouter applications

src/container.py Outdated Show resolved Hide resolved
src/container.py Outdated Show resolved Hide resolved
src/container.py Outdated Show resolved Hide resolved
src/container.py Outdated Show resolved Hide resolved
src/container.py Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
tests/integration/test_exporter.py Outdated Show resolved Hide resolved
tests/integration/test_exporter.py Show resolved Hide resolved
src/snap.py Outdated Show resolved Hide resolved
src/container.py Outdated Show resolved Hide resolved
Comment on lines 315 to 387
workload_.reconcile(
tls=self._tls_certificate_saved,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is used in k8s charm & abstract_charm.py is shared with k8s charm

looking at k8s charm, looks like there might be a bug with self._tls_certificate_saved not beginning with underscore in kubernetes_charm.py

src/workload.py Outdated
*,
container_: container.Container,
logrotate_: logrotate.LogRotate,
cos_: "relations.cos.COSRelation",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cos shouldn't conflict with relations.cos (would only conflict with import cos)

am I mistaken?

src/workload.py Outdated
Comment on lines 311 to 314
if tls:
self.enable_tls() and self._container.mysql_router_service_enabled and self._restart(
tls=tls
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, it would be more robust to have tls input on reconcile and call reconcile from TLS file

so that everything uses the reconcile approach instead of an implicit dependency in the reconcile loop on the event-based tls enable/disable

src/relations/cos.py Outdated Show resolved Hide resolved
import ops
from charms.grafana_agent.v0.cos_agent import COSAgentProvider

import abstract_charm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this cause circular import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated. it indeed caused a circular import

Comment on lines 43 to 57
self.peer_relation_app = data_interfaces.DataPeer(
self,
relation_name=self._PEER_RELATION_NAME,
secret_field_name=self._SECRET_INTERNAL_LABEL,
deleted_label=self._SECRET_DELETED_LABEL,
)
self.peer_relation_unit = data_interfaces.DataPeerUnit(
self,
relation_name=self._PEER_RELATION_NAME,
additional_secret_fields=[
relations.cos.MONITORING_PASSWORD_KEY,
],
secret_field_name=self._SECRET_INTERNAL_LABEL,
deleted_label=self._SECRET_DELETED_LABEL,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason this is in machine_charm.py instead of abstract_charm.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated: moved to cos file

Comment on lines 33 to 34
# DEPRECATED shared-db: Enable legacy "mysql-shared" interface
del self._database_provides
self._database_provides = relations.database_providers_wrapper.RelationEndpoint(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed due to the following error: https://github.com/canonical/mysql-router-operator/actions/runs/8233518223/job/22513139520

As far as I can tell, we are instantiating _database_provides.RelationEndpoint twice with the same unique identifier. I remove the variable declaration in the super class, and we will ensure that both the machine_charm and k8s_charm files define it exactly once

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did the test failure recently appear? I don't think I've seen it before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it appeared when i upgraded the data_interfaces charm lib

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like it might be a bug in the lib; suggest reporting

if that bug won't get fixed in the lib, would recommend using self._database_provides as an argument to __init__ for database_providers_wrapper.RelationEndpoint so that abstract_charm.py can remain unchanged (and typed correctly)

metadata.yaml Outdated
Comment on lines 42 to 43
mysql-router-peers:
interface: mysql_router_peers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think this comment went through on last review: suggest cos-rest-api-password or something similar

Comment on lines 43 to 48
self.peer_relation_app = data_interfaces.DataPeer(
self,
relation_name=self._PEER_RELATION_NAME,
secret_field_name=self._SECRET_INTERNAL_LABEL,
deleted_label=self._SECRET_DELETED_LABEL,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are app secrets used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

app secrets is not used at the moment, but it is needed for the proven get_secret() and set_secret() methods. the code will soon be replaced when new charm structure is adopted, so not worth tinkering much with it. this does not create an app scoped secrets. just allows storing app related secrets if we want to in the future

src/abstract_charm.py Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/constants.py Outdated Show resolved Hide resolved
src/snap.py Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
@@ -107,6 +97,31 @@ def _read_write_endpoint(self) -> str:
def _read_only_endpoint(self) -> str:
"""MySQL Router read-only endpoint"""

@property
def tls_certificate_saved(self) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be private method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to #93 (comment)

Comment on lines 106 to 116
@property
def tls_key(self) -> str:
"""Custom TLS key"""
# TODO VM TLS: Remove property after implementing TLS on machine_charm
return None

@property
def tls_certificate(self) -> str:
"""Custom TLS certificate"""
# TODO VM TLS: Remove property after implementing TLS on machine_charm
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in this pr but when this lands in k8s, I think these should be moved to tls.py

similar to https://github.com/canonical/mysql-router-k8s-operator/blob/d43a43481b7f966db340fa994a93cb18cca592dd/src/kubernetes_charm.py#L53

src/abstract_charm.py Outdated Show resolved Hide resolved
Comment on lines +85 to +89
@property
@abc.abstractmethod
def _cos(self) -> relations.cos.COSRelation:
"""COS"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is relations/cos.py vm specific?

if not, this method shouldn't be abstract

if so, the type here should be an abstract implementation, not a concrete implementation (feel free to ignore for now, but keep in mind for k8s)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to #93 (comment)

src/workload.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
@shayancanonical
Copy link
Contributor Author

Thanks for the reviews! Merging, and will iterate on further improvements with follow up PRs

@shayancanonical shayancanonical merged commit a7c99af into main Mar 19, 2024
20 checks passed
@shayancanonical shayancanonical deleted the feature/cos_integration branch March 19, 2024 08:50
shayancanonical added a commit to canonical/mysql-router-k8s-operator that referenced this pull request Apr 1, 2024
## Issue
We need to introduce support for COS
Fixes #218

## Solution
1. Introduce support for COS (ported from [PR in MySQLRouter
VM](canonical/mysql-router-operator#93))
2. Fix all TLS databag usage to instead use secrets (using
data_interfaces charm lib)

## TODO
File bug regarding intermittent issue where router exporter does not
start up cleanly (due to bind-address in use) which auto-resolves due to
pebble retries.

## Demo
Grafana dashboard:

![image](https://github.com/canonical/mysql-router-k8s-operator/assets/99665202/13ca1ebb-5c34-45dd-9c59-3a80e5f166ca)

Juju model tested along with TLS certificates operator:

![image](https://github.com/canonical/mysql-router-k8s-operator/assets/99665202/3af37174-899a-40a9-8d2d-7c9a0170ce55)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants