Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPE-1794 Implement COS integration #93

Merged
merged 40 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
84af432
WIP: Initial implementation of COS with a lot of rough edges
shayancanonical Nov 20, 2023
a71cbd1
WIP: rough working prototype of cos integration
shayancanonical Nov 21, 2023
fb501f7
Replace secrets charm lib with published one
shayancanonical Nov 27, 2023
26c99d4
Merge branch 'main' into feature/cos_integration
shayancanonical Nov 27, 2023
bb7f341
WIP
shayancanonical Nov 27, 2023
79a5d90
Fix bug where credentials were passed to router options instead of th…
shayancanonical Nov 28, 2023
efea884
Address PR feedback
shayancanonical Dec 1, 2023
4a20fa7
Address PR feedback + update data_interfaces and cos_agent charm libs
shayancanonical Dec 1, 2023
927cffe
Merge branch 'main' into feature/cos_integration
shayancanonical Dec 1, 2023
05302d9
Add integration test for the exporter endpoint
shayancanonical Dec 4, 2023
feaa8ad
Increase timeout for exporter tests to avoid unnecessary timeout erro…
shayancanonical Dec 4, 2023
426bddf
Skip exporter tests on focal
shayancanonical Dec 4, 2023
88da3a9
Delete extra ) from CI workflow file
shayancanonical Dec 4, 2023
7a241ac
Try removing curly braces from integration test condition in ci.yaml
shayancanonical Dec 4, 2023
5cb2d56
Try using single quotes for matrix variable values in integration tes…
shayancanonical Dec 4, 2023
743604a
Try adding curly braces back + use single quotes instead of double qu…
shayancanonical Dec 4, 2023
0418ba5
Try to exclude matrix entry for exporter tests in focal
shayancanonical Dec 5, 2023
234bf0b
Add comment explaining why we skip exporter tests on focal
shayancanonical Dec 5, 2023
44f00cf
Import latest version of data_secrets lib and test focal compatibilit…
shayancanonical Mar 4, 2024
8027248
Merge branch 'main' into feature/cos_integration
shayancanonical Mar 4, 2024
3235040
Update outdated charm libs
shayancanonical Mar 4, 2024
b215877
Specify series for subordinate charms to avoid conflict with principa…
shayancanonical Mar 5, 2024
6689a35
Merge branch 'main' into feature/cos_integration
shayancanonical Mar 5, 2024
b16525e
Fix typos and issues from merge conflict resolution
shayancanonical Mar 6, 2024
a7e6c77
Reconcile all workloads (router, exporter, tls) in one method
shayancanonical Mar 7, 2024
908a011
Run code format
shayancanonical Mar 7, 2024
8ef349b
Address minor PR feedback
shayancanonical Mar 11, 2024
b0cba95
Update outdated charm libs
shayancanonical Mar 11, 2024
1776402
Address PR feedback
shayancanonical Mar 13, 2024
9d03646
Minor leftover improvements + fix bugs
shayancanonical Mar 13, 2024
d45df31
Fix typo in method call
shayancanonical Mar 13, 2024
635d807
Fix bugs + remove usage of data_secrets and use dynamic usage of secr…
shayancanonical Mar 14, 2024
d5b41ce
Another round of feedback + move abstracted secrets code to a separat…
shayancanonical Mar 14, 2024
bdb40a9
Address PR feedback
shayancanonical Mar 15, 2024
ade2bff
Leftovers cleanup
shayancanonical Mar 15, 2024
917d965
Fix bug introduced during refactor
shayancanonical Mar 15, 2024
6e76961
Address feedback + make database test more resilient by using block_u…
shayancanonical Mar 15, 2024
50044f0
Avoid using reconcile_services wrapper in workload
shayancanonical Mar 18, 2024
9665f3e
Make properties in abstract charm private as they will not be invoked…
shayancanonical Mar 18, 2024
7e49cd7
Instantiate database_provides in abstract_charm for type hinting
shayancanonical Mar 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
889 changes: 598 additions & 291 deletions lib/charms/data_platform_libs/v0/data_interfaces.py

Large diffs are not rendered by default.

144 changes: 0 additions & 144 deletions lib/charms/data_platform_libs/v0/data_secrets.py

This file was deleted.

103 changes: 45 additions & 58 deletions lib/charms/grafana_agent/v0/cos_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,14 +211,14 @@ def __init__(self, *args):
from collections import namedtuple
from itertools import chain
from pathlib import Path
from typing import TYPE_CHECKING, Any, Callable, ClassVar, Dict, List, Optional, Set, Union
from typing import TYPE_CHECKING, Any, Callable, ClassVar, Dict, List, Optional, Set, Tuple, Union

import pydantic
from cosl import GrafanaDashboard, JujuTopology
from cosl.rules import AlertRules
from ops.charm import RelationChangedEvent
from ops.framework import EventBase, EventSource, Object, ObjectEvents
from ops.model import Relation, Unit
from ops.model import Relation
from ops.testing import CharmType

if TYPE_CHECKING:
Expand All @@ -234,7 +234,7 @@ class _MetricsEndpointDict(TypedDict):

LIBID = "dc15fa84cef84ce58155fb84f6c6213a"
LIBAPI = 0
LIBPATCH = 7
LIBPATCH = 8

PYDEPS = ["cosl", "pydantic < 2"]

Expand All @@ -258,7 +258,9 @@ class CosAgentProviderUnitData(pydantic.BaseModel):
metrics_alert_rules: dict
log_alert_rules: dict
dashboards: List[GrafanaDashboard]
subordinate: Optional[bool]
# subordinate is no longer used but we should keep it until we bump the library to ensure
# we don't break compatibility.
subordinate: Optional[bool] = None

# The following entries may vary across units of the same principal app.
# this data does not need to be forwarded to the gagent leader
Expand All @@ -277,9 +279,9 @@ class CosAgentPeersUnitData(pydantic.BaseModel):
# We need the principal unit name and relation metadata to be able to render identifiers
# (e.g. topology) on the leader side, after all the data moves into peer data (the grafana
# agent leader can only see its own principal, because it is a subordinate charm).
principal_unit_name: str
principal_relation_id: str
principal_relation_name: str
unit_name: str
relation_id: str
relation_name: str

# The only data that is forwarded to the leader is data that needs to go into the app databags
# of the outgoing o11y relations.
Expand All @@ -299,7 +301,7 @@ def app_name(self) -> str:
TODO: Switch to using `model_post_init` when pydantic v2 is released?
https://github.com/pydantic/pydantic/issues/1729#issuecomment-1300576214
"""
return self.principal_unit_name.split("/")[0]
return self.unit_name.split("/")[0]


class COSAgentProvider(Object):
Expand Down Expand Up @@ -375,7 +377,6 @@ def _on_refresh(self, event):
dashboards=self._dashboards,
metrics_scrape_jobs=self._scrape_jobs,
log_slots=self._log_slots,
subordinate=self._charm.meta.subordinate,
)
relation.data[self._charm.unit][data.KEY] = data.json()
except (
Expand Down Expand Up @@ -468,12 +469,6 @@ class COSAgentRequirerEvents(ObjectEvents):
validation_error = EventSource(COSAgentValidationError)


class MultiplePrincipalsError(Exception):
"""Custom exception for when there are multiple principal applications."""

pass


class COSAgentRequirer(Object):
"""Integration endpoint wrapper for the Requirer side of the cos_agent interface."""

Expand Down Expand Up @@ -559,13 +554,13 @@ def _on_relation_data_changed(self, event: RelationChangedEvent):
if not (provider_data := self._validated_provider_data(raw)):
return

# Copy data from the principal relation to the peer relation, so the leader could
# Copy data from the cos_agent relation to the peer relation, so the leader could
# follow up.
# Save the originating unit name, so it could be used for topology later on by the leader.
data = CosAgentPeersUnitData( # peer relation databag model
principal_unit_name=event.unit.name,
principal_relation_id=str(event.relation.id),
principal_relation_name=event.relation.name,
unit_name=event.unit.name,
relation_id=str(event.relation.id),
relation_name=event.relation.name,
metrics_alert_rules=provider_data.metrics_alert_rules,
log_alert_rules=provider_data.log_alert_rules,
dashboards=provider_data.dashboards,
Expand All @@ -592,39 +587,7 @@ def trigger_refresh(self, _):
self.on.data_changed.emit() # pyright: ignore

@property
def _principal_unit(self) -> Optional[Unit]:
"""Return the principal unit for a relation.

Assumes that the relation is of type subordinate.
Relies on the fact that, for subordinate relations, the only remote unit visible to
*this unit* is the principal unit that this unit is attached to.
"""
if relations := self._principal_relations:
# Technically it's a list, but for subordinates there can only be one relation
principal_relation = next(iter(relations))
if units := principal_relation.units:
# Technically it's a list, but for subordinates there can only be one
return next(iter(units))

return None

@property
def _principal_relations(self):
relations = []
for relation in self._charm.model.relations[self._relation_name]:
if not json.loads(relation.data[next(iter(relation.units))]["config"]).get(
["subordinate"], False
):
relations.append(relation)
if len(relations) > 1:
logger.error(
"Multiple applications claiming to be principal. Update the cos-agent library in the client application charms."
)
raise MultiplePrincipalsError("Multiple principal applications.")
return relations

@property
def _remote_data(self) -> List[CosAgentProviderUnitData]:
def _remote_data(self) -> List[Tuple[CosAgentProviderUnitData, JujuTopology]]:
"""Return a list of remote data from each of the related units.

Assumes that the relation is of type subordinate.
Expand All @@ -641,7 +604,15 @@ def _remote_data(self) -> List[CosAgentProviderUnitData]:
continue
if not (provider_data := self._validated_provider_data(raw)):
continue
all_data.append(provider_data)

topology = JujuTopology(
model=self._charm.model.name,
model_uuid=self._charm.model.uuid,
application=unit.app.name,
unit=unit.name,
)

all_data.append((provider_data, topology))

return all_data

Expand Down Expand Up @@ -711,7 +682,7 @@ def metrics_alerts(self) -> Dict[str, Any]:
def metrics_jobs(self) -> List[Dict]:
"""Parse the relation data contents and extract the metrics jobs."""
scrape_jobs = []
for data in self._remote_data:
for data, topology in self._remote_data:
for job in data.metrics_scrape_jobs:
# In #220, relation schema changed from a simplified dict to the standard
# `scrape_configs`.
Expand All @@ -727,6 +698,22 @@ def metrics_jobs(self) -> List[Dict]:
"tls_config": {"insecure_skip_verify": True},
}

# Apply labels to the scrape jobs
for static_config in job.get("static_configs", []):
topo_as_dict = topology.as_dict(excluded_keys=["charm_name"])
static_config["labels"] = {
# Be sure to keep labels from static_config
**static_config.get("labels", {}),
# TODO: We should add a new method in juju_topology.py
# that like `as_dict` method, returns the keys with juju_ prefix
# https://github.com/canonical/cos-lib/issues/18
**{
"juju_{}".format(key): value
for key, value in topo_as_dict.items()
if value
},
}

scrape_jobs.append(job)

return scrape_jobs
Expand All @@ -735,7 +722,7 @@ def metrics_jobs(self) -> List[Dict]:
def snap_log_endpoints(self) -> List[SnapEndpoint]:
"""Fetch logging endpoints exposed by related snaps."""
plugs = []
for data in self._remote_data:
for data, _ in self._remote_data:
targets = data.log_slots
if targets:
for target in targets:
Expand Down Expand Up @@ -775,7 +762,7 @@ def logs_alerts(self) -> Dict[str, Any]:
model=self._charm.model.name,
model_uuid=self._charm.model.uuid,
application=app_name,
# For the topology unit, we could use `data.principal_unit_name`, but that unit
# For the topology unit, we could use `data.unit_name`, but that unit
# name may not be very stable: `_gather_peer_data` de-duplicates by app name so
# the exact unit name that turns up first in the iterator may vary from time to
# time. So using the grafana-agent unit name instead.
Expand Down Expand Up @@ -808,9 +795,9 @@ def dashboards(self) -> List[Dict[str, str]]:

dashboards.append(
{
"relation_id": data.principal_relation_id,
"relation_id": data.relation_id,
# We have the remote charm name - use it for the identifier
"charm": f"{data.principal_relation_name}-{app_name}",
"charm": f"{data.relation_name}-{app_name}",
"content": content,
"title": title,
}
Expand Down
Loading
Loading