Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-5637, DPE-5276] Implement expose-external config option with values false (clusterip), nodeport and loadbalancer #328

Merged
merged 25 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
5931e15
WIP: Implement expose-external config option with values false (clust…
shayancanonical Oct 10, 2024
7cac8af
Manually tested expose-external config + some code cleanup
shayancanonical Oct 14, 2024
c1459f2
Fix some issues + implement integration tests for expose-external
shayancanonical Oct 15, 2024
f471813
Add missing config file
shayancanonical Oct 15, 2024
902509f
Remove redundant action return code check
shayancanonical Oct 15, 2024
d115742
Add workaround for juju client errors when patching svc to loadbalanc…
shayancanonical Oct 16, 2024
681a5c1
Merge branch 'main' into feature/expose-external
shayancanonical Oct 16, 2024
b60e5e3
Standardize abstract_charm.py with the VM charm
shayancanonical Oct 16, 2024
0c048db
Update outdated tracing charm libs
shayancanonical Oct 16, 2024
ffc728a
Address PR feedback
shayancanonical Oct 17, 2024
d76b325
Merge branch 'main' into feature/expose-external
shayancanonical Oct 17, 2024
ee7bccb
Avoid waiting for service and let further events update endpoints onc…
shayancanonical Oct 18, 2024
ab49776
Appropriately set app status when invalid expose-external value
shayancanonical Oct 18, 2024
f938351
Address PR feedback + update outdate charm tracing lib
shayancanonical Oct 21, 2024
7254856
Add check for None when declaring variable
shayancanonical Oct 21, 2024
4d33cbb
Update tests to retry connections to endpoints after it fails
shayancanonical Oct 22, 2024
c6f6869
Test against EKS + make updates to be complaint with DA122
shayancanonical Nov 22, 2024
e7a9bf1
Merge branch 'main' into feature/expose-external
shayancanonical Nov 22, 2024
4006177
Adjust scenario to include new router peer relation
shayancanonical Nov 22, 2024
887317d
Fix lint warnings
shayancanonical Nov 22, 2024
b9575c3
Revert charm tracing lib to libpatch 2
shayancanonical Nov 22, 2024
c5df314
Update TLS SANs to include DNS for reaching the unit directly
shayancanonical Nov 25, 2024
369f8b6
Add K8s service SANs (the one created by juju)
shayancanonical Nov 26, 2024
fb65030
Address PR feedback
shayancanonical Nov 26, 2024
66138e3
Update data platform workflows to v23.1.0
shayancanonical Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ jobs:

build:
name: Build charm
uses: canonical/data-platform-workflows/.github/workflows/build_charm.yaml@v23.0.5
uses: canonical/data-platform-workflows/.github/workflows/build_charm.yaml@v23.1.0
with:
cache: true

Expand Down Expand Up @@ -96,7 +96,7 @@ jobs:
- lint
- unit-test
- build
uses: canonical/data-platform-workflows/.github/workflows/integration_test_charm.yaml@v23.0.5
uses: canonical/data-platform-workflows/.github/workflows/integration_test_charm.yaml@v23.1.0
with:
artifact-prefix: ${{ needs.build.outputs.artifact-prefix }}
architecture: ${{ matrix.architecture }}
Expand All @@ -106,5 +106,6 @@ jobs:
juju-snap-channel: ${{ matrix.juju.snap_channel }}
libjuju-version-constraint: ${{ matrix.juju.libjuju }}
_beta_allure_report: ${{ matrix.juju.allure_on_amd64 && matrix.architecture == 'amd64' }}
metallb-addon: true
permissions:
contents: write # Needed for Allure Report beta
4 changes: 2 additions & 2 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,14 @@ jobs:

build:
name: Build charm
uses: canonical/data-platform-workflows/.github/workflows/build_charm.yaml@v23.0.5
uses: canonical/data-platform-workflows/.github/workflows/build_charm.yaml@v23.1.0

release:
name: Release charm
needs:
- ci-tests
- build
uses: canonical/data-platform-workflows/.github/workflows/release_charm.yaml@v23.0.5
uses: canonical/data-platform-workflows/.github/workflows/release_charm.yaml@v23.1.0
with:
channel: 8.0/edge
artifact-prefix: ${{ needs.build.outputs.artifact-prefix }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/sync_docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ on:
jobs:
sync-docs:
name: Sync docs from Discourse
uses: canonical/data-platform-workflows/.github/workflows/sync_docs.yaml@v23.0.5
uses: canonical/data-platform-workflows/.github/workflows/sync_docs.yaml@v23.1.0
with:
reviewers: a-velasco,izmalk
permissions:
Expand Down
17 changes: 17 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2024 Canonical Ltd.
# See LICENSE file for licensing details.

options:
expose-external:
description: |
String to determine how to expose the MySQLRouter externally from the Kubernetes cluster.
Possible values: 'false', 'nodeport', 'loadbalancer'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: default false may imply true is valid value. Change to something else? no none

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulomach any ideas for alternatives? expose-external: none? expose-external: clusterip?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additionally, the possible values are false and nodeport in kafka-k8s. it may not be a good idea to introduce inconsistencies

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh bummer, ok maybe for another time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, it seems it was a Marc nitpick also that Mykola did not responded to in the original kafka spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default false may imply true is valid value. Change to something else? no none

👋🏻 I subscribe Paulo's comment here.

I understand we are "compromising" in order to have a similar interface across DP charms (i.e. kafka-k8s), but it is confusing and we all should make an effort to reduce these occurrences in the future.

type: string
default: "false"

loadbalancer-extra-annotations:
description: |
A JSON string representing extra annotations for the Kubernetes service created
and managed by the charm.
type: string
default: "{}"
15 changes: 8 additions & 7 deletions lib/charms/tempo_coordinator_k8s/v0/charm_tracing.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ def _remove_stale_otel_sdk_packages():
# Increment this PATCH version before using `charmcraft publish-lib` or reset
# to 0 if you are raising the major API version

LIBPATCH = 1
LIBPATCH = 2

PYDEPS = ["opentelemetry-exporter-otlp-proto-http==1.21.0"]

Expand Down Expand Up @@ -371,10 +371,6 @@ class UntraceableObjectError(TracingError):
"""Raised when an object you're attempting to instrument cannot be autoinstrumented."""


class TLSError(TracingError):
"""Raised when the tracing endpoint is https but we don't have a cert yet."""


def _get_tracing_endpoint(
tracing_endpoint_attr: str,
charm_instance: object,
Expand Down Expand Up @@ -484,10 +480,15 @@ def wrap_init(self: CharmBase, framework: Framework, *args, **kwargs):
)

if tracing_endpoint.startswith("https://") and not server_cert:
raise TLSError(
logger.error(
"Tracing endpoint is https, but no server_cert has been passed."
"Please point @trace_charm to a `server_cert` attr."
"Please point @trace_charm to a `server_cert` attr. "
"This might also mean that the tracing provider is related to a "
"certificates provider, but this application is not (yet). "
"In that case, you might just have to wait a bit for the certificates "
"integration to settle. "
)
return

exporter = OTLPSpanExporter(
endpoint=tracing_endpoint,
Expand Down
24 changes: 16 additions & 8 deletions lib/charms/tempo_coordinator_k8s/v0/tracing.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def __init__(self, *args):
`TracingEndpointRequirer.request_protocols(*protocol:str, relation:Optional[Relation])` method.
Using this method also allows you to use per-relation protocols.

Units of provider charms obtain the tempo endpoint to which they will push their traces by calling
Units of requirer charms obtain the tempo endpoint to which they will push their traces by calling
`TracingEndpointRequirer.get_endpoint(protocol: str)`, where `protocol` is, for example:
- `otlp_grpc`
- `otlp_http`
Expand All @@ -44,7 +44,10 @@ def __init__(self, *args):
If the `protocol` is not in the list of protocols that the charm requested at endpoint set-up time,
the library will raise an error.

## Requirer Library Usage
We recommend that you scale up your tracing provider and relate it to an ingress so that your tracing requests
go through the ingress and get load balanced across all units. Otherwise, if the provider's leader goes down, your tracing goes down.

## Provider Library Usage

The `TracingEndpointProvider` object may be used by charms to manage relations with their
trace sources. For this purposes a Tempo-like charm needs to do two things
Expand Down Expand Up @@ -107,7 +110,7 @@ def __init__(self, *args):

# Increment this PATCH version before using `charmcraft publish-lib` or reset
# to 0 if you are raising the major API version
LIBPATCH = 1
LIBPATCH = 3

PYDEPS = ["pydantic"]

Expand Down Expand Up @@ -985,11 +988,16 @@ def charm_tracing_config(
is_https = endpoint.startswith("https://")

if is_https:
if cert_path is None:
raise TracingError("Cannot send traces to an https endpoint without a certificate.")
elif not Path(cert_path).exists():
# if endpoint is https BUT we don't have a server_cert yet:
# disable charm tracing until we do to prevent tls errors
if cert_path is None or not Path(cert_path).exists():
# disable charm tracing until we obtain a cert to prevent tls errors
logger.error(
"Tracing endpoint is https, but no server_cert has been passed."
"Please point @trace_charm to a `server_cert` attr. "
"This might also mean that the tracing provider is related to a "
"certificates provider, but this application is not (yet). "
"In that case, you might just have to wait a bit for the certificates "
"integration to settle. "
)
return None, None
return endpoint, str(cert_path)
else:
Expand Down
3 changes: 1 addition & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ tenacity = "^8.5.0"
allure-pytest = "^2.13.5"
allure-pytest-collection-report = {git = "https://github.com/canonical/data-platform-workflows", tag = "v23.0.2", subdirectory = "python/pytest_plugins/allure_pytest_collection_report"}


[tool.coverage.run]
branch = true

Expand Down
59 changes: 48 additions & 11 deletions src/abstract_charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ def __init__(self, *args) -> None:
self._database_requires = relations.database_requires.RelationEndpoint(self)
self._database_provides = relations.database_provides.RelationEndpoint(self)
self._cos_relation = relations.cos.COSRelation(self, self._container)
self._ha_cluster = None
self.framework.observe(self.on.update_status, self.reconcile)
self.framework.observe(
self.on[upgrade.PEER_RELATION_ENDPOINT_NAME].relation_changed, self.reconcile
Expand Down Expand Up @@ -99,23 +100,29 @@ def _logrotate(self) -> logrotate.LogRotate:

@property
@abc.abstractmethod
def _read_write_endpoint(self) -> str:
def _read_write_endpoints(self) -> str:
"""MySQL Router read-write endpoint"""

@property
@abc.abstractmethod
def _read_only_endpoint(self) -> str:
def _read_only_endpoints(self) -> str:
"""MySQL Router read-only endpoint"""

@property
@abc.abstractmethod
def _exposed_read_write_endpoint(self) -> str:
"""The exposed read-write endpoint"""
def _exposed_read_write_endpoint(self) -> typing.Optional[str]:
"""The exposed read-write endpoint.

Only defined in vm charm.
"""

@property
@abc.abstractmethod
def _exposed_read_only_endpoint(self) -> str:
"""The exposed read-only endpoint"""
def _exposed_read_only_endpoint(self) -> typing.Optional[str]:
"""The exposed read-only endpoint.

Only defined in vm charm.
"""

@abc.abstractmethod
def is_externally_accessible(self, *, event) -> typing.Optional[bool]:
Expand All @@ -124,6 +131,14 @@ def is_externally_accessible(self, *, event) -> typing.Optional[bool]:
Only defined in vm charm to return True/False. In k8s charm, returns None.
"""

@property
@abc.abstractmethod
def _status(self) -> ops.StatusBase:
"""Status of the charm.

Only applies to Kubernetes charm
"""

@property
def _tls_certificate_saved(self) -> bool:
"""Whether a TLS certificate is available to use"""
Expand Down Expand Up @@ -201,6 +216,8 @@ def _determine_app_status(self, *, event) -> ops.StatusBase:
# (Relations should not be modified during upgrade.)
return upgrade_status
statuses = []
if self._status:
statuses.append(self._status)
for endpoint in (self._database_requires, self._database_provides):
if status := endpoint.get_status(event):
statuses.append(status)
Expand All @@ -212,6 +229,9 @@ def _determine_unit_status(self, *, event) -> ops.StatusBase:
workload_status = self.get_workload(event=event).status
if self._upgrade:
statuses.append(self._upgrade.get_unit_juju_status(workload_status=workload_status))
# only in machine charms
if self._ha_cluster:
statuses.append(self._ha_cluster.get_unit_juju_status())
statuses.append(workload_status)
return self._prioritize_statuses(statuses)

Expand All @@ -232,19 +252,30 @@ def wait_until_mysql_router_ready(self, *, event) -> None:
"""

@abc.abstractmethod
def _reconcile_node_port(self, *, event) -> None:
"""Reconcile node port.
def _reconcile_service(self) -> None:
"""Reconcile service.

Only applies to Kubernetes charm
"""

def _check_service_connectivity(self) -> bool:
"""Check if the service is available (connectable with a socket).

Returns false if Machine charm. Overridden in Kubernetes charm.
"""
return False

@abc.abstractmethod
def _reconcile_ports(self, *, event) -> None:
"""Reconcile exposed ports.

Only applies to Machine charm
"""

@abc.abstractmethod
def _update_endpoints(self) -> None:
"""Update the endpoints in the provider relation if necessary."""

# =======================
# Handlers
# =======================
Expand Down Expand Up @@ -311,6 +342,10 @@ def reconcile(self, event=None) -> None: # noqa: C901
f"{self._cos_relation.is_relation_breaking(event)=}"
)

# only in machine charms
if self._ha_cluster:
self._ha_cluster.set_vip(self.config.get("vip"))

try:
if self._unit_lifecycle.authorized_leader:
if self._database_requires.is_relation_breaking(event):
Expand All @@ -324,15 +359,17 @@ def reconcile(self, event=None) -> None: # noqa: C901
and isinstance(workload_, workload.AuthenticatedWorkload)
and workload_.container_ready
):
self._reconcile_node_port(event=event)
self._reconcile_service()
shayancanonical marked this conversation as resolved.
Show resolved Hide resolved
self._database_provides.reconcile_users(
event=event,
router_read_write_endpoint=self._read_write_endpoint,
router_read_only_endpoint=self._read_only_endpoint,
router_read_write_endpoints=self._read_write_endpoints,
router_read_only_endpoints=self._read_only_endpoints,
exposed_read_write_endpoint=self._exposed_read_write_endpoint,
exposed_read_only_endpoint=self._exposed_read_only_endpoint,
shell=workload_.shell,
)
self._update_endpoints()

if workload_.container_ready:
workload_.reconcile(
event=event,
Expand Down
Loading
Loading