Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-3689, DPE-4179] Expose read-write and read-only endpoints when related to data-integrator + TLS support #119

Merged
merged 34 commits into from
Apr 24, 2024

Conversation

shayancanonical
Copy link
Contributor

@shayancanonical shayancanonical commented Apr 4, 2024

Issue

  1. We are currently not reacting to external-node-connectivity that may be sent to mysql-router via data-interfaces when related with data-integrator
  2. We are not able to relate with the TLS operators
  3. We broke upgrades (Upgrades broken #130) when we implemented reconcile pattern for COS

Solution

  1. Support external-node-connectivity by binding to address 0.0.0.0 when related to data-integrator. Else, only expose router through a unix socket
  2. Add capability to relate to TLS operators
  3. Refactor to fix upgrades -- remaining integration test will be added in a follow up PR

Testing

Added integration tests for TLS + COS, TLS, and TLS + Data Integrator

todo

Change the charmed-mysql snap version to the official 8.0/edge version instead of a branch build

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some questions inside.
BTW: are we missing CA chain support on mysql-router? See canonical/mysql-operator#396 + canonical/mysql-k8s-operator#383

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Comment on lines 211 to 217
if self.is_exposed():
for port in (6446, 6447):
with socket.socket() as s:
assert s.connect_ex(("localhost", port)) == 0
else:
assert self._container.path("/run/mysqlrouter/mysql.sock").exists()
assert self._container.path("/run/mysqlrouter/mysqlro.sock").exists()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be in sync. When m-r related to data-integrator, users should expose data-integrator and not m-r.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taurus-forever how will data-integrator know which ports to expose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @carlcsaposs-canonical. Isn't it ideal that mysqlrouter exposes its own ports and then unexpose when the relation with data integrator is removed? Since they are on the same machine, it is easiest if router exposes it own ports

src/machine_charm.py Outdated Show resolved Hide resolved
Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the tests. Minor comments

src/snap.py Show resolved Hide resolved
src/relations/database_provides.py Outdated Show resolved Hide resolved
src/machine_charm.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
poetry.lock Show resolved Hide resolved
src/relations/tls.py Outdated Show resolved Hide resolved
src/relations/tls.py Outdated Show resolved Hide resolved
[data.get("external-node-connectivity") == "true" for data in relation_data.values()]
)

def reconcile_ports(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this doesn't belong in database_provides since it's not part of the database_provides relation

also, this should maybe be merged with https://github.com/canonical/mysql-router-k8s-operator/blob/481b23de75a1662210201294ab1efbd952fe4bd0/src/kubernetes_charm.py#L250-L252
also, do ports 6448 and 6449 need to be opened?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo ports 6448 and 6449 are not going to be used externally (they're related to X protocol) and so should not be exposed

removed from database_provides in ee4b098

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/relations/tls.py Outdated Show resolved Hide resolved
src/machine_workload.py Outdated Show resolved Hide resolved
Comment on lines 211 to 217
if self.is_exposed():
for port in (6446, 6447):
with socket.socket() as s:
assert s.connect_ex(("localhost", port)) == 0
else:
assert self._container.path("/run/mysqlrouter/mysql.sock").exists()
assert self._container.path("/run/mysqlrouter/mysqlro.sock").exists()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taurus-forever how will data-integrator know which ports to expose?

pyproject.toml Outdated Show resolved Hide resolved
[data.get("external-node-connectivity") == "true" for data in relation_data.values()]
)

def reconcile_ports(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poetry.lock Outdated
@@ -989,19 +1073,22 @@ signedtoken = ["cryptography (>=3.0.0)", "pyjwt (>=2.0.0,<3)"]

[[package]]
name = "ops"
version = "2.6.0"
version = "2.12.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops is 2.12, not 2.8

#119 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to 2.12.0 in 0cf8b71

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was canonical/operator#1091 (comment) addressed? from comment linked above

if not, please revert to <2.10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted to < 2.10.0 in 0a36768. the linked comment seems like a separate issue than the goal of this PR and should be addressed separately

pyproject.toml Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/machine_charm.py Show resolved Hide resolved
src/machine_charm.py Outdated Show resolved Hide resolved
src/relations/database_provides.py Outdated Show resolved Hide resolved
src/relations/database_provides.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
src/workload.py Outdated Show resolved Hide resolved
Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks great! few minor potential issues

pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
Comment on lines 211 to 215
@abc.abstractmethod
def wait_until_mysql_router_ready(self) -> None:
"""Wait until a connection to MySQL Router is possible.

Retry every 5 seconds for up to 30 seconds.
"""
logger.debug("Waiting until MySQL Router is ready")
self.unit.status = ops.MaintenanceStatus("MySQL Router starting")
try:
for attempt in tenacity.Retrying(
reraise=True,
stop=tenacity.stop_after_delay(30),
wait=tenacity.wait_fixed(5),
):
with attempt:
for port in (6446, 6447):
with socket.socket() as s:
assert s.connect_ex(("localhost", port)) == 0
except AssertionError:
logger.exception("Unable to connect to MySQL Router")
raise
else:
logger.debug("MySQL Router is ready")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps leave this as concrete implementation (if can be used in k8s)

and then override in vm with super() call if exposed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rather have them be methods in the subclass if they are separate in implementation (the vm charm needs to use sockets to test if the charm is not exposed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the check is identical to k8s if the charm is exposed, right?

then you can do something like this

# abstract_charm.py
        logger.debug("Waiting until MySQL Router is ready")
        self.unit.status = ops.MaintenanceStatus("MySQL Router starting")
        try:
            for attempt in tenacity.Retrying(
                reraise=True,
                stop=tenacity.stop_after_delay(30),
                wait=tenacity.wait_fixed(5),
            ):
                with attempt:
                    for port in (6446, 6447):
                        with socket.socket() as s:
                            assert s.connect_ex(("localhost", port)) == 0
        except AssertionError:
            logger.exception("Unable to connect to MySQL Router")
            raise
        else:
            logger.debug("MySQL Router is ready")
# machine_charm.py
if self.is_exposed():
    super().wait_until_mysql_router_ready()
else:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will address in a followup

src/abstract_charm.py Outdated Show resolved Hide resolved
src/snap.py Show resolved Hide resolved
if not self._container.mysql_router_service_enabled:
is_charm_exposed = self._charm.is_exposed
socket_file_exists = self._container.path("/run/mysqlrouter/mysql.sock").exists()
require_rebootstrap = is_charm_exposed == socket_file_exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really nice!

src/workload.py Show resolved Hide resolved
Comment on lines 71 to 72
if not self._charm.is_exposed:
self._update_configured_socket_file_locations_and_bind_address()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this affect future re-bootstraps if charm is bootstrapped with is_exposed = False and then with is_exposed = True?

see other comment about deleting files before re-bootstrap

Comment on lines 205 to 208
relation_data = self._interface.fetch_relation_data(fields=["external-node-connectivity"])
return any(
[data.get("external-node-connectivity") == "true" for data in relation_data.values()]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please sync with k8s code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synced in 1b0bb19

@shayancanonical shayancanonical changed the title [DPE-3689] Expose read-write and read-only endpoints when related to data-integrator + TLS support [DPE-3689, DPE-4179] Expose read-write and read-only endpoints when related to data-integrator + TLS support Apr 24, 2024
pyproject.toml Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
src/abstract_charm.py Outdated Show resolved Hide resolved
src/snap.py Show resolved Hide resolved
src/workload.py Outdated
Comment on lines 325 to 354
def _enable_router(self, *, tls: bool, unit_name: str) -> None:
"""Enable router after setting up all the necessary prerequisites."""
logger.debug("Enabling MySQL Router service")
self._cleanup_after_upgrade_or_potential_container_restart()
# create an empty credentials file, if the file does not exist
self._container.create_router_rest_api_credentials_file()
self._bootstrap_router(tls=tls)
self.shell.add_attributes_to_mysql_router_user(
username=self._router_username, router_id=self._router_id, unit_name=unit_name
)
self._container.update_mysql_router_service(enabled=True, tls=tls)
self._logrotate.enable()
logger.debug("Enabled MySQL Router service")
self._charm.wait_until_mysql_router_ready()

def _enable_exporter(
self, *, tls: bool, exporter_config: "relations.cos.ExporterConfig"
) -> None:
"""Enable the mysqlrouter exporter."""
logger.debug("Enabling MySQL Router exporter service")
self.setup_monitoring_user()
self._container.update_mysql_router_exporter_service(
enabled=True,
config=exporter_config,
tls=tls,
key_filename=str(self._tls_key_file),
certificate_filename=str(self._tls_certificate_file),
certificate_authority_filename=str(self._tls_certificate_authority_file),
)
logger.debug("Enabled MySQL Router exporter service")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: could this be simplified by moving back into reconcile?

and then reconcile would look like

if require reboostrap:
    disable
if router not enabled:
    enable
if exporter not enabled:
    enabled

and upgrade would look like

disable
self.reconcile()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that this simplification should be handled separately (if at all) -- likely when we add integration tests for upgrade (which i hope to add as soon as this PR is merged)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add this in a follow-up PR? I think moving back into reconcile will make the code easier to maintain, specifically it alleviates concerns with _enable_router getting called without "all the necessary prerequisites"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will address in a followup

Comment on lines 113 to 114
def is_externally_accessible(self, event=None) -> typing.Optional[bool]:
"""Whether router is externally accessible"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is this only used by vm charm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is only used by the vm charm. but needs to be defined in abstract_charm because reconcile in workload calls to it. in vm, it will return True/False, in k8s it will return None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you mention that it's only used by machine charm in docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 0978187

src/machine_charm.py Outdated Show resolved Hide resolved
src/machine_logrotate.py Outdated Show resolved Hide resolved
"""The exposed read-only endpoint"""

@abc.abstractmethod
def is_externally_accessible(self, event=None) -> typing.Optional[bool]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove default None since event should always be passed (even if in reconcile, the value of event is none because it's not a relation-breaking event, that value should always be passed here—this function should never be called without an event arg I believe)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 0978187

Comment on lines 86 to 87
def is_externally_accessible(self, event=None) -> typing.Optional[bool]:
return self._database_provides.external_connectivity(event)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here with event should always be passed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 0978187

wait=tenacity.wait_fixed(5),
):
with attempt:
if self.is_externally_accessible():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does event need to be passed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 0978187

src/workload.py Outdated
Comment on lines 325 to 354
def _enable_router(self, *, tls: bool, unit_name: str) -> None:
"""Enable router after setting up all the necessary prerequisites."""
logger.debug("Enabling MySQL Router service")
self._cleanup_after_upgrade_or_potential_container_restart()
# create an empty credentials file, if the file does not exist
self._container.create_router_rest_api_credentials_file()
self._bootstrap_router(tls=tls)
self.shell.add_attributes_to_mysql_router_user(
username=self._router_username, router_id=self._router_id, unit_name=unit_name
)
self._container.update_mysql_router_service(enabled=True, tls=tls)
self._logrotate.enable()
logger.debug("Enabled MySQL Router service")
self._charm.wait_until_mysql_router_ready()

def _enable_exporter(
self, *, tls: bool, exporter_config: "relations.cos.ExporterConfig"
) -> None:
"""Enable the mysqlrouter exporter."""
logger.debug("Enabling MySQL Router exporter service")
self.setup_monitoring_user()
self._container.update_mysql_router_exporter_service(
enabled=True,
config=exporter_config,
tls=tls,
key_filename=str(self._tls_key_file),
certificate_filename=str(self._tls_certificate_file),
certificate_authority_filename=str(self._tls_certificate_authority_file),
)
logger.debug("Enabled MySQL Router exporter service")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add this in a follow-up PR? I think moving back into reconcile will make the code easier to maintain, specifically it alleviates concerns with _enable_router getting called without "all the necessary prerequisites"

@carlcsaposs-canonical
Copy link
Contributor

only blocking comment is #119 (comment); otherwise approved

@shayancanonical shayancanonical dismissed carlcsaposs-canonical’s stale review April 24, 2024 19:45

Addressed all pending feedback and also received Carl's written approval

@shayancanonical shayancanonical merged commit cecb229 into main Apr 24, 2024
30 checks passed
@shayancanonical shayancanonical deleted the feature/external_connectivity branch April 24, 2024 19:45
@shayancanonical shayancanonical mentioned this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants