Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt testcontainers to be testing framework agnostic #82

Merged
merged 10 commits into from
Nov 24, 2023
141 changes: 126 additions & 15 deletions cratedb_toolkit/testing/testcontainers/cratedb.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is for Python 3.7 and 3.8 to support generic types
# like `dict` instead of `typing.Dict
from __future__ import annotations

import logging
import os
from typing import Optional
Expand All @@ -19,6 +23,7 @@
from testcontainers.core.waiting_utils import wait_container_is_ready, wait_for_logs

from cratedb_toolkit.testing.testcontainers.util import KeepaliveContainer, asbool
from cratedb_toolkit.util import DatabaseAdapter

logger = logging.getLogger(__name__)

Expand All @@ -38,6 +43,7 @@
>>> import sqlalchemy

>>> cratedb_container = CrateDBContainer("crate:5.2.3")
>>> cratedb_container.start()
>>> with cratedb_container as cratedb:
... engine = sqlalchemy.create_engine(cratedb.get_connection_url())
... with engine.begin() as connection:
Expand All @@ -51,44 +57,91 @@
CRATEDB_PASSWORD = os.environ.get("CRATEDB_PASSWORD", "")
CRATEDB_DB = os.environ.get("CRATEDB_DB", "doc")
KEEPALIVE = asbool(os.environ.get("CRATEDB_KEEPALIVE", os.environ.get("TC_KEEPALIVE", False)))
CMD_OPTS = {
"discovery.type": "single-node",
"node.attr.storage": "hot",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a partial hot/cold targeted custom setup which should be irrelevant for generic test clusters and especially in a single-node setup.

@amotl Any reason for this configuration? (as it was in the codebase already).

Suggested change
"node.attr.storage": "hot",

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, well spotted. It is indeed a configuration detail which is specific to the test cases for cratedb_toolkit.retention, where this code was initially conceived for.

In this spirit, it should not be part of the generic startup parameters, but at the same time, it shows we need the capacity to configure those details when needed.

@pilosus: Do you think we can improve this spot, so that corresponding configuration settings can be defined on behalf of the snippet in conftest.py? This time, it will probably not be so easy, because the test adapter will already need this information at startup time. Maybe you have an idea how to handle this elegantly?

While being at it: Of course, it would not just be about the specific node.attr.storage parameter, but about any other parameters as well.

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time, it will probably not be so easy, because the test adapter will already need this information at startup time. Maybe you have an idea how to handle this elegantly?

While being at it: Of course, it would not just be about the specific node.attr.storage parameter, but about any other parameters as well.

Ah, I see you already added cmd_opts to the constructor. 🙇

https://github.com/crate-workbench/cratedb-toolkit/blob/661370ffb0619e2a4c698c52627c99a1fb726bad/cratedb_toolkit/testing/testcontainers/cratedb.py#L94-L95

So, {"node.attr.storage": "hot", "path.repo": "/tmp/snapshots"} would just need to be moved over to the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. And given that we use dict merge, you can even override the default settings

In [1]: from cratedb_toolkit.testing.testcontainers.cratedb  import CrateDBContainer

In [2]: c = CrateDBContainer(cmd_opts={"node.attr.storage": "cold"})

In [3]: c._command
Out[3]: '-Cdiscovery.type=single-node -Cnode.attr.storage=cold -Cpath.repo=/tmp/snapshots'

I'll remove the node.attr.storage from the default though

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. What do you think about path.repo, @seut? It could be convenient for testing to have it configured by default. Because you didn't mention it in your request, do you think it can stay?

Copy link
Contributor Author

@pilosus pilosus Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seut @amotl hmm, when I delete node.attr.storage=hot
tests:

  • tests/retention/test_cli.py::test_run_delete_basic
  • tests/retention/test_cli.py::test_run_delete_dryrun
  • tests/retention/test_cli.py::test_run_reallocate
    hang.

Here are the logs:

2023-11-24 17:39:14,494 [cratedb_toolkit.retention.setup.schema] INFO    : Installing retention policy bookkeeping table at database 'crate://crate:REDACTED@localhost:33018', table TableAddress(schema='testdrive-ext', table='retention_policy')
2023-11-24 17:39:14,902 [cratedb_toolkit.retention.store     ] INFO    : Connecting to database crate://crate:REDACTED@localhost:33018, table "testdrive-ext"."retention_policy"
Waiting to be ready...
2023-11-24 17:39:15,366 [testcontainers.core.waiting_utils   ] INFO    : Waiting to be ready...

Interestingly enough, test_run_delete_with_tags_match works well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the test cases defined for cratedb_toolkit.retention need this setting to be configured, so it will need to go into tests/conftest.py somehow. On the other hand, it should not be part of the generic configuration. That's yet another cliff we need to take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the whole chain of the nested fixtures with different scopes, it will either require some more time from me next week, or a simpler solution with cratedb override for tests/retention/conftest.py that have node.attr.storage=hot in it.

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can easily handle that on a later iteration, and/or discuss possible solutions beforehand. Thank you.

"path.repo": "/tmp/snapshots",
}

# TODO: Dual-port use with 4200+5432.
def __init__(
self,
image: str = "crate/crate:nightly",
port: int = 4200,
ports: Optional[dict] = None,
user: Optional[str] = None,
password: Optional[str] = None,
dbname: Optional[str] = None,
dialect: str = "crate",
cmd_opts: Optional[dict] = None,
**kwargs,
) -> None:
"""
:param image: docker hub image path with optional tag
:param ports: optional dict that maps a port inside the container to a port on the host machine;
`None` as a map value generates a random port;
Dicts are ordered. By convention the first key-val pair is designated to the HTTP interface.
Example: {4200: None, 5432: 15432} - port 4200 inside the container will be mapped
to a random port on the host, internal port 5432 for PSQL interface will be mapped
to the 15432 port on the host.
:param user: optional username to access the DB; if None, try `CRATEDB_USER` environment variable
:param password: optional password to access the DB; if None, try `CRATEDB_PASSWORD` environment variable
:param dbname: optional database name to access the DB; if None, try `CRATEDB_DB` environment variable
:param cmd_opts: an optional dict with CLI arguments to be passed to the DB entrypoint inside the container
:param kwargs: misc keyword arguments
"""
super().__init__(image=image, **kwargs)

self._name = "testcontainers-cratedb" # -{os.getpid()}
self._command = "-Cdiscovery.type=single-node -Ccluster.routing.allocation.disk.threshold_enabled=false"
# TODO: Generalize by obtaining more_opts from caller.
self._command += " -Cnode.attr.storage=hot"
self._command += " -Cpath.repo=/tmp/snapshots"
self._name = "testcontainers-cratedb"

cmd_opts = cmd_opts or {}
self._command = self._build_cmd({**self.CMD_OPTS, **cmd_opts})

self.CRATEDB_USER = user or self.CRATEDB_USER
self.CRATEDB_PASSWORD = password or self.CRATEDB_PASSWORD
self.CRATEDB_DB = dbname or self.CRATEDB_DB

self.port_to_expose = port
self.dialect = dialect

def _configure(self) -> None:
self.with_exposed_ports(self.port_to_expose)
self.port_mapping = ports if ports else {4200: None}
self.port_to_expose, _ = list(self.port_mapping.items())[0]

@staticmethod
def _build_cmd(opts: dict) -> str:
"""
Return a string with command options concatenated and optimised for ES5 use
"""
cmd = []
for key, val in opts.items():
if isinstance(val, bool):
val = str(val).lower()

Check warning on line 112 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L112

Added line #L112 was not covered by tests
cmd.append(f"-C{key}={val}")
return " ".join(cmd)

def _configure_ports(self) -> None:
"""
Bind all the ports exposed inside the container to the same port on the host
"""
# If host_port is `None`, a random port to be generated
for container_port, host_port in self.port_mapping.items():
self.with_bind_ports(container=container_port, host=host_port)

def _configure_credentials(self) -> None:
self.with_env("CRATEDB_USER", self.CRATEDB_USER)
self.with_env("CRATEDB_PASSWORD", self.CRATEDB_PASSWORD)
self.with_env("CRATEDB_DB", self.CRATEDB_DB)

def get_connection_url(self, host=None) -> str:
def _configure(self) -> None:
self._configure_ports()
self._configure_credentials()

def get_connection_url(self, dialect: str = "crate", host: Optional[str] = None) -> str:
"""
Return a connection URL to the DB

:param host: optional string
:param dialect: a string with the dialect name to generate a DB URI
:return: string containing a connection URL to te DB
"""
# TODO: When using `db_name=self.CRATEDB_DB`:
# Connection.__init__() got an unexpected keyword argument 'database'
return super()._create_connection_url(
dialect=self.dialect,
dialect=dialect,
username=self.CRATEDB_USER,
password=self.CRATEDB_PASSWORD,
host=host,
Expand All @@ -101,3 +154,61 @@
# In `testcontainers-java`, there is the `HttpWaitStrategy`.
# TODO: Provide a client instance.
wait_for_logs(self, predicate="o.e.n.Node.*started", timeout=MAX_TRIES)


class CrateDBTestAdapter:
"""
A little helper wrapping Testcontainer's `CrateDBContainer` and
CrateDB Toolkit's `DatabaseAdapter`, agnostic of the test framework.
"""

def __init__(self, crate_version: str = "nightly", **kwargs):
self.cratedb: Optional[CrateDBContainer] = None
self.database: Optional[DatabaseAdapter] = None
self.image: str = "crate/crate:{}".format(crate_version)

def start(self, **kwargs):
"""
Start testcontainer, used for tests set up
"""
self.cratedb = CrateDBContainer(image=self.image, **kwargs)
self.cratedb.start()
self.database = DatabaseAdapter(dburi=self.get_connection_url())

def stop(self):
"""
Stop testcontainer, used for tests tear down
"""
if self.cratedb:
self.cratedb.stop()

def reset(self, tables: Optional[list] = None):
"""
Drop tables from the given list, used for tests set up or tear down
"""
if tables and self.database:
for reset_table in tables:
self.database.connection.exec_driver_sql(f"DROP TABLE IF EXISTS {reset_table};")

def get_connection_url(self, *args, **kwargs):
"""
Return a URL for SQLAlchemy DB engine
"""
if self.cratedb:
return self.cratedb.get_connection_url(*args, **kwargs)
return None

Check warning on line 199 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L199

Added line #L199 was not covered by tests

def get_http_url(self, **kwargs):
"""
Return a URL for CrateDB's HTTP endpoint
"""
return self.get_connection_url(dialect="http", **kwargs)

Check warning on line 205 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L205

Added line #L205 was not covered by tests

@property
def http_url(self):
"""
Return a URL for CrateDB's HTTP endpoint.

Used to stay backward compatible with the downstream code.
"""
return self.get_http_url()

Check warning on line 214 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L214

Added line #L214 was not covered by tests
2 changes: 1 addition & 1 deletion doc/sandbox.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ source .venv/bin/activate

Install project in sandbox mode.
```shell
pip install --editable='.[develop,test]'
pip install --editable='.[io,test,develop]'
```

Run tests. `TC_KEEPALIVE` keeps the auxiliary service containers running, which
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,8 @@ extend-ignore = [
"RET504",
# Unnecessary `elif` after `return` statement
"RET505",
# Probable insecure usage of temporary file or directory
"S108",
]

extend-exclude = [
Expand Down
45 changes: 9 additions & 36 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,14 @@
import pytest
import responses

from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBContainer
from cratedb_toolkit.util import DatabaseAdapter
from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBTestAdapter
from cratedb_toolkit.util.common import setup_logging

# Use different schemas for storing the subsystem database tables, and the
# test/example data, so that they do not accidentally touch the default `doc`
# schema.
TESTDRIVE_EXT_SCHEMA = "testdrive-ext"
TESTDRIVE_DATA_SCHEMA = "testdrive-data"

TESTDRIVE_EXT_SCHEMA = "testdrive-ext"
RESET_TABLES = [
f'"{TESTDRIVE_EXT_SCHEMA}"."retention_policy"',
f'"{TESTDRIVE_DATA_SCHEMA}"."raw_metrics"',
Expand All @@ -25,34 +23,8 @@
'"testdrive"."demo"',
]


class CrateDBFixture:
"""
A little helper wrapping Testcontainer's `CrateDBContainer` and
CrateDB Toolkit's `DatabaseAdapter`, agnostic of the test framework.
"""

def __init__(self):
self.cratedb = None
self.database: DatabaseAdapter = None
self.setup()

def setup(self):
# TODO: Make image name configurable.
self.cratedb = CrateDBContainer("crate/crate:nightly")
self.cratedb.start()
self.database = DatabaseAdapter(dburi=self.get_connection_url())

def finalize(self):
self.cratedb.stop()

def reset(self):
# TODO: Make list of tables configurable.
for reset_table in RESET_TABLES:
self.database.connection.exec_driver_sql(f"DROP TABLE IF EXISTS {reset_table};")

def get_connection_url(self, *args, **kwargs):
return self.cratedb.get_connection_url(*args, **kwargs)
CRATEDB_HTTP_PORT = 44209
CRATEDB_SETTINGS = {"http.port": CRATEDB_HTTP_PORT}


@pytest.fixture(scope="session", autouse=True)
Expand All @@ -71,18 +43,19 @@ def cratedb_service():
"""
Provide a CrateDB service instance to the test suite.
"""
db = CrateDBFixture()
db.reset()
db = CrateDBTestAdapter()
db.start(ports={CRATEDB_HTTP_PORT: None}, cmd_opts=CRATEDB_SETTINGS)
db.reset(tables=RESET_TABLES)
yield db
db.finalize()
db.stop()
amotl marked this conversation as resolved.
Show resolved Hide resolved


@pytest.fixture(scope="function")
def cratedb(cratedb_service):
"""
Provide a fresh canvas to each test case invocation, by resetting database content.
"""
cratedb_service.reset()
cratedb_service.reset(tables=RESET_TABLES)
amotl marked this conversation as resolved.
Show resolved Hide resolved
yield cratedb_service


Expand Down
33 changes: 33 additions & 0 deletions tests/testing/test_testcontainers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import pytest

from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBContainer


@pytest.mark.parametrize(
"opts, expected",
[
pytest.param(
{"indices.breaker.total.limit": "90%"},
(
"-Cdiscovery.type=single-node "
"-Cnode.attr.storage=hot "
"-Cpath.repo=/tmp/snapshots "
"-Cindices.breaker.total.limit=90%"
),
id="add_cmd_option",
),
pytest.param(
{"discovery.type": "zen", "indices.breaker.total.limit": "90%"},
(
"-Cdiscovery.type=zen "
"-Cnode.attr.storage=hot "
"-Cpath.repo=/tmp/snapshots "
"-Cindices.breaker.total.limit=90%"
),
id="override_defaults",
),
],
)
def test_build_command(opts, expected):
db = CrateDBContainer(cmd_opts=opts)
assert db._command == expected