Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt testcontainers to be testing framework agnostic #82

Merged
merged 10 commits into from
Nov 24, 2023
112 changes: 104 additions & 8 deletions cratedb_toolkit/testing/testcontainers/cratedb.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is for Python 3.7 and 3.8 to support generic types
# like `dict` instead of `typing.Dict
from __future__ import annotations

import logging
import os
from typing import Optional
Expand All @@ -19,6 +23,7 @@
from testcontainers.core.waiting_utils import wait_container_is_ready, wait_for_logs

from cratedb_toolkit.testing.testcontainers.util import KeepaliveContainer, asbool
from cratedb_toolkit.util import DatabaseAdapter

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -51,8 +56,13 @@
CRATEDB_PASSWORD = os.environ.get("CRATEDB_PASSWORD", "")
CRATEDB_DB = os.environ.get("CRATEDB_DB", "doc")
KEEPALIVE = asbool(os.environ.get("CRATEDB_KEEPALIVE", os.environ.get("TC_KEEPALIVE", False)))
CMD_OPTS = {
"discovery.type": "single-node",
"cluster.routing.allocation.disk.threshold_enabled": False,
amotl marked this conversation as resolved.
Show resolved Hide resolved
"node.attr.storage": "hot",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a partial hot/cold targeted custom setup which should be irrelevant for generic test clusters and especially in a single-node setup.

@amotl Any reason for this configuration? (as it was in the codebase already).

Suggested change
"node.attr.storage": "hot",

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, well spotted. It is indeed a configuration detail which is specific to the test cases for cratedb_toolkit.retention, where this code was initially conceived for.

In this spirit, it should not be part of the generic startup parameters, but at the same time, it shows we need the capacity to configure those details when needed.

@pilosus: Do you think we can improve this spot, so that corresponding configuration settings can be defined on behalf of the snippet in conftest.py? This time, it will probably not be so easy, because the test adapter will already need this information at startup time. Maybe you have an idea how to handle this elegantly?

While being at it: Of course, it would not just be about the specific node.attr.storage parameter, but about any other parameters as well.

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time, it will probably not be so easy, because the test adapter will already need this information at startup time. Maybe you have an idea how to handle this elegantly?

While being at it: Of course, it would not just be about the specific node.attr.storage parameter, but about any other parameters as well.

Ah, I see you already added cmd_opts to the constructor. 🙇

https://github.com/crate-workbench/cratedb-toolkit/blob/661370ffb0619e2a4c698c52627c99a1fb726bad/cratedb_toolkit/testing/testcontainers/cratedb.py#L94-L95

So, {"node.attr.storage": "hot", "path.repo": "/tmp/snapshots"} would just need to be moved over to the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. And given that we use dict merge, you can even override the default settings

In [1]: from cratedb_toolkit.testing.testcontainers.cratedb  import CrateDBContainer

In [2]: c = CrateDBContainer(cmd_opts={"node.attr.storage": "cold"})

In [3]: c._command
Out[3]: '-Cdiscovery.type=single-node -Cnode.attr.storage=cold -Cpath.repo=/tmp/snapshots'

I'll remove the node.attr.storage from the default though

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. What do you think about path.repo, @seut? It could be convenient for testing to have it configured by default. Because you didn't mention it in your request, do you think it can stay?

Copy link
Contributor Author

@pilosus pilosus Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seut @amotl hmm, when I delete node.attr.storage=hot
tests:

  • tests/retention/test_cli.py::test_run_delete_basic
  • tests/retention/test_cli.py::test_run_delete_dryrun
  • tests/retention/test_cli.py::test_run_reallocate
    hang.

Here are the logs:

2023-11-24 17:39:14,494 [cratedb_toolkit.retention.setup.schema] INFO    : Installing retention policy bookkeeping table at database 'crate://crate:REDACTED@localhost:33018', table TableAddress(schema='testdrive-ext', table='retention_policy')
2023-11-24 17:39:14,902 [cratedb_toolkit.retention.store     ] INFO    : Connecting to database crate://crate:REDACTED@localhost:33018, table "testdrive-ext"."retention_policy"
Waiting to be ready...
2023-11-24 17:39:15,366 [testcontainers.core.waiting_utils   ] INFO    : Waiting to be ready...

Interestingly enough, test_run_delete_with_tags_match works well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the test cases defined for cratedb_toolkit.retention need this setting to be configured, so it will need to go into tests/conftest.py somehow. On the other hand, it should not be part of the generic configuration. That's yet another cliff we need to take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the whole chain of the nested fixtures with different scopes, it will either require some more time from me next week, or a simpler solution with cratedb override for tests/retention/conftest.py that have node.attr.storage=hot in it.

Copy link
Member

@amotl amotl Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can easily handle that on a later iteration, and/or discuss possible solutions beforehand. Thank you.

"path.repo": "/tmp/snapshots",
}

# TODO: Dual-port use with 4200+5432.
def __init__(
self,
image: str = "crate/crate:nightly",
Expand All @@ -61,34 +71,58 @@
password: Optional[str] = None,
dbname: Optional[str] = None,
dialect: str = "crate",
cmd_opts: Optional[dict] = None,
extra_ports: Optional[list] = None,
**kwargs,
) -> None:
super().__init__(image=image, **kwargs)

self._name = "testcontainers-cratedb" # -{os.getpid()}
self._command = "-Cdiscovery.type=single-node -Ccluster.routing.allocation.disk.threshold_enabled=false"
# TODO: Generalize by obtaining more_opts from caller.
self._command += " -Cnode.attr.storage=hot"
self._command += " -Cpath.repo=/tmp/snapshots"

cmd_opts = cmd_opts if cmd_opts else {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit: I usually write this like that. Is it a good idea?

Suggested change
cmd_opts = cmd_opts if cmd_opts else {}
cmd_opts = cmd_opts or {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self._command = self._build_cmd({**self.CMD_OPTS, **cmd_opts})

self.CRATEDB_USER = user or self.CRATEDB_USER
self.CRATEDB_PASSWORD = password or self.CRATEDB_PASSWORD
self.CRATEDB_DB = dbname or self.CRATEDB_DB

self.port_to_expose = port
self.extra_ports = extra_ports or []
self.dialect = dialect

@staticmethod
def _build_cmd(opts: dict) -> str:
"""
Return a string with command options concatenated and optimised for ES5 use
"""
cmd = []
for key, val in opts.items():
if isinstance(val, bool):
val = str(val).lower()
cmd.append("-C{}={}".format(key, val))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use f-strings already?

Suggested change
cmd.append("-C{}={}".format(key, val))
cmd.append(f"-C{key}={val}")

Copy link
Contributor Author

@pilosus pilosus Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.
Alhough I personally am not a big fan of f-strings, because sometimes people abuse their flexibility with larger expressions and, thus, damage readability. But it's a personal choice rather than anything else. In this particular case it's just fine to use f-strings

return " ".join(cmd)

def _configure_ports(self) -> None:
"""
Bind all the ports exposed inside the container to the same port on the host
"""
ports = [*[self.port_to_expose], *self.extra_ports]
for port in ports:
# If random port is needed on the host, use host=None
# or invoke self.with_exposed_ports
self.with_bind_ports(container=port, host=port)

def _configure(self) -> None:
self.with_exposed_ports(self.port_to_expose)
self._configure_ports()
self.with_env("CRATEDB_USER", self.CRATEDB_USER)
self.with_env("CRATEDB_PASSWORD", self.CRATEDB_PASSWORD)
self.with_env("CRATEDB_DB", self.CRATEDB_DB)

def get_connection_url(self, host=None) -> str:
def get_connection_url(self, host=None, dialect=None) -> str:
# TODO: When using `db_name=self.CRATEDB_DB`:
# Connection.__init__() got an unexpected keyword argument 'database'
return super()._create_connection_url(
dialect=self.dialect,
dialect=dialect or self.dialect,
username=self.CRATEDB_USER,
password=self.CRATEDB_PASSWORD,
host=host,
Expand All @@ -101,3 +135,65 @@
# In `testcontainers-java`, there is the `HttpWaitStrategy`.
# TODO: Provide a client instance.
wait_for_logs(self, predicate="o.e.n.Node.*started", timeout=MAX_TRIES)


class TestDrive:
"""
Use different schemas for storing the subsystem database tables, and the
test/example data, so that they do not accidentally touch the default `doc`
schema.
"""

EXT_SCHEMA = "testdrive-ext"
DATA_SCHEMA = "testdrive-data"

RESET_TABLES = [
f'"{EXT_SCHEMA}"."retention_policy"',
f'"{DATA_SCHEMA}"."raw_metrics"',
f'"{DATA_SCHEMA}"."sensor_readings"',
f'"{DATA_SCHEMA}"."testdrive"',
f'"{DATA_SCHEMA}"."foobar"',
f'"{DATA_SCHEMA}"."foobar_unique_single"',
f'"{DATA_SCHEMA}"."foobar_unique_composite"',
# cratedb_toolkit.io.{influxdb,mongodb}
'"testdrive"."demo"',
]
Copy link
Member

@amotl amotl Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First things first: I like that you bundled those bits of information into a little container class.

On the other hand, contrary to CrateDBFixture, this section is specific to the test suite for cratedb_toolkit, and is not meant to be shipped with cratedb_toolkit.testing.

Do you see a chance to decouple this and let it be configured in tests/conftest.py, maybe on behalf of just a bit more pytest .request / _conf / -fixture magic, but loosely coupled, so that there is no dependency path going from cratedb_testing to tests/conftest.py, and the configuration could be somehow elegantly inverted instead?

I am not sure if I am asking for too much here, or if you can follow my thoughts easily, or if the implementation would be too complicated. Please let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If reset tables aren’t need in testing module and only needed in conftest, I can simply move them there. But in this case shall I still try to remove explicit import of testing/testcontainers? It’s needed for CrateDBFixture. I probably don’t see the whole picture, but utils module is still explicitly imported there, so I don’t quite get why it’s different for testing module

Copy link
Member

@amotl amotl Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If reset tables aren’t need in testing module and only needed in conftest, I can simply move them there.

This specific set of tables are only meant to be reset on behalf of the cratedb-toolkit test suite, which is not being shipped as part of the package. On the other hand, cratedb_toolkit.testing intends to bundle generic testing helpers/utilities/fixtures.

In this spirit, you made the right choice to put CrateDBFixture there, but I think the table definition / test suite configuration itself, now excellently bundled into the TestDrive container class, should stay in tests/conftest.py. However, it can't be there in isolation, because it will need to be picked up by the generic CrateDBFixture in some way, because this one actually orchestrates the container lifecycle. Can you figure out a way to make that happen elegantly?

But in this case shall I still try to remove explicit import of testing/testcontainers? It’s needed for CrateDBFixture.

Are you referring to one of those? I think both are fine in general. It should be free for every module to use generic utilities from cratedb_toolkit.testing, but not the other way round, at least import-wise.

# File: cratedb_toolkit/testing/testcontainers/cratedb.py
from cratedb_toolkit.testing.testcontainers.util import KeepaliveContainer, asbool
# File: tests/conftest.py
from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBFixture

I probably don’t see the whole picture, but utils module is still explicitly imported there, so I don’t quite get why it’s different for testing module

Are you referring to this import?

# File: cratedb_toolkit/testing/testcontainers/cratedb.py
from testcontainers.core.waiting_utils import wait_container_is_ready, wait_for_logs

I think it is also perfectly fine. You can pull in all desired utitilities into conftest.py, but the tricky part will be to define the test suite database connectivity configuration (TestDrive) there, and let it be picked up / consumed by the generic CrateDBFixture to be used properly at runtime, because it can't "import" something from tests/conftest.py.

Copy link
Contributor Author

@pilosus pilosus Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, basically, I've made CrateDBFixture.reset method to take reset tables in as a parameter without hardcoding them anyhow. That means, CrateDBFixture can still live in the cratedb_toolkit/testing/testcontainers/cratedb.py, but the reset tables are defined in the tests/conftest.py along with the fixtures that use CrateDBFixture. I hope this solves your concern with the loose coupling.

Copy link
Member

@amotl amotl Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was exactly what I was dreaming up regarding separation of concerns, and where I haven't been able to make any progress so far. Thank you very much for resolving that.



class CrateDBFixture:
"""
A little helper wrapping Testcontainer's `CrateDBContainer` and
CrateDB Toolkit's `DatabaseAdapter`, agnostic of the test framework.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate very much bringing this into the generic cratedb_toolkit.testing module namespace, to make it a re-usable component for other packages.

Copy link
Member

@amotl amotl Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking about "naming things": If you have a better idea for the name, in order to better disambiguate from pytest's notion of "fixtures", because it is actually a testframework-agnostic adapter/wrapper around, well, DatabaseAdapter 1 and CrateDBContainer, please let me know 2.

Footnotes

  1. Also eventually to be renamed to CrateDBClientAdapter or something different.

  2. As with my other suggestions, the change itself can easily be done on a subsequent iteration. I am just taking the chance to talk with someone about it ;].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would CrateDBTestAdapter be a more sensible name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we have DatabaseAdapter, I like the CrateDBTestAdapter name!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful. Let us rename it on behalf of a subsequent patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you already renamed it properly, thanks.

I proposed to change it later, because I feared this would add too much noise to this patch because the symbol would need to be changed at too many places. But, of course, it turns out that it already has been nicely decoupled from the test cases on behalf of the cratedb and cratedb_service pytest fixtures, so my fears were unfounded.


def __init__(self, crate_version: str = "nightly", **kwargs):
self.cratedb: Optional[CrateDBContainer] = None
self.image: str = "crate/crate:{}".format(crate_version)
self.database: Optional[DatabaseAdapter] = None
self.setup(**kwargs)

def setup(self, **kwargs):
self.cratedb = CrateDBContainer(image=self.image, **kwargs)
self.cratedb.start()
self.database = DatabaseAdapter(dburi=self.get_connection_url())

def finalize(self):
if self.cratedb:
self.cratedb.stop()
amotl marked this conversation as resolved.
Show resolved Hide resolved

def reset(self, tables: Optional[list] = TestDrive.RESET_TABLES):
if tables and self.database:
for reset_table in tables:
self.database.connection.exec_driver_sql(f"DROP TABLE IF EXISTS {reset_table};")

def get_connection_url(self, *args, **kwargs):
if self.cratedb:
return self.cratedb.get_connection_url(*args, **kwargs)
return None

Check warning on line 192 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L192

Added line #L192 was not covered by tests

@property
def http_url(self):
"""
Return a URL for HTTP interface
"""
return self.get_connection_url(dialect="http")

Check warning on line 199 in cratedb_toolkit/testing/testcontainers/cratedb.py

View check run for this annotation

Codecov / codecov/patch

cratedb_toolkit/testing/testcontainers/cratedb.py#L199

Added line #L199 was not covered by tests
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,8 @@ extend-ignore = [
"RET504",
# Unnecessary `elif` after `return` statement
"RET505",
# Probable insecure usage of temporary file or directory
"S108",
]

extend-exclude = [
Expand Down
53 changes: 4 additions & 49 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,56 +3,11 @@
import pytest
import responses

from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBContainer
from cratedb_toolkit.util import DatabaseAdapter
from cratedb_toolkit.testing.testcontainers.cratedb import CrateDBFixture, TestDrive
from cratedb_toolkit.util.common import setup_logging

# Use different schemas for storing the subsystem database tables, and the
# test/example data, so that they do not accidentally touch the default `doc`
# schema.
TESTDRIVE_EXT_SCHEMA = "testdrive-ext"
TESTDRIVE_DATA_SCHEMA = "testdrive-data"

RESET_TABLES = [
f'"{TESTDRIVE_EXT_SCHEMA}"."retention_policy"',
f'"{TESTDRIVE_DATA_SCHEMA}"."raw_metrics"',
f'"{TESTDRIVE_DATA_SCHEMA}"."sensor_readings"',
f'"{TESTDRIVE_DATA_SCHEMA}"."testdrive"',
f'"{TESTDRIVE_DATA_SCHEMA}"."foobar"',
f'"{TESTDRIVE_DATA_SCHEMA}"."foobar_unique_single"',
f'"{TESTDRIVE_DATA_SCHEMA}"."foobar_unique_composite"',
# cratedb_toolkit.io.{influxdb,mongodb}
'"testdrive"."demo"',
]


class CrateDBFixture:
"""
A little helper wrapping Testcontainer's `CrateDBContainer` and
CrateDB Toolkit's `DatabaseAdapter`, agnostic of the test framework.
"""

def __init__(self):
self.cratedb = None
self.database: DatabaseAdapter = None
self.setup()

def setup(self):
# TODO: Make image name configurable.
self.cratedb = CrateDBContainer("crate/crate:nightly")
self.cratedb.start()
self.database = DatabaseAdapter(dburi=self.get_connection_url())

def finalize(self):
self.cratedb.stop()

def reset(self):
# TODO: Make list of tables configurable.
for reset_table in RESET_TABLES:
self.database.connection.exec_driver_sql(f"DROP TABLE IF EXISTS {reset_table};")

def get_connection_url(self, *args, **kwargs):
return self.cratedb.get_connection_url(*args, **kwargs)
TESTDRIVE_DATA_SCHEMA = TestDrive.DATA_SCHEMA
TESTDRIVE_EXT_SCHEMA = TestDrive.EXT_SCHEMA


@pytest.fixture(scope="session", autouse=True)
Expand All @@ -63,7 +18,7 @@ def configure_database_schema(session_mocker):

If not configured otherwise, the test suite currently uses `testdrive-ext`.
"""
session_mocker.patch("os.environ", {"CRATEDB_EXT_SCHEMA": TESTDRIVE_EXT_SCHEMA})
session_mocker.patch("os.environ", {"CRATEDB_EXT_SCHEMA": TestDrive.EXT_SCHEMA})


@pytest.fixture(scope="session")
Expand Down
Loading