[DPE-4856][MISC] Port some `test_self_healing` CI fixes + update check for invalid extra user credentials #546

lucasgameiroborges · 2024-07-25T01:15:01Z

Port some of the CI tweaks made on test_self_healing on K8s, improving overall test stability (but still not stabilizing it completely)
Fixes a typo on test_new_relations + adapting the logic of checking for invalid user roles to make the test pass (credits to @dragomirp )
Other minor tweaks to improve test stability in general, inspired by the progress made on K8s repo

Issues encountered and follow-up tickets:

codecov · 2024-07-25T01:32:20Z

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 70.89%. Comparing base (2c0f64b) to head (10a96e5).
Report is 4 commits behind head on main.

Files	Patch %	Lines
src/relations/postgresql_provider.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #546   +/-   ##
=======================================
  Coverage   70.89%   70.89%           
=======================================
  Files          11       11           
  Lines        3024     3024           
  Branches      535      535           
=======================================
  Hits         2144     2144           
  Misses        764      764           
  Partials      116      116

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This reverts commit 8f0a177.

lucasgameiroborges · 2024-07-31T16:56:30Z

src/relations/postgresql_provider.py

@@ -259,7 +259,7 @@ def check_for_invalid_extra_user_roles(self, relation_id: int) -> bool:
            for data in relation.data.values():
                extra_user_roles = data.get("extra-user-roles")
                if extra_user_roles is None:
-                    break
+                    continue


necessary to correctly set/keep the blocked status due to invalid extra user roles, see typo in test below.

lucasgameiroborges · 2024-07-31T16:57:42Z

tests/integration/new_relations/test_new_relations.py

@@ -563,7 +563,7 @@ async def test_invalid_extra_user_roles(ops_test: OpsTest):
            f"{DATABASE_APP_NAME}:database", f"{DATA_INTEGRATOR_APP_NAME}:postgresql"
        )
        await ops_test.model.wait_for_idle(apps=[DATABASE_APP_NAME])
-        ops_test.model.block_until(
+        await ops_test.model.block_until(


There was a missing await here, hiding a bug in the charm behavior :( thankfully @dragomirp caught this typo when looking at the logs when debugging the Nextcloud error.

lucasgameiroborges · 2024-07-31T16:59:26Z

tests/integration/new_relations/test_new_relations.py

@@ -583,29 +583,40 @@ async def test_invalid_extra_user_roles(ops_test: OpsTest):
        )


-@pytest.mark.group(1)
+@pytest.mark.group(2)
 @markers.amd64_only  # nextcloud charm not available for arm64
 async def test_nextcloud_db_blocked(ops_test: OpsTest, charm: str) -> None:


Since a new update to nextcloud charm, its deploy procedure became longer + the fast-forwarding of update status hook was causing the charm to never enter blocked status. To avoid making the charm too long, this test was moved to a different group.

lucasgameiroborges · 2024-07-31T17:01:27Z

tests/integration/ha_tests/test_self_healing.py

@@ -261,6 +244,9 @@ async def test_full_cluster_restart(

    # Change the loop wait setting to make Patroni wait more time before restarting PostgreSQL.
    initial_loop_wait = await get_patroni_setting(ops_test, "loop_wait")
+    initial_ttl = await get_patroni_setting(ops_test, "ttl")
+    # loop_wait parameter is limited by ttl value, thus we should increase it first
+    await change_patroni_setting(ops_test, "ttl", 600, use_random_unit=True)


We need to increase ttl configuration before increasing loop_wait due to a constraint from Patroni, see the warning in https://patroni.readthedocs.io/en/latest/dynamic_configuration.html

tests/integration/ha_tests/test_self_healing.py

lucasgameiroborges · 2024-07-31T17:01:50Z

tests/integration/ha_tests/test_self_healing.py

@@ -145,8 +148,9 @@ async def test_storage_re_use(ops_test, continuous_writes):
 @pytest.mark.group(1)
 @pytest.mark.abort_on_fail
 @pytest.mark.parametrize("process", DB_PROCESSES)
-async def test_kill_db_process(
-    ops_test: OpsTest, process: str, continuous_writes, primary_start_timeout
+@pytest.mark.parametrize("signal", ["SIGTERM", pytest.param("SIGKILL", marks=markers.juju2)])


Refactor akin to what was previously done on K8s.

lucasgameiroborges · 2024-07-31T17:03:11Z

tests/integration/ha_tests/helpers.py

@@ -642,15 +642,13 @@ async def get_primary(ops_test: OpsTest, app, down_unit: str = None) -> str:
    """
    for unit in ops_test.model.applications[app].units:
        if unit.name != down_unit:
-            break


change this helper function to match the behavior on K8s part, running the action in each unit once, instead of retrying multiple times on the same unit.

lucasgameiroborges · 2024-07-31T17:04:06Z

tests/integration/ha_tests/test_self_healing.py

+    await send_signal_to_process(ops_test, primary_name, process, signal)
+
+    # Wait some time to elect a new primary.
+    sleep(MEDIAN_ELECTION_TIME * 6)


This and the other sleeps added allow for time to a new primary to be elected, much like we do on K8s

taurus-forever

Advanced sleep() programming!

lucasgameiroborges added 2 commits July 24, 2024 22:14

CI run trigger

1ea096a

do not raise errors on wait for idle

8f0a177

github-actions bot added the Libraries: OK label Jul 25, 2024

lucasgameiroborges added 4 commits July 26, 2024 14:25

Revert "do not raise errors on wait for idle"

b23f438

This reverts commit 8f0a177.

add ttl config change

440a043

add a wait for idle

2c2c2d0

Merge remote-tracking branch 'origin/main' into lucas/port-ci-fixes

910c21b

github-actions bot added Libraries: Out of sync and removed Libraries: OK labels Jul 29, 2024

Merge remote-tracking branch 'origin/main' into lucas/port-ci-fixes

4a84b73

github-actions bot added Libraries: OK and removed Libraries: Out of sync labels Jul 29, 2024

lucasgameiroborges added 7 commits July 29, 2024 16:52

refactor helpers

c8b2c35

refactor changes

96086af

add sleep to test

007efae

sleeps and waits

b35a8ae

try waiting before check

b767708

refactor test

ed64475

refactor test again

9dac481

github-actions bot added Libraries: Out of sync and removed Libraries: OK labels Jul 30, 2024

lucasgameiroborges added 4 commits July 30, 2024 18:48

try running unstable test

0fb0cec

fix groups

6c49862

try new test

23f9c9c

mark test as unstable

a68b6ae

marceloneppel approved these changes Jul 31, 2024

View reviewed changes

lucasgameiroborges added 3 commits July 31, 2024 09:49

test new relation fixes

cbe4fab

change status check

e4eee60

separate nextcloud test + revert tweaks

9df76f5

increase timeouts

062525e

lucasgameiroborges marked this pull request as ready for review July 31, 2024 16:33

lucasgameiroborges changed the title ~~Port CI fixes~~ [DPE-4856][MISC] Port some test_self_healing CI fixes + update check for invalid extra user credentials Jul 31, 2024

lucasgameiroborges commented Jul 31, 2024

View reviewed changes

dragomirp reviewed Jul 31, 2024

View reviewed changes

tests/integration/ha_tests/test_self_healing.py Outdated Show resolved Hide resolved

lucasgameiroborges commented Jul 31, 2024

View reviewed changes

not skip test on juju 3

10a96e5

dragomirp approved these changes Jul 31, 2024

View reviewed changes

lucasgameiroborges requested a review from taurus-forever July 31, 2024 17:15

This was referenced Jul 31, 2024

Lock file maintenance Python dependencies #543

Merged

[MISC] Adds missing await to invalid_extra_user_roles integration test + fix check loop canonical/postgresql-k8s-operator#602

Merged

taurus-forever approved these changes Aug 1, 2024

View reviewed changes

lucasgameiroborges merged commit 57eb505 into main Aug 1, 2024
83 checks passed

lucasgameiroborges deleted the lucas/port-ci-fixes branch August 1, 2024 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DPE-4856][MISC] Port some `test_self_healing` CI fixes + update check for invalid extra user credentials #546

[DPE-4856][MISC] Port some `test_self_healing` CI fixes + update check for invalid extra user credentials #546

lucasgameiroborges commented Jul 25, 2024 •

edited

Loading

codecov bot commented Jul 25, 2024 •

edited

Loading

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

lucasgameiroborges Jul 31, 2024

taurus-forever left a comment

[DPE-4856][MISC] Port some test_self_healing CI fixes + update check for invalid extra user credentials #546

[DPE-4856][MISC] Port some test_self_healing CI fixes + update check for invalid extra user credentials #546

Conversation

lucasgameiroborges commented Jul 25, 2024 • edited Loading

Issues encountered and follow-up tickets:

codecov bot commented Jul 25, 2024 • edited Loading

Codecov Report

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

lucasgameiroborges Jul 31, 2024

Choose a reason for hiding this comment

taurus-forever left a comment

Choose a reason for hiding this comment

[DPE-4856][MISC] Port some `test_self_healing` CI fixes + update check for invalid extra user credentials #546

[DPE-4856][MISC] Port some `test_self_healing` CI fixes + update check for invalid extra user credentials #546

lucasgameiroborges commented Jul 25, 2024 •

edited

Loading

codecov bot commented Jul 25, 2024 •

edited

Loading