Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-5827] Set all nodes to synchronous replicas #672

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

dragomirp
Copy link
Contributor

@dragomirp dragomirp commented Nov 14, 2024

Set all replicas to sync

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.83%. Comparing base (4c4b810) to head (3cdf0ba).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #672   +/-   ##
=======================================
  Coverage   71.82%   71.83%           
=======================================
  Files          13       13           
  Lines        3219     3220    +1     
  Branches      477      476    -1     
=======================================
+ Hits         2312     2313    +1     
  Misses        791      791           
  Partials      116      116           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dragomirp dragomirp marked this pull request as ready for review November 15, 2024 20:40
@@ -629,7 +629,7 @@ def render_patroni_yml_file(
stanza=stanza,
restore_stanza=restore_stanza,
version=self.get_postgresql_version().split(".")[0],
minority_count=self.planned_units // 2,
synchronous_node_count=self.planned_units - 1,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 for leader

@@ -846,7 +846,7 @@ def update_synchronous_node_count(self, units: int | None = None) -> None:
with attempt:
r = requests.patch(
f"{self._patroni_url}/config",
json={"synchronous_node_count": units // 2},
json={"synchronous_node_count": units - 1},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REST calls should also use the new value.

@dragomirp dragomirp requested review from a team, taurus-forever, marceloneppel and lucasgameiroborges and removed request for a team November 15, 2024 21:09
Copy link
Member

@lucasgameiroborges lucasgameiroborges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

@dragomirp
Copy link
Contributor Author

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

What was discussed was to switch the default behaviour to all sync and add a config for controlling the amount of syncs. The config needs a spec. I'll comment on the ticket to clarify.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Testing it, it's working fine in almost all scenarios.

The scenario where it's not working yet is the upgrade:

juju deploy postgresql --channel 14/stable -n 3

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, as expected.

juju run postgresql/leader pre-upgrade-check

juju refresh --path ./*.charm postgresql

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, but we should have one leader and two sync-standby only.

juju add-unit postgresql -n 1

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, and three sync-standby, as expected.

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, need some time to play it without rush.
Let's merge it after release the current PostgreSQL VM to stable.

@dragomirp
Copy link
Contributor Author

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Thanks, Dragomir! LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants