Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raft multi version testing #559

Merged
merged 28 commits into from
Mar 25, 2024
Merged

Conversation

dhiaayachi
Copy link
Contributor

@dhiaayachi dhiaayachi commented Jun 26, 2023

In the past when introducing features which present a compatibility risk in the raft library our testing strategy relied on having the feature behind a feature flag and test a mix of nodes where the feature is activated or deactivated. While this strategy have the benefit of being simple and avoiding the need to create multi processes tests where we test an older release of the raft library against a newer one, it present a risk as we are relying on the premise that, when the feature flag is off, the raft library still behave the same way as an older version of raft and any deviation of that would create a testing gap.

This PR is a proposal to introduce a new class of tests in the raft library to cover multiversion testing. The library in here permit to instantiate a new version of raft with the previously released version and give the ability to create tests scenario like upgrade tests (demonstrated in this PR as well).

From implementation perspective, the test suite is its own module. The module introduce 2 dependencies to raft:

  • A dependency to the new version, that represent the tip of the current branch (using a replace directive to the raft module in the current repo).
  • A dependency to the previously released version which represent the latest released tag (using a replace directive to a submodule which is pinned to the latest released version currently 1.5.0)

The test library introduce a testcluster package that represent the glue that permit to create and operate the 2 versions of raft in the same test.

A useful use case for this type of tests is the implementation of raft pre-vote. Using the same test strategy we could implement a multi versions tests (before prevote and after prevote) and verify that when transitioning between the 2 versions raft continue to operate safely.

TODO:

  • Run tests in CI
  • Automate updating the git submodule after a new release tag is created

@dhiaayachi dhiaayachi requested a review from a team as a code owner June 26, 2023 19:48
@dhiaayachi dhiaayachi requested review from loshz and removed request for a team June 26, 2023 19:48
require.Equal(t, a, leader)

//
getLeader.GetRaft().(*raftprevious.Raft).RemoveServer(raftprevious.ServerID(rLatest.ID(i)), 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm in two minds about this step... Consul's default mode for servers is to have leave_on_terminate = false which I think is correct - generally killing a server is either due to an in-place upgrade or a crash. In either case the intent is not to reduce the size of the quorum so not leaving makes sense.

The confusing part is that our docs since forever have actively recommended to manually perform a consul leave on the server before stopping it which I've never really liked for the above reasons. I think the benefit of doing so is that when the leader leaves gracefully leadership can transfeer more smoothly, but we now have Leadership Transfer that should explicitly handle that without messing with the quorum.

I think there was some discussion recently about how leaving first also increases the chances of leader disruption because the node leaves but then still interacts with the other raft nodes somehow but I don't recall the details.

I think it would be a better test if we didn't re-configure raft here as that is closer to how a straight replacement upgrade (without manual leave steps) would work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW those upgrade docs are being changed and in newer versions the user could force a leadership transfer when upgrading the leader instead of performing a leave.

Bad things can happen when a server leaves the cluster but will come back. Mainly that after leaving if raft on the node isn't turned off quickly enough it will attempt to kick off leader elections running up the term. The behavior we observed with a recent issue is that if that node runs up the term by X then when it rejoins it will trigger X leader elections until all other nodes agree on the term. This has the effect of destabilizing the cluster during the upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, independently from the upgrade procedure documented in Consul, we should have tests for all of those scenarios as they may happen during upgrade for different reasons and we need to ensure that the cluster end up stable at the end. I can set up a table test that include those scenarios.

@mkeeler The issue you are referring to is related to not having raft prevote implemented. Actually in the test there is an extra call to wait for leadership because of that and I added a comment about it not being necessary when we have prevote.

Copy link
Contributor Author

@dhiaayachi dhiaayachi Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also that make me realize that I'm changing the name of the nodes when upgrading, which do not really reflect an upgrade scenario, I will fix that.

store := rLatest.Store(getLeader.GetLocalID())

//Remove and shutdown the leader node
fr := getLeader.GetRaft().(*raftprevious.Raft).RemoveServer(raftprevious.ServerID(getLeader.GetLocalID()), 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above. Not sure if this is the "right" thing to do here 🤔

@dhiaayachi
Copy link
Contributor Author

dhiaayachi commented Aug 21, 2023

I added multiple upgrade scenario as discussed, we now test for upgrade:

  • When a node leave the cluster before rejoining with a new version
  • When a node restart with a new version without leaving the cluster
  • When a node, if it's a leader, perform a leader transfer before restarting with a new version
    I think this would cover all the possible scenarios, including what we advice in Consul.

@banks @mkeeler could you please give this another round of review, I rebased the raft prevote branch on top of this, as I'm adding some upgrade tests for prevote too and would like to get both merged in the near term.

@dhiaayachi dhiaayachi requested review from a team, dekimsey and randyhdev and removed request for a team March 19, 2024 14:20
Copy link
Member

@mkeeler mkeeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me but I wonder if longer term it would be easier to maintain a container based approach to multi-version testing (or if not containers then multiple compiled test binaries)

For example, if we had a testing only CLI which exposed all of the methods on the Raft via a gRPC API, then you could run these server testing CLIs and then communicate with any cluster generically through the gRPC API. This would avoid needing submodules, importing two versions of the raft library etc. It would come with the complexity of exposing a testing only gRPC API for managing Raft. This API could be relatively small to start with though to only expose just the methods that the tests need and only be expanded upon in the future.

@dhiaayachi
Copy link
Contributor Author

This seems fine to me but I wonder if longer term it would be easier to maintain a container based approach to multi-version testing (or if not containers then multiple compiled test binaries)

For example, if we had a testing only CLI which exposed all of the methods on the Raft via a gRPC API, then you could run these server testing CLIs and then communicate with any cluster generically through the gRPC API. This would avoid needing submodules, importing two versions of the raft library etc. It would come with the complexity of exposing a testing only gRPC API for managing Raft. This API could be relatively small to start with though to only expose just the methods that the tests need and only be expanded upon in the future.

@mkeeler I considered an approach similar to the test-container approach we use in Consul. The main reason I decided to go with the approach in this PR is debuggability. Specially in library like raft, being able to see the full system state (all raft instances and their call stacks, code instrumentation using breakpoints...) is important to be able to understand what's going on while debugging. Also from fast iterating perspective the approach in here would be faster to iterate while adding features and bug fixes (no need to build a container image...). Add to that the fact that observing the system state using the current API is not always ideal and as you mentioned we need to write code to package the library into a standalone exec which could be as involving as the wrappers added in here.

That said, this approach comes with some drawbacks related to go static typing and needs to convert types between multiple versions. I tried to abstract as much as I can of that, but it's still involving and could be limiting. I suggest we try and see how far this approach get us and if the limits are too inconvenient we can switch to a container based testing solution in the future.

@mkeeler
Copy link
Member

mkeeler commented Mar 21, 2024

Actually, if you defined the gRPC API you could potentially do in-mem gRPC to each Raft instance while not having to care about types and while still allowing single instance easy debuggability.

Anyways, thats probably too future thinking for now.

One other drawback to the current single process approach is that it requires updating git submodules to test with other versions instead of being able to parameterize the test with some other version or versions to test with.

Copy link
Member

@banks banks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome. Thanks Dhia!

@dhiaayachi dhiaayachi merged commit cc2bb08 into hashicorp:main Mar 25, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants