Implement raft-wal #21460

raskchanky · 2023-06-26T22:04:30Z

This provides a config option called raft_wal, which can be set to "true" or "false" (raft backend config is map[string]string) for optionally enabling the use of https://github.com/hashicorp/raft-wal for raft storage instead of BoltDB.

If Vault is configured to use raft-wal but it detects raft.db in the normal spot, it continues to use raft.db and logs a warning that it's ignoring the raft-wal config, similar to what Consul does.

I modified the raft test helpers to allow a config map to be passed in, and then made most of the raft tests table driven, so they can exercise both boltdb and raft-wal. Most of the changes in that test file are mechanical, just moving existing test code into a table driven setup and adding a bit of error checking.

On the subject of "why are we doing this":
There are 2 main motivations for using raft-wal instead of raft-boltdb for raft storage: performance and stability. In microbenchmarks I've done, raft-wal is roughly 10% faster than raft-boltdb for normal operations. That's nice. On the stability front, since it is a data store that's designed specifically for raft storage, there's no freelist to contend with. Which means that, unlike raft-boltdb, there's no cruft to build up over time as the raft log is repeatedly truncated. When using raft-boltdb, eventually you're going to need to rotate your nodes out in order to compact boltdb, otherwise the freelist will grow large and negatively impact write performance, which will negatively impact the stability of your cluster.

raskchanky · 2023-08-08T23:04:08Z

@banks If you get a few free moments, I wonder if you could take a quick peek at this and see if it looks reasonable. I know you also did some work on a verifier as part of integrating raft-wal into Consul, but I wasn't sure 1) if that was also appropriate here and 2) if so, how to actually implement that.

banks

Looking great Josh. Couple of comments inline including the file existence check which I don't think is quite right.

For the verifier, yes it's just as applicable to us as it is to Consul. It basically provides some confidence that the LogStore is not corrupting data (whether it it BoltDB or raft-wal) which otherwise is extremely hard to detect.

Implementing requires a few things:

config to enable and setup how often to write checkpoints (see consul docs)
The current active node (raft leader) needs to use that config to periodically apply a special Raft entry. This part could be a little more complex in Vault because the code that runs on the "active node" in general is not aware of rafty things and the raft storage is not directly aware of whether it is the leader or not or at least doesn't run different operations if it is the leader right now IIRC. We should be able to figure out some way to make this work and I hope without breaking too many abstractions (ideally it would be confined to the Raft backend not spread through Vault' HA code when it's raft specific but we can see how that goes). That code needs to do something like this: https://github.com/hashicorp/consul/blob/e235c8be3c67ed1389af017a76b29a8452b86453/agent/consul/leader_log_verification.go
We need code that defines how to classify a checkpoint operation in Raft vs any other one and how to report success or failures to logs. This should probably live in the raft backend package and look something like this: https://github.com/hashicorp/consul/blob/e235c8be3c67ed1389af017a76b29a8452b86453/agent/consul/server_log_verification.go
plumbing to wire that all up: https://github.com/hashicorp/consul/blob/e235c8be3c67ed1389af017a76b29a8452b86453/agent/consul/server.go#L1076-L1083
This is the most subtle bit: the way we record checksums on the checkpoints may or may not play nicely with Vault's existing usage of chunking and/or FSM. In Consul I had to add a Shim to the FSM because of the way raft-chunking expects the raft extra data field to be used. Since we didn't fix that yet, and since we have to be compatible with older Vault builds to avoid crashed during upgrade anyway, we may need a similar shim and some careful testing of the mixed-version as well as mixed-enabled vs disabled configs. See https://github.com/hashicorp/consul/blob/e235c8be3c67ed1389af017a76b29a8452b86453/agent/consul/fsm/log_verification_chunking_shim.go#L18

physical/raft/raft.go

physical/raft/raft_test.go

github-actions · 2023-08-23T18:47:47Z

CI Results:
All Go tests succeeded! ✅

raskchanky · 2023-08-29T00:43:00Z

@banks Thanks for all the tips. I think I got items 1-4 on your list implemented (modulo some better acceptance tests). Item 5 on your list has me a bit puzzled still, in terms of where it goes.

banks · 2023-08-29T09:39:43Z

@raskchanky item 5 may not be needed though I suspect it will. The way you tell is:

Start up a Vault cluster where the leader is using this branch with verifier enabled but at least one follower is using current HEAD or a previous release version
See if it crashes

😄

The problem is that the verifier relies on being able to write additional data into the Extensions field in each raft log. go-raftchunking was the original user of this field and even tried to design itself with a way that other extensions can also use it because it re-encodes existing Extension data into it's own and vice versa.

The problem though is that the log verified needs to be able to write to the Extension field at a lower level than go-raftchunking. go-raftchunking is middleware that wraps the entire Raft Commit -> FSM Apply flow from the outside while log verifier needs to have the checksums computed and written to the log from inside the leader's LogStore abstraction at the lowest level of Raft stack.

The good news is that verifier only cares about Extension on Checkpoint log entries which by definition will never need to be chunked by go-raftchunking so the two things can operate indepently.

The bad news is that go-raftchunking's FSM layer assumes that if any log has a non-nil Extensions field, that it must have been encoded by raftchunking.Apply on the other side and so tries to decode it as it's own protobuf state and errors if it can't. Error during apply causes a panic since Raft can't really make progress if it can't apply a log.

Long term it would be nicer to change raftchunking so that it had it's own magic prefix and ignored others but I didn't do that for Consul because even if we did we'd also need some sort of migration process because logs on disk would have been persisted the old way etc. It was simpler to just add the FSM shim I linked to before which used a heuristic which I could guarantee was always safe to "do the right thing" and intercept the new checkpoint log entries before they reach the raftchunking.FSM.Apply and cause an error. Actually they should be a no-op at the FSM layer since all the validation happens at the VerifyingLogStore layer so we just need to intercept them to prevent the error.

Does that make sense?

Not reviewed the other changes here yet, happy to chat about this if you want to figure out the best approach together.

banks

This looks super close @raskchanky! 🎉

I think the one thing we should review is the encoding of checkpoints so that we can avoid the overhead of double-parsing every single log on all servers at different FSM layers! I don't think that's too gross from a quick look but we can talk about it. It would mean bypassing or slightly refactoring applyLogs to let us write the raw raft.Log somehow.

physical/raft/raft.go

go.mod

physical/raft/raft.go

physical/raft/raft_test.go

banks

Looking awesome @raskchanky!

I'll submit this now although I have noted a couple things I'm going to come back to later. Mostly minor nits but one or two places where what you have works but in theory could possibly hit edge cases now or in the future so probably best to tighten those up!

go.mod

physical/raft/fsm.go

physical/raft/raft.go

vault/raft.go

* adding a migration test from boltdb to raftwal and back adding a migration test using snapshot restore * feedback

banks

Lots if nits in here that really don't matter so leave them to you if you think they are worth fiddling with.

The one thing I think we should look at before merge is the BatchApply handling of the empty log - it seems to work now but seems alarmingly brittle, near-missing panics and bad violations of assumptions by pure chance in several places that have no explicit intention to allow this behavior! More inline. I don't think this should be too hard and at worst I'd consider accepting it by just adding inline comments to all those places where we happen to to do the "right" thing by chance now so it's at least harder to accidentaly break later!

physical/raft/fsm.go

physical/raft/raft.go

banks · 2024-01-23T20:23:33Z

physical/raft/raft.go

+	if boltStore, ok := b.stableStore.(*raftboltdb.BoltStore); ok {
+		bss := boltStore.Stats()
+		logStoreStats = &bss
+	}
+


I wonder if we have tests that cover these metrics being produced? I don't see any usages of CollectMetrics in raft_test.go at least. If not it's probably worth (at least) adding to a TODO list for manual testing to be sure we don't regress those here.

I'm not sure I'm following what kind of test you're looking for. CollectMetrics is called here as part of the main metrics collection loop. Can you be more specific?

We talked offline but for the sake of GH history, I just meant that I don't think we have unit tests that assert that boltdb metrics are actually reported when this is called either locally or in general in Vault.

So it would be possible to typo this change and not fail any tests but break our actual metrics around boltdb.

Ideally we'd have some sort of unit test there but since there is no precedent we can plan to add one later, but it would be wise to at least manually verify that when using BoltDB on this branch we still get stats output as before!

physical/raft/raft.go

Co-authored-by: Paul Banks <[email protected]>

banks

We talked offline but here's the other thing we should fix to get the verifier working again.

physical/raft/raft.go

raskchanky

@banks 293f027 looks good to me

banks

Amazing job @raskchanky

This looks good to go!

mladlow · 2024-03-11T21:51:45Z

changelog/21460.txt

@@ -0,0 +1,3 @@
+```release-note:feature
+storage/raft: Add experimental support for raft-wal, a new backend engine for integrated storage.


@raskchanky next time please use the correct new feature formatting for new features in the changelog.

Thanks for the reminder. I did correct this in a subsequent PR.

This seems like a good candidate for a new CI check, so that we don't have to rely on humans remembering to always do the right thing in several different scenarios.

@raskchanky I've added an agenda item to discuss new requirements for the changelog checking tooling.

raskchanky added 4 commits June 26, 2023 14:56

Implement raft-wal

ca1c92f

go mod tidy

c45413f

add metrics, fix a panic

27aede1

fix the panic for real this time

3f0c104

raskchanky requested review from banks, ncabatoff, mpalmi and hghaf099 June 30, 2023 20:28

VioletHynes added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 6, 2023

banks reviewed Aug 9, 2023

View reviewed changes

physical/raft/raft.go Outdated Show resolved Hide resolved

physical/raft/raft.go Outdated Show resolved Hide resolved

physical/raft/raft_test.go Outdated Show resolved Hide resolved

Merge branch 'main' into raft-wal

d669640

vercel bot deployed to Preview August 23, 2023 18:32 View deployment

raskchanky added 4 commits August 23, 2023 13:07

PR feedback

fd567dd

refactor tests to use a helper and reduce duplication

567243f

add a test to verify we don't use raft-wal if raft.db exists

3d03e8c

Merge branch 'main' into raft-wal

bee2d0b

vercel bot deployed to Preview August 24, 2023 16:05 View deployment

raskchanky added 4 commits August 28, 2023 14:14

add config to enable the verifier

766c53b

add tests for parsing verification intervals

b00ae66

run the verifier in the background

bed9e34

wire up the verifier

555afb4

raskchanky added 2 commits August 30, 2023 13:39

go mod tidy

1fb11f4

Merge branch 'main' into raft-wal

3110582

vercel bot deployed to Preview August 30, 2023 20:46 View deployment

banks requested changes Sep 8, 2023

View reviewed changes

raskchanky marked this pull request as ready for review September 8, 2023 17:55

Merge branch 'main' into raft-wal

a7b31cc

vercel bot deployed to Preview January 5, 2024 01:24 View deployment

Merge branch 'main' into raft-wal

9f567cd

vercel bot deployed to Preview January 8, 2024 22:55 View deployment

Merge branch 'main' into raft-wal

d70b266

vercel bot deployed to Preview January 9, 2024 16:36 View deployment

banks self-requested a review January 11, 2024 15:17

banks requested changes Jan 11, 2024

View reviewed changes

PR feedback

6071268

raskchanky added this to the 1.16.0-rc1 milestone Jan 12, 2024

hghaf099 and others added 2 commits January 16, 2024 16:27

Vault 20270 docker test raft wal (#24463)

1bb29b3

* adding a migration test from boltdb to raftwal and back adding a migration test using snapshot restore * feedback

Merge branch 'main' into raft-wal

4c5f434

vercel bot deployed to Preview January 23, 2024 19:39 View deployment

banks requested changes Jan 23, 2024

View reviewed changes

Update physical/raft/raft.go

fa90807

Co-authored-by: Paul Banks <[email protected]>

banks requested changes Jan 24, 2024

View reviewed changes

physical/raft/raft.go Outdated Show resolved Hide resolved

raskchanky and others added 7 commits January 24, 2024 09:12

PR feedback

395984b

change verifier function

54448fb

Merge branch 'main' into raft-wal

af9b42f

make this shorter

7d42da4

Merge branch 'main' into raft-wal

b9e433e

add changelog

83ea124

Fix Close behavior

293f027

raskchanky commented Jan 25, 2024

View reviewed changes

raskchanky added 2 commits January 25, 2024 09:28

make supporting empty logs more explicit

fab75fe

add some godocs

dfd68eb

banks approved these changes Jan 25, 2024

View reviewed changes

raskchanky merged commit ef26498 into main Jan 25, 2024
110 checks passed

raskchanky deleted the raft-wal branch January 25, 2024 18:08

mladlow reviewed Mar 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement raft-wal #21460

Implement raft-wal #21460

raskchanky commented Jun 26, 2023 •

edited

Loading

raskchanky commented Aug 8, 2023

banks left a comment

github-actions bot commented Aug 23, 2023 •

edited

Loading

raskchanky commented Aug 29, 2023

banks commented Aug 29, 2023

banks left a comment

banks left a comment

banks left a comment

banks Jan 23, 2024

raskchanky Jan 23, 2024

banks Jan 25, 2024

banks left a comment

raskchanky left a comment •

edited

Loading

banks left a comment

mladlow Mar 11, 2024

raskchanky Mar 11, 2024

raskchanky Mar 11, 2024

mladlow Mar 12, 2024

		@@ -0,0 +1,3 @@
		```release-note:feature
		storage/raft: Add experimental support for raft-wal, a new backend engine for integrated storage.

Implement raft-wal #21460

Implement raft-wal #21460

Conversation

raskchanky commented Jun 26, 2023 • edited Loading

raskchanky commented Aug 8, 2023

banks left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 23, 2023 • edited Loading

raskchanky commented Aug 29, 2023

banks commented Aug 29, 2023

banks left a comment

Choose a reason for hiding this comment

banks left a comment

Choose a reason for hiding this comment

banks left a comment

Choose a reason for hiding this comment

banks Jan 23, 2024

Choose a reason for hiding this comment

raskchanky Jan 23, 2024

Choose a reason for hiding this comment

banks Jan 25, 2024

Choose a reason for hiding this comment

banks left a comment

Choose a reason for hiding this comment

raskchanky left a comment • edited Loading

Choose a reason for hiding this comment

banks left a comment

Choose a reason for hiding this comment

mladlow Mar 11, 2024

Choose a reason for hiding this comment

raskchanky Mar 11, 2024

Choose a reason for hiding this comment

raskchanky Mar 11, 2024

Choose a reason for hiding this comment

mladlow Mar 12, 2024

Choose a reason for hiding this comment

raskchanky commented Jun 26, 2023 •

edited

Loading

github-actions bot commented Aug 23, 2023 •

edited

Loading

raskchanky left a comment •

edited

Loading