Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftlog: introduce & use in loqrecovery, kvserver #76126

Merged
merged 2 commits into from
Nov 12, 2022

Conversation

tbg
Copy link
Member

@tbg tbg commented Feb 6, 2022

This commit introduces a new package raftlog. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g. for testing).

For now, we introduce two concepts:

  • raftlog.Entry, which wraps a raftpb.Entry and all of the
    information derived from it, such as the command ID, the
    kvserverpb.RaftCommand, the configuration change (if any), etc.

  • raftlog.Iterator, a way to iterate over the raft log in terms of
    raftlog.Entry (as opposed to raftpb.Entry which requires lots of
    manual processing).

Both are then applied across the codebase, concretely:

  • loqrecovery is simplified via raftlog.Iterator to pull commit
    triggers out of the raft log.
  • debug pretty-printing is simpler thanks to use of raftlog.Entry.
  • decodedRaftEntry is now structurally a raftpb.Entry, and again
    lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
raftlog.Iterator. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together batcheval and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for raftlog.{Entry,Iterator}.

Touches #75729.

Epic: CRDB-220
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg marked this pull request as ready for review February 6, 2022 22:23
@tbg tbg requested a review from a team as a code owner February 6, 2022 22:23
@tbg tbg requested review from erikgrinaker, a team and aliher1911 February 6, 2022 22:24
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see this cleanup!

pkg/kv/kvserver/loqrecovery/collect.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/entry.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/entry.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_application_cmd.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/entry.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/entry.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by comments. Thanks for doing this!

Reviewed 8 of 12 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, and @tbg)


pkg/kv/kvserver/raftlog/entry.go, line 50 at r1 (raw file):

}

// Load populates the Entry from the Ent field. Will reset the Meta field.

wondering why this comment mentions the Meta field. If it populates Entry doesn't that imply that everything will be reset?
I guess it is because of this weird behavior where sometimes we have to unmarshal MVCCMetadata to get a raftpb.Entry and sometimes we are handed a raftpb.Entry directly. Is the latter case the one where we have it in the cache?


pkg/kv/kvserver/raftlog/entry.go, line 98 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

nit: consider typing this out as ConfChange, ditto with the fields. It's too terse when encountered in the code.

+1


pkg/kv/kvserver/raftlog/iterator.go, line 52 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Seek shouldn't take a bound. Put the iterator construction and bounds in NewIterator, and have Seek take a single index.

And rename it SeekGE just to be consistent with other code.


pkg/kv/kvserver/raftlog/iterator.go, line 80 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Seems iffy to do this as a side-effect of Valid(). This iterator pattern that we generally use is pretty awkward in terms of error handling and such (particularly decoding errors), but what we usually end up doing is to call into this decoding in Seek and Next, and set an error in Iterator.err which is surfaced from Valid().

Making UnsafeEntry() return a decoding error might be the saner choice here -- I think that's what we want to do with e.g. MVCCIterator down the road too.

MVCCIterator will probably move to a model where Next (and seeks) will return (valid bool, err error), since the interface call overhead of calling Valid every time is not insignificant (there is a TODO in engine.go). We may as well adopt a similar approach here, which means we would decode in Next and SeekGE.

@tbg tbg force-pushed the raftloglib branch 3 times, most recently from 1bbb2af to ec57fc4 Compare February 11, 2022 16:59
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the comments, ready for another round. I still need to do a little perf work here (left a comment there as well).

Dismissed @erikgrinaker and @sumeerbhola from 5 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, and @sumeerbhola)


pkg/kv/kvserver/replica_application_cmd.go, line 45 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Remove the comment -- ent no longer exists.

Done.


pkg/kv/kvserver/replica_application_cmd.go, line 187 at r2 (raw file):

	var err error
	d.Entry, err = raftlog.MakeEntry(*e)

I need to look at this one again, I think as is I'm allocating more than before because MakeEntry will internally put the return value on the heap.


pkg/kv/kvserver/raftlog/iterator.go, line 52 at r1 (raw file):

Previously, sumeerbhola wrote…

And rename it SeekGE just to be consistent with other code.

Done.


pkg/kv/kvserver/raftlog/iterator.go, line 80 at r1 (raw file):

Previously, sumeerbhola wrote…

MVCCIterator will probably move to a model where Next (and seeks) will return (valid bool, err error), since the interface call overhead of calling Valid every time is not insignificant (there is a TODO in engine.go). We may as well adopt a similar approach here, which means we would decode in Next and SeekGE.

I've implemented Sumeer's suggestion.

Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a few hours figuring out why once I pushed the new code all packages were timing out. I haven't figured it out yet, but ultimately it seems to be caused by a rebase on master I carried out. I seem to be able to pass all tests when rebased on 4ea464d (the initial commit this PR started out on top of) but on 18e4193 it's broken.

Since I need to figure this out anyway, no need to review again yet. I will likely end up making some changes anyway.

Reviewed 2 of 10 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911, @erikgrinaker, and @sumeerbhola)

@tbg

This comment was marked as resolved.

@tbg

This comment was marked as resolved.

@tbg tbg force-pushed the raftloglib branch 2 times, most recently from 679cc04 to 0237730 Compare February 21, 2022 11:01
@tbg tbg force-pushed the raftloglib branch 2 times, most recently from 7302e83 to 2960aed Compare August 19, 2022 13:17
@tbg
Copy link
Member Author

tbg commented Aug 19, 2022

Worked on this a bit today (thanks, Flex Friday!) and this should actually work now. Let's hold off on reviews until the stability period stuff is a bit less hectic, though.

tbg added a commit to tbg/cockroach that referenced this pull request Sep 9, 2022
This is more legible; these are effectively random bytes.

Extracted from cockroachdb#76126.

Release note: None
Copy link
Collaborator

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, ta! Left a few nits, and questions about the semantics.

I think this PR could have been easier to review if split so that one commit introduced the Entry type, and another one for the Iterator.

pkg/kv/kvserver/raftlog/entry.go Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_application_cmd.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_application_cmd.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/replica_application_cmd.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Show resolved Hide resolved
pkg/kv/kvserver/raftlog/iterator.go Show resolved Hide resolved
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR could have been easier to review if split so that one commit introduced the Entry type, and another one for the Iterator.

Thanks for the feedback, I'll be a bit more aggressive about splitting next time. I find that it does slow me down quite a bit so it needs to be worth it.

I'd like to add some more unit testing to this so this needs some more work, but please PTAL at the fixup commits. Additional testing will come in follow-up commits (I don't want to merge before we have this), so by establishing a checkpoint here you don't have to revisit it later.

Reviewed 7 of 7 files at r9.
Dismissed @erikgrinaker from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @pavelkalinnikov, and @sumeerbhola)


pkg/kv/kvserver/replica_application_cmd.go line 45 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

How about removing the decodedRaftEntry wrapper, and directly inlining entry *raflog.Entry? There seems to be no reason to keep it?

Well spotted, done!


pkg/kv/kvserver/replica_application_cmd.go line 84 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

ctx seems unused all the way down the stack from here

Done.


pkg/kv/kvserver/raftlog/entry.go line 86 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

ConfChangeV1

The type is raftpb.ConfChange, and that's what we're unmarshalling (into).


pkg/kv/kvserver/raftlog/iterator.go line 89 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

Can there be gaps in the log indices? I.e. isn't the first log index >= idx equal to idx (or none)?

I imagine a gap can happen when the index is compacted away? Will this be an error then, or depends on who iterates?

Maybe would be worth elaborating nuances around entries with consecutive indices, somewhere in the Iterator comment.

There are never any gaps. I'll make this clear on the Iterator comment.


pkg/kv/kvserver/raftlog/iterator.go line 101 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

Will iteration ever jump by more than +1 index? What precisely does "next entry" mean?

Index+1. I added a comment.


pkg/kv/kvserver/raftlog/iterator.go line 130 at r10 (raw file):

Previously, pavelkalinnikov (Pavel Kalinnikov) wrote…

Is reading one extra entry unavoidable here? E.g. could the storage iterator have an upper bound that it more efficiently terminates at?

Yes, see the TODO in NewIterator. Actually I went ahead and implemented it.

@tbg tbg force-pushed the raftloglib branch 2 times, most recently from 9092822 to 396d847 Compare November 10, 2022 22:11
@tbg
Copy link
Member Author

tbg commented Nov 10, 2022

Added another commit with unit testing, which can also be reviewed.

tbg added a commit to tbg/cockroach that referenced this pull request Nov 11, 2022
We should no longer need it. Also, it introduces the complication
of "empty but ID'ed raft commands", which is annoying. We could
probably change this to propose a truly `nil` command (mimicking
what raft does internally) but it is better to avoid this problem
altogether.

If there truly is a case in which we need to nudge raft, it likely
also exists outside of the `splitDelayHelper`, and the sooner we
find out about it the better.

Extracted from cockroachdb#76126.

Release note: None
craig bot pushed a commit that referenced this pull request Nov 12, 2022
87710: kvserver: remove ProposeEmptyCommand r=erikgrinaker a=tbg

We should no longer need it. Also, it introduces the complication
of "empty but ID'ed raft commands", which is annoying. We could
probably change this to propose a truly `nil` command (mimicking
what raft does internally) but it is better to avoid this problem
altogether.

If there truly is a case in which we need to nudge raft, it likely
also exists outside of the `splitDelayHelper`, and the sooner we
find out about it the better.

Extracted from #76126.

Epic: None
Release note: None


Co-authored-by: Tobias Grieger <[email protected]>
This commit introduces a new package `raftlog`. Aspirationally, this
will at some point become the de facto way of encapsulating the raft log
encoding and may play a role in programmatically constructing and
inspecting raft logs (e.g.  for testing).

For now, we introduce two concepts:

- `raftlog.Entry`, which wraps a `raftpb.Entry` and all of the
  information derived from it, such as the command ID, the
`kvserverpb.RaftCommand`, the configuration change (if any), etc.

- `raftlog.Iterator`, a way to iterate over the raft log in terms of
  `raftlog.Entry` (as opposed to `raftpb.Entry` which requires lots of
manual processing).

Both are then applied across the codebase, concretely:

- `loqrecovery` is simplified via `raftlog.Iterator` to pull commit
  triggers out of the raft log.
- debug pretty-printing is simpler thanks to use of `raftlog.Entry`.
- `decodedRaftEntry` is now structurally a `raftpb.Entry`, and again
  lots manual custom unmarshaling code evaporates.

It's currently difficult to create "interesting" raft log entries if
trying to stay away from manual population of large datastructures
(which is prone to rotting), so there's zero unit testing of
`raftlog.Iterator`. However, the code is not new, instead it was
deduplicated from a few places, and is now tested through all of them;
so I don't feel to bad about it. I still think it is a priority to be
able to "comfortably" create at least "simple" raft logs, meaning we
need to be able to string together `batcheval` and entry creation at
least in a rudimentary fashion. I intend to look into this next and
add comprehensive unit tests for `raftlog.{Entry,Iterator}`.

Touches cockroachdb#75729.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Nov 12, 2022
This is cockroachdb#76126, not to be reviewed in this PR. It will be rebased away
once that PR has merged.
Protobuf makes it hard to do well here.

```
BenchmarkNewEntry/fromRawValue=false,release=false-10   1319 ns/op   4890 B/op  10 allocs/op
BenchmarkNewEntry/fromRawValue=false,release=true-10    1128 ns/op   4442 B/op   9 allocs/op
BenchmarkNewEntry/fromRawValue=true,release=false-10    2611 ns/op  13213 B/op  13 allocs/op
BenchmarkNewEntry/fromRawValue=true,release=true-10     2536 ns/op  12766 B/op  12 allocs/op
```

Release note: None
@tbg
Copy link
Member Author

tbg commented Nov 12, 2022

TFTR!

bors r=pavelkalinnikov

@craig
Copy link
Contributor

craig bot commented Nov 12, 2022

Build succeeded:

@craig craig bot merged commit fa82973 into cockroachdb:master Nov 12, 2022
@tbg tbg deleted the raftloglib branch November 13, 2022 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants