Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rangefeed: create catchup iterators eagerly #111045

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

aliher1911
Copy link
Contributor

@aliher1911 aliher1911 commented Sep 21, 2023

Previously, catchup iterators were created in the main rangefeed processor work loop. This is negatively affecting scheduler based processors as this operation could be slow.
This commit makes iterator creation eager, simplifying error handling and making rangefeed times delays lower.

Epic: CRDB-26372

Fixes: #111060
Fixes: #111040

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Sep 21, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually quite nice, small and focused. I just have few nits, but @erikgrinaker should have final approval.

pkg/kv/kvserver/rangefeed/processor.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/rangefeed/processor.go Outdated Show resolved Hide resolved
iter, err := rangefeed.NewCatchUpIterator(r.store.TODOEngine(), rSpan.AsRawSpanWithNoLocals(),
args.Timestamp, iterSemRelease, pacer)
if err != nil {
r.raftMu.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to keep AssertHeld?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want. The sole reason for that was because we passed closure to method that executes it within its worker. But all that work must be done while we are still holding the lock (in this method above). Now everything is done within the method body under the lock, so no point in having the assert.

// out of container, it is safe to close it regardless of startup success or
// failure.
type CatchupIteratorContainer struct {
syncutil.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure who we are trying to serialize with.... presumably registration/filter stuff running and racing with some other error encountered elsewhere...

(totally optional/): maybe type CatchupIteratorContainer atomic.Pointer[CatchUpIterator.

Get method is simply swap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if context is cancelled while processor is handling registration request. So I'll be uneasy mutating it from two places. I also think that linter will eventually flake on data race even if that's safe to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea of atomic pointer + swap. I'll change that.


// Get moves iterator out of container. Calling Close on container won't close
// the iterator after that. Safe to call on empty container.
func (c *CatchupIteratorContainer) Get() (iter *CatchUpIterator) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe rename to Detach() or Move() or Release()?

@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch from 136c17d to d0bad64 Compare September 21, 2023 15:19
Copy link
Contributor Author

@aliher1911 aliher1911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me address the nits real quick!

// out of container, it is safe to close it regardless of startup success or
// failure.
type CatchupIteratorContainer struct {
syncutil.Mutex
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if context is cancelled while processor is handling registration request. So I'll be uneasy mutating it from two places. I also think that linter will eventually flake on data race even if that's safe to do.

iter, err := rangefeed.NewCatchUpIterator(r.store.TODOEngine(), rSpan.AsRawSpanWithNoLocals(),
args.Timestamp, iterSemRelease, pacer)
if err != nil {
r.raftMu.Unlock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want. The sole reason for that was because we passed closure to method that executes it within its worker. But all that work must be done while we are still holding the lock (in this method above). Now everything is done within the method body under the lock, so no point in having the assert.

// out of container, it is safe to close it regardless of startup success or
// failure.
type CatchupIteratorContainer struct {
syncutil.Mutex
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea of atomic pointer + swap. I'll change that.

@aliher1911 aliher1911 marked this pull request as ready for review September 21, 2023 15:26
@aliher1911 aliher1911 requested a review from a team September 21, 2023 15:26
@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch from d0bad64 to 483b6aa Compare September 21, 2023 15:48
@aliher1911 aliher1911 self-assigned this Sep 21, 2023
@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch 2 times, most recently from 420ad31 to a5d7730 Compare September 21, 2023 16:56
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aliher1911 and @erikgrinaker)

@miretskiy
Copy link
Contributor

miretskiy commented Sep 21, 2023 via email

@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch 2 times, most recently from f0cfab1 to 3257904 Compare September 22, 2023 09:21
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you look at the original motivation for plumbing through the constructor in #69613? I think this is fine, because all the lifecycle events appear to synchronize with raftMu, but I wonder if we're missing something.

pkg/kv/kvserver/rangefeed/scheduled_processor.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/rangefeed/processor.go Outdated Show resolved Hide resolved
@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch from 3257904 to cf35678 Compare September 22, 2023 12:06
@aliher1911
Copy link
Contributor Author

The original PR you mentioned tried to move iterator creation from registration work loop which is guaranteed to run after current raft lock was released and some data might get through. But it was passing a function instead of constructing iterator upfront. Since all possible locks are held while we are waiting for registration I don't see why we can't create iterator upfront other than trying to avoid its creation on failure path.

Container was trying to do the same, but just handling it on error paths might be better. The only difference that it was handled implicitly by defer and someone didn't start using the iterator.

Moving contruction to registerWithRangefeedRaftMuLocked() seems ugly as we will have to pull limiter callbacks down the stack because they are attached to the interator. So passing iterator down and letting registerWithRangefeedRaftMuLocked() handle closing seem not uglier than passing callback for the sake of iterator creation to me.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the simplification.

Moving contruction to registerWithRangefeedRaftMuLocked() seems ugly as we will have to pull limiter callbacks down the stack because they are attached to the interator. So passing iterator down and letting registerWithRangefeedRaftMuLocked() handle closing seem not uglier than passing callback for the sake of iterator creation to me.

Yeah, this is better, thanks.

pkg/kv/kvserver/replica_rangefeed.go Show resolved Hide resolved
pkg/kv/kvserver/replica_rangefeed.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/rangefeed/registry.go Show resolved Hide resolved
pkg/kv/kvserver/rangefeed/registry_test.go Outdated Show resolved Hide resolved
Previously, catchup iterators were created in the main rangefeed
processor work loop. This is negatively affecting scheduler based
processors as this operation could be slow.
This commit makes iterator creation eager, simplifying error handling
and making rangefeed times delays lower.

Epic: none

Release note: None
@aliher1911 aliher1911 force-pushed the rangefeed_eager_iterator branch from b50a6f0 to 54b8a73 Compare September 22, 2023 15:30
@aliher1911
Copy link
Contributor Author

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Sep 22, 2023

Build succeeded:

@craig craig bot merged commit 235babd into cockroachdb:master Sep 22, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants