Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Push pagination for EndTxn into the MVCC layer #94939

Merged
merged 2 commits into from
Feb 26, 2023

Conversation

KaiSun314
Copy link
Contributor

@KaiSun314 KaiSun314 commented Jan 9, 2023

Informs: #77228

This PR consists of two parts: adding MVCCPagination and integrating MVCCPagination to push pagination of EndTxn into the MVCC layer.

Part 1: Add MVCCPagination

Implement MVCCPagination which calls a user-inputted function f() until
it returns an error (note that the iterutil.StopIteration() error means
we have iterated through all elements), or until the number of keys hits
the maxKeys limit or the number of bytes hits the targetBytes limit.

MVCCPagination is a general framework that enables key and byte
pagination.

Part 2: Integrate MVCCPagination to Push Pagination of EndTxn Into The MVCC Layer

In this PR, we push key pagination for EndTxn (currently located in
cmd_end_transaction.go) into the MVCC layer (mvcc.go) by integrating
MVCCPagination from the previous commit into EndTxn.

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Jan 9, 2023

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

I was unable to automatically find a reviewer. You can try CCing one of the following members:

  • A person you worked with closely on this PR.
  • The person who created the ticket, or a CRDB organization member involved with the ticket (author, commenter, etc.).
  • Join our community slack channel and ask on #contributors.
  • Try find someone else from here.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Jan 9, 2023
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from f07e0e9 to 7155de9 Compare January 9, 2023 19:36
@blathers-crl
Copy link

blathers-crl bot commented Jan 9, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

I was unable to automatically find a reviewer. You can try CCing one of the following members:

  • A person you worked with closely on this PR.
  • The person who created the ticket, or a CRDB organization member involved with the ticket (author, commenter, etc.).
  • Join our community slack channel and ask on #contributors.
  • Try find someone else from here.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch 2 times, most recently from b826111 to dd1ff90 Compare January 9, 2023 21:41
@blathers-crl
Copy link

blathers-crl bot commented Jan 9, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the X-blathers-triaged blathers was able to find an owner label Jan 9, 2023
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from dd1ff90 to 107a923 Compare January 10, 2023 00:30
@blathers-crl
Copy link

blathers-crl bot commented Jan 10, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@KaiSun314 KaiSun314 marked this pull request as ready for review January 10, 2023 02:48
@KaiSun314 KaiSun314 requested review from a team as code owners January 10, 2023 02:48
@KaiSun314 KaiSun314 requested a review from a team January 10, 2023 02:48
@erikgrinaker erikgrinaker removed their request for review January 10, 2023 12:30
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch 2 times, most recently from 59b4b73 to 3718d04 Compare January 11, 2023 01:07
@blathers-crl
Copy link

blathers-crl bot commented Jan 11, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from 3718d04 to fef0d8a Compare January 12, 2023 02:12
@blathers-crl
Copy link

blathers-crl bot commented Jan 12, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from fef0d8a to cae61a6 Compare January 12, 2023 04:01
@KaiSun314 KaiSun314 changed the title Push end txn pagination into mvcc storage: Push end txn pagination into mvcc Jan 12, 2023
@KaiSun314 KaiSun314 changed the title storage: Push end txn pagination into mvcc storage: Push pagination for EndTxn into the MVCC layer Jan 12, 2023
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from cae61a6 to 3788544 Compare January 13, 2023 02:54
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from 3788544 to 2d95a9d Compare January 30, 2023 17:53
@blathers-crl
Copy link

blathers-crl bot commented Jan 30, 2023

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 557 at r2 (raw file):

		span := args.LockSpans[i]
		i++
		if err := func() error {

This anonymous function was used to group the error-handling path so we could wrap all errors with "resolving lock at ..". Now that we're calling storage.MVCCPagination, I don't think we need it. Instead, we can just perform the error wrapping on the error return path after the call to storage.MVCCPagination.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 574 at r2 (raw file):

				// Note that the underlying pebbleIterator will still be reused
				// since readWriter is a pebbleBatch in the typical case.
				ok, _, _, err := storage.MVCCResolveWriteIntent(ctx, readWriter, ms, update,

MVCCResolveWriteIntent returns a resume span. Should we be consulting it?


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 635 at r2 (raw file):

	}

	externalLocks = append(externalLocks, args.LockSpans[i:]...)

Instead of maintaining i separately and reaching into args.LockSpans multiple times, what do you think about something like:

remainingLockSpans := args.LockSpans
f := func(maxKeys, targetBytes int64) (numKeys int64, numBytes int64, resumeSpan *roachpb.Span, err error) {
    if len(remainingLockSpans) == 0 {
        return 0, 0, nil, iterutil.StopIteration()
    }
    span := remainingLockSpans[0]
    remainingLockSpans = remainingLockSpans[1:]
    ...
    if ... {
        externalLocks = append(externalLocks, span)
    }
    ...
}

externalLocks = append(externalLocks, remainingLockSpans...)

pkg/storage/mvcc.go line 4089 at r1 (raw file):

}

// MVCCPagination invokes f() until it returns done (i.e. we have iterated

The functions in this package are typically verbs (MVCCScan, MVCCIterate, etc.), so to match, we'll want to name this MVCCPaginate.


pkg/storage/mvcc.go line 4090 at r1 (raw file):

// MVCCPagination invokes f() until it returns done (i.e. we have iterated
// through all elements) or an error, or until the number of keys hits the

We should include some of the commentary from MVCCIterate about iterutil.StopIteration.

EDIT: it looks like you added some of this in the second commit. We'll want to move that back to the first.


pkg/storage/mvcc.go line 4091 at r1 (raw file):

// MVCCPagination invokes f() until it returns done (i.e. we have iterated
// through all elements) or an error, or until the number of keys hits the
// maxKeys limit or the number of bytes hits the targetBytes limit.

It would be worthwhile calling out that it's up to f() whether it wants to allow the numBytes to exceed targetBytes by up to one entry or whether it wants to terminate iteration before numBytes exceeds targetBytes. See the AllowEmpty option.

Having written that out, I wonder how that choice interacts with the resumeSpan assertion that you added.


pkg/storage/mvcc.go line 4105 at r1 (raw file):

		}
		addedKeys, addedBytes, resumeSpan, err := f(maxKeys, targetBytes)
		if err != nil {

Is the assumption that addedKeys, addedBytes, and resumeSpan are all empty in this case? If so, should we assert that?


pkg/storage/mvcc.go line 4127 at r1 (raw file):

			}
		}
		if resumeSpan != nil {

I think we had talked about returning the resumeSpan in the past. It's definitely better than nothing because it allows for the assertion, but if we're just going to drop it after the assertion, I wonder whether we should return the resumeReason instead. If we do that, we can tighten up the assertion further. There's also something nice about returning an integer instead of a pointer.

@nvanbenschoten nvanbenschoten removed the X-blathers-untriaged blathers was unable to find an owner label Feb 23, 2023
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from 4b0d350 to c661e1e Compare February 23, 2023 23:44
Copy link
Contributor Author

@KaiSun314 KaiSun314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 574 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

MVCCResolveWriteIntent returns a resume span. Should we be consulting it?

MVCCResolveWriteIntent will only return a resume span with byte pagination, not key pagination. Byte pagination will be added in #95141, and this PR's goal is to just do refactoring with no behaviour changes


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 635 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Instead of maintaining i separately and reaching into args.LockSpans multiple times, what do you think about something like:

remainingLockSpans := args.LockSpans
f := func(maxKeys, targetBytes int64) (numKeys int64, numBytes int64, resumeSpan *roachpb.Span, err error) {
    if len(remainingLockSpans) == 0 {
        return 0, 0, nil, iterutil.StopIteration()
    }
    span := remainingLockSpans[0]
    remainingLockSpans = remainingLockSpans[1:]
    ...
    if ... {
        externalLocks = append(externalLocks, span)
    }
    ...
}

externalLocks = append(externalLocks, remainingLockSpans...)

Done. Agreed, that is cleaner!


pkg/storage/mvcc.go line 4091 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It would be worthwhile calling out that it's up to f() whether it wants to allow the numBytes to exceed targetBytes by up to one entry or whether it wants to terminate iteration before numBytes exceeds targetBytes. See the AllowEmpty option.

Having written that out, I wonder how that choice interacts with the resumeSpan assertion that you added.

Hmm that's true. I have implemented one possible solution: MVCCPaginate will take in an allowEmpty parameter. If allowEmpty is true, then the resumeReason assertion will not be checked for the targetBytes limit (the maxKeys limit will still be checked regardless of allowEmpty). I was wondering if this would be okay?


pkg/storage/mvcc.go line 4127 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I think we had talked about returning the resumeSpan in the past. It's definitely better than nothing because it allows for the assertion, but if we're just going to drop it after the assertion, I wonder whether we should return the resumeReason instead. If we do that, we can tighten up the assertion further. There's also something nice about returning an integer instead of a pointer.

Done. Agreed, changed to resumeReason, which also works perfectly with the above design to resolve the resumeReason assertion conflicting with the user wishing to terminate iteration before numBytes exceeds targetBytes.

@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from c661e1e to f426a48 Compare February 24, 2023 00:00
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3, 1 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @KaiSun314)


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 573 at r5 (raw file):

	remainingLockSpans := args.LockSpans
	var span roachpb.Span

Why declare span with this scope? It doesn't seem to be used outside of f, unless I'm missing something.

EDIT: I see that we're using this for the error wrapping. If that's the case, then I think we should just push the errors.Wrapf into the closure. There are only 4 error-handling paths. This gives us the opportunity to specialize the error text for each.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go line 648 at r5 (raw file):

	}

	numKeys, _, _, err := storage.MVCCPaginate(ctx, maxKeys, 0, false, f)

Could you add /* <param name> */ comments for the literal values passed in here?


pkg/storage/mvcc.go line 4091 at r1 (raw file):

Previously, KaiSun314 (Kai Sun) wrote…

Hmm that's true. I have implemented one possible solution: MVCCPaginate will take in an allowEmpty parameter. If allowEmpty is true, then the resumeReason assertion will not be checked for the targetBytes limit (the maxKeys limit will still be checked regardless of allowEmpty). I was wondering if this would be okay?

This seems reasonable to me.


pkg/storage/mvcc.go line 4148 at r3 (raw file):

			}
		}
		if resumeReason == kvpb.RESUME_KEY_LIMIT {

Consider changing this to a switch-case statement so that you can exhaustively test all of the handled resume reasons (RESUME_KEY_LIMIT, RESUME_BYTE_LIMIT, 0) and then assert against the unhandled reasons.

Implement MVCCPagination which calls a user-inputted function f() until
it returns an error (note that the iterutil.StopIteration() error means
we have iterated through all elements), or until the number of keys hits
the maxKeys limit or the number of bytes hits the targetBytes limit.

MVCCPagination is a general framework that enables key and byte
pagination.

Informs: cockroachdb#77228

Release note: None
In this PR, we push key pagination for EndTxn (currently located in
cmd_end_transaction.go) into the MVCC layer (mvcc.go) by integrating
MVCCPagination from the previous commit into the EndTxn command.

Informs: cockroachdb#77228

Release note: None
@KaiSun314 KaiSun314 force-pushed the push-end-txn-pagination-into-mvcc branch from f426a48 to b103e22 Compare February 25, 2023 01:33
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks Kai!

bors r=nvanbenschoten

Reviewed 3 of 3 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @KaiSun314)

@craig
Copy link
Contributor

craig bot commented Feb 26, 2023

Build succeeded:

@craig craig bot merged commit 4e6df7b into cockroachdb:master Feb 26, 2023
@KaiSun314
Copy link
Contributor Author

Thank you so much for the review and for merging, Nathan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants