Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: have sequenced reads use the intent history #33244

Merged
merged 4 commits into from
Feb 5, 2019

Conversation

ridwanmsharif
Copy link

Reads can now use the intent history to serve reads from
values written at a lower sequence. Reads no longer read
all writes in a transaction but instead read all writes
in the transaction with a higher sequence than it. If a key
is written to multiple times within a transaction, the read
can now get the appropriate value using the intent history.

Release note: None

@ridwanmsharif ridwanmsharif requested a review from a team December 18, 2018 19:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ridwanmsharif ridwanmsharif force-pushed the sequenced-reads branch 5 times, most recently from f782b0c to c185bf0 Compare December 19, 2018 20:50
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 5 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 169 at r1 (raw file):

  }

  bool getFromIntentHistory(cockroach::storage::engine::enginepb::MVCCMetadata meta, int32_t seq) {

It's strange that we pass two fields into this method but then set another field - so it's not a fully static method. I'd recommend passing no fields in here.

@petermattis might also have other comments about the C++ style.


c-deps/libroach/mvcc.h, line 170 at r1 (raw file):

  bool getFromIntentHistory(cockroach::storage::engine::enginepb::MVCCMetadata meta, int32_t seq) {
    int32_t largestSeq = -1;

We should use a binary_search to accomplish this lookup.


c-deps/libroach/mvcc.h, line 290 at r1 (raw file):

    if (txn_epoch_ == meta_.txn().epoch()) {
      if (txn_sequence_ >= meta_.txn().sequence()) {

We're going to need to make this logic conditional on a cluster version. To do this, we'll need to go through a few steps, outlined in cockroach_versions.go.

We'll then want to check whether this version is "active" somewhere in or above the mvcc layer. This is going to be a little tricky because we'll need the right dependencies in a few places. Here's my suggestion:

  1. Add a new IgnoreSeq field to MVCCGetOptions and MVCCScanOptions
  2. Comment with a TODO(nvanbenschoten) that the field can go away in 2.3.
  3. Set this field when the new cluster version is NOT active in cmd_get.go, cmd_scan.go, and cmd_reverse_scan.go (I just counted and I think we'll need to set this in 5 places, but please double check). Here's an example of how to check whether a cluster setting is active.
  4. Pass this option down through C++
  5. Make this condition (ignore_seq_ || (txn_sequence_ >= meta_.txn().sequence()))

c-deps/libroach/mvcc.h, line 291 at r1 (raw file):

    if (txn_epoch_ == meta_.txn().epoch()) {
      if (txn_sequence_ >= meta_.txn().sequence()) {
        // 8. We're reading our own txn's intent at a higher sequence.

"an equal or higher sequence"


c-deps/libroach/mvcc.h, line 299 at r1 (raw file):

        // 9. We're reading our own txn's intent at a lower sequence.
        // This means the intent we're seeing was written after the read.
        // So we try and read a value from the intent history that is

s/So//


c-deps/libroach/mvcc.h, line 300 at r1 (raw file):

        // This means the intent we're seeing was written after the read.
        // So we try and read a value from the intent history that is
        // appropriate instead. If no value is found still, read the value

s/still//


c-deps/libroach/mvcc.h, line 303 at r1 (raw file):

        // as though the intent was not present.
        bool found = getFromIntentHistory(meta_, txn_sequence_);
        if (!found) {

nit: invert this condition so that we can separate this into a separate "steps" (e.g. step 10.)


pkg/kv/txn_interceptor_sequence_nums.go, line 69 at r1 (raw file):

// populateMetaLocked is part of the txnInterceptor interface. Since reads now
// require the sequence number, we must pass it through as well.

No need for this comment.


pkg/kv/txn_interceptor_sequence_nums.go, line 77 at r1 (raw file):

// augmentMetaLocked is part of the txnInterceptor interface. Since reads now
// require the sequence number, we must ensure we don't ever underestimate
// the sequence.

No need for this comment.

Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 169 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's strange that we pass two fields into this method but then set another field - so it's not a fully static method. I'd recommend passing no fields in here.

@petermattis might also have other comments about the C++ style.

Done.


c-deps/libroach/mvcc.h, line 170 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should use a binary_search to accomplish this lookup.

Done. Using std:lower_bound that does almost exactly what we want (internally uses binary search).


c-deps/libroach/mvcc.h, line 290 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We're going to need to make this logic conditional on a cluster version. To do this, we'll need to go through a few steps, outlined in cockroach_versions.go.

We'll then want to check whether this version is "active" somewhere in or above the mvcc layer. This is going to be a little tricky because we'll need the right dependencies in a few places. Here's my suggestion:

  1. Add a new IgnoreSeq field to MVCCGetOptions and MVCCScanOptions
  2. Comment with a TODO(nvanbenschoten) that the field can go away in 2.3.
  3. Set this field when the new cluster version is NOT active in cmd_get.go, cmd_scan.go, and cmd_reverse_scan.go (I just counted and I think we'll need to set this in 5 places, but please double check). Here's an example of how to check whether a cluster setting is active.
  4. Pass this option down through C++
  5. Make this condition (ignore_seq_ || (txn_sequence_ >= meta_.txn().sequence()))

Adding a Commit to the PR that'll add a cluster setting.


c-deps/libroach/mvcc.h, line 291 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"an equal or higher sequence"

Done.


c-deps/libroach/mvcc.h, line 299 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/So//

Done.


c-deps/libroach/mvcc.h, line 300 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/still//

Done.


pkg/kv/txn_interceptor_sequence_nums.go, line 69 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for this comment.

Done.


pkg/kv/txn_interceptor_sequence_nums.go, line 77 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for this comment.

Done.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r2, 11 of 11 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 29 at r2 (raw file):

// compareIntents compares two sequenced intents to check if the first one
// has a sequence equal or lower than the second.
bool compareIntents(cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent1,

make the args const &.


c-deps/libroach/mvcc.h, line 31 at r2 (raw file):

bool compareIntents(cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent1,
    cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent2) {
  if (intent1.sequence() <= intent2.sequence()) {

This can be return intent1.sequence() <= intent2.sequence();, right?

I think it should also be strictly less, based on https://en.cppreference.com/w/cpp/algorithm/lower_bound.


c-deps/libroach/mvcc.h, line 195 at r2 (raw file):

      return false;
    }
    const auto intent = *(low - 1);

std::lower_bound returns " the first element in the range [first, last) that is not less than value". So does this logic work if we query at the same sequence as an intent? I think we might want std::upper_bound.


c-deps/libroach/mvcc.h, line 312 at r2 (raw file):

        // This means the intent we're seeing was written after the read.
        // If there exists a value in the intent history that has a sequence
        // number lower than the read sequence, read that value.

equal to or lower than!


c-deps/libroach/mvcc.h, line 317 at r2 (raw file):

          return true;
        }
        // 10. If no value in the intent history has a sequence number lower

ditto

Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! Your last review was too chill @nvanbenschoten. You gave me some hope that this gets through before I leave cause there was nothing to iterate on. Give it another 👀 to see if you find anything else, pretty excited now!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 29 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

make the args const &.

Done.


c-deps/libroach/mvcc.h, line 31 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This can be return intent1.sequence() <= intent2.sequence();, right?

I think it should also be strictly less, based on https://en.cppreference.com/w/cpp/algorithm/lower_bound.

Well that isn't really necessary because making it <= would allow for intents written at the same seq to be read as well. But I changed it anyway as I'm now using upper_bound.


c-deps/libroach/mvcc.h, line 195 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

std::lower_bound returns " the first element in the range [first, last) that is not less than value". So does this logic work if we query at the same sequence as an intent? I think we might want std::upper_bound.

Actually lower_bound used to work correctly even in that case as the compare predicate function would consider <= instead of just <. Since, upper_bound achieves the same thing, and is the convention we should stick with it. Done.


c-deps/libroach/mvcc.h, line 312 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

equal to or lower than!

Done.


c-deps/libroach/mvcc.h, line 317 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

ditto

Done.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ridwanmsharif Have you run any of the MVCC benchmarks (in storage/engine) to see the impact of this change? This is performance sensitive code and extra care needs to be taken. I haven't had a chance to look at this change yet, so perhaps there is nothing of concern.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS The relevant benchmarks for this change are the various MVCCScan benchmarks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@ridwanmsharif ridwanmsharif requested a review from a team December 21, 2018 16:03
Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran a few benchmarks and there was virtually no hit in the benchmarks you mention. I believe that makes sense as the extra logic for reading the right value from history would apply on when there are a few writes on the same key by the same transaction. We've not been writing history for a while and the history is expected to be pretty small as well. I defer judgement to @nvanbenschoten now. Doesn't seem like I will be able to iterate on this anymore. I trust you will review this and push this through. I'll check back on this from my flight. Great working with you all. Happy Halloween! And Happy Holidays!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@nvanbenschoten
Copy link
Member

@ridwanmsharif sorry for letting this sit. I rebased this on master, fixed a small bug, and cleaned it up a bit. We're close to being ready to merge! The only issue is that I don't have permission to push to the pull request. Do you mind following these steps: https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/?

@ridwanmsharif
Copy link
Author

That's great to hear! I do see that this PR already checks the "Allow edits from maintainers" option. I don't quite understand why you can't push to it. Also sent you an invitation to be a collaborator on the fork, maybe that fixes this? @nvanbenschoten

@nvanbenschoten
Copy link
Member

I think the issue was actually that I was pushing to the wrong branch name. Pushing to the correct one fixed it! I left the first two commits alone (other than a rebase) and added two more. @ridwanmsharif you might be interested in looking them over. Otherwise, @petermattis, mind reviewing those two changes when you get a chance?

I'm gone for the next two weeks, so @petermattis feel free to merge this if everything checks out on your end. I'm also happy to merge when I return in case there's any fallout.

@nvanbenschoten nvanbenschoten dismissed their stale review January 11, 2019 16:31

The third commit addressed the rest of my feedback.

Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM. Thanks for fixing the bug too! I'll defer to @petermattis for the final ✔️

Reads can now use the intent history to serve reads from
values written at a lower sequence. Reads no longer read
all writes in a transaction but instead read all writes
in the transaction with a higher sequence than it. If a key
is written to multiple times within a transaction, the read
can now get the appropriate value using the intent history.

Release note: None
Ridwan Sharif and others added 3 commits February 5, 2019 16:35
We need a cluster version for this change. We make sure
to set this newly created `IgnoreSequence` to true each time we see a
cluster version is too low.

Release note: None
This also fixes one bug, where MVCCScan would get stuck reading the
same key and never advancing. The added test cases caught this.

Release note: None
To do this, it needed to read at the previous sequence number before
writing at the current sequece number.

Release note: None
@nvanbenschoten
Copy link
Member

Ok, now that #34025 has calmed down, I'm going to go ahead and merge this. Thanks again for all the hard work on this @ridwanmsharif!

bors r+

craig bot pushed a commit that referenced this pull request Feb 5, 2019
33244: storage: have sequenced reads use the intent history r=nvanbenschoten a=ridwanmsharif

Reads can now use the intent history to serve reads from
values written at a lower sequence. Reads no longer read
all writes in a transaction but instead read all writes
in the transaction with a higher sequence than it. If a key
is written to multiple times within a transaction, the read
can now get the appropriate value using the intent history.

Release note: None

Co-authored-by: Ridwan Sharif <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Feb 5, 2019

Build succeeded

@craig craig bot merged commit 20bf018 into cockroachdb:master Feb 5, 2019
@nvanbenschoten nvanbenschoten deleted the sequenced-reads branch February 6, 2019 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants