Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: disallow parallel commit of batch with read-only requests #44608

Merged

Conversation

nvanbenschoten
Copy link
Member

Fixes #44315.

This commit fixes an issue where KV batches with read-only requests were allowed
to be issued as parallel commits. This should not have been allowed because the
outcome of these read-only requests, notably Get and Scan requests, is not taken
into consideration by the status resolution process for STAGING transactions.

This would have caused serious atomicity issues if parallel commit batches
(non-1PC committing batches) were issued with read-only requests by users of the
KV API. Luckily, in practice that wouldn't actually happen as we never issue
batches like that so there's no concern about production clusters hitting this
bug. Still, this is a very good catch as this was a serious footgun and fixing
this hardens the KV API. We should also still backport this to v19.2 just to be
safe.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@danhhz
Copy link
Contributor

danhhz commented Feb 3, 2020

I could rubber stamp this, the change looks fine and like it's doing what it says it does, but would we prefer a real review?

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but to be honest I'm not sure what semantics we want for reads in the commit batch. I think you can argue for committing even if these reads fail. But we can punt until we have a better error recovery story.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @danhhz, and @nvanbenschoten)


pkg/kv/txn_interceptor_committer.go, line 279 at r1 (raw file):

// canBatchCommitInParallel determines whether the batch can issue its
// committing EndTxn in parallel with the rest of its requests.
func (tc *txnCommitter) canBatchCommitInParallel(

The new name doesn't mean much to me. How about just canCommitInParallel?
And I'd rephrase the comment too; it's not just about committing in parallel with the other requests in this batch, but also with the in-flight writes.

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/parCommitReadOnly branch from b66c6c3 to 0f03869 Compare February 4, 2020 01:04
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @danhhz)


pkg/kv/txn_interceptor_committer.go, line 279 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

The new name doesn't mean much to me. How about just canCommitInParallel?
And I'd rephrase the comment too; it's not just about committing in parallel with the other requests in this batch, but also with the in-flight writes.

Done.

@craig
Copy link
Contributor

craig bot commented Feb 4, 2020

Build failed (retrying...)

@craig
Copy link
Contributor

craig bot commented Feb 4, 2020

Build failed

@nvanbenschoten
Copy link
Member Author

no space left on device then TestRandomKeyAndTimestampExport flake (#44589).

bors r+

Fixes cockroachdb#44315.

This commit fixes an issue where KV batches with read-only requests were allowed
to be issued as parallel commits. This should not have been allowed because the
outcome of these read-only requests, notably Get and Scan requests, is not taken
into consideration by the status resolution process for STAGING transactions.

This would have caused serious atomicity issues if parallel commit batches
(non-1PC committing batches) were issued with read-only requests by users of the
KV API. Luckily, in practice that wouldn't actually happen as we never issue
batches like that so there's no concern about production clusters hitting this
bug. Still, this is a very good catch as this was a serious footgun and fixing
this hardens the KV API. We should also still backport this to v19.2 just to be
safe.

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/parCommitReadOnly branch from 0f03869 to 56a9934 Compare February 4, 2020 03:45
@nvanbenschoten
Copy link
Member Author

Hello bors?

{{:badmatch,
  {:error, :push, 422,
   "{\"message\":\"Required status check \\\"license/cla\\\" is expected.\",\"documentation_url\":\"https://help.github.com/articles/about-protected-branches\"}"}},
 [
   {BorsNG.Worker.Batcher, :complete_batch, 3,
    [file: 'lib/worker/batcher.ex', line: 475]},
   {BorsNG.Worker.Batcher, :maybe_complete_batch, 1,
    [file: 'lib/worker/batcher.ex', line: 456]},
   {BorsNG.Worker.Batcher, :handle_cast, 2,
    [file: 'lib/worker/batcher.ex', line: 83]},
   {:gen_server, :try_dispatch, 4,
    [file: 'gen_server.erl', line: 637]},
   {:gen_server, :handle_msg, 6,
    [file: 'gen_server.erl', line: 711]},
   {:proc_lib, :init_p_do_apply, 3,
    [file: 'proc_lib.erl', line: 249]}
 ]}

bors r+

craig bot pushed a commit that referenced this pull request Feb 4, 2020
44608: kv: disallow parallel commit of batch with read-only requests r=nvanbenschoten a=nvanbenschoten

Fixes #44315.

This commit fixes an issue where KV batches with read-only requests were allowed
to be issued as parallel commits. This should not have been allowed because the
outcome of these read-only requests, notably Get and Scan requests, is not taken
into consideration by the status resolution process for STAGING transactions.

This would have caused serious atomicity issues if parallel commit batches
(non-1PC committing batches) were issued with read-only requests by users of the
KV API. Luckily, in practice that wouldn't actually happen as we never issue
batches like that so there's no concern about production clusters hitting this
bug. Still, this is a very good catch as this was a serious footgun and fixing
this hardens the KV API. We should also still backport this to v19.2 just to be
safe.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Feb 4, 2020

Build succeeded

@craig craig bot merged commit 56a9934 into cockroachdb:master Feb 4, 2020
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/parCommitReadOnly branch February 4, 2020 18:14
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 4, 2020
This was missed in cockroachdb#44608. I noticed while backporting that change in cockroachdb#44703.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv: unexpected REASON_TXN_COMMITTED from CommitInBatch
4 participants