Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: batch double committing #46374

Closed
danhhz opened this issue Mar 20, 2020 · 5 comments
Closed

kv: batch double committing #46374

danhhz opened this issue Mar 20, 2020 · 5 comments
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model.

Comments

@danhhz
Copy link
Contributor

danhhz commented Mar 20, 2020

With the following diff on top of 8eff499cc20cd4b8e6e35b985405d16d22f7cbdb (HEAD is also fine assuming the diff applies cleanly.) We trigger the invariant violation: value v-<foo> was written by two operations assertion in kvnemesis. kvnemesis only writes each value 1 time, but this assertion shows that it came out of rangefeed twice.

The following logs happen to show it interleaving with another operation that wrote to the same key, but I've seen cases where this didn't happen (so the value and the prev value of the rangefeed log line are the same). It only seems to happen when a kv.Batch is committed with kv.DB.Run and only when there are both Puts and Gets present in the batch.

I200320 18:07:36.531269 1571 kv/kvnemesis/watcher.go:162  rangefeed Put /Table/50/"17d0b1ef" 1584727656.527476000,0 -> /BYTES/v-20 (prev /BYTES/v-19)
...
I200320 18:07:36.531326 1571 kv/kvnemesis/watcher.go:162  rangefeed Put /Table/50/"17d0b1ef" 1584727656.527570000,0 -> /BYTES/v-21 (prev /BYTES/v-20)
...
I200320 18:07:36.531539 1456 kv/kvnemesis/kvnemesis.go:63
  0 OP  {
  0 OP    b := &Batch{}
  0 OP    b.Put(ctx, /Table/50/"17d0b1ef", v-20) // nil
  0 OP    b.Get(ctx, /Table/50/"007dc6b7") // (nil, nil)
  0 OP    b.Get(ctx, /Table/50/"c97aa241") // (nil, nil)
  0 OP    db0.Run(ctx, b) // nil
  0 OP  }
I200320 18:07:36.531580 1571 kv/kvnemesis/watcher.go:162  rangefeed Put /Table/50/"17d0b1ef" 1584727656.528733000,0 -> /BYTES/v-20 (prev /BYTES/v-21)
...
--- FAIL: TestKVNemesisSingleNode (0.41s)
panic: invariant violation: value v-20 was written by two operations /Table/50/"17d0b1ef"/1584727656.528733000,0 and /Table/50/"17d0b1ef"/1584727656.527476000,0 [recovered]

This repros very quickly (seconds) with make stress PKG=./pkg/kv/kvnemesis TESTS=TestKVNemesisSingleNode.

diff --git a/pkg/kv/kvnemesis/generator.go b/pkg/kv/kvnemesis/generator.go
index e9692ead69..9e594a6b09 100644
--- a/pkg/kv/kvnemesis/generator.go
+++ b/pkg/kv/kvnemesis/generator.go
@@ -182,7 +182,7 @@ func NewDefaultConfig() GeneratorConfig {
 	//
 	// TODO(dan): Make this `config.Ops.Batch.Ops.PutExisting = 0` once #46081 is
 	// fixed.
-	config.Ops.Batch = BatchOperationConfig{}
+	// config.Ops.Batch = BatchOperationConfig{}
 	return config
 }
 
diff --git a/pkg/kv/kvnemesis/kvnemesis_test.go b/pkg/kv/kvnemesis/kvnemesis_test.go
index f0c7839666..dbc892c5d1 100644
--- a/pkg/kv/kvnemesis/kvnemesis_test.go
+++ b/pkg/kv/kvnemesis/kvnemesis_test.go
@@ -46,9 +46,10 @@ func TestKVNemesisSingleNode(t *testing.T) {
 	failures, err := RunNemesis(ctx, rng, ct, config, db)
 	require.NoError(t, err, `%+v`, err)
 
-	for _, failure := range failures {
-		t.Errorf("failure:\n%+v", failure)
-	}
+	_ = failures
+	// for _, failure := range failures {
+	// 	t.Errorf("failure:\n%+v", failure)
+	// }
 }
 
 func TestKVNemesisMultiNode(t *testing.T) {

Found by kvnemesis.

@nvanbenschoten
Copy link
Member

For continuity, this was discussed and diagnosed in this thread: #46017 (comment).

The problem is that a batch with reads and writes is split at the DistSender level (BatchRequest.Split) and issued as completely separate batches. We should probably be returning an OpRequiresTxnError error in this case.

@danhhz
Copy link
Contributor Author

danhhz commented Mar 23, 2020

Naive question: Are these the same issue? That link is to two parts of a split batch committing at different timestamps. This is one key (so one part) committing twice at two different timestamps.

@nvanbenschoten
Copy link
Member

Oh, you're right, I misunderstood the issue and thought it was the same as #46081. At least on the surface, this does seem to be a separate issue. It doesn't seem to be explained by the problem discussed in that link, but interestingly it would be addressed by the solution from there if this truly only occurs with Puts and Gets present in the same batch.

This would be an interesting thread for @OwenQian to pull on once we get more clarity on #45586.

@lunevalex lunevalex added the A-kv-transactions Relating to MVCC and the transactional model. label Jul 29, 2020
@ajwerner
Copy link
Contributor

@nvanbenschoten is this still a thing?

@nvanbenschoten
Copy link
Member

I have not seen this since. #46081 is still a thing, but I'll close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model.
Projects
None yet
Development

No branches or pull requests

5 participants