Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvclient/kvcoord: TestTransactionUnexpectedlyCommitted failed #110187

Closed
cockroach-teamcity opened this issue Sep 7, 2023 · 5 comments · Fixed by #111090
Closed

kv/kvclient/kvcoord: TestTransactionUnexpectedlyCommitted failed #110187

cockroach-teamcity opened this issue Sep 7, 2023 · 5 comments · Fixed by #111090
Assignees
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. skipped-test T-kv KV Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 7, 2023

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on master @ cd2e06079050eb1d9df004ea2e2d22ce03dd3b3a:

    defer srv.Stopper().Stop(...)
    ts := srv.ApplicationLayer()
    dist_sender_ambiguous_test.go:421: first range: r65:/Table/100/"{a"-b"} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:422: second range: r66:/{Table/100/"b"-Max} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:308: [op 1] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a'"], [txn: 8935bddf]}, meta={id=8935bddf key=/Table/100/"a'" iso=Serializable pri=0.03919778 epo=0 ts=1694103748.489040650,0 min=1694103748.489040650,0 seq=2}
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:1570: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/b55e16144cf4d6269c46ab547b210b8a/logTestTransactionUnexpectedlyCommitted229717589
--- FAIL: TestTransactionUnexpectedlyCommitted (7.46s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:308: [op 1] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:395: valid lease info for r65: repl=(n1,s1):1 seq=1 start=0,0 exp=1694103754.951317626,0 pro=1694103748.951317626,0
    dist_sender_ambiguous_test.go:395: valid lease info for r66: repl=(n2,s2):3 seq=12 start=1694103749.951152063,0 exp=1694103755.951066568,0 pro=1694103749.951066568,0
    dist_sender_ambiguous_test.go:308: [op 2] (txn1) n1->n2:r66/3 batchReq={Get [/Table/100/"b",/Min), [txn: 907867c0]}, meta={id=907867c0 key=/Table/100/"a" iso=Serializable pri=0.00026319 epo=0 ts=1694103749.952588735,0 min=1694103749.952588735,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 3] (txn1) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 907867c0]}, meta={id=907867c0 key=/Table/100/"a" iso=Serializable pri=0.00026319 epo=0 ts=1694103749.952588735,0 min=1694103749.952588735,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 4] (txn1) n1->n2:r66/3 batchReq={Put [/Table/100/"b",/Min), [txn: 907867c0]}, meta={id=907867c0 key=/Table/100/"a" iso=Serializable pri=0.00026319 epo=0 ts=1694103749.952588735,0 min=1694103749.952588735,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 5] (txn1) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 907867c0]}, meta={id=907867c0 key=/Table/100/"a" iso=Serializable pri=0.00026319 epo=0 ts=1694103749.952588735,0 min=1694103749.952588735,0 seq=0}
    dist_sender_ambiguous_test.go:70: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:511: Transferring r66 lease to n1
    dist_sender_ambiguous_test.go:72: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:70: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:308: [op 6] (txn2) n1->n1:r66/1 batchReq={Get [/Table/100/"b",/Min), [txn: 04af2d39]}, meta={id=04af2d39 key=/Table/100/"a" iso=Serializable pri=0.00261301 epo=0 ts=1694103749.955502739,0 min=1694103749.955502739,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 7] (txn2) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 04af2d39]}, meta={id=04af2d39 key=/Table/100/"a" iso=Serializable pri=0.00261301 epo=0 ts=1694103749.955502739,0 min=1694103749.955502739,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 8] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,04af2d39->907867c0) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 9] (_) n1->n1:r65/1 batchReq={RecoverTxn(907867c0, commit) [/Table/100/"a"]}, meta={<nil>}
    dist_sender_ambiguous_test.go:70: [op 9] (_) n1->n1:r65/1 ‹1 RecoverTxn› paused after send
    dist_sender_ambiguous_test.go:308: [op 10] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,04af2d39->907867c0) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 11] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:72: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:1062: [op 4]  - injected RPC error
    dist_sender_ambiguous_test.go:308: [op 12] (txn1) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 907867c0], [protect-ambiguous-replay]}, meta={id=907867c0 key=/Table/100/"a" iso=Serializable pri=0.00026319 epo=0 ts=1694103749.952588735,0 min=1694103749.952588735,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 13] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,907867c0->04af2d39) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:72: [op 9] (_) n1->n1:r65/1 ‹1 RecoverTxn› unpaused
    dist_sender_ambiguous_test.go:1091: txn1 completed with err: result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))
    dist_sender_ambiguous_test.go:1081: [op 9]  - complete, resp={header:<> recovered_txn:<meta:<id:907867c0-4675-4567-8cd5-b3036039795d key:"\354\022a\000\001" write_timestamp:<wall_time:1694103749952588735 > min_timestamp:<wall_time:1694103749952588735 > priority:5652 sequence:3 coordinator_node_id:1 > status:COMMITTED last_heartbeat:<wall_time:1694103749952588735 > read_timestamp:<> global_uncertainty_limit:<> lock_spans:<key:"\354\022a\000\001" > lock_spans:<key:"\354\022b\000\001" > > }
    dist_sender_ambiguous_test.go:308: [op 14] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 15] (txn2) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 04af2d39]}, meta={id=04af2d39 key=/Table/100/"a" iso=Serializable pri=0.00261301 epo=0 ts=1694103749.955502739,0 min=1694103749.955502739,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 16] (txn2) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 04af2d39]}, meta={id=04af2d39 key=/Table/100/"a" iso=Serializable pri=0.00261301 epo=0 ts=1694103749.955502739,0 min=1694103749.955502739,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 17] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a"], [txn: 04af2d39]}, meta={id=04af2d39 key=/Table/100/"a" iso=Serializable pri=0.00261301 epo=0 ts=1694103749.955502739,0 min=1694103749.955502739,0 seq=3}
    dist_sender_ambiguous_test.go:256: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:256
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:1097
        	Error:      	Error "result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))" does not contain "replay protection"
        	Test:       	TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
        	Messages:   	expected AmbiguousResultError to include message "replay protection"
    dist_sender_ambiguous_test.go:308: [op 18] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:449: key: /Table/100/"a", value: /INT/30
    dist_sender_ambiguous_test.go:449: key: /Table/100/"b", value: /INT/70
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (0.55s)
Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

Jira issue: CRDB-31311

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Sep 7, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.2 milestone Sep 7, 2023
@cockroach-teamcity
Copy link
Member Author

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on master @ 9d3115d1030ee7526818768913285dd0426dd9b5:

    defer srv.Stopper().Stop(...)
    ts := srv.ApplicationLayer()
    dist_sender_ambiguous_test.go:421: first range: r65:/Table/100/"{a"-b"} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:422: second range: r66:/{Table/100/"b"-Max} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:308: [op 1] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a'"], [txn: e77e076f]}, meta={id=e77e076f key=/Table/100/"a'" iso=Serializable pri=0.01484393 epo=0 ts=1694170299.317812471,0 min=1694170299.317812471,0 seq=2}
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:1570: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/40ec041d13327ceafaefe721fe51f834/logTestTransactionUnexpectedlyCommitted2182142855
--- FAIL: TestTransactionUnexpectedlyCommitted (22.80s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:308: [op 1] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:395: valid lease info for r65: repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1694170288.535973576,0
    dist_sender_ambiguous_test.go:395: valid lease info for r66: repl=(n2,s2):3 seq=23 start=1694170299.828702195,0 epo=1 pro=1694170299.837638697,0
    dist_sender_ambiguous_test.go:308: [op 2] (txn1) n1->n2:r66/3 batchReq={Get [/Table/100/"b",/Min), [txn: 1ad65eb0]}, meta={id=1ad65eb0 key=/Table/100/"a" iso=Serializable pri=0.00300221 epo=0 ts=1694170299.891221207,0 min=1694170299.891221207,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 3] (txn1) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 1ad65eb0]}, meta={id=1ad65eb0 key=/Table/100/"a" iso=Serializable pri=0.00300221 epo=0 ts=1694170299.891221207,0 min=1694170299.891221207,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 4] (txn1) n1->n2:r66/3 batchReq={Put [/Table/100/"b",/Min), [txn: 1ad65eb0]}, meta={id=1ad65eb0 key=/Table/100/"a" iso=Serializable pri=0.00300221 epo=0 ts=1694170299.891221207,0 min=1694170299.891221207,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 5] (txn1) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 1ad65eb0]}, meta={id=1ad65eb0 key=/Table/100/"a" iso=Serializable pri=0.00300221 epo=0 ts=1694170299.891221207,0 min=1694170299.891221207,0 seq=0}
    dist_sender_ambiguous_test.go:70: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:511: Transferring r66 lease to n1
    dist_sender_ambiguous_test.go:72: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:70: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:308: [op 6] (txn2) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 067730f7]}, meta={id=067730f7 key=/Table/100/"a" iso=Serializable pri=0.01239958 epo=0 ts=1694170299.910746295,0 min=1694170299.910746295,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 7] (txn2) n1->n1:r66/1 batchReq={Get [/Table/100/"b",/Min), [txn: 067730f7]}, meta={id=067730f7 key=/Table/100/"a" iso=Serializable pri=0.01239958 epo=0 ts=1694170299.910746295,0 min=1694170299.910746295,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 8] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,067730f7->1ad65eb0) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 9] (_) n1->n1:r65/1 batchReq={RecoverTxn(1ad65eb0, commit) [/Table/100/"a"]}, meta={<nil>}
    dist_sender_ambiguous_test.go:70: [op 9] (_) n1->n1:r65/1 ‹1 RecoverTxn› paused after send
    dist_sender_ambiguous_test.go:72: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:1062: [op 4]  - injected RPC error
    dist_sender_ambiguous_test.go:308: [op 10] (txn1) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 1ad65eb0], [protect-ambiguous-replay]}, meta={id=1ad65eb0 key=/Table/100/"a" iso=Serializable pri=0.00300221 epo=0 ts=1694170299.891221207,0 min=1694170299.891221207,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 11] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,067730f7->1ad65eb0) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 12] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 13] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,1ad65eb0->067730f7) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:72: [op 9] (_) n1->n1:r65/1 ‹1 RecoverTxn› unpaused
    dist_sender_ambiguous_test.go:1091: txn1 completed with err: result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))
    dist_sender_ambiguous_test.go:1081: [op 9]  - complete, resp={header:<> recovered_txn:<meta:<id:1ad65eb0-94ef-4e4f-845d-1d333ca0088d key:"\354\022a\000\001" write_timestamp:<wall_time:1694170299891221207 > min_timestamp:<wall_time:1694170299891221207 > priority:64472 sequence:3 coordinator_node_id:1 > status:COMMITTED last_heartbeat:<wall_time:1694170299891221207 > read_timestamp:<> global_uncertainty_limit:<> lock_spans:<key:"\354\022a\000\001" > lock_spans:<key:"\354\022b\000\001" > > }
    dist_sender_ambiguous_test.go:308: [op 14] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:308: [op 15] (txn2) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 067730f7]}, meta={id=067730f7 key=/Table/100/"a" iso=Serializable pri=0.01239958 epo=0 ts=1694170299.910746295,0 min=1694170299.910746295,0 seq=0}
    dist_sender_ambiguous_test.go:308: [op 16] (txn2) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 067730f7]}, meta={id=067730f7 key=/Table/100/"a" iso=Serializable pri=0.01239958 epo=0 ts=1694170299.910746295,0 min=1694170299.910746295,0 seq=0}
    dist_sender_ambiguous_test.go:256: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:256
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:1097
        	Error:      	Error "result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))" does not contain "replay protection"
        	Test:       	TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
        	Messages:   	expected AmbiguousResultError to include message "replay protection"
    dist_sender_ambiguous_test.go:308: [op 17] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a"], [txn: 067730f7]}, meta={id=067730f7 key=/Table/100/"a" iso=Serializable pri=0.01239958 epo=0 ts=1694170299.910746295,0 min=1694170299.910746295,0 seq=3}
    dist_sender_ambiguous_test.go:308: [op 18] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:449: key: /Table/100/"a", value: /INT/30
    dist_sender_ambiguous_test.go:449: key: /Table/100/"b", value: /INT/70
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (0.34s)

Parameters: TAGS=bazel,gss,deadlock , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@rickystewart
Copy link
Collaborator

@cockroach-teamcity
Copy link
Member Author

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on master @ 28552c8876e8d6764c2102effbb8d9842185123d:

    defer ts.Stopper().Stop(...)
to:
    srv, ... := serverutils.StartServer(t, ...)
    defer srv.Stopper().Stop(...)
    ts := srv.ApplicationLayer()

See also: https://go.crdb.dev/p/testserver-and-cluster-virtualization
    dist_sender_ambiguous_test.go:422: first range: r65:/Table/100/"{a"-b"} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:423: second range: r66:/{Table/100/"b"-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
--- FAIL: TestTransactionUnexpectedlyCommitted (22.24s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:309: [op 1] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:396: valid lease info for r65: repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1694770909.485538583,0
    dist_sender_ambiguous_test.go:396: valid lease info for r66: repl=(n2,s2):2 seq=23 start=1694770929.574685684,0 epo=1 pro=1694770929.579057564,0
    dist_sender_ambiguous_test.go:309: [op 2] (txn1) n1->n2:r66/2 batchReq={Get [/Table/100/"b",/Min), [txn: 0c9120d8]}, meta={id=0c9120d8 key=/Table/100/"a" iso=Serializable pri=0.02445360 epo=0 ts=1694770929.597431152,0 min=1694770929.597431152,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 3] (txn1) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 0c9120d8]}, meta={id=0c9120d8 key=/Table/100/"a" iso=Serializable pri=0.02445360 epo=0 ts=1694770929.597431152,0 min=1694770929.597431152,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 4] (txn1) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 0c9120d8]}, meta={id=0c9120d8 key=/Table/100/"a" iso=Serializable pri=0.02445360 epo=0 ts=1694770929.597431152,0 min=1694770929.597431152,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 5] (txn1) n1->n2:r66/2 batchReq={Put [/Table/100/"b",/Min), [txn: 0c9120d8]}, meta={id=0c9120d8 key=/Table/100/"a" iso=Serializable pri=0.02445360 epo=0 ts=1694770929.597431152,0 min=1694770929.597431152,0 seq=0}
    dist_sender_ambiguous_test.go:71: [op 5] (txn1) n1->n2:r66/2 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:512: Transferring r66 lease to n1
    dist_sender_ambiguous_test.go:73: [op 5] (txn1) n1->n2:r66/2 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:71: [op 5] (txn1) n1->n2:r66/2 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:309: [op 6] (txn2) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: dcf6ea76]}, meta={id=dcf6ea76 key=/Table/100/"a" iso=Serializable pri=0.00814204 epo=0 ts=1694770929.612200970,0 min=1694770929.612200970,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 7] (txn2) n1->n1:r66/1 batchReq={Get [/Table/100/"b",/Min), [txn: dcf6ea76]}, meta={id=dcf6ea76 key=/Table/100/"a" iso=Serializable pri=0.00814204 epo=0 ts=1694770929.612200970,0 min=1694770929.612200970,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 8] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,dcf6ea76->0c9120d8) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 9] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,dcf6ea76->0c9120d8) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 10] (_) n1->n1:r65/1 batchReq={RecoverTxn(0c9120d8, commit) [/Table/100/"a"]}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 11] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:71: [op 10] (_) n1->n1:r65/1 ‹1 RecoverTxn› paused after send
    dist_sender_ambiguous_test.go:73: [op 5] (txn1) n1->n2:r66/2 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:1063: [op 5]  - injected RPC error
    dist_sender_ambiguous_test.go:309: [op 12] (txn1) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 0c9120d8], [protect-ambiguous-replay]}, meta={id=0c9120d8 key=/Table/100/"a" iso=Serializable pri=0.02445360 epo=0 ts=1694770929.597431152,0 min=1694770929.597431152,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 13] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,0c9120d8->dcf6ea76) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:73: [op 10] (_) n1->n1:r65/1 ‹1 RecoverTxn› unpaused
    dist_sender_ambiguous_test.go:1082: [op 10]  - complete, resp={header:<> recovered_txn:<meta:<id:0c9120d8-5187-4be8-ba18-91f2c7d54cd7 key:"\354\022a\000\001" write_timestamp:<wall_time:1694770929597431152 > min_timestamp:<wall_time:1694770929597431152 > priority:525137 sequence:3 coordinator_node_id:1 > status:COMMITTED last_heartbeat:<wall_time:1694770929597431152 > read_timestamp:<> global_uncertainty_limit:<> lock_spans:<key:"\354\022a\000\001" > lock_spans:<key:"\354\022b\000\001" > > }
    dist_sender_ambiguous_test.go:1092: txn1 completed with err: result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))
    dist_sender_ambiguous_test.go:309: [op 14] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 15] (txn2) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: dcf6ea76]}, meta={id=dcf6ea76 key=/Table/100/"a" iso=Serializable pri=0.00814204 epo=0 ts=1694770929.612200970,0 min=1694770929.612200970,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 16] (txn2) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: dcf6ea76]}, meta={id=dcf6ea76 key=/Table/100/"a" iso=Serializable pri=0.00814204 epo=0 ts=1694770929.612200970,0 min=1694770929.612200970,0 seq=0}
    dist_sender_ambiguous_test.go:257: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:257
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:1098
        	Error:      	Error "result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))" does not contain "replay protection"
        	Test:       	TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
        	Messages:   	expected AmbiguousResultError to include message "replay protection"
    dist_sender_ambiguous_test.go:309: [op 17] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a"], [txn: dcf6ea76]}, meta={id=dcf6ea76 key=/Table/100/"a" iso=Serializable pri=0.00814204 epo=0 ts=1694770929.612200970,0 min=1694770929.612200970,0 seq=3}
    dist_sender_ambiguous_test.go:309: [op 18] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:450: key: /Table/100/"a", value: /INT/30
    dist_sender_ambiguous_test.go:450: key: /Table/100/"b", value: /INT/70
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (0.29s)

Parameters: TAGS=bazel,gss,deadlock , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on master @ 49eef749235f15d57153cb00f48dfd3b27be6c54:

    ts, ... := serverutils.StartServer(t, ...)
    defer ts.Stopper().Stop(...)
to:
    srv, ... := serverutils.StartServer(t, ...)
    defer srv.Stopper().Stop(...)
    ts := srv.ApplicationLayer()

See also: https://go.crdb.dev/p/testserver-and-cluster-virtualization
    dist_sender_ambiguous_test.go:422: first range: r65:/Table/100/"{a"-b"} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:423: second range: r66:/{Table/100/"b"-Max} [(n1,s1):1, (n3,s3):2, (n2,s2):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
--- FAIL: TestTransactionUnexpectedlyCommitted (21.52s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:396: valid lease info for r65: repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1695292181.573924282,0
    dist_sender_ambiguous_test.go:396: valid lease info for r66: repl=(n2,s2):3 seq=23 start=1695292199.862951883,0 epo=1 pro=1695292199.866499382,0
    dist_sender_ambiguous_test.go:309: [op 1] (txn1) n1->n2:r66/3 batchReq={Get [/Table/100/"b",/Min), [txn: cbdd9f46]}, meta={id=cbdd9f46 key=/Table/100/"a" iso=Serializable pri=0.00099172 epo=0 ts=1695292199.874581382,0 min=1695292199.874581382,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 2] (txn1) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: cbdd9f46]}, meta={id=cbdd9f46 key=/Table/100/"a" iso=Serializable pri=0.00099172 epo=0 ts=1695292199.874581382,0 min=1695292199.874581382,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 3] (txn1) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: cbdd9f46]}, meta={id=cbdd9f46 key=/Table/100/"a" iso=Serializable pri=0.00099172 epo=0 ts=1695292199.874581382,0 min=1695292199.874581382,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 4] (txn1) n1->n2:r66/3 batchReq={Put [/Table/100/"b",/Min), [txn: cbdd9f46]}, meta={id=cbdd9f46 key=/Table/100/"a" iso=Serializable pri=0.00099172 epo=0 ts=1695292199.874581382,0 min=1695292199.874581382,0 seq=0}
    dist_sender_ambiguous_test.go:71: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:512: Transferring r66 lease to n1
    dist_sender_ambiguous_test.go:73: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:71: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:309: [op 5] (txn2) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 2cfb00a9]}, meta={id=2cfb00a9 key=/Table/100/"a" iso=Serializable pri=0.01168265 epo=0 ts=1695292199.894786830,0 min=1695292199.894786830,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 6] (txn2) n1->n1:r66/1 batchReq={Get [/Table/100/"b",/Min), [txn: 2cfb00a9]}, meta={id=2cfb00a9 key=/Table/100/"a" iso=Serializable pri=0.01168265 epo=0 ts=1695292199.894786830,0 min=1695292199.894786830,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 7] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,2cfb00a9->cbdd9f46) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 8] (_) n1->n1:r65/1 batchReq={RecoverTxn(cbdd9f46, commit) [/Table/100/"a"]}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 9] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,2cfb00a9->cbdd9f46) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:71: [op 8] (_) n1->n1:r65/1 ‹1 RecoverTxn› paused after send
    dist_sender_ambiguous_test.go:73: [op 4] (txn1) n1->n2:r66/3 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:1063: [op 4]  - injected RPC error
    dist_sender_ambiguous_test.go:309: [op 10] (txn1) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: cbdd9f46], [protect-ambiguous-replay]}, meta={id=cbdd9f46 key=/Table/100/"a" iso=Serializable pri=0.00099172 epo=0 ts=1695292199.874581382,0 min=1695292199.874581382,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 11] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 12] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,cbdd9f46->2cfb00a9) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:73: [op 8] (_) n1->n1:r65/1 ‹1 RecoverTxn› unpaused
    dist_sender_ambiguous_test.go:1092: txn1 completed with err: result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))
    dist_sender_ambiguous_test.go:1082: [op 8]  - complete, resp={header:<> recovered_txn:<meta:<id:cbdd9f46-3dbd-43c6-a91e-0f8b1ec99522 key:"\354\022a\000\001" write_timestamp:<wall_time:1695292199874581382 > min_timestamp:<wall_time:1695292199874581382 > priority:21297 sequence:3 coordinator_node_id:1 > status:COMMITTED last_heartbeat:<wall_time:1695292199874581382 > read_timestamp:<> global_uncertainty_limit:<> lock_spans:<key:"\354\022a\000\001" > lock_spans:<key:"\354\022b\000\001" > > }
    dist_sender_ambiguous_test.go:309: [op 13] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 14] (txn2) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 2cfb00a9]}, meta={id=2cfb00a9 key=/Table/100/"a" iso=Serializable pri=0.01168265 epo=0 ts=1695292199.894786830,0 min=1695292199.894786830,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 15] (txn2) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 2cfb00a9]}, meta={id=2cfb00a9 key=/Table/100/"a" iso=Serializable pri=0.01168265 epo=0 ts=1695292199.894786830,0 min=1695292199.894786830,0 seq=0}
    dist_sender_ambiguous_test.go:257: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:257
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:1098
        	Error:      	Error "result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))" does not contain "replay protection"
        	Test:       	TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
        	Messages:   	expected AmbiguousResultError to include message "replay protection"
    dist_sender_ambiguous_test.go:309: [op 16] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a"], [txn: 2cfb00a9]}, meta={id=2cfb00a9 key=/Table/100/"a" iso=Serializable pri=0.01168265 epo=0 ts=1695292199.894786830,0 min=1695292199.894786830,0 seq=3}
    dist_sender_ambiguous_test.go:309: [op 17] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:450: key: /Table/100/"a", value: /INT/30
    dist_sender_ambiguous_test.go:450: key: /Table/100/"b", value: /INT/70
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (0.25s)

Parameters: TAGS=bazel,gss,deadlock , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@nvanbenschoten nvanbenschoten added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure labels Sep 21, 2023
@cockroach-teamcity
Copy link
Member Author

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on master @ b1a641f1d27653a4e93154926be57d0cae864bd6:

    defer ts.Stopper().Stop(...)
to:
    srv, ... := serverutils.StartServer(t, ...)
    defer srv.Stopper().Stop(...)
    ts := srv.ApplicationLayer()

See also: https://go.crdb.dev/p/testserver-and-cluster-virtualization
    dist_sender_ambiguous_test.go:422: first range: r65:/Table/100/"{a"-b"} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:423: second range: r66:/{Table/100/"b"-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
--- FAIL: TestTransactionUnexpectedlyCommitted (8.29s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:309: [op 1] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:396: valid lease info for r65: repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1695463971.009874805,0
    dist_sender_ambiguous_test.go:396: valid lease info for r66: repl=(n2,s2):2 seq=23 start=1695463975.951690161,0 epo=1 pro=1695463975.953238591,0
    dist_sender_ambiguous_test.go:309: [op 2] (txn1) n1->n2:r66/2 batchReq={Get [/Table/100/"b",/Min), [txn: 41b1943f]}, meta={id=41b1943f key=/Table/100/"a" iso=Serializable pri=0.00915090 epo=0 ts=1695463975.955155801,0 min=1695463975.955155801,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 3] (txn1) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 41b1943f]}, meta={id=41b1943f key=/Table/100/"a" iso=Serializable pri=0.00915090 epo=0 ts=1695463975.955155801,0 min=1695463975.955155801,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 4] (txn1) n1->n2:r66/2 batchReq={Put [/Table/100/"b",/Min), [txn: 41b1943f]}, meta={id=41b1943f key=/Table/100/"a" iso=Serializable pri=0.00915090 epo=0 ts=1695463975.955155801,0 min=1695463975.955155801,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 5] (txn1) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 41b1943f]}, meta={id=41b1943f key=/Table/100/"a" iso=Serializable pri=0.00915090 epo=0 ts=1695463975.955155801,0 min=1695463975.955155801,0 seq=0}
    dist_sender_ambiguous_test.go:71: [op 4] (txn1) n1->n2:r66/2 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:512: Transferring r66 lease to n1
    dist_sender_ambiguous_test.go:73: [op 4] (txn1) n1->n2:r66/2 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:71: [op 4] (txn1) n1->n2:r66/2 ‹1 Put› paused after send
    dist_sender_ambiguous_test.go:309: [op 6] (txn2) n1->n1:r66/1 batchReq={Get [/Table/100/"b",/Min), [txn: 58e79336]}, meta={id=58e79336 key=/Table/100/"a" iso=Serializable pri=0.00793156 epo=0 ts=1695463975.960713900,0 min=1695463975.960713900,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 7] (txn2) n1->n1:r65/1 batchReq={Get [/Table/100/"a",/Min), [txn: 58e79336]}, meta={id=58e79336 key=/Table/100/"a" iso=Serializable pri=0.00793156 epo=0 ts=1695463975.960713900,0 min=1695463975.960713900,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 8] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,58e79336->41b1943f) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 10] (_) n1->n1:r65/1 batchReq={RecoverTxn(41b1943f, commit) [/Table/100/"a"]}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 9] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,58e79336->41b1943f) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:71: [op 10] (_) n1->n1:r65/1 ‹1 RecoverTxn› paused after send
    dist_sender_ambiguous_test.go:73: [op 4] (txn1) n1->n2:r66/2 ‹1 Put› unpaused
    dist_sender_ambiguous_test.go:1063: [op 4]  - injected RPC error
    dist_sender_ambiguous_test.go:309: [op 11] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 12] (txn1) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 41b1943f], [protect-ambiguous-replay]}, meta={id=41b1943f key=/Table/100/"a" iso=Serializable pri=0.00915090 epo=0 ts=1695463975.955155801,0 min=1695463975.955155801,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 13] (_) n1->n1:r65/1 batchReq={PushTxn(PUSH_ABORT,41b1943f->58e79336) [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:73: [op 10] (_) n1->n1:r65/1 ‹1 RecoverTxn› unpaused
    dist_sender_ambiguous_test.go:1082: [op 10]  - complete, resp={header:<> recovered_txn:<meta:<id:41b1943f-0eba-4263-9c39-1f5d14e01f30 key:"\354\022a\000\001" write_timestamp:<wall_time:1695463975955155801 > min_timestamp:<wall_time:1695463975955155801 > priority:196514 sequence:3 coordinator_node_id:1 > status:COMMITTED last_heartbeat:<wall_time:1695463975955155801 > read_timestamp:<> global_uncertainty_limit:<> lock_spans:<key:"\354\022a\000\001" > lock_spans:<key:"\354\022b\000\001" > > }
    dist_sender_ambiguous_test.go:1092: txn1 completed with err: result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))
    dist_sender_ambiguous_test.go:309: [op 14] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"a",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:309: [op 15] (txn2) n1->n1:r66/1 batchReq={Put [/Table/100/"b",/Min), [txn: 58e79336]}, meta={id=58e79336 key=/Table/100/"a" iso=Serializable pri=0.00793156 epo=0 ts=1695463975.960713900,0 min=1695463975.960713900,0 seq=0}
    dist_sender_ambiguous_test.go:309: [op 16] (txn2) n1->n1:r65/1 batchReq={Put [/Table/100/"a",/Min), EndTxn(parallel commit) [/Table/100/"a"], [txn: 58e79336]}, meta={id=58e79336 key=/Table/100/"a" iso=Serializable pri=0.00793156 epo=0 ts=1695463975.960713900,0 min=1695463975.960713900,0 seq=0}
    dist_sender_ambiguous_test.go:257: 
        	Error Trace:	github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:257
        	            				github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord_test/pkg/kv/kvclient/kvcoord/dist_sender_ambiguous_test.go:1098
        	Error:      	Error "result is ambiguous: error=grpc: response jammed on n1<-n2 [code 14/Unavailable] [propagate] (last error: TransactionStatusError: already committed (REASON_TXN_COMMITTED))" does not contain "replay protection"
        	Test:       	TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
        	Messages:   	expected AmbiguousResultError to include message "replay protection"
    dist_sender_ambiguous_test.go:309: [op 17] (txn2) n1->n1:r65/1 batchReq={EndTxn(commit) [/Table/100/"a"], [txn: 58e79336]}, meta={id=58e79336 key=/Table/100/"a" iso=Serializable pri=0.00793156 epo=0 ts=1695463975.960713900,0 min=1695463975.960713900,0 seq=3}
    dist_sender_ambiguous_test.go:309: [op 18] (_) n1->n1:r66/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:450: key: /Table/100/"a", value: /INT/30
    dist_sender_ambiguous_test.go:450: key: /Table/100/"b", value: /INT/70
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (0.19s)

Parameters: TAGS=bazel,gss , stress=true

Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

arulajmani added a commit to arulajmani/cockroach that referenced this issue Sep 25, 2023
craig bot pushed a commit that referenced this issue Sep 25, 2023
111205: kvcoord: skip TestTransactionUnexpectedlyCommitted r=nvanbenschoten a=arulajmani

Flaky test.

Informs #110187

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Sep 27, 2023
The `TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease`
test, introduced to test cockroachdb#107658, has been flaky (particularly under
deadlock builds) due to a race condition between a retry of a write and
intent resolution. While both orderings in this test result in a correct
`AmbiguousResultError` for the client, when intent resolution wins the
race, the retried write will attempt to push away the current
lockholder; since it is illegal for a committed transaction to perform a
push, this results in a different secondary error attached to the
`AmbiguousResultError`. This change ensures a predefined ordering of
these operations so that the secondary error is consistent across runs
of the test.

Fixes: cockroachdb#110187

Release note: None
craig bot pushed a commit that referenced this issue Sep 27, 2023
109781: spanconfig: pass SpanConfig by pointer r=arulajmani a=andrewbaptist

Note: Only the last 2 commits are only in this PR.

SpanConfig can be a large object and it is more efficient to pass by
pointer now that it goes through multiple layers of code.

Epic: none

Release note: None


111090: kvcoord: fix flake in `TestTransactionUnexpectedlyCommitted` r=AlexTalks a=AlexTalks

The `TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease` test, introduced to test #107658, has been flaky (particularly under deadlock builds) due to a race condition between a retry of a write and intent resolution. While both orderings in this test result in a correct `AmbiguousResultError` for the client, when intent resolution wins the race, the retried write will attempt to push away the current lockholder; since it is illegal for a committed transaction to perform a
push, this results in a different secondary error attached to the `AmbiguousResultError`. This change ensures a predefined ordering of these operations so that the secondary error is consistent across runs of the test.

Fixes: #110187

Release note: None

111369: clusterversion: fix formatting of versions comment r=RaduBerinde a=RaduBerinde

Some of the formatting in this big comment is getting screwed up by
gofmt. This change fixes it - the trick was to make sure the "sub
lists" are all indented by a tab so it's all treated like a big code
block.

Epic: none
Release note: None

111372: roachtest: fix leaked goroutines in c2c roachtests r=lidorcarmel a=lidorcarmel

Without this PR, c2c roachtests have almost 30 messages like these at the end:
```
18:04:19 leaktest.go:161: Leaked goroutine: goroutine 1879 [select, 2 minutes]:
database/sql.(*DB).connectionOpener(0xc003f2bc70, {0x10812e000, 0xc00059d220})
	GOROOT/src/database/sql/sql.go:1218 +0x8d
created by database/sql.OpenDB
	GOROOT/src/database/sql/sql.go:791 +0x18d
```

This PR cleans that up.

Epic: none

Release note: None

Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Alex Sarkesian <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
@craig craig bot closed this as completed in fdb6e86 Sep 27, 2023
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Oct 5, 2023
The `TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease`
test, introduced to test cockroachdb#107658, has been flaky (particularly under
deadlock builds) due to a race condition between a retry of a write and
intent resolution. While both orderings in this test result in a correct
`AmbiguousResultError` for the client, when intent resolution wins the
race, the retried write will attempt to push away the current
lockholder; since it is illegal for a committed transaction to perform a
push, this results in a different secondary error attached to the
`AmbiguousResultError`. This change ensures a predefined ordering of
these operations so that the secondary error is consistent across runs
of the test.

Fixes: cockroachdb#110187

Release note: None
THardy98 pushed a commit to THardy98/cockroach that referenced this issue Oct 6, 2023
The `TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease`
test, introduced to test cockroachdb#107658, has been flaky (particularly under
deadlock builds) due to a race condition between a retry of a write and
intent resolution. While both orderings in this test result in a correct
`AmbiguousResultError` for the client, when intent resolution wins the
race, the retried write will attempt to push away the current
lockholder; since it is illegal for a committed transaction to perform a
push, this results in a different secondary error attached to the
`AmbiguousResultError`. This change ensures a predefined ordering of
these operations so that the secondary error is consistent across runs
of the test.

Fixes: cockroachdb#110187

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this issue Oct 6, 2023
The `TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease`
test, introduced to test cockroachdb#107658, has been flaky (particularly under
deadlock builds) due to a race condition between a retry of a write and
intent resolution. While both orderings in this test result in a correct
`AmbiguousResultError` for the client, when intent resolution wins the
race, the retried write will attempt to push away the current
lockholder; since it is illegal for a committed transaction to perform a
push, this results in a different secondary error attached to the
`AmbiguousResultError`. This change ensures a predefined ordering of
these operations so that the secondary error is consistent across runs
of the test.

Fixes: cockroachdb#110187

Release note: None
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. skipped-test T-kv KV Team
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants