Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test: TestTenantLogic_partial_index failed [TransactionRetryWithProtoRefreshError on index creation] #126763

Closed
cockroach-teamcity opened this issue Jul 5, 2024 · 10 comments · Fixed by #133400
Assignees
Labels
branch-master Failures and bugs on the master branch. branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. P-2 Issues/test failures with a fix SLA of 3 months T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jul 5, 2024

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_partial_index failed with artifacts on master @ 295d09a88895a69e5cc9149fb8165acf78e39e61:

=== RUN   TestTenantLogic_partial_index
    test_log_scope.go:170: test logs captured to: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index3175470877
    test_log_scope.go:81: use -show-logs to present logs inline
[05:36:23] setting distsql_workmem='17395B';
[05:36:23] setting distsql_workmem='17395B';
[05:36:25] --- progress: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4074/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index: 82 statements
[05:36:31] --- done: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4074/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index with config 3node-tenant: 99 tests, 0 failures
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4074/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: error while processing
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4074/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_ABORT_SPAN): "sql txn" meta={id=a8969e98 key=/Tenant/10/Table/7/1/0/0 iso=Serializable pri=0.06129593 epo=0 ts=1720157788.872089549,2 min=1720157786.829626719,0 seq=28} lock=true stat=ABORTED rts=1720157786.829626719,0 wto=false gul=1720157787.329626719,0
    panic.go:626: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index3175470877
--- FAIL: TestTenantLogic_partial_index (11.19s)
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

Jira issue: CRDB-40112

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-queries SQL Queries Team labels Jul 5, 2024
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Jul 5, 2024
@cockroach-teamcity
Copy link
Member Author

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_partial_index failed with artifacts on master @ 485975b3a824c68c07340e6a336c7864c00d3c6d:

=== RUN   TestTenantLogic_partial_index
    test_log_scope.go:170: test logs captured to: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index943042731
    test_log_scope.go:81: use -show-logs to present logs inline
[05:37:33] --- progress: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4076/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index: 58 statements
[05:37:39] --- done: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4076/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index with config 3node-tenant: 99 tests, 0 failures
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4076/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: error while processing
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4076/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_ABORT_SPAN): "sql txn" meta={id=bca53adc key=/Tenant/10/Table/7/1/0/0 iso=Serializable pri=0.02161860 epo=0 ts=1720330654.386492632,0 min=1720330654.386492632,0 seq=36} lock=true stat=ABORTED rts=1720330654.386492632,0 wto=false gul=1720330654.886492632,0
    panic.go:626: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index943042731
--- FAIL: TestTenantLogic_partial_index (11.55s)
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_partial_index failed with artifacts on master @ 485975b3a824c68c07340e6a336c7864c00d3c6d:

=== RUN   TestTenantLogic_partial_index
    test_log_scope.go:170: test logs captured to: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index2024852354
    test_log_scope.go:81: use -show-logs to present logs inline
[05:35:34] setting distsql_workmem='55512B';
[05:35:34] setting distsql_workmem='55512B';
[05:35:36] --- progress: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4069/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index: 78 statements
[05:35:43] --- done: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4069/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index with config 3node-tenant: 99 tests, 0 failures
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4069/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: error while processing
    logic.go:4135: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4069/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): "sql txn" meta={id=161c3cea key=/Tenant/10/Table/7/1/0/0 iso=Serializable pri=0.00467417 epo=0 ts=1720416937.910588096,0 min=1720416937.910588096,0 seq=33} lock=true stat=PENDING rts=1720416937.910588096,0 wto=false gul=1720416938.410588096,0
    panic.go:626: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/f82bd4d116da0800a539c75f49940d6f/logTestTenantLogic_partial_index2024852354
--- FAIL: TestTenantLogic_partial_index (11.18s)
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@yuzefovich
Copy link
Member

I think the first time we saw this was in #125354 about a month ago, and it appears that we hit a txn retry when committing an explicit txn within which we create a secondary index. Re-assigning it to Foundations for further triage.

@yuzefovich yuzefovich added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-queries SQL Queries Team labels Jul 10, 2024
@yuzefovich yuzefovich removed this from SQL Queries Jul 10, 2024
@yuzefovich yuzefovich changed the title pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test: TestTenantLogic_partial_index failed pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test: TestTenantLogic_partial_index failed [TransactionRetryWithProtoRefreshError on index creation] Jul 10, 2024
@rafiss rafiss removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Jul 10, 2024
@exalate-issue-sync exalate-issue-sync bot added the P-2 Issues/test failures with a fix SLA of 3 months label Jul 18, 2024
@rimadeodhar
Copy link
Collaborator

rimadeodhar commented Jul 23, 2024

The statement that fails within the partial_index logic test:

BEGIN
CREATE TABLE i (a INT, b enum)
INSERT INTO i VALUES (1, 'foo'), (2, 'bar')
CREATE INDEX a_b_foo_idx ON i (a) WHERE b = 'foo'
COMMIT

I'm struggling to reproduce this failure even though the test seems to routinely fail on CI. The same statement runs fine on SQL shell. I tried running it under --stress but no dice. Any other ideas on what might be a good way to reproduce this?

cc @fqazi, @rafiss

@rafiss
Copy link
Collaborator

rafiss commented Aug 12, 2024

Let's try breaking up the statement into multiple transactions. Using a larger transaction with multiple schema changes makes it more likely for txn retry errors to occur.

Copy link

github-actions bot commented Oct 7, 2024

We have marked this test failure issue as stale because it has been
inactive for 1 month. If this failure is still relevant, removing the
stale label or adding a comment will keep it active. Otherwise,
we'll close it in 5 days to keep the test failure queue tidy.

@michae2
Copy link
Collaborator

michae2 commented Oct 18, 2024

This is still happening: #132866

@cockroach-teamcity
Copy link
Member Author

pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test.TestTenantLogic_partial_index failed with artifacts on master @ b13f76063a555af796f5df7243ecdd894ff0ff51:

=== RUN   TestTenantLogic_partial_index
    test_log_scope.go:165: test logs captured to: /artifacts/tmp/_tmp/6b89db816bf074ebea6fff85a8db7616/logTestTenantLogic_partial_index1883979
    test_log_scope.go:76: use -show-logs to present logs inline
[20:44:27] setting distsql_workmem='79082B';
[20:44:27] setting distsql_workmem='79082B';
[20:44:30] --- progress: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/16193/execroot/com_github_cockroachdb_cockroach/bazel-out/aarch64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index: 83 statements
[20:44:36] --- done: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/16193/execroot/com_github_cockroachdb_cockroach/bazel-out/aarch64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index with config 3node-tenant: 99 tests, 0 failures
    logic.go:4229: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/16193/execroot/com_github_cockroachdb_cockroach/bazel-out/aarch64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: error while processing
    logic.go:4229: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/16193/execroot/com_github_cockroachdb_cockroach/bazel-out/aarch64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/partial_index:581: 
        expected success, but found
        (40001) restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): "sql txn" meta={id=97f5d411 key=/Tenant/10/Table/7/1/0/0 iso=Serializable pri=0.01370334 epo=0 ts=1729629870.928312511,1 min=1729629870.853385993,0 seq=27} lock=true stat=PENDING rts=1729629870.853385993,0 wto=false gul=1729629871.353385993,0
    panic.go:626: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/6b89db816bf074ebea6fff85a8db7616/logTestTenantLogic_partial_index1883979
--- FAIL: TestTenantLogic_partial_index (10.39s)
Help

See also: How To Investigate a Go Test Failure (internal)

This test on roachdash | Improve this report!

@rafiss
Copy link
Collaborator

rafiss commented Oct 24, 2024

key=/Tenant/10/Table/7/1/0/0 looks like it corresponds to the sequence used to generate descriptor IDs:

DescIDSequenceID = 7

There probably is increased contention on this key since logic tests use a transactional descriptor ID generator:

UseTransactionalDescIDGenerator: true,

Perhaps using a high priority txn here could help avoid the transaction being aborted.

Copy link

blathers-crl bot commented Oct 24, 2024

Based on the specified backports for linked PR #133400, I applied the following new label(s) to this issue: branch-release-24.1, branch-release-24.2, branch-release-24.3. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 labels Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. P-2 Issues/test failures with a fix SLA of 3 months T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
No open projects
Status: Triage
Development

Successfully merging a pull request may close this issue.

5 participants