-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v22.1.1: sql: alter table attempted to update job for mutation 2, but job already exists with mutation 1 #82921
Comments
This was during an attempt to run an automated integration test against a Running under Windows, from a powershell session:
The query being run was a large set of |
Smallest repro case I've got so far.
Then immediately run these statements as a Npgsql command:
That seems to trigger the fault reliably. If you run these statements interactively through the cockroach sql client, the fault is not triggered |
Running each of the statements as individual Npgsql commands does not trigger the issue. i.e.
then
then
with no delays between. This is a viable work-around for me right now. |
Thanks for the repro! This definitely helps. |
The root cause here is this code: cockroach/pkg/sql/conn_executor_prepare.go Lines 460 to 467 in 3f87f2f
The code is wrong in two ways:
The tl;dr is that that code block has fully rotted. It even knew it was a mistake when it was added. |
92300: sql: fix bug with multi-statement implicit txn schema changes and Bind r=ajwerner a=ajwerner For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked. Fixes #82921 Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1". Co-authored-by: Andrew Werner <[email protected]>
For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked. Fixes #82921 Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1".
For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked. Fixes #82921 Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1".
For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked. Fixes #82921 Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1".
For legacy reasons, we were resetting the descriptor collection state in Bind if we thought we were not in a transaction. Since #76792, we're always in a transaction. You might think that'd mean that the logic would not run. Sadly, for other still unclear reasons, when in an implicit transaction `(*connExecutor).getTransactionState()` returns `NoTxnStateStr`. The end result was that we'd erroneously reset our descriptor state in the middle of a multi- statement implicit transaction if bind was invoked. Fixes #82921 Release note (bug fix): Fixed a bug which could lead to errors when running multiple schema change statements in a single command using a driver that uses the extended pgwire protocol internally (Npgsql in .Net as an example). These errors would have the form "attempted to update job for mutation 2, but job already exists with mutation 1".
This issue was autofiled by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.
Sentry link: https://sentry.io/organizations/cockroach-labs/issues/3350717600/?referrer=webhooks_plugin
Panic message:
Stacktrace (expand for inline code snippets):
cockroach/pkg/sql/table.go
Lines 201 to 203 in 242d13b
cockroach/pkg/sql/table.go
Lines 247 to 249 in 242d13b
cockroach/pkg/sql/drop_index.go
Lines 553 to 555 in 242d13b
cockroach/pkg/sql/drop_index.go
Lines 161 to 163 in 242d13b
cockroach/pkg/sql/plan.go
Lines 515 to 517 in 242d13b
cockroach/pkg/sql/walk.go
Lines 111 to 113 in 242d13b
cockroach/pkg/sql/walk.go
Lines 296 to 298 in 242d13b
cockroach/pkg/sql/walk.go
Lines 78 to 80 in 242d13b
cockroach/pkg/sql/walk.go
Lines 42 to 44 in 242d13b
cockroach/pkg/sql/plan.go
Lines 518 to 520 in 242d13b
cockroach/pkg/sql/plan_node_to_row_source.go
Lines 145 to 147 in 242d13b
cockroach/pkg/sql/colexec/columnarizer.go
Lines 157 to 159 in 242d13b
cockroach/pkg/sql/colflow/stats.go
Lines 89 to 91 in 242d13b
cockroach/pkg/sql/colflow/flow_coordinator.go
Lines 234 to 236 in 242d13b
cockroach/pkg/sql/colexecerror/error.go
Lines 90 to 92 in 242d13b
cockroach/pkg/sql/colflow/flow_coordinator.go
Lines 233 to 235 in 242d13b
cockroach/pkg/sql/colflow/flow_coordinator.go
Lines 278 to 280 in 242d13b
cockroach/pkg/sql/colflow/vectorized_flow.go
Lines 259 to 261 in 242d13b
cockroach/pkg/sql/distsql_running.go
Lines 596 to 598 in 242d13b
cockroach/pkg/sql/distsql_running.go
Lines 1444 to 1446 in 242d13b
cockroach/pkg/sql/conn_executor_exec.go
Lines 1466 to 1468 in 242d13b
cockroach/pkg/sql/conn_executor_exec.go
Lines 1142 to 1144 in 242d13b
cockroach/pkg/sql/conn_executor_exec.go
Lines 685 to 687 in 242d13b
cockroach/pkg/sql/conn_executor_exec.go
Lines 142 to 144 in 242d13b
cockroach/pkg/sql/conn_executor_exec.go
Lines 230 to 232 in 242d13b
cockroach/pkg/sql/conn_executor.go
Lines 1951 to 1953 in 242d13b
cockroach/pkg/sql/conn_executor.go
Lines 1953 to 1955 in 242d13b
cockroach/pkg/sql/conn_executor.go
Lines 1799 to 1801 in 242d13b
cockroach/pkg/sql/conn_executor.go
Lines 747 to 749 in 242d13b
cockroach/pkg/sql/pgwire/conn.go
Lines 723 to 725 in 242d13b
src/runtime/asm_amd64.s#L1580-L1582 in runtime.goexit
v22.1.1
Jira issue: CRDB-16731
The text was updated successfully, but these errors were encountered: