Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc: expressions with exist clauses cause a panic #90416

Closed
HonoreDB opened this issue Oct 21, 2022 · 2 comments · Fixed by #85177
Closed

cdc: expressions with exist clauses cause a panic #90416

HonoreDB opened this issue Oct 21, 2022 · 2 comments · Fixed by #85177
Assignees
Labels
A-cdc Change Data Capture A-cdc-expressions Features related to changefeed projections and filters branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-cdc

Comments

@HonoreDB
Copy link
Contributor

HonoreDB commented Oct 21, 2022

Describe the problem

CREATE CHANGEFEED WITH schema_change_policy='stop' AS SELECT 1 FROM foo WHERE EXISTS (SELECT true);

Kills the cockroach. Tested on release-22.2 branch.

Jira issue: CRDB-20757

Epic CRDB-17161

@HonoreDB HonoreDB added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-cdc Change Data Capture T-cdc GA-blocker A-cdc-expressions Features related to changefeed projections and filters labels Oct 21, 2022
@blathers-crl
Copy link

blathers-crl bot commented Oct 21, 2022

cc @cockroachdb/cdc

@blathers-crl
Copy link

blathers-crl bot commented Oct 21, 2022

Hi @HonoreDB, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@HonoreDB HonoreDB added branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-22.2.0 labels Oct 21, 2022
HonoreDB added a commit to HonoreDB/cockroach that referenced this issue Oct 25, 2022
SQL is complicated. If we hit a panic while parsing or evaluating SQL,
we need to fail the changefeed, not crash the node.

Informs cockroachdb#90416.

Release note (sql change): Fixed a bug that could cause crashes when parsing malformed changefeed expressions.
HonoreDB added a commit to HonoreDB/cockroach that referenced this issue Oct 26, 2022
SQL is complicated. If we hit a panic while parsing or evaluating SQL,
we need to fail the changefeed, not crash the node.

Informs cockroachdb#90416.

Release note (sql change): Fixed a bug that could cause crashes when parsing malformed changefeed expressions.
craig bot pushed a commit that referenced this issue Oct 28, 2022
90638: changefeedccl: add high-level recovers in cdceval r=[miretskiy] a=HonoreDB

SQL is complicated. If we hit a panic while parsing or evaluating SQL, we need to fail the changefeed, not crash the node.

Informs #90416.

Release note (sql change): Fixed a bug that could cause crashes when parsing malformed changefeed expressions.

Co-authored-by: Aaron Zinger <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Oct 28, 2022
SQL is complicated. If we hit a panic while parsing or evaluating SQL,
we need to fail the changefeed, not crash the node.

Informs #90416.

Release note (sql change): Fixed a bug that could cause crashes when parsing malformed changefeed expressions.
blathers-crl bot pushed a commit that referenced this issue Oct 28, 2022
SQL is complicated. If we hit a panic while parsing or evaluating SQL,
we need to fail the changefeed, not crash the node.

Informs #90416.

Release note (sql change): Fixed a bug that could cause crashes when parsing malformed changefeed expressions.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 16, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 16, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 17, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 29, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 29, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 30, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 30, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 1, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 1, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 2, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 2, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90455
Informs cockroachdb#90442
Informs CRDB-18978
Informs CRDB-17161

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
craig bot pushed a commit that referenced this issue Dec 2, 2022
85177: changefeedccl: Rely on optimizer and distSQL when evaluating CDC expressions. r=miretskiy a=miretskiy

Previous PRs (#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev.*) AS prevJson FROM tbl
```

Fixes #90416
Fixes #90714
Fixes #90455
Informs #90442
Informs CRDB-18978
Informs CRDB-17161

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.

92951: ui: show txn idle time in txn detail page r=matthewtodd a=matthewtodd

Part of #86667

|Before|After|
|--|--|
|<img width="1628" alt="Screen Shot 2022-12-02 at 3 16 08 PM" src="https://user-images.githubusercontent.com/5261/205379330-e1990261-9ca4-4d73-878f-a21ed0a90412.png">|<img width="1628" alt="Screen Shot 2022-12-02 at 3 16 15 PM" src="https://user-images.githubusercontent.com/5261/205379338-956a1801-07db-48e8-9a9c-bbe9e84fb5b6.png">|

Release note (ui change): The "Transaction resource usage" card on the transaction fingerprint page now includes an "Idle latency" row, representing the time spent by the application performing other work while holding this transaction open.

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture A-cdc-expressions Features related to changefeed projections and filters branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants