Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc: cannot use UDT cast in changefeed expression #90714

Closed
jayshrivastava opened this issue Oct 26, 2022 · 1 comment · Fixed by #85177
Closed

cdc: cannot use UDT cast in changefeed expression #90714

jayshrivastava opened this issue Oct 26, 2022 · 1 comment · Fixed by #85177
Assignees
Labels
A-cdc Change Data Capture C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc

Comments

@jayshrivastava
Copy link
Contributor

jayshrivastava commented Oct 26, 2022

When you create a changefeed expression which references UDTs, some weird behavior occurs:

CREATE TYPE status AS ENUM ('inactive', 'active');

CREATE TYPE priority AS ENUM ('high', 'low');

CREATE TABLE foo (
    a INT PRIMARY KEY,
    b status,
    c priority,
    FAMILY only_b (b),
    FAMILY only_c (c)
);      


root@localhost:26257/defaultdb> CREATE CHANGEFEED WITH split_column_families AS SELECT 'inactive'::status, c FROM foo;
ERROR: undefined object with OID 100113
SQLSTATE: 42704

Also this:

root@localhost:26257/defaultdb> CREATE CHANGEFEED WITH split_column_families, schema_change_policy='stop'  AS SELECT 'inactive'::status from foo;
ERROR: expressions can't reference columns from more than one column family
SQLSTATE: 22023

Neither of these errors should happen since the changefeed expressions are valid.

Jira issue: CRDB-20905

Epic CRDB-17161

@jayshrivastava jayshrivastava added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-cdc Change Data Capture labels Oct 26, 2022
@blathers-crl blathers-crl bot added the T-cdc label Oct 26, 2022
@blathers-crl
Copy link

blathers-crl bot commented Oct 26, 2022

cc @cockroachdb/cdc

@miretskiy miretskiy self-assigned this Nov 11, 2022
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 16, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 16, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 17, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 29, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 29, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 30, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 30, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 1, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 1, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 2, 2022
Previous PRs (cockroachdb#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev) AS prevJson FROM tbl
```

Fixes cockroachdb#90416
Fixes cockroachdb#90714
Fixes cockroachdb#90421
Fixes cockroachdb#90455
Informs cockroachdb#90442

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward-incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.
craig bot pushed a commit that referenced this issue Dec 2, 2022
85177: changefeedccl: Rely on optimizer and distSQL when evaluating CDC expressions. r=miretskiy a=miretskiy

Previous PRs (#82562) introduced CDC expressions.

This PR builds on that and replaces majority of hand written
evaluation logic in favor of tighter integration with
optimizer and dist SQL processors.

CDC expression, which is really a simplified `SELECT` statement,
is now  planned by the optimizer `sql.PlanCDCExpression`.
The resulting plan is then fed to the distSQL, to produce
a specialized CDC execution plan (`sql.RunCDCEvaluation`).

The execution plan is special in that it is guaranteed to be
a local execution plan, and changefeed is expected to "feed"
the data (encoded row) directly into the execution pipeline,
with change aggregators consuming resulting projection.

The benefit of this approach is that expression optimization,
and evaluation is now handled by optimizer and distSQL.
The responsibility of CDC evaluation package is to make sure
that CDC expression is "sane" and to setup CDC specific functions.

Since the whole expression is not yet fully planned
by distSQL (i.e. we don't have changefeed operator implemented yet),
the integration between CDC expressions and optimizer/execInfra
is not yet complete.  In particular, this
PR does not replace current distSQL execution for CDC --
namely, we still keep the same execution model using hand planned
`ChangeFrontierProcessor` and `ChangeAggretagorProcessor`.
It augments existing model, while tightening the integration.

Still, this is an improvement over previous state.  The follow on work
will complete integration.

Some changes enabled by this implementation include the replacement
of `cdc_prev()` function which returned JSONb representation of the
previous row, with a `cdc_prev` tuple.  This makes changefeed
expressions more natural since tuples are strongly typed:

```
SELECT * FROM tbl WHERE col != cdc_prev.col`
```

In addition, by using tuple to represent the state of the previous row,
we can now leverage existing SQL functions.  For example, to emit
previus row as JSONb we can do:

```
SELECT *, row_to_json(cdc_prev.*) AS prevJson FROM tbl
```

Fixes #90416
Fixes #90714
Fixes #90455
Informs #90442
Informs CRDB-18978
Informs CRDB-17161

Release note (enterprise change): CDC expressions are now planned and
evaluated using SQL optimizer and distSQL execution. The state
of the previous row is now exposed as `cdc_prev` tuple.

Release note (backward incompatible change): The replacement of
cdc_prev() function in favor a cdc_prev tuple is an incompatible
change that may break changefeeds that use old cdc_prev() function.

92951: ui: show txn idle time in txn detail page r=matthewtodd a=matthewtodd

Part of #86667

|Before|After|
|--|--|
|<img width="1628" alt="Screen Shot 2022-12-02 at 3 16 08 PM" src="https://user-images.githubusercontent.com/5261/205379330-e1990261-9ca4-4d73-878f-a21ed0a90412.png">|<img width="1628" alt="Screen Shot 2022-12-02 at 3 16 15 PM" src="https://user-images.githubusercontent.com/5261/205379338-956a1801-07db-48e8-9a9c-bbe9e84fb5b6.png">|

Release note (ui change): The "Transaction resource usage" card on the transaction fingerprint page now includes an "Idle latency" row, representing the time spent by the application performing other work while holding this transaction open.

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
@craig craig bot closed this as completed in b5cb990 Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants