Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changefeed in Avro format does not include mvcc_timestamp when option is specified #123078

Closed
jonstjohn opened this issue Apr 25, 2024 · 3 comments · Fixed by #129840
Closed

Changefeed in Avro format does not include mvcc_timestamp when option is specified #123078

jonstjohn opened this issue Apr 25, 2024 · 3 comments · Fixed by #129840
Assignees
Labels
A-cdc Change Data Capture branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.1.8-rc branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc

Comments

@jonstjohn
Copy link
Collaborator

jonstjohn commented Apr 25, 2024

Describe the problem

When a changefeed is created using WITH mvcc_timestamp and Avro format, the mvcc_timestamp never gets emitted in the message. Using WITH updated works for emitting the updated timestamp, but mvcc_timestamp is required to accurately include the timestamp during an initial scan.

To Reproduce

Setup a confluent schema registry.

Run cockroach demo.

Run the following sinkless changefeed:

CREATE CHANGEFEED FOR TABLE movr.users 
WITH 
        updated ,
        full_table_name
        , mvcc_timestamp    
        , format = avro
        , envelope = wrapped
        , confluent_schema_registry = 'http://127.0.0.1:8081/'       
        , schema_change_events = column_changes
        , schema_change_policy = nobackfill
        , kafka_sink_config='{"RequiredAcks": "ALL", "Compression": "GZIP"}'   
        , initial_scan = 'no'
        ;

Insert a row into the movr.users table. Notice that the changefeed message does not have the mvcc_timestamp field.

Expected behavior
The mvcc_timestamp field is included, similar to how it is included when using json format.

Environment:

  • CockroachDB version 23.2.2
  • Server OS: Linux
  • Client app: cockroach sql

Additional context
The only other option is to use the updated field, which does not accurately reflect the mvcc_timestamp during initial scan or backfill.

Jira issue: CRDB-38190

Epic CRDB-41784

@jonstjohn jonstjohn added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 25, 2024
@keljopap
Copy link

keljopap commented Apr 25, 2024

Just noting we are seeing this with the experimental change feed on Cockroach v23.1.17 as well.

@wenyihu6 wenyihu6 added the A-cdc Change Data Capture label May 22, 2024
@blathers-crl blathers-crl bot added the T-cdc label May 22, 2024
Copy link

blathers-crl bot commented May 22, 2024

cc @cockroachdb/cdc

craig bot pushed a commit that referenced this issue Aug 31, 2024
129840: changefeedccl: emit mvcc_timestamp for avro format r=andyyang890 a=rharding6373

This PR adds support for the mvcc_timestamp option with the avro format. Before this change, changefeeds using avro would not fail if mvcc_timestamp was specified, but would ignore the option. Now avro supports the mvcc_timestamp by adding mvcc_timestamp to the schema and emitting the mvcc value with the row data.

Epic: none
Fixes: #123078

Release note (enterprise change): Adds changefeed support for the mvcc_timestamp option with the avro format. If both options are specified, the avro schema includes an mvcc_timestamp metadata field and emits the row's mvcc timestamp with the row data.

Co-authored-by: rharding6373 <[email protected]>
@craig craig bot closed this as completed in 597bf39 Aug 31, 2024
@rharding6373 rharding6373 added branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 labels Nov 22, 2024
blathers-crl bot pushed a commit that referenced this issue Nov 22, 2024
This PR adds support for the mvcc_timestamp option with the avro format.
Before this change, changefeeds using avro would not fail if
mvcc_timestamp was specified, but would ignore the option. Now avro
supports the mvcc_timestamp by adding mvcc_timestamp to the schema and
emitting the mvcc value with the row data.

Epic: none
Fixes: #123078

Release note (enterprise change): Adds changefeed support for the
mvcc_timestamp option with the avro format. If both options are
specified, the avro schema includes an mvcc_timestamp metadata field and
emits the row's mvcc timestamp with the row data.
blathers-crl bot pushed a commit that referenced this issue Nov 22, 2024
This PR adds support for the mvcc_timestamp option with the avro format.
Before this change, changefeeds using avro would not fail if
mvcc_timestamp was specified, but would ignore the option. Now avro
supports the mvcc_timestamp by adding mvcc_timestamp to the schema and
emitting the mvcc value with the row data.

Epic: none
Fixes: #123078

Release note (enterprise change): Adds changefeed support for the
mvcc_timestamp option with the avro format. If both options are
specified, the avro schema includes an mvcc_timestamp metadata field and
emits the row's mvcc timestamp with the row data.
blathers-crl bot pushed a commit that referenced this issue Nov 22, 2024
This PR adds support for the mvcc_timestamp option with the avro format.
Before this change, changefeeds using avro would not fail if
mvcc_timestamp was specified, but would ignore the option. Now avro
supports the mvcc_timestamp by adding mvcc_timestamp to the schema and
emitting the mvcc value with the row data.

Epic: none
Fixes: #123078

Release note (enterprise change): Adds changefeed support for the
mvcc_timestamp option with the avro format. If both options are
specified, the avro schema includes an mvcc_timestamp metadata field and
emits the row's mvcc timestamp with the row data.
Copy link

blathers-crl bot commented Dec 2, 2024

Based on the specified backports for linked PR #129840, I applied the following new label(s) to this issue: branch-release-24.1.8-rc. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl bot pushed a commit that referenced this issue Dec 2, 2024
This PR adds support for the mvcc_timestamp option with the avro format.
Before this change, changefeeds using avro would not fail if
mvcc_timestamp was specified, but would ignore the option. Now avro
supports the mvcc_timestamp by adding mvcc_timestamp to the schema and
emitting the mvcc value with the row data.

Epic: none
Fixes: #123078

Release note (enterprise change): Adds changefeed support for the
mvcc_timestamp option with the avro format. If both options are
specified, the avro schema includes an mvcc_timestamp metadata field and
emits the row's mvcc timestamp with the row data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.1.8-rc branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants