-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: support for collection types (LIST, SET, MAP) and UDT #9
Comments
I’d be interested in assisting with this if no one else is. |
The main difficulty in supporting collection types is supporting non-frozen types. In Scylla there are two types of collections/UDTs: frozen and non-frozen. When you update a frozen collection, its entire contents after the update are stored in the CDC log. On the other hand, you can partially update non-frozen collections (such as appending items to a list). In the CDC log, only the added/removed elements would be saved in such a case. We (cc: @haaawk) have decided to not overcomplicate the generated Kafka message to accommodate those different operations in case of non-frozen collections (appending, removing, overwriting), especially since this is not what the Debezium model expects and most Sink Connectors would not support it. However, if we implemented support for postimages (#8 which we plan to do), a state of non-frozen collection/UDT after an update would be known (at the additional requirement that you have to enable postimages on your CDC table) - that way adding support for non-frozen collection types. (You can read https://docs.scylladb.com/using-scylla/cdc/cdc-advanced-types/ for more info) In the meantime, I have pushed (a very early) implementation of support of frozen collections: #12. To support post-images, we plan to implement a higher-level abstraction in scylla-cdc-java repo, that combines pre-images, delta and post-image rows and parses delta information of non-frozen collection updates. |
(apologies for issue title rename, wrong browser tab -> please ignore) |
Hi @avelanarius is there an ETA for post-image support? |
@avelanarius is this already in the making, are you also looking for contributors? |
@avelanarius @hartmut-co-uk |
I have done more code changes on my fork last week to accommodate using UDT with Avro, but haven't had time to test them yet. |
@avelanarius and @Lorak-mmk are working on support for frozen and non-frozen collection |
hi @hartmut-co-uk @avelanarius |
track |
Hi, is there a plan to merge #21? we can really use this feature. |
+1. This is an important feature |
+1 |
As a consumer of my CDC event stream (Kafka topic), with table cdc enabled and collection types (LIST, SET, MAP) and UDT used, I'd like to receive change data of all columns of the
*_cdc_log
record, incl. collection type + UDT fields.This would allow me to utilise the change event for stream processing as no data is omitted.
Example use cases:
The text was updated successfully, but these errors were encountered: