Stateless pipeline resuming #2343
Closed
chubei
started this conversation in
Feature Requests
Replies: 1 comment 5 replies
-
How should connector behave when ingestion resumes from middle of source transaction? Now, in every connector we need to have logic which drops those events before sending them to pipeline. Maybe, it would be better that in pipeline we drop those messages instead of doing that in every connector. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Stateless pipeline resuming
For stateless pipelines, we store the pipeline state in the sink.
The pipeline state is a collection of connection states.
A connection state consists of two parts:
OpIdentifier
. It's always twou64
s.Connection level state is stored in a separate table in the sink, and is written only once. Record level state is written as part of the record.
To be more specific, every record in the sink has a
__dozer_record_state
field which stores theOpIdentifier
. We should be able to get the maximumOpIdentifier
of a sink. (How a specific sink stores and aggregates the field is out of scope for this document.)For inserts, the field is set to the record's
OpIdentifier
if it has one (during replication), orNone
if it doesn't (during snapshotting).For deletes, the field is deleted with the record.
For updates, the field is updated to the new
OpIdentifier
of the update.Upon restart, Dozer first aggregates the sink for the maximum
OpIdentifier
of the sink. If it'sNone
, Dozer starts moving the whole source tables. Otherwise, Dozer passes theOpIdentifier
and the connection level state to the connector to continue replication.Why this is correct
If the last operation that was written to the sink is an insert or update, then the connector restarts from the correct position.
If the last one or more operations are deletes, the connector will send these deletes again, but the deletes will not modify the sink because these records are already deleted.
Beta Was this translation helpful? Give feedback.
All reactions