Skip to content
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.

Primary key updates handled incorrectly in LOG_BASED replication #92

Open
iljau opened this issue May 13, 2021 · 3 comments
Open

Primary key updates handled incorrectly in LOG_BASED replication #92

iljau opened this issue May 13, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@iljau
Copy link

iljau commented May 13, 2021

Describe the bug
Primary key updates handled incorrectly in LOG_BASED replication, causing old data to be kept in target tables.

To Reproduce
Steps to reproduce the behavior:

-- table defintion
create table table_name
(
	a_id integer not null,
	b_id integer not null,
	c integer not null,
	constraint pkey
		primary key (a_id, b_id, c)
);

-- update one row in table

in consume_message, following wal2json payload is read:

def consume_message(streams, state, msg, time_extracted, conn_info):

# wal2json payload
PAYLOAD = {
    'kind': 'update',
    'schema': 'public',
    'table': 'table_name',
    'columnnames': ['a_id', 'b_id', 'c'],
    'columntypes': ['integer', 'integer', 'integer'],
    # new primary key
    'columnvalues': [2, 10, 1],
    'oldkeys': {
         'keynames': ['a_id', 'b_id', 'c'],
         'keytypes': ['integer', 'integer', 'integer'],
         # old primary key
         'keyvalues': [2, 5, 1]
    }
}

# emitted record by tap
RECORD_MESSAGE = {
    'type': 'RECORD', 'stream': 'public-table_name',
    'record':
        {'a_id': 1, 'b_id': 10, 'c': 1, '_sdc_deleted_at': None},
    'version': 1,
    'time_extracted': '2021-05-13T09:20:31.892225Z'
}

Expected behavior
In target table row with PK [2, 5, 1] is updated to [2, 10, 1]

Actual result
In target table row with PK [2, 5, 1] is kept and row with PK [2, 10, 1] is added.
Target table now contains row [2, 5, 1] which has been deleted from source.

Your environment

  • Version of tap: [1.7.1]
  • Version of python [3.7]
@iljau iljau added the bug Something isn't working label May 13, 2021
@Samira-El
Copy link
Contributor

Hey,

This is a known edge case, primary keys are usually not expected to be updated throughout the lifecycle of a record so handling of this edge case is not currently implemented.

@JustMaris
Copy link

Is there a plan to implement handling of this edge case?

@Samira-El
Copy link
Contributor

I'm afraid no, but if you're up for it, you can send a PR and we would review it. The change isn't that complex.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants