Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing data in targets due to key properties in events #34

Open
diegobatt opened this issue May 4, 2021 · 0 comments
Open

[BUG] Missing data in targets due to key properties in events #34

diegobatt opened this issue May 4, 2021 · 0 comments

Comments

@diegobatt
Copy link

All the Events-based streams have the same problem, when dumped into a target (Postgres in my case) there is missing data. I explored the issue at seems to not be due to the data retrieval part because all the data is in the state.json outputted by the tap. Nevertheless, the schema is not properly reproduced in these streams. for instance, in feature_events, the table-keys are:

"visitor_id", "account_id", "server", "remote_ip"

But this is no how the stream really works, we can have more than one row for that combination of values (we usually do, actually), as an example the same user with the same IP in the same server could make events in two different features, but feature_id is not a table-key. The result of this is that when one event for this combination is added to the table, it blocks the rest of them due to primary key constraints, resulting in missing events.
Changing this table-keys to:

"visitor_id",  "account_id", "server", "remote_ip", "day", "feature_id"

seems to solve the problem. From the code perspective, it might be a problem to add day since at the time you define the table keys properties you don't know what the period will be since this is a class property. I'm sure there is a workaround though, maybe adding both day and hour as key properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant