You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All the Events-based streams have the same problem, when dumped into a target (Postgres in my case) there is missing data. I explored the issue at seems to not be due to the data retrieval part because all the data is in the state.json outputted by the tap. Nevertheless, the schema is not properly reproduced in these streams. for instance, in feature_events, the table-keys are:
"visitor_id", "account_id", "server", "remote_ip"
But this is no how the stream really works, we can have more than one row for that combination of values (we usually do, actually), as an example the same user with the same IP in the same server could make events in two different features, but feature_id is not a table-key. The result of this is that when one event for this combination is added to the table, it blocks the rest of them due to primary key constraints, resulting in missing events.
Changing this table-keys to:
seems to solve the problem. From the code perspective, it might be a problem to add day since at the time you define the table keys properties you don't know what the period will be since this is a class property. I'm sure there is a workaround though, maybe adding both day and hour as key properties.
The text was updated successfully, but these errors were encountered:
All the Events-based streams have the same problem, when dumped into a target (Postgres in my case) there is missing data. I explored the issue at seems to not be due to the data retrieval part because all the data is in the state.json outputted by the tap. Nevertheless, the schema is not properly reproduced in these streams. for instance, in
feature_events
, the table-keys are:But this is no how the stream really works, we can have more than one row for that combination of values (we usually do, actually), as an example the same user with the same IP in the same server could make events in two different features, but feature_id is not a table-key. The result of this is that when one event for this combination is added to the table, it blocks the rest of them due to primary key constraints, resulting in missing events.
Changing this table-keys to:
seems to solve the problem. From the code perspective, it might be a problem to add day since at the time you define the table keys properties you don't know what the period will be since this is a class property. I'm sure there is a workaround though, maybe adding both day and hour as key properties.
The text was updated successfully, but these errors were encountered: