When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

loveeklund-osttra · 2024-09-13T07:23:41Z

The _handle_max_record_age()
https://github.com/meltano/sdk/blob/6708cb995c68ab6f74d4874dfc8f978c3b054ceb/singer_sdk/target_base.py#L284
Gets called every 5 minutes. It it turn calls drain_all()
target-bigquery/target_bigquery/target.py
which writes to the target table and writes out a state.

If you are running upsert the target table is a temporary table. As the merge doesn't happen until the end this means the state is out of sync with the content of the "real" target table. This can be problematic and lead to what I would call unexpected behavior if a job for any reason doesn't reach its end.

I created this(#96) PR for a possible solution using the pre_state_hook.

z3z1ma · 2024-09-16T05:46:07Z

I also think #96 will resolve this. I will close this for now but we can re-open if behavior persists. Will cut a new pypi release soon.

z3z1ma closed this as completed Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

loveeklund-osttra commented Sep 13, 2024

z3z1ma commented Sep 16, 2024

When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

Comments

loveeklund-osttra commented Sep 13, 2024

z3z1ma commented Sep 16, 2024