Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running upsert jobs that are over 5 minutes the states gets moved forward without data being written #97

Closed
loveeklund-osttra opened this issue Sep 13, 2024 · 1 comment

Comments

@loveeklund-osttra
Copy link
Contributor

The _handle_max_record_age()
https://github.com/meltano/sdk/blob/6708cb995c68ab6f74d4874dfc8f978c3b054ceb/singer_sdk/target_base.py#L284
Gets called every 5 minutes. It it turn calls drain_all()
target-bigquery/target_bigquery/target.py
which writes to the target table and writes out a state.

If you are running upsert the target table is a temporary table. As the merge doesn't happen until the end this means the state is out of sync with the content of the "real" target table. This can be problematic and lead to what I would call unexpected behavior if a job for any reason doesn't reach its end.

I created this(#96) PR for a possible solution using the pre_state_hook.

@z3z1ma
Copy link
Owner

z3z1ma commented Sep 16, 2024

I also think #96 will resolve this. I will close this for now but we can re-open if behavior persists. Will cut a new pypi release soon.

@z3z1ma z3z1ma closed this as completed Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants