-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state gets moved forward without data being written when batch load fails. #101
Comments
I've looked into this some more and I don't think it is the fail_fast that is causing the issue. From what I've been able to see the issue seems to arise when this load_job https://github.com/z3z1ma/target-bigquery/blob/main/target_bigquery/batch_job.py#L63 fails for some reason. |
Accidentally closed the issues... I think I somewhat understand what happens, it's something with the parallelization and it not waiting properly when it goes to requeue the job in BatchJobWorker.run . I'll try to get some more details soon |
If you want to replicate the error you can check out this commit https://github.com/loveeklund-osttra/target-bigquery/tree/308859d93da38135a30433edb523c970f4bdb371 I've tried with the other loading methods as well and it gets the same error for all of them except gcs_stage, which actually fails because it triggers the loading of data to bigquery in cleanup and not in the run of the worker. I added some logging statements to get some clarity into why it fails and I think the problem is the requeueing logic that happens in the batchjobworker.run causes the pipeline to not wait for it to finish properly I'm going to see if I can fix it by removing the retrying logic in the workers run commands |
I think it is the behaviour described here.
https://github.com/z3z1ma/target-bigquery/blob/9d1d0b08606a716a5a36f53b3388cbd6055535a8/target_bigquery/target.py#L544C9-L549C79
I suspect what happened was that one on my workers failed on a bad row but the other was able to write out data. Resulting in state being moved forward without any data from the bad sink being written.
What is the upside vs downside that is referenced in the comment? is it that data gets read from source but not written to target?
The text was updated successfully, but these errors were encountered: