Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery destinations: print BQ error when sync fails #7938

Closed
sherifnada opened this issue Nov 12, 2021 · 0 comments · Fixed by #8788
Closed

BigQuery destinations: print BQ error when sync fails #7938

sherifnada opened this issue Nov 12, 2021 · 0 comments · Fixed by #8788

Comments

@sherifnada
Copy link
Contributor

sherifnada commented Nov 12, 2021

Tell us about the problem you're trying to solve

currently, when a bigquery sync fails, the destination logs look like the following:

[35mdestination�[0m - 2021-11-12 06:25:07 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:07 �[32mINFO�[m i.a.i.b.FailureTrackingAirbyteMessageConsumer(close):60 - {} - Airbyte message consumer: succeeded.
�[35mdestination�[0m - 2021-11-12 06:25:07 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:07 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(close):163 - {} - Started closing all connections
�[35mdestination�[0m - 2021-11-12 06:25:11 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:11 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(closeNormalBigqueryStreams):278 - {} - Waiting for jobs to be finished/closed
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:22 �[1;31mERROR�[m i.a.i.d.b.BigQueryRecordConsumer(lambda$closeNormalBigqueryStreams$6):284 - {} - Failed to process a message for job: Job{job=JobId{project=alphaproject, job=4b4851c5-90e7-4544-b639-aa32abf53sf, location=europe-west1}, status=JobStatus{state=RUNNING, error=null, executionErrors=null}, statistics=LoadStatistics{creationTime=1636698310929, endTime=null, startTime=1636698311078, numChildJobs=null, parentJobId=null, scriptStatistics=null, reservationUsage=null, inputBytes=null, inputFiles=null, outputBytes=null, outputRows=null, badRecords=null}, [email protected], etag=vXRqor+IpSRnDyoUtMeNLw==, generatedId=alphaproject:europe-west1.4b4851c5-90e7-4544-b639-aa32abf53sf, selfLink=https://www.googleapis.com/bigquery/v2/projects/alphaproject/jobs/4b4851c5-90e7-4544-b639-aa32abf53sf?location=europe-west1, configuration=LoadJobConfiguration{type=LOAD, destinationTable=GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=airbyte, projectId=alphaproject, tableId=_airbyte_tmp_nrf_ms_ads_campaigns}}, decimalTargetTypes=null, destinationEncryptionConfiguration=null, createDisposition=CREATE_IF_NEEDED, writeDisposition=null, formatOptions=FormatOptions{format=NEWLINE_DELIMITED_JSON}, nullMarker=null, maxBadRecords=null, schema=Schema{fields=[Field{name=Id, type=FLOAT, mode=NULLABLE, description=null, policyTags=null}, Field{name=Name, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=Status, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=SubType, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=BudgetId, type=FLOAT, mode=NULLABLE, description=null, policyTags=null}, Field{name=Settings, type=RECORD, mode=NULLABLE, description=null, policyTags=null}, Field{name=TimeZone, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=Languages, type=RECORD, mode=NULLABLE, description=null, policyTags=null}, Field{name=BudgetType, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=DailyBudget, type=FLOAT, mode=NULLABLE, description=null, policyTags=null}, Field{name=CampaignType, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=ExperimentId, type=FLOAT, mode=NULLABLE, description=null, policyTags=null}, Field{name=BiddingScheme, type=RECORD, mode=NULLABLE, description=null, policyTags=null}, Field{name=FinalUrlSuffix, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=TrackingUrlTemplate, type=STRING, mode=NULLABLE, description=null, policyTags=null}, Field{name=UrlCustomParameters, type=RECORD, mode=NULLABLE, description=null, policyTags=null}, Field{name=ForwardCompatibilityMap, type=RECORD, mode=null, description=null, policyTags=null}, Field{name=AudienceAdsBidAdjustment, type=FLOAT, mode=NULLABLE, description=null, policyTags=null}, Field{name=AdScheduleUseSearcherTimeZone, type=BOOLEAN, mode=NULLABLE, description=null, policyTags=null}, Field{name=_airbyte_ab_id, type=STRING, mode=null, description=null, policyTags=null}, Field{name=_airbyte_emitted_at, type=TIMESTAMP, mode=null, description=null, policyTags=null}]}, ignoreUnknownValue=null, sourceUris=null, schemaUpdateOptions=null, autodetect=null, timePartitioning=null, clustering=null, useAvroLogicalTypes=null, labels=null, jobTimeoutMs=null, rangePartitioning=null, hivePartitioningOptions=null}}, 
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - Streams numbers: 10, 
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - SyncMode: WRITE_APPEND, 
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - TableName: GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=airbyte, tableId=ms_ads_campaigns}}, 
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - TmpTableName: GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=airbyte, tableId=_airbyte_tmp_nrf_ms_ads_campaigns}}
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:22 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(printHeapMemoryConsumption):431 - {} - Initial Memory (xms) mb = 126
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:22 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(printHeapMemoryConsumption):432 - {} - Max Memory (xmx) : mb + 5966
�[35mdestination�[0m - 2021-11-12 06:25:22 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:22 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(closeNormalBigqueryStreams):312 - {} - Removing tmp tables...
�[35mdestination�[0m - 2021-11-12 06:25:23 INFO () DefaultAirbyteStreamFactory(lambda$create$0):61 - 2021-11-12 06:25:23 �[32mINFO�[m i.a.i.d.b.BigQueryRecordConsumer(closeNormalBigqueryStreams):314 - {} - Finishing destination process...completed
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - Exception in thread "main" java.lang.RuntimeException: com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 45; errors: 1. Please look into the errors[] collection for more details.
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryRecordConsumer.lambda$closeNormalBigqueryStreams$6(BigQueryRecordConsumer.java:289)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.commons.lang.Exceptions.castCheckedToRuntime(Exceptions.java:52)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.commons.lang.Exceptions.toRuntime(Exceptions.java:39)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryRecordConsumer.lambda$closeNormalBigqueryStreams$7(BigQueryRecordConsumer.java:279)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at java.base/java.util.HashMap$Values.forEach(HashMap.java:1068)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryRecordConsumer.closeNormalBigqueryStreams(BigQueryRecordConsumer.java:279)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryRecordConsumer.close(BigQueryRecordConsumer.java:169)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryDenormalizedRecordConsumer.close(BigQueryDenormalizedRecordConsumer.java:81)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.base.FailureTrackingAirbyteMessageConsumer.close(FailureTrackingAirbyteMessageConsumer.java:62)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.base.IntegrationRunner.consumeWriteStream(IntegrationRunner.java:152)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:128)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryDenormalizedDestination.main(BigQueryDenormalizedDestination.java:209)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - Caused by: com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 45; errors: 1. Please look into the errors[] collection for more details.
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at com.google.cloud.bigquery.Job.reload(Job.java:411)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at com.google.cloud.bigquery.Job.waitFor(Job.java:248)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.bigquery.BigQueryRecordConsumer.lambda$closeNormalBigqueryStreams$6(BigQueryRecordConsumer.java:282)
�[35mdestination�[0m - 2021-11-12 06:25:23 ERROR () LineGobbler(voidCall):82 - 	... 11 more
2021-11-12 06:25:23 INFO () DefaultReplicationWorker(run):124 - Destination thread complete.
2021-11-12 06:25:23 ERROR () DefaultReplicationWorker(run):128 - Sync worker failed.
io.airbyte.workers.WorkerException: Destination process exit with code 1. This warning is normal if the job was cancelled.
	at io.airbyte.workers.protocols.airbyte.DefaultAirbyteDestination.close(DefaultAirbyteDestination.java:114) ~[io.airbyte-airbyte-workers-0.30.36-alpha.jar:?]
	at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:126) ~[io.airbyte-airbyte-workers-0.30.36-alpha.jar:?]
	at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:32) ~[io.airbyte-airbyte-workers-0.30.36-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:167) ~[io.airbyte-airbyte-workers-0.30.36-alpha.jar:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
2021-11-12 06:25:23 INFO () DefaultReplicationWorker(run):152 - sync summary: io.airbyte.config.ReplicationAttemptSummary@4dec3487[status=failed,recordsSynced=3022738,bytesSynced=1617174416,startTime=1636692715347,endTime=1636698323998]
2021-11-12 06:25:23 INFO () DefaultReplicationWorker(run):161 - Source did not output any state messages
2021-11-12 06:25:23 WARN () DefaultReplicationWorker(run):172 - State capture: No state retained.
2021-11-12 06:25:23 INFO () TemporalAttemptExecution(get):137 - Stopping cancellation check scheduling...

This doesn't actually tell us what the problem was; it only says Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 45; errors: 1. Please look into the errors[] collection for more details.

This makes it very difficult to quickly fix the issue, as we need to go ask the user to get us more information.

Describe the solution you’d like

Whenever the sync fails, we should read/access the information in the errors[] collection and print them as part of the logs.

Describe the alternative you’ve considered or used

Ask the user to look at their own bigquery instance to find the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
4 participants