Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛Destination-Bigquery: Added an explicit error message if sync fails due to a config issue #21144

Merged
Merged
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@
import com.amazonaws.services.s3.AmazonS3;
import com.fasterxml.jackson.databind.JsonNode;
import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.FormatOptions;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.Schema;
import com.google.cloud.bigquery.TableDataWriteChannel;
import com.google.cloud.bigquery.TableId;
import com.google.cloud.bigquery.WriteChannelConfiguration;
import io.airbyte.commons.exceptions.ConfigErrorException;
import io.airbyte.integrations.destination.bigquery.BigQueryUtils;
import io.airbyte.integrations.destination.bigquery.formatter.BigQueryRecordFormatter;
import io.airbyte.integrations.destination.bigquery.uploader.config.UploaderConfig;
Expand All @@ -30,6 +32,26 @@

public class BigQueryUploaderFactory {

private static final String CONFIG_ERROR_MSG = """
\n
********************************************************************************************
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm..., so I like the formatting in the expanded log view, but I think I would prefer removing the long line of stars so that the message is easier to read in the non-expanded view (judging by the screenshot attached to the PR).

Copy link
Contributor Author

@etsybaev etsybaev Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed stars, thanks. Previously just added it to make the error message more noticeable and fancy :)


Failed to write to destination schema.

1. Make sure you have all required permissions for writing to the schema.

2. Make sure that the actual destination schema's location corresponds to location provided
in connector's config.

3. Try to change the "Destination schema" from "Mirror Source Structure" (if it's set) tp the
"Destination Default" option.

*******************************************************************************************

More details:

""";

public static AbstractBigQueryUploader<?> getUploader(final UploaderConfig uploaderConfig)
throws IOException {
final String schemaName = BigQueryUtils.getSchema(uploaderConfig.getConfig(), uploaderConfig.getConfigStream());
Expand Down Expand Up @@ -141,7 +163,13 @@ private static BigQueryDirectUploader getBigQueryDirectUploader(
.setProject(bigQuery.getOptions().getProjectId())
.build();

final TableDataWriteChannel writer = bigQuery.writer(job, writeChannelConfiguration);
final TableDataWriteChannel writer;

try{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space between try and {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks. Please ignore the code-style issues for now. Once everything is approved I usually run the gradle's "format" command along with version bumping. So such issues would be automatically fixed.

writer = bigQuery.writer(job, writeChannelConfiguration);
}catch (final BigQueryException e){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks. Please ignore the code-style issues for now. Once everything is approved I usually run the gradle's "format" command along with version bumping. So such issues would be automatically fixed.

throw new ConfigErrorException(CONFIG_ERROR_MSG + e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to make this catch clause more specific, so that only permission and 404 errors result in ConfigErrorException, but other problems (whatever they may be) are not translated into ConfigErrorException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To tell the truth, this is what I did from the very beginning - i.e. added a catch to check the only exceptions with 404 in message. But after playing a bit more I noticed that all other possible issues were also related to config issues. So then updated code to catch everything as ConfigError. At this part of code, we just create a new connection. So whatever will go wrong - will be more or less related to the config issue. Should I anyway update it to catch the 404 only? Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we'd catch specific error codes and provide error messages specific to those codes. Here is the list of error codes (https://cloud.google.com/bigquery/docs/error-messages). I don't think we can run into all of these here. We can probably assume that we will mostly run into 404 and 403 which we should report as config errors. Occasionally, we might run into 500 and 503, which we should not report as config errors and we should provide a different error message when we run into 500 and 503

}

// this this optional value. If not set - use default client's value (15MiG)
final Integer bigQueryClientChunkSizeFomConfig =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,13 @@ class BigQueryDestinationTest {
Path.of("secrets/credentials-standard-no-dataset-creation.json");
protected static final Path CREDENTIALS_NON_BILLABLE_PROJECT_PATH =
Path.of("secrets/credentials-standard-non-billable-project.json");
protected static final Path CREDENTIALS_NO_EDIT_PUBLIC_SCHEMA_ROLE_PATH =
Path.of("secrets/credentials-no-edit-public-schema-role.json");
protected static final Path CREDENTIALS_WITH_GCS_STAGING_PATH =
Path.of("secrets/credentials-gcs-staging.json");

protected static final Path[] ALL_PATHS = {CREDENTIALS_WITH_GCS_STAGING_PATH, CREDENTIALS_BAD_PROJECT_PATH, CREDENTIALS_NO_DATASET_CREATION_PATH,
CREDENTIALS_NON_BILLABLE_PROJECT_PATH, CREDENTIALS_WITH_GCS_STAGING_PATH};
CREDENTIALS_NO_EDIT_PUBLIC_SCHEMA_ROLE_PATH,CREDENTIALS_NON_BILLABLE_PROJECT_PATH, CREDENTIALS_WITH_GCS_STAGING_PATH};
private static final Logger LOGGER = LoggerFactory.getLogger(BigQueryDestinationTest.class);
private static final String DATASET_NAME_PREFIX = "bq_dest_integration_test";

Expand Down Expand Up @@ -116,6 +118,7 @@ class BigQueryDestinationTest {
protected static JsonNode configWithProjectId;
protected static JsonNode configWithBadProjectId;
protected static JsonNode insufficientRoleConfig;
protected static JsonNode noEditPublicSchemaRoleConfig;
protected static JsonNode nonBillableConfig;
protected static JsonNode gcsStagingConfig; //default BigQuery config. Also used for setup/teardown
protected BigQuery bigquery;
Expand Down Expand Up @@ -144,6 +147,7 @@ private Stream<Arguments> failCheckTestConfigProvider() {
private Stream<Arguments> failWriteTestConfigProvider() {
return Stream.of(
Arguments.of("configWithBadProjectId", "User does not have bigquery.datasets.create permission in project"),
Arguments.of("noEditPublicSchemaRoleConfig", "Failed to write to destination schema."), // (or it may not exist)
Arguments.of("insufficientRoleConfig", "Permission bigquery.tables.create denied")
);
}
Expand Down Expand Up @@ -178,6 +182,8 @@ public static void beforeAll() throws IOException {
insufficientRoleConfig = BigQueryDestinationTestUtils.createConfig(CREDENTIALS_NO_DATASET_CREATION_PATH, datasetId);
//config that tries to write to a project with disabled billing (free tier)
nonBillableConfig = BigQueryDestinationTestUtils.createConfig(CREDENTIALS_NON_BILLABLE_PROJECT_PATH, "testnobilling");
//config that has no privileges to edit anything in Public schema
noEditPublicSchemaRoleConfig = BigQueryDestinationTestUtils.createConfig(CREDENTIALS_NO_EDIT_PUBLIC_SCHEMA_ROLE_PATH, "public");
//config with GCS staging
gcsStagingConfig = BigQueryDestinationTestUtils.createConfig(CREDENTIALS_WITH_GCS_STAGING_PATH, datasetId);

Expand All @@ -199,6 +205,7 @@ public static void beforeAll() throws IOException {
put("configWithProjectId", configWithProjectId);
put("configWithBadProjectId", configWithBadProjectId);
put("insufficientRoleConfig", insufficientRoleConfig);
put("noEditPublicSchemaRoleConfig", noEditPublicSchemaRoleConfig);
put("nonBillableConfig", nonBillableConfig);
put("gcsStagingConfig", gcsStagingConfig);
}};
Expand Down