Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BigTable source format in BigQuery tables #4155

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion products/bigquery/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,8 @@ objects:
description: |
The format of the data files. For CSV files, specify "CSV". For datastore backups, specify "DATASTORE_BACKUP".
For newline-delimited JSON, specify "NEWLINE_DELIMITED_JSON". For Avro, specify "AVRO". For parquet, specify "PARQUET".
For orc, specify "ORC". The default value is CSV.
For orc, specify "ORC". [Beta] For Bigtable, specify "BIGTABLE".
The default value is CSV.
default_value: 'CSV'
- !ruby/object:Api::Type::Boolean
name: 'allowJaggedRows'
Expand Down
4 changes: 2 additions & 2 deletions third_party/terraform/resources/resource_bigquery_table.go
Original file line number Diff line number Diff line change
Expand Up @@ -153,9 +153,9 @@ func resourceBigQueryTable() *schema.Resource {
"source_format": {
Type: schema.TypeString,
Required: true,
Description: `The data format. Supported values are: "CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "PARQUET", and "DATSTORE_BACKUP". To use "GOOGLE_SHEETS" the scopes must include "googleapis.com/auth/drive.readonly".`,
Description: `The data format. Supported values are: "CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "PARQUET", "DATSTORE_BACKUP", and "BIGTABLE". To use "GOOGLE_SHEETS" the scopes must include "googleapis.com/auth/drive.readonly".`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder ORC can also included in the source_format list ?. We have an open issue (hashicorp/terraform-provider-google#7691) for that type.

Copy link
Contributor Author

@jmthvt jmthvt Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry but that's out of scope for this PR - adding it in the list is trivial but requires to write the tests.

ValidateFunc: validation.StringInSlice([]string{
"CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "DATSTORE_BACKUP", "PARQUET",
"CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "DATSTORE_BACKUP", "PARQUET", "BIGTABLE",
}, false),
},
// SourceURIs [Required] The fully-qualified URIs that point to your data in Google Cloud.
Expand Down
55 changes: 55 additions & 0 deletions third_party/terraform/tests/resource_bigquery_table_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,30 @@ func TestAccBigQueryDataTable_sheet(t *testing.T) {
})
}

func TestAccBigQueryDataTable_bigtable(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should only be run for the beta provider, right? This file may need to be renamed to end in .go.erb and wrap this test in if/else tags to check the version.

An example can be seen here: https://github.com/GoogleCloudPlatform/magic-modules/blob/master/third_party/terraform/tests/resource_binaryauthorization_policy_test.go.erb#L41

Copy link
Contributor Author

@jmthvt jmthvt Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite a weird one actually:
As per my understanding, you can set the format to BIGTABLE in stable (search for sourceFormat in https://bigquery.googleapis.com/discovery/v1/apis/bigquery/v2/rest )
However only 2 regions are supported for the queries https://cloud.google.com/bigquery/external-data-bigtable#supported_regions_and_zones.

From the BigQuery API point of view, I don't think it requires the beta provider.
But from BigTable it does.
Not sure of the preferred approach here.

t.Parallel()

context := map[string]interface{}{
"random_suffix": randString(t, 10),
}

vcrTest(t, resource.TestCase{
PreCheck: func() { testAccPreCheck(t) },
Providers: testAccProviders,
CheckDestroy: testAccCheckBigQueryTableDestroyProducer(t),
Steps: []resource.TestStep{
{
Config: testAccBigQueryTableFromBigtable(context),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this doesn't match the function declaration, causing this error:
google-beta/resource_bigquery_table_test.go:432:13: undefined: testAccBigQueryTableFromBigtable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, thanks!
Should be fixed now.

},
{
ResourceName: "google_bigquery_table.table",
ImportState: true,
ImportStateVerify: true,
},
},
})
}

func testAccCheckBigQueryExtData(t *testing.T, expectedQuoteChar string) resource.TestCheckFunc {
return func(s *terraform.State) error {
for _, rs := range s.RootModule().Resources {
Expand Down Expand Up @@ -1048,6 +1072,37 @@ func testAccBigQueryTableFromSheet(context map[string]interface{}) string {
`, context)
}

func testAccBigQueryTableFromBigTable(context map[string]interface{}) string {
return Nprintf(`
resource "google_bigquery_table" "table" {
dataset_id = google_bigquery_dataset.dataset.dataset_id
table_id = "tf_test_bigtable_%{random_suffix}"

external_data_configuration {
autodetect = true
source_format = "BIGTABLE"
ignore_unknown_values = true

source_uris = [
"https://https://googleapis.com/bigtable/projects/project_id/instances/instance_id/tables/table_name",
]
}
}

resource "google_bigquery_dataset" "dataset" {
dataset_id = "tf_test_ds_%{random_suffix}"
friendly_name = "test"
description = "This is a test description"
location = "EU"
default_table_expiration_ms = 3600000

labels = {
env = "default"
}
}
`, context)
}

var TEST_CSV = `lifelock,LifeLock,,web,Tempe,AZ,1-May-07,6850000,USD,b
lifelock,LifeLock,,web,Tempe,AZ,1-Oct-06,6000000,USD,a
lifelock,LifeLock,,web,Tempe,AZ,1-Jan-08,25000000,USD,c
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ The `external_data_configuration` block supports:

* `source_format` (Required) - The data format. Supported values are:
"CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "PARQUET",
and "DATSTORE_BACKUP". To use "GOOGLE_SHEETS"
"DATSTORE_BACKUP", and "BIGTABLE". To use "GOOGLE_SHEETS"
the `scopes` must include
"https://www.googleapis.com/auth/drive.readonly".

Expand Down