-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigQuery Table Hive Partitioning not Working with Explicit Schema (autodetect=false) #6693
BigQuery Table Hive Partitioning not Working with Explicit Schema (autodetect=false) #6693
Comments
@LoekL it would help if you could run this with |
@ffung, dropped a table & then:
Other one:
|
@ffung It appears |
if you look at the Besides that I currently don't see any differences between the {
"tableReference": {
"tableId": "events_playfab_hive_partitioned_batched",
"datasetId": "external",
"projectId": "tidal-turbine-222414"
},
"externalDataConfiguration": {
"autodetect": true,
"hivePartitioningOptions": {
"mode": "CUSTOM",
"sourceUriPrefix": "gs://tidal-turbine-222414-analytics-testing-batched/data_type=jsonl/event_schema=playfab/{event_category:STRING}/{event_environment:STRING}/{event_date:DATE}/{event_hour:STRING}/{event_minute:STRING}"
},
"ignoreUnknownValues": true,
"schema": {
"fields": [
{
"name": "AnalyticsEnvironment",
"type": "STRING"
},
{
"name": "PlayFabEnvironment",
"type": "STRING"
},
{
"name": "SourceType",
"type": "STRING"
},
{
"name": "Source",
"type": "STRING"
},
{
"name": "EventNamespace",
"type": "STRING"
},
{
"name": "TitleId",
"type": "STRING"
},
{
"name": "GroupBatchId",
"type": "STRING"
},
{
"name": "BatchId",
"type": "STRING"
},
{
"name": "EventId",
"type": "STRING"
},
{
"name": "EventName",
"type": "STRING"
},
{
"name": "EntityType",
"type": "STRING"
},
{
"name": "EntityId",
"type": "STRING"
},
{
"name": "Timestamp",
"type": "TIMESTAMP"
},
{
"name": "ReceivedTimestamp",
"type": "TIMESTAMP"
},
{
"name": "BatchedTimestamp",
"type": "TIMESTAMP"
},
{
"name": "BatchJobName",
"type": "STRING"
},
{
"name": "EventAttributes",
"type": "STRING"
}
]
},
"sourceFormat": "NEWLINE_DELIMITED_JSON",
"sourceUris": [
"gs://tidal-turbine-222414-analytics-testing-batched/data_type=jsonl/event_schema=playfab/*"
]
}
} |
@ffung see first post, bq mkdef seems to always set this to Using
|
I also just tried adding |
@LoekL thanks for the info, I had a closer look at the payloads and there's a difference, |
I see, so it should work if we move |
yes, the terraform schema for |
Seems like the docs for this didn't make it to the Terraform docs site? It's not showing it's possible to specify schema under external_data_configuration block |
It's not clear what it was but it's there now. Thanks for looking into it. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
Community Note
modular-magician
user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot
, a community member has claimed the issue already.Terraform Version
Affected Resource(s)
Terraform Configuration Files
Debug Output
Terraform did not produce an error.
Expected Behavior
I should have gotten tables with Hive partition columns, however the tables that were successfully created did not show the hive partitions.
Actual Behavior
This is an example table that Terraform produced:
Steps to Reproduce
I have tried applying the same without passing it an explicit schema, and set
autodetect = true
. Now the tables produced do have the Hive partitions, however:For these reasons I'd like to pass it an explicit schema.
I get it working correctly via:
The definition file looks like this:
Note that in bq mkdef I did not set
--autodetect
, yet the file still has"autodetect": true
(other bug --Google Cloud SDK 298.0.0
&bq 2.0.58
?). Changing this tofalse
, or removing it altogether doesn't change anything though, it still works:Finally,
Compression = GZIP
, whereas the working one does not have this set. But I'm not sure how this would translate to the missing partitions. Also note I am using decompressive transcoding (all files in Cloud Storage haveContent-Encoding: gzip
set).bq mk
) and is overriding the hive path partitions?References
The text was updated successfully, but these errors were encountered: