Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Catalog Entry #3532

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 200 additions & 0 deletions products/datacatalog/api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,206 @@ objects:
name: description
description: |
Entry group description, which can consist of several sentences or paragraphs that describe entry group contents.
- !ruby/object:Api::Resource
name: Entry
base_url: '{{entry_group}}/entries'
create_url: '{{entry_group}}/entries?entryId={{entry_id}}'
self_link: "{{name}}"
update_verb: :PATCH
update_mask: true
description: |
Entry Metadata. A Data Catalog Entry resource represents another resource in Google Cloud Platform
(such as a BigQuery dataset or a Pub/Sub topic) or outside of Google Cloud Platform. Clients can use
the linkedResource field in the Entry resource to refer to the original resource ID of the source system.

An Entry resource contains resource details, such as its schema. An Entry can also be used to attach
flexible metadata, such as a Tag.
references: !ruby/object:Api::Resource::ReferenceLinks
guides:
'Official Documentation': https://cloud.google.com/data-catalog/docs
api: https://cloud.google.com/data-catalog/docs/reference/rest/v1/projects.locations.entryGroups.entries
parameters:
- !ruby/object:Api::Type::String
name: entryGroup
required: true
url_param_only: true
input: true
description: |
The name of the entry group this entry is in.
- !ruby/object:Api::Type::String
name: entryId
required: true
url_param_only: true
input: true
description: |
The id of the entry to create.
properties:
- !ruby/object:Api::Type::String
name: name
description: |
The Data Catalog resource name of the entry in URL format.
Example: projects/{project_id}/locations/{location}/entryGroups/{entryGroupId}/entries/{entryId}.
Note that this Entry and its child resources may not actually be stored in the location in this name.
output: true
- !ruby/object:Api::Type::String
name: linkedResource
description: |
The resource this metadata entry refers to.
For Google Cloud Platform resources, linkedResource is the full name of the resource.
For example, the linkedResource for a table resource from BigQuery is:
//bigquery.googleapis.com/projects/projectId/datasets/datasetId/tables/tableId
Output only when Entry is of type in the EntryType enum. For entries with userSpecifiedType,
this field is optional and defaults to an empty string.
- !ruby/object:Api::Type::String
name: displayName
description: |
Display information such as title and description. A short name to identify the entry,
for example, "Analytics Data - Jan 2011".
- !ruby/object:Api::Type::String
name: description
description: |
Entry description, which can consist of several sentences or paragraphs that describe entry contents.
- !ruby/object:Api::Type::String
# This is a string instead of a NestedObject because schemas contain ColumnSchemas, which can contain nested ColumnSchemas.
# We'll have people provide the json blob for the schema instead.
name: schema
description: |
Schema of the entry (e.g. BigQuery, GoogleSQL, Avro schema), as a json string. An entry might not have any schema
attached to it. See
https://cloud.google.com/data-catalog/docs/reference/rest/v1/projects.locations.entryGroups.entries#schema
for what fields this schema can contain.
- !ruby/object:Api::Type::Enum
name: type
description: |
The type of the entry. Only used for Entries with types in the EntryType enum.
Currently, only FILESET enum value is allowed. All other entries created through Data Catalog must use userSpecifiedType.
values:
- :FILESET
input: true
exactly_one_of:
- type
- user_specified_type
- !ruby/object:Api::Type::String
name: userSpecifiedType
description: |
Entry type if it does not fit any of the input-allowed values listed in EntryType enum above.
When creating an entry, users should check the enum values first, if nothing matches the entry
to be created, then provide a custom value, for example "my_special_type".
userSpecifiedType strings must begin with a letter or underscore and can only contain letters,
numbers, and underscores; are case insensitive; must be at least 1 character and at most 64 characters long.
exactly_one_of:
- type
- user_specified_type
- !ruby/object:Api::Type::String
name: integratedSystem
description: |
This field indicates the entry's source system that Data Catalog integrates with, such as BigQuery or Pub/Sub.
output: true
- !ruby/object:Api::Type::String
name: userSpecifiedSystem
description: |
This field indicates the entry's source system that Data Catalog does not integrate with.
userSpecifiedSystem strings must begin with a letter or underscore and can only contain letters, numbers,
and underscores; are case insensitive; must be at least 1 character and at most 64 characters long.
- !ruby/object:Api::Type::NestedObject
name: gcsFilesetSpec
description: |
Specification that applies to a Cloud Storage fileset. This is only valid on entries of type FILESET.
properties:
- !ruby/object:Api::Type::Array
name: filePatterns
description: |
Patterns to identify a set of files in Google Cloud Storage.
See [Cloud Storage documentation](https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames)
for more information. Note that bucket wildcards are currently not supported. Examples of valid filePatterns:

* gs://bucket_name/dir/*: matches all files within bucket_name/dir directory.
* gs://bucket_name/dir/**: matches all files in bucket_name/dir spanning all subdirectories.
* gs://bucket_name/file*: matches files prefixed by file in bucket_name
* gs://bucket_name/??.txt: matches files with two characters followed by .txt in bucket_name
* gs://bucket_name/[aeiou].txt: matches files that contain a single vowel character followed by .txt in bucket_name
* gs://bucket_name/[a-m].txt: matches files that contain a, b, ... or m followed by .txt in bucket_name
* gs://bucket_name/a/*/b: matches all files in bucket_name that match a/*/b pattern, such as a/c/b, a/d/b
* gs://another_bucket/a.txt: matches gs://another_bucket/a.txt
required: true
item_type: Api::Type::String
- !ruby/object:Api::Type::Array
name: sampleGcsFileSpecs
description: |
Sample files contained in this fileset, not all files contained in this fileset are represented here.
output: true
item_type: !ruby/object:Api::Type::NestedObject
properties:
- !ruby/object:Api::Type::String
name: filePath
description: |
The full file path
output: true
- !ruby/object:Api::Type::Integer
name: sizeBytes
description: |
The size of the file, in bytes.
output: true
- !ruby/object:Api::Type::NestedObject
name: bigqueryTableSpec
description: |
Specification that applies to a BigQuery table. This is only valid on entries of type TABLE.
output: true
properties:
- !ruby/object:Api::Type::String
name: tableSourceType
description: |
The table source type.
output: true
- !ruby/object:Api::Type::NestedObject
name: viewSpec
description: |
Table view specification. This field should only be populated if tableSourceType is BIGQUERY_VIEW.
output: true
properties:
- !ruby/object:Api::Type::String
name: viewQuery
description: |
The query that defines the table view.
output: true
- !ruby/object:Api::Type::NestedObject
name: tableSpec
description: |
Spec of a BigQuery table. This field should only be populated if tableSourceType is BIGQUERY_TABLE.
output: true
properties:
- !ruby/object:Api::Type::String
name: groupedEntry
description: |
If the table is a dated shard, i.e., with name pattern [prefix]YYYYMMDD, groupedEntry is the
Data Catalog resource name of the date sharded grouped entry, for example,
projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}.
Otherwise, groupedEntry is empty.
output: true
- !ruby/object:Api::Type::NestedObject
name: bigqueryDateShardedSpec
description: |
Specification for a group of BigQuery tables with name pattern [prefix]YYYYMMDD.
Context: https://cloud.google.com/bigquery/docs/partitioned-tables#partitioning_versus_sharding.
output: true
properties:
- !ruby/object:Api::Type::String
name: dataset
description: |
The Data Catalog resource name of the dataset entry the current table belongs to, for example,
projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}
output: true
- !ruby/object:Api::Type::String
name: tablePrefix
description: |
The table name prefix of the shards. The name of any given shard is [tablePrefix]YYYYMMDD,
for example, for shard MyTable20180101, the tablePrefix is MyTable.
output: true
- !ruby/object:Api::Type::Integer
name: shardCount
description: |
Total number of shards.
output: true

# Blocked on b/155304495
# - !ruby/object:Api::Resource
Expand Down
42 changes: 41 additions & 1 deletion products/datacatalog/terraform.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
--- !ruby/object:Provider::Terraform::Config
overrides: !ruby/object:Overrides::ResourceOverrides
EntryGroup: !ruby/object:Overrides::Terraform::ResourceOverride
import_format: ["{{name}}"]
examples:
- !ruby/object:Provider::Terraform::Examples
name: "data_catalog_entry_group_basic"
Expand All @@ -36,7 +37,46 @@ overrides: !ruby/object:Overrides::ResourceOverrides
required: false
default_from_api: true
custom_code: !ruby/object:Provider::Terraform::CustomCode
custom_import: templates/terraform/custom_import/self_link_as_name.erb
custom_import: templates/terraform/custom_import/data_catalog_entry_group.go.erb
Entry: !ruby/object:Overrides::Terraform::ResourceOverride
import_format: ["{{name}}"]
supports_indirect_user_project_override: true
examples:
- !ruby/object:Provider::Terraform::Examples
name: "data_catalog_entry_basic"
primary_resource_id: "basic_entry"
vars:
entry_id: "my_entry"
entry_group_id: "my_group"
- !ruby/object:Provider::Terraform::Examples
name: "data_catalog_entry_fileset"
primary_resource_id: "basic_entry"
vars:
entry_id: "my_entry"
entry_group_id: "my_group"
- !ruby/object:Provider::Terraform::Examples
name: "data_catalog_entry_full"
primary_resource_id: "basic_entry"
vars:
entry_id: "my_entry"
entry_group_id: "my_group"
properties:
linkedResource: !ruby/object:Overrides::Terraform::PropertyOverride
default_from_api: true
schema: !ruby/object:Overrides::Terraform::PropertyOverride
custom_expand: 'templates/terraform/custom_expand/json_schema.erb'
custom_flatten: 'templates/terraform/custom_flatten/json_schema.erb'
state_func: 'func(v interface{}) string { s, _ := structure.NormalizeJsonString(v); return s }'
validation: !ruby/object:Provider::Terraform::Validation
function: 'validation.ValidateJsonString'
userSpecifiedSystem: !ruby/object:Overrides::Terraform::PropertyOverride
validation: !ruby/object:Provider::Terraform::Validation
regex: '^[A-z_][A-z0-9_]{0,63}$'
userSpecifiedType: !ruby/object:Overrides::Terraform::PropertyOverride
validation: !ruby/object:Provider::Terraform::Validation
regex: '^[A-z_][A-z0-9_]{0,63}$'
custom_code: !ruby/object:Provider::Terraform::CustomCode
custom_import: templates/terraform/custom_import/data_catalog_entry.go.erb
# TagTemplate: !ruby/object:Overrides::Terraform::ResourceOverride
# examples:
# - !ruby/object:Provider::Terraform::Examples
Expand Down
4 changes: 2 additions & 2 deletions products/healthcare/terraform.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -107,8 +107,8 @@ overrides: !ruby/object:Overrides::ResourceOverrides
creationTime: !ruby/object:Overrides::Terraform::PropertyOverride
exclude: true
parserConfig.schema: !ruby/object:Overrides::Terraform::PropertyOverride
custom_expand: 'templates/terraform/custom_expand/healthcare_hl7_v2_store_schema.erb'
custom_flatten: 'templates/terraform/custom_flatten/healthcare_hl7_v2_store_schema.erb'
custom_expand: 'templates/terraform/custom_expand/json_schema.erb'
custom_flatten: 'templates/terraform/custom_flatten/json_schema.erb'
state_func: 'func(v interface{}) string { s, _ := structure.NormalizeJsonString(v); return s }'
validation: !ruby/object:Provider::Terraform::Validation
function: 'validation.ValidateJsonString'
Expand Down
17 changes: 17 additions & 0 deletions templates/terraform/custom_import/data_catalog_entry.go.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
config := meta.(*Config)

// current import_formats can't import fields with forward slashes in their value
if err := parseImportId([]string{"(?P<name>.+)"}, d, config); err != nil {
return nil, err
}

name := d.Get("name").(string)
egRegex := regexp.MustCompile("(projects/.+/locations/.+/entryGroups/.+)/entries/(.+)")

parts := egRegex.FindStringSubmatch(name)
if len(parts) != 3 {
return nil, fmt.Errorf("entry name does not fit the format %s", egRegex)
}
d.Set("entry_group", parts[1])
d.Set("entry_id", parts[2])
return []*schema.ResourceData{d}, nil
18 changes: 18 additions & 0 deletions templates/terraform/custom_import/data_catalog_entry_group.go.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
config := meta.(*Config)

// current import_formats can't import fields with forward slashes in their value
if err := parseImportId([]string{"(?P<name>.+)"}, d, config); err != nil {
return nil, err
}

name := d.Get("name").(string)
egRegex := regexp.MustCompile("projects/(.+)/locations/(.+)/entryGroups/(.+)")

parts := egRegex.FindStringSubmatch(name)
if len(parts) != 4 {
return nil, fmt.Errorf("entry group name does not fit the format %s", egRegex)
}
d.Set("project", parts[1])
d.Set("region", parts[2])
d.Set("entry_group_id", parts[3])
return []*schema.ResourceData{d}, nil
11 changes: 11 additions & 0 deletions templates/terraform/examples/data_catalog_entry_basic.tf.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
resource "google_data_catalog_entry" "<%= ctx[:primary_resource_id] %>" {
entry_group = google_data_catalog_entry_group.entry_group.id
entry_id = "<%= ctx[:vars]['entry_id'] %>"

user_specified_type = "my_custom_type"
user_specified_system = "SomethingExternal"
}

resource "google_data_catalog_entry_group" "entry_group" {
entry_group_id = "<%= ctx[:vars]['entry_group_id'] %>"
}
14 changes: 14 additions & 0 deletions templates/terraform/examples/data_catalog_entry_fileset.tf.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
resource "google_data_catalog_entry" "<%= ctx[:primary_resource_id] %>" {
entry_group = google_data_catalog_entry_group.entry_group.id
entry_id = "<%= ctx[:vars]['entry_id'] %>"

type = "FILESET"

gcs_fileset_spec {
file_patterns = ["gs://fake_bucket/dir/*"]
}
}

resource "google_data_catalog_entry_group" "entry_group" {
entry_group_id = "<%= ctx[:vars]['entry_group_id'] %>"
}
54 changes: 54 additions & 0 deletions templates/terraform/examples/data_catalog_entry_full.tf.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
resource "google_data_catalog_entry" "<%= ctx[:primary_resource_id] %>" {
entry_group = google_data_catalog_entry_group.entry_group.id
entry_id = "<%= ctx[:vars]['entry_id'] %>"

user_specified_type = "my_user_specified_type"
user_specified_system = "Something_custom"
linked_resource = "my/linked/resource"

display_name = "my custom type entry"
description = "a custom type entry for a user specified system"

schema = <<EOF
{
"columns": [
{
"column": "first_name",
"description": "First name",
"mode": "REQUIRED",
"type": "STRING"
},
{
"column": "last_name",
"description": "Last name",
"mode": "REQUIRED",
"type": "STRING"
},
{
"column": "address",
"description": "Address",
"mode": "REPEATED",
"subcolumns": [
{
"column": "city",
"description": "City",
"mode": "NULLABLE",
"type": "STRING"
},
{
"column": "state",
"description": "State",
"mode": "NULLABLE",
"type": "STRING"
}
],
"type": "RECORD"
}
]
}
EOF
}

resource "google_data_catalog_entry_group" "entry_group" {
entry_group_id = "<%= ctx[:vars]['entry_group_id'] %>"
}
Loading