Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for providing additional experiments to Dataflow job #6196

Merged
merged 4 commits into from
Apr 27, 2020

Conversation

wazim
Copy link
Contributor

@wazim wazim commented Apr 24, 2020

An example use case is as-per: https://cloud.google.com/dataflow/docs/guides/using-cloud-monitoring#receive_worker_vm_metrics_from_the_agent. It is needed that an additional experiment is configured: enable_stackdriver_agent_metrics.

An example looks like:

resource "google_dataflow_job" "with_additional_experiments" {
  name = "dataflow-job"

  additional_experiments = ["enable_stackdriver_agent_metrics"]

  template_gcs_path = "gs://testing/template"
  temp_gcs_location = "gs://testing/tmp"
  parameters = {
    inputFile = "gs://testing/input"
    output    = "gs://testing/output"
  }
  on_delete = "cancel"
}

@ghost ghost added the size/m label Apr 24, 2020
@ghost ghost requested a review from danawillow April 24, 2020 12:11
@@ -151,6 +151,15 @@ func resourceDataflowJob() *schema.Resource {
ValidateFunc: validation.StringInSlice([]string{"WORKER_IP_PUBLIC", "WORKER_IP_PRIVATE", ""}, false),
},

"additional_experiments": {
Type: schema.TypeList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the order here meaningful? If not, I'd recommend using a TypeSet instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I've updated it to TypeSet. Thanks!

randStr := randString(t, 10)
bucket := "tf-test-dataflow-gcs-" + randStr
job := "tf-test-dataflow-job-" + randStr
additionalExperiments := []string{"enable_stackdriver_agent_metrics"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a few more values to this list just for extra confidence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added shuffle_mode=service to the list as per https://cloud.google.com/blog/products/gcp/introducing-cloud-dataflow-shuffle-for-up-to-5x-performance-improvement-in-data-analytic-pipelines. I can add some more if you think there is value in it? Thanks!

@ghost ghost added the documentation label Apr 27, 2020
Copy link
Contributor

@danawillow danawillow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @wazim!

Since the source of truth for our resources is at https://github.com/GoogleCloudPlatform/magic-modules, I'll take care of upstreaming this change there.

@ghost
Copy link

ghost commented May 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators May 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants