Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add "internal IP only" support for Dataproc clusters #837

Merged
merged 10 commits into from
Dec 22, 2017
Merged

WIP: Add "internal IP only" support for Dataproc clusters #837

merged 10 commits into from
Dec 22, 2017

Conversation

nickjacques
Copy link
Contributor

@nickjacques nickjacques commented Dec 10, 2017

This PR creates a internal_ip_only boolean argument, nested in the gce_cluster_config schema. By default, this value is set to false, which is consistent with the current behavior of the API and CLI - Dataproc clusters are created with internal IPs and ephemeral external IPs. When the value is true, this passes through to the API and creates a Dataproc cluster with internal IP addresses only (no ephemeral external IP addresses).

This requires use of subnetworks, as well as Private Google Access enabled in the destination subnetwork. This should be added to the provider docs.

Note: the internalIpOnly API argument is available in the Dataproc v1 API (as well as v1beta2)

TODO:

  • create feature by adding argument to gce_cluster_config schema (737fd7a)
  • add test cases
    • default test case: internal_ip_only is false (a679b18)
    • test case: internal_ip_only is true (dd4521c)
  • validate test cases
  • add argument to documentation (fecee25)
  • augment resource schema argument to conflict when not using subnets?

@nickjacques
Copy link
Contributor Author

@danawillow Hi Dana! Could you give dd4521c a glance? I am not familiar with running the full test suite and don't feel comfortable running it without guidance (I don't want to impact my production projects due to an oversight or missed env var). I believe a679b18 is pretty straightforward and should get the job done, but I'm less confident about dd4521c. Would love your thoughts on this PR as a whole so far. Thanks!!

@paddycarver
Copy link
Contributor

Fix for #864.

@danawillow danawillow self-requested a review December 19, 2017 00:31
@danawillow danawillow self-assigned this Dec 19, 2017
Copy link
Contributor

@danawillow danawillow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nickjacques! This looks great, just a few small comments on the tests.

@@ -107,6 +107,9 @@ func TestAccDataprocCluster_basic(t *testing.T) {
// Default behaviour is for Dataproc to autogen or autodiscover a config bucket
resource.TestCheckResourceAttrSet("google_dataproc_cluster.basic", "cluster_config.0.bucket"),

// Default behavior is for Dataproc to not use only internal IP addresses
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.gce_cluster_config.internal_ip_only", "false"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be cluster_config.0.gce_cluster_config.0.internal_ip_only because of the way nested resources are stored in the Terraform state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 365df40!

testAccCheckDataprocClusterExists("google_dataproc_cluster.basic", &cluster),

// Testing behavior for Dataproc to use only internal IP addresses
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.gce_cluster_config.internal_ip_only", "true"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 365df40!

name = "dproc-cluster-test-allow-internal"
description = "Firewall rules for dataproc Terraform acceptance testing"
network = "${google_compute_network.dataproc_network.name}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to also include source_ranges here to more closely match the default allow-internal firewall rule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3e67813!

func testAccDataprocCluster_basicWithInternalIpOnlyTrue(rnd string) string {
return fmt.Sprintf(`
resource "google_compute_network" "dataproc_network" {
name = "dataproc-internalip-network"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add randomness to this and the subnetwork? This helps out with our automated testing so that if the test fails before the resource is deleted, it doesn't collide with the existing one when trying to create it the next time. Feel free to reuse the same rnd var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 853a9ee!

resource "google_dataproc_cluster" "basic" {
name = "dproc-cluster-test-%s"
region = "us-central1"
depends_on = ["google_compute_firewall.dataproc_network_firewall","google_compute_subnetwork.dataproc_subnetwork"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're already depending on the subnetwork below, so no need to include it in the depends_on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7d45f5f!

@@ -107,6 +107,9 @@ func TestAccDataprocCluster_basic(t *testing.T) {
// Default behaviour is for Dataproc to autogen or autodiscover a config bucket
resource.TestCheckResourceAttrSet("google_dataproc_cluster.basic", "cluster_config.0.bucket"),

// Default behavior is for Dataproc to not use only internal IP addresses
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.0.gce_cluster_config.internal_ip_only", "false"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll still need another .0 after gce_cluster_config

testAccCheckDataprocClusterExists("google_dataproc_cluster.basic", &cluster),

// Testing behavior for Dataproc to use only internal IP addresses
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.0.gce_cluster_config.internal_ip_only", "true"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

resource "google_dataproc_cluster" "basic" {
name = "dproc-cluster-test-%s"
region = "us-central1"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops sorry you'll still want to have the explicit depends on for the firewall rule, just not for the subnetwork

@nickjacques
Copy link
Contributor Author

Right you are @danawillow, sorry about that! I think the last two commits should tidy things up.

@danawillow
Copy link
Contributor

Looks good!

--- PASS: TestAccDataprocCluster_basic (101.89s)
--- PASS: TestAccDataprocCluster_basicWithInternalIpOnlyTrue (206.42s)

Was there anything else you were hoping to add to this? I'm ready to merge if you are.

@nickjacques
Copy link
Contributor Author

Awesome, thanks Dana! The only other thing I’m pondering over is if there should be some sort of conflict if the “subnet” property is not set (since using the internalIp setting requires deploying to a subnet and not a legacy network), or if we should just document that using subnets is required for this option (which is inferred in the added documentation, but maybe should be more directly stated?)

@danawillow
Copy link
Contributor

There's no way to say on plan-time "if X field is set, Y field must also be set" so either way, a user running Terraform would get the error when they ran terraform apply. In that case, we could catch the error early to save an API call, or we could let the API return an error to save some code. I'd probably lean towards catching the error early, especially if the error the API returns isn't descriptive enough (I haven't checked whether it is or not), but I don't think it's super important to do so.

@nickjacques
Copy link
Contributor Author

@danawillow looks like the API does a pretty good job of handling this itself - here's me trying to use internal_ip_only without specifying a subnetwork:

Error: Error applying plan:

1 error(s) occurred:

* module.dataproc-cluster.google_dataproc_cluster.dataproc: 1 error(s) occurred:

* google_dataproc_cluster.dataproc: googleapi: Error 400: Network 'default' doesn't contain subnetworks, but 'internal_ip_only' is true. Specify a network which contains subnetworks to enable 'internal_ip_only', or set 'internal_ip_only' to 'false'., badRequest

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

I think this is ready for merge if the above error is acceptable!

@danawillow
Copy link
Contributor

Works for me, thanks!

@danawillow danawillow merged commit 46cc5b7 into hashicorp:master Dec 22, 2017
@nickjacques
Copy link
Contributor Author

Thanks Dana!! Appreciate all your help. Happy holidays!

chrisst pushed a commit to chrisst/terraform-provider-google that referenced this pull request Nov 9, 2018
)

* Add internalIpOnly support for Dataproc clusters

* Add internal_ip_only to dataproc cluster docs

* Add default/basic dataproc internal ip test case

* Add test for dataproc internal_ip_only=true

* fixup cluster_config.gce_cluster_config to include .0.

* Remove redundant depends_on

* Add %s rnd to network and subnetwork

* Use variable for subnet CIDR and reference via source_ranges

* Add depends_on back to dataproc cluster test

* Fix cluster attribute refs (.0. again)
@ghost
Copy link

ghost commented Mar 30, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 30, 2020
flozzone pushed a commit to flozzone/terraform-provider-google that referenced this pull request Oct 17, 2024
…p#840)

* Added support for multi-project GKE Hub registration

Added an optional variable HUB_PROJECT_ID which will allow you to specific a seperate
GCP project for the GKE HUB than the project the cluster is deployed to.

This included updating the 3 examples the leveraged the hub module.

Issue: hashicorp#837

* Adding service idenity resource for multi project deployments

Ensuring that the Hub default Service Account exists
when adding a cluster from outside the hub project

Issue: hashicorp#837

* Fixing bharathkkb comments

Fixes hashicorp#837

* Fix linting issue

* Removed Google project data souce

Issue: hashicorp#837

* Adding upgrade documentation for this change.

* Lint updates on readme.

Co-authored-by: James Duncan <[email protected]>
Co-authored-by: Bharath KKB <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants