-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add "internal IP only" support for Dataproc clusters #837
Conversation
@danawillow Hi Dana! Could you give dd4521c a glance? I am not familiar with running the full test suite and don't feel comfortable running it without guidance (I don't want to impact my production projects due to an oversight or missed env var). I believe a679b18 is pretty straightforward and should get the job done, but I'm less confident about dd4521c. Would love your thoughts on this PR as a whole so far. Thanks!! |
Fix for #864. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nickjacques! This looks great, just a few small comments on the tests.
@@ -107,6 +107,9 @@ func TestAccDataprocCluster_basic(t *testing.T) { | |||
// Default behaviour is for Dataproc to autogen or autodiscover a config bucket | |||
resource.TestCheckResourceAttrSet("google_dataproc_cluster.basic", "cluster_config.0.bucket"), | |||
|
|||
// Default behavior is for Dataproc to not use only internal IP addresses | |||
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.gce_cluster_config.internal_ip_only", "false"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be cluster_config.0.gce_cluster_config.0.internal_ip_only
because of the way nested resources are stored in the Terraform state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 365df40!
testAccCheckDataprocClusterExists("google_dataproc_cluster.basic", &cluster), | ||
|
||
// Testing behavior for Dataproc to use only internal IP addresses | ||
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.gce_cluster_config.internal_ip_only", "true"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 365df40!
name = "dproc-cluster-test-allow-internal" | ||
description = "Firewall rules for dataproc Terraform acceptance testing" | ||
network = "${google_compute_network.dataproc_network.name}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to also include source_ranges
here to more closely match the default allow-internal firewall rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 3e67813!
func testAccDataprocCluster_basicWithInternalIpOnlyTrue(rnd string) string { | ||
return fmt.Sprintf(` | ||
resource "google_compute_network" "dataproc_network" { | ||
name = "dataproc-internalip-network" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add randomness to this and the subnetwork? This helps out with our automated testing so that if the test fails before the resource is deleted, it doesn't collide with the existing one when trying to create it the next time. Feel free to reuse the same rnd var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 853a9ee!
resource "google_dataproc_cluster" "basic" { | ||
name = "dproc-cluster-test-%s" | ||
region = "us-central1" | ||
depends_on = ["google_compute_firewall.dataproc_network_firewall","google_compute_subnetwork.dataproc_subnetwork"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're already depending on the subnetwork below, so no need to include it in the depends_on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 7d45f5f!
@@ -107,6 +107,9 @@ func TestAccDataprocCluster_basic(t *testing.T) { | |||
// Default behaviour is for Dataproc to autogen or autodiscover a config bucket | |||
resource.TestCheckResourceAttrSet("google_dataproc_cluster.basic", "cluster_config.0.bucket"), | |||
|
|||
// Default behavior is for Dataproc to not use only internal IP addresses | |||
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.0.gce_cluster_config.internal_ip_only", "false"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll still need another .0 after gce_cluster_config
testAccCheckDataprocClusterExists("google_dataproc_cluster.basic", &cluster), | ||
|
||
// Testing behavior for Dataproc to use only internal IP addresses | ||
resource.TestCheckResourceAttr("google_dataproc_cluster.basic", "cluster_config.0.gce_cluster_config.internal_ip_only", "true"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
resource "google_dataproc_cluster" "basic" { | ||
name = "dproc-cluster-test-%s" | ||
region = "us-central1" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops sorry you'll still want to have the explicit depends on for the firewall rule, just not for the subnetwork
Right you are @danawillow, sorry about that! I think the last two commits should tidy things up. |
Looks good!
Was there anything else you were hoping to add to this? I'm ready to merge if you are. |
Awesome, thanks Dana! The only other thing I’m pondering over is if there should be some sort of conflict if the “subnet” property is not set (since using the internalIp setting requires deploying to a subnet and not a legacy network), or if we should just document that using subnets is required for this option (which is inferred in the added documentation, but maybe should be more directly stated?) |
There's no way to say on plan-time "if X field is set, Y field must also be set" so either way, a user running Terraform would get the error when they ran |
@danawillow looks like the API does a pretty good job of handling this itself - here's me trying to use
I think this is ready for merge if the above error is acceptable! |
Works for me, thanks! |
Thanks Dana!! Appreciate all your help. Happy holidays! |
) * Add internalIpOnly support for Dataproc clusters * Add internal_ip_only to dataproc cluster docs * Add default/basic dataproc internal ip test case * Add test for dataproc internal_ip_only=true * fixup cluster_config.gce_cluster_config to include .0. * Remove redundant depends_on * Add %s rnd to network and subnetwork * Use variable for subnet CIDR and reference via source_ranges * Add depends_on back to dataproc cluster test * Fix cluster attribute refs (.0. again)
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
…p#840) * Added support for multi-project GKE Hub registration Added an optional variable HUB_PROJECT_ID which will allow you to specific a seperate GCP project for the GKE HUB than the project the cluster is deployed to. This included updating the 3 examples the leveraged the hub module. Issue: hashicorp#837 * Adding service idenity resource for multi project deployments Ensuring that the Hub default Service Account exists when adding a cluster from outside the hub project Issue: hashicorp#837 * Fixing bharathkkb comments Fixes hashicorp#837 * Fix linting issue * Removed Google project data souce Issue: hashicorp#837 * Adding upgrade documentation for this change. * Lint updates on readme. Co-authored-by: James Duncan <[email protected]> Co-authored-by: Bharath KKB <[email protected]>
This PR creates a
internal_ip_only
boolean argument, nested in thegce_cluster_config
schema. By default, this value is set tofalse
, which is consistent with the current behavior of the API and CLI - Dataproc clusters are created with internal IPs and ephemeral external IPs. When the value istrue
, this passes through to the API and creates a Dataproc cluster with internal IP addresses only (no ephemeral external IP addresses).This requires use of subnetworks, as well as Private Google Access enabled in the destination subnetwork. This should be added to the provider docs.
Note: the
internalIpOnly
API argument is available in the Dataproc v1 API (as well as v1beta2)TODO:
gce_cluster_config
schema (737fd7a)internal_ip_only
isfalse
(a679b18)internal_ip_only
istrue
(dd4521c)