-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to create GKE cluster based dataproc #2127
Comments
There was a bug in v29. Please try with HEAD |
Ok thanks for response, I am not getting "HEAD", do we need to use another older version such as 28 or 27? |
Yes, just use the latest version from the master branch |
Yes I am using the latest version 29 from the main branch but it throw error Can you please help us which version is good to a based dataproc cluster and if any example (as no example given to below link for creating GKE based dataproce cluster https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/v29.0.0/modules/dataproc/README.md |
Try removing the |
After removing the ?ref=v29.0.0 from the module's source it works, but I am creating my own modules and need to use ref=v29.0.0 is must, please let us know which version ()lower 28, 27, 26 ect) is working to create GKE based dataproc cluster |
use one of the daily tags, perhaps daily-2024.03.04 |
Hi juliocc, As per your recomandation I am using daily-2024.03.04 to create data proc cluster using GKE using example given and if I replace code as per recomandtion in error as Now it throw error with project name as twice.. Error: Error creating Dataproc cluster: googleapi: Error 400: GKE Node Pool name 'projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1/nodePools/projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1/nodePools/test-node-pool-1', must conform to pattern 'projects/([^/]+)/(?:locations|zones)/([^/]+)/clusters/([^/]+)/nodePools/([^/]+)', badRequest Please let us know what values need to add in node_pool . |
There are few issues that may play a role here:
I used also custom IAM configuration, as otherwise everything binds to compute default service account. I managed to create successfully a Dataproc on GKE using following config: locals {
dataproc_namespace = "foobar"
}
module "cluster-1" {
source = "./fabric/modules/gke-cluster-standard"
project_id = var.project_id
name = "cluster"
location = "${var.region}-b"
vpc_config = {
network = var.vpc.self_link
subnetwork = var.subnet.self_link
secondary_range_names = {} # use default names "pods" and "services"
master_authorized_ranges = {
internal-vms = "10.0.0.0/8"
}
master_ipv4_cidr_block = "192.168.0.0/28"
}
private_cluster_config = {
enable_private_endpoint = true
master_global_access = false
}
enable_features = {
dataplane_v2 = true
fqdn_network_policy = true
workload_identity = true
}
labels = {
environment = "dev"
}
}
module "cluster-1-nodepool-1" {
source = "./fabric/modules/gke-nodepool"
project_id = var.project_id
cluster_name = module.cluster-1.name
location = "${var.region}-b"
name = "nodepool-1"
nodepool_config = {
autoscaling = {
max_node_count = 2
min_node_count = 1
}
}
}
module "service-account" {
source = "./fabric/modules/iam-service-account"
project_id = var.project_id
name = "dataproc-worker"
iam = {
"roles/iam.workloadIdentityUser" = [
"serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/agent]",
"serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/spark-driver]",
"serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/spark-executor]"
]
}
iam_project_roles = {
(var.project_id) = ["roles/dataproc.worker"]
}
}
module "processing-dp-cluster" {
source = "./fabric/modules/dataproc"
project_id = var.project_id
name = "my-cluster"
region = var.region
dataproc_config = {
virtual_cluster_config = {
kubernetes_cluster_config = {
kubernetes_namespace = local.dataproc_namespace
kubernetes_software_config = {
component_version = {
"SPARK" : "3.1-dataproc-14"
}
properties = {
"dataproc:dataproc.gke.agent.google-service-account" = module.service-account.email
"dataproc:dataproc.gke.spark.driver.google-service-account" = module.service-account.email
"dataproc:dataproc.gke.spark.executor.google-service-account" = module.service-account.email
}
}
gke_cluster_config = {
gke_cluster_target = module.cluster-1.id
node_pool_target = {
node_pool = "dataproc-nodepool"
roles = ["DEFAULT"]
}
}
}
}
}
} I'll be updating README for Dataproc module with this example. |
Describe the bug
Not able to define values of the following module using while creating gke based dataproc cluster
Environment
v29 version of dataproc module
To Reproduce
using kubernetes_cluster_config to create dataproc cluster for gke type
Expected behavior
it must create gke cluster
Result
Additional context
Add any other context about the problem here
The text was updated successfully, but these errors were encountered: