Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to create GKE cluster based dataproc #2127

Closed
ajitAcc opened this issue Mar 4, 2024 · 9 comments
Closed

Not able to create GKE cluster based dataproc #2127

ajitAcc opened this issue Mar 4, 2024 · 9 comments

Comments

@ajitAcc
Copy link

ajitAcc commented Mar 4, 2024

Describe the bug
Not able to define values of the following module using while creating gke based dataproc cluster

kubernetes_software_config = {
          component_version = {
            "SPARK" : "3.1-dataproc-7"
          }
          properties = {
            "spark:spark.kubernetes.container.image" : "europe-west3-docker.pkg.dev/cloud-dataproc/dpgke/sparkengine:dataproc-14"
          }
        }

Environment

output from `terraform -version`

v29 version of dataproc module

output from `git rev-parse --short HEAD`

To Reproduce
using kubernetes_cluster_config to create dataproc cluster for gke type

Expected behavior
it must create gke cluster

Result

The given value is not suitable for module.cluster_config_gce.var.dataproc_config declared at ../modules/dataproc/variables.tf:1,1-27: attribute
│ "virtual_cluster_config": attribute "kubernetes_cluster_config": attribute "kubernetes_software_config": attribute "component_version": list of
│ map of string required.

Additional context
Add any other context about the problem here

@juliocc
Copy link
Collaborator

juliocc commented Mar 4, 2024

There was a bug in v29. Please try with HEAD

@ajitAcc
Copy link
Author

ajitAcc commented Mar 4, 2024

Ok thanks for response, I am not getting "HEAD", do we need to use another older version such as 28 or 27?

@juliocc
Copy link
Collaborator

juliocc commented Mar 4, 2024

Yes, just use the latest version from the master branch

@ajitAcc
Copy link
Author

ajitAcc commented Mar 4, 2024

Yes I am using the latest version 29 from the main branch
source = "git::[https://github.com/GoogleCloudPlatform/cloud-foundation-fabric.git//modules/dataproc?ref=v29.0.0"
Actually in variable.tf .
I am trying to create Dataproc cluster using GKE
virtual_cluster_config = {
staging_bucket = local.gke_staging_bucket
kubernetes_cluster_config = {
kubernetes_namespace = "foobar"
kubernetes_software_config = {
component_version = [{ "SPARK" = "3.1-dataproc-7" }]
properties = [{ "spark:spark.kubernetes.container.image" = "europe-west3-docker.pkg.dev/cloud-dataproc/dpgke/sparkengine:dataproc-14" }]
}

but it throw error
Error: Incorrect attribute value type

│ on .terraform/modules/cluster_config_gce.dataproc/modules/dataproc/main.tf line 246, in resource "google_dataproc_cluster" "cluster":
│ 246: component_version = var.dataproc_config.virtual_cluster_config.kubernetes_cluster_config.kubernetes_software_config.component_version
│ ├────────────────
│ │ var.dataproc_config.virtual_cluster_config.kubernetes_cluster_config.kubernetes_software_config.component_version is list of map of string with 1 element

│ Inappropriate value for attribute "component_version": map of string required.

Can you please help us which version is good to a based dataproc cluster and if any example (as no example given to below link for creating GKE based dataproce cluster https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/v29.0.0/modules/dataproc/README.md

@juliocc
Copy link
Collaborator

juliocc commented Mar 4, 2024

Try removing the ?ref=v29.0.0 from the module's source

@ajitAcc
Copy link
Author

ajitAcc commented Mar 4, 2024

After removing the ?ref=v29.0.0 from the module's source it works, but I am creating my own modules and need to use ref=v29.0.0 is must, please let us know which version ()lower 28, 27, 26 ect) is working to create GKE based dataproc cluster

@juliocc
Copy link
Collaborator

juliocc commented Mar 4, 2024

use one of the daily tags, perhaps daily-2024.03.04

@juliocc juliocc closed this as completed Mar 4, 2024
@ajitAcc
Copy link
Author

ajitAcc commented Mar 5, 2024

Hi juliocc,

As per your recomandation I am using daily-2024.03.04 to create data proc cluster using GKE using example given
module "cluster_config_gke" {
source = "../modules/dataprocgke"
name = "dev-gke-cluster-test"
project_id = local.project_id
region = local.region
service_account = local.service_account
labels = local.labels
dataproc_config = {
virtual_cluster_config = {
staging_bucket = local.gke_staging_bucket
kubernetes_cluster_config = {
kubernetes_namespace = "foobar"
kubernetes_software_config = {
component_version = {
"SPARK" = "3.1-dataproc-7"
}
properties = {
"spark:spark.kubernetes.container.image" = "europe-west3-docker.pkg.dev/cloud-dataproc/dpgke/sparkengine:dataproc-14"
}
}
gke_cluster_config = {
gke_cluster_target = "projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1"
node_pool_target = {
node_pool = "test-node-pool-1"
roles = ["DEFAULT"]
}
}
}
}
}
}
but it throw error with Pool name '' (empty)
Error: Error waiting for creating Dataproc cluster: Error code 3, message: GKE Node Pool name '', must conform to pattern 'projects/([^/]+)/(?:locations|zones)/([^/]+)/clusters/([^/]+)/nodePools/([^/]+)'

│ with module.cluster_config_gke.module.dataprocgke.google_dataproc_cluster.cluster,
│ on .terraform/modules/cluster_config_gke.dataprocgke/modules/dataproc/main.tf line 23, in resource "google_dataproc_cluster" "cluster":
│ 23: resource "google_dataproc_cluster" "cluster" {

and if I replace code as per recomandtion in error as
node_pool = "projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1/nodePools/test-node-pool-1

Now it throw error with project name as twice..

Error: Error creating Dataproc cluster: googleapi: Error 400: GKE Node Pool name 'projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1/nodePools/projects/sampleproject-1/locations/europe-west3/clusters/simple-std-cluster-1/nodePools/test-node-pool-1', must conform to pattern 'projects/([^/]+)/(?:locations|zones)/([^/]+)/clusters/([^/]+)/nodePools/([^/]+)', badRequest

│ with module.cluster_config_gke.module.dataprocgke.google_dataproc_cluster.cluster,
│ on .terraform/modules/cluster_config_gke.dataprocgke/modules/dataproc/main.tf line 23, in resource "google_dataproc_cluster" "cluster":
│ 23: resource "google_dataproc_cluster" "cluster" {

Please let us know what values need to add in node_pool .
Also can you share which is latest stable version to create dataproc cluster using GKE, or when in future we can get stable version to create dataproc cluster using GKE

@wiktorn
Copy link
Collaborator

wiktorn commented Mar 8, 2024

There are few issues that may play a role here:

  • you need to have a node pool within cluster, where kube-system workloads can run
  • there is an issue in provider, that it doesn't remove the node pool from the cluster and it doesn't identify existing node pool as not requiring any change, so you need to manually remove the old pool before retrying

I used also custom IAM configuration, as otherwise everything binds to compute default service account.

I managed to create successfully a Dataproc on GKE using following config:

locals {
  dataproc_namespace = "foobar"
}

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = var.project_id
  name       = "cluster"
  location   = "${var.region}-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {} # use default names "pods" and "services"
    master_authorized_ranges = {
      internal-vms = "10.0.0.0/8"
    }
    master_ipv4_cidr_block = "192.168.0.0/28"
  }
  private_cluster_config = {
    enable_private_endpoint = true
    master_global_access    = false
  }
  enable_features = {
    dataplane_v2        = true
    fqdn_network_policy = true
    workload_identity   = true
  }
  labels = {
    environment = "dev"
  }
}

module "cluster-1-nodepool-1" {
  source       = "./fabric/modules/gke-nodepool"
  project_id   = var.project_id
  cluster_name = module.cluster-1.name
  location     = "${var.region}-b"
  name         = "nodepool-1"
  nodepool_config = {
    autoscaling = {
      max_node_count = 2
      min_node_count = 1
    }
  }
}

module "service-account" {
  source     = "./fabric/modules/iam-service-account"
  project_id = var.project_id
  name       = "dataproc-worker"
  iam = {
    "roles/iam.workloadIdentityUser" = [
      "serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/agent]",
      "serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/spark-driver]",
      "serviceAccount:${var.project_id}.svc.id.goog[${local.dataproc_namespace}/spark-executor]"
    ]
  }
  iam_project_roles = {
    (var.project_id) = ["roles/dataproc.worker"]
  }
}

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = var.project_id
  name       = "my-cluster"
  region     = var.region
  dataproc_config = {
    virtual_cluster_config = {
      kubernetes_cluster_config = {
        kubernetes_namespace = local.dataproc_namespace
        kubernetes_software_config = {
          component_version = {
            "SPARK" : "3.1-dataproc-14"
          }
          properties = {
            "dataproc:dataproc.gke.agent.google-service-account"          = module.service-account.email
            "dataproc:dataproc.gke.spark.driver.google-service-account"   = module.service-account.email
            "dataproc:dataproc.gke.spark.executor.google-service-account" = module.service-account.email
          }
        }
        gke_cluster_config = {
          gke_cluster_target = module.cluster-1.id
          node_pool_target = {
            node_pool = "dataproc-nodepool"
            roles     = ["DEFAULT"]
          }
        }
      }
    }
  }
}

I'll be updating README for Dataproc module with this example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants