Skip to content

Commit

Permalink
flextemplatecomposition
Browse files Browse the repository at this point in the history
  • Loading branch information
svetakvsundhar committed Dec 20, 2024
1 parent 687c2c0 commit e8dee87
Show file tree
Hide file tree
Showing 5 changed files with 147 additions and 0 deletions.
45 changes: 45 additions & 0 deletions experiments/compositions/samples/DataflowFlexTemplates/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Dataflow Flex Template Deployment on Kubernetes

This directory provides a KCC Compositions approach to deploying Dataflow Flex Templates on Kubernetes. It leverages a custom resource definition (CRD) and a facade file to simplify the deployment process.

**Key Components**

* **KCC Composition:** Manages the creation of the Dataflow Flex Template job and associated resources (e.g., a staging Cloud Storage bucket).
* **Custom Resource Definition (CRD):** Defines a CRD for `DataflowFlexTemplateConfig` to hold essential configuration parameters for the Dataflow job (e.g., project name, region).
* **Facade File:** A simplified YAML file that references the CR to deploy the Dataflow job.

**Prerequisites**

* **Kubernetes Cluster:** With Config Connector installed.
* **Service Account:** With necessary IAM permissions to create and manage Dataflow jobs, Cloud Storage buckets, and other related resources.
* **Dataflow Flex Template:** The Flex Template you want to deploy, available in a Google Cloud Storage location.
* **Pipeline Definition:** Your pipeline definition file (e.g., a YAML file) stored in a Google Cloud Storage location. **Ensure you upload your `beam.yaml` pipeline definition file to a GCS bucket (`yamlPipelineFilePath`) before proceeding.**

**Deployment Steps**

1. **Apply `dataflowflextemplates-crd.yaml`:** Creates the CRD for `DataflowFlexTemplateConfig`.

```bash
kubectl apply -f dataflowflextemplates-crd.yaml -n config-control

2. **Apply `dataflowflextemplates-composition.yaml`:** Creates the KCC Composition for managing the Dataflow deployment.

```bash
kubectl apply -f dataflowflextemplates-composition.yaml -n config-control
```

3. **Apply facade.yaml:** Applies the facade file, which triggers the Composition to create the Dataflow job and any associated resources.

```bash
kubectl apply -f facade.yaml -n config-control
```

4. **Deletion Steps**

To delete the resources, delete the YAML files in reverse order:

```bash
kubectl delete -f facade.yaml -n config-control
kubectl delete -f dataflowflextemplates-composition.yaml -n config-control
kubectl delete -f dataflowflextemplates-crd.yaml -n config-control
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
pipeline:
type: chain
transforms:
- type: ReadFromBigQuery
name: read-from-bigquery
config:
table_spec: test-gcp-tse.zq.s3
- type: Sql
config:
query: "SELECT * from PCOLLECTION"
- type: WriteToJson
config:
path: gs://samples-beamyamldemo/output.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


apiVersion: composition.google.com/v1alpha1
kind: Composition
metadata:
name: complete-dataflowflextemplate-composition
namespace: config-control
spec:
inputAPIGroup: dataflowflextemplateconfigs.idp.mycompany.com
expanders:
- type: jinja2 # create-gcs-bucket
name: create-gcs-bucket
template: |
apiVersion: storage.cnrm.cloud.google.com/v1beta1
kind: StorageBucket
metadata:
name: {{ dataflowflextemplateconfigs.spec.projectName }}-dataflowflextemplatejob-dep-batc
namespace: config-control
spec:
uniformBucketLevelAccess: true
- type: jinja2 # launch-dataflow-template
name: launch-dataflow-template
template: |
apiVersion: dataflow.cnrm.cloud.google.com/v1beta1
kind: DataflowFlexTemplateJob
metadata:
annotations:
cnrm.cloud.google.com/on-delete: "cancel"
name: dataflowflextemplatejob-sample-batch
spec:
region: {{ dataflowflextemplateconfigs.spec.region }}
containerSpecGcsPath: {{ dataflowflextemplateconfigs.spec.pathToTemplate }}
parameters:
yaml_pipeline_file: {{ dataflowflextemplateconfigs.spec.yamlPipelineFilePath }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: dataflowflextemplateconfigs.idp.mycompany.com
spec:
group: idp.mycompany.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
projectName:
type: string
region:
type: string
yamlPipelineFilePath:
type: string
pathToTemplate:
type: string
scope: Namespaced
names:
plural: dataflowflextemplateconfigs
singular: dataflowflextemplateconfig
kind: DataflowFlexTemplateConfig
shortNames:
- dftc
10 changes: 10 additions & 0 deletions experiments/compositions/samples/DataflowFlexTemplates/facade.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: idp.mycompany.com/v1
kind: DataflowFlexTemplateConfig
metadata:
name: dataflowflextemplate-job
namespace: config-control
spec:
projectName: test-gcp-tse #Or your desired project name
region: us-central1 #Or your desired region
yamlPipelineFilePath: gs://test-gcp-tse/beam-batch-sql.yaml #Or your path to Beam YAML file
pathToTemplate: gs://dataflow-templates-us-central1/latest/flex/Yaml_Template # DO NOT CHANGE if deploying Beam YAML.

0 comments on commit e8dee87

Please sign in to comment.