Skip to content

Commit

Permalink
feat(helm): add helm chart
Browse files Browse the repository at this point in the history
  • Loading branch information
unknowntpo committed Oct 31, 2024
1 parent d64b86a commit 4bd5321
Show file tree
Hide file tree
Showing 35 changed files with 1,154 additions and 58 deletions.
Empty file added .helmignore
Empty file.
55 changes: 46 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Depending on your network and computer, startup time may take 3-5 minutes. Once
## Prerequisites

Install Git, Docker, Docker Compose.
Docker Desktop (or Orbstack) with Kubenetes enabled, and helm CLI are required if you use helm-chart to deploy services.

## System Resource Requirements

Expand All @@ -35,14 +36,14 @@ Install Git, Docker, Docker Compose.

The playground runs a number of services. The TCP ports used may clash with existing services you run, such as MySQL or Postgres.

| Docker container | Ports used |
|-----------------------|----------------------|
| playground-gravitino | 8090 9001 |
| playground-hive | 3307 19000 19083 60070 |
| playground-mysql | 13306 |
| playground-postgresql | 15342 |
| playground-trino | 18080 |
| playground-jupyter | 18888 |
| Docker container | Ports used |
| --------------------- | ---------------------------- |
| playground-gravitino | 8090 9001 |
| playground-hive | 3307 19000 19083 20000 60070 |
| playground-mysql | 13306 |
| playground-postgresql | 15342 |
| playground-trino | 18080 |
| playground-jupyter | 18888 |

## Playground usage

Expand Down Expand Up @@ -98,7 +99,7 @@ docker exec -it playground-spark bash
2. Open the Spark SQL client in the container.
```shell
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
```

## Example
Expand Down Expand Up @@ -295,8 +296,44 @@ os.environ["OPENAI_API_KEY"] = ""
os.environ["OPENAI_API_BASE"] = ""
```

## Kubernetes

Enable Kubernetes in Docker Desktop or Orbstack.

In Project root directory, execute this command:

```
helm upgrade --install gravitino-playground ./helm-chart/ --create-namespace --namespace gravitino-playground --set projectRoot=$(pwd)
```

1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```

## ASF Incubator disclaimer

Apache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

<sub>Apache®, Apache Gravitino&trade;, Apache Hive&trade;, Apache Iceberg&trade;, and Apache Spark&trade; are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.</sub>

33 changes: 24 additions & 9 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,13 @@ services:
- "8090:8090"
- "9001:9001"
container_name: playground-gravitino
environment:
- MYSQL_HOST_IP=mysql
- HIVE_HOST_IP=hive
depends_on:
hive :
hive:
condition: service_healthy
mysql :
mysql:
condition: service_healthy
volumes:
- ./healthcheck:/tmp/healthcheck
Expand All @@ -70,15 +73,17 @@ services:
- GRAVITINO_HOST_PORT=8090
- GRAVITINO_METALAKE_NAME=metalake_demo
- HIVE_HOST_IP=hive
entrypoint: /bin/bash /tmp/trino/init.sh
- MYSQL_HOST_IP=mysql
- POSTGRES_HOST_IP=postgresql
entrypoint: /bin/bash /tmp/trino/init.sh
volumes:
- ./init/trino:/tmp/trino
- ./init/common:/tmp/common
- ./healthcheck:/tmp/healthcheck
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy
healthcheck:
test: ["CMD", "/tmp/healthcheck/trino-healthcheck.sh"]
Expand Down Expand Up @@ -112,14 +117,15 @@ services:
- "13306:3306"
volumes:
- ./init/mysql:/docker-entrypoint-initdb.d/
command:
- ./healthcheck:/tmp/healthcheck
command:
--default-authentication-plugin=mysql_native_password
--character-set-server=utf8mb4
--collation-server=utf8mb4_general_ci
--explicit_defaults_for_timestamp=true
--lower_case_table_names=1
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -h localhost -pmysql"]
test: ["CMD", "/bin/bash", "/tmp/healthcheck/mysql-healthcheck.sh"]
interval: 5s
timeout: 60s
retries: 5
Expand All @@ -131,6 +137,10 @@ services:
entrypoint: /bin/bash /tmp/spark/init.sh
environment:
- HADOOP_USER_NAME=root
- GRAVITINO_HOST_IP=gravitino
- GRAVITINO_HOST_PORT=8090
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
ports:
- "14040:4040"
volumes:
Expand All @@ -140,15 +150,20 @@ services:
jupyter:
image: jupyter/pyspark-notebook:spark-3.4.1
container_name: playground-jupyter
environment:
- GRAVITINO_HOST_IP=gravitino
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
- POSTGRES_HOST_IP=postgresql
ports:
- 18888:8888
volumes:
- ./init/jupyter:/tmp/gravitino
entrypoint: /bin/bash /tmp/gravitino/init.sh
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy

volumes:
Expand Down
6 changes: 4 additions & 2 deletions healthcheck/gravitino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ max_attempts=3
attempt=0
success=false

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://127.0.0.1:8090/api/version)
HOST_IP=${GRAVITINO_HOST_IP:-localhost}

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://${HOST_IP}:8090/api/version)

if echo "$response" | grep -q "\"code\":0"; then
success=true
break
Expand Down
36 changes: 36 additions & 0 deletions healthcheck/hive-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

# Set Hive connection details
HOST_IP=${HIVE_HOST_IP:-localhost}
HIVE_PORT="10000"

# Attempt to connect to Hive using curl
curl -s -o /dev/null -w "%{http_code}" http://${HOST_IP}:${HIVE_PORT}

# Check the HTTP status code
if [ $? -eq 0 ]; then
echo "Hive connection successful"
exit 0
else
echo "Hive connection failed"
exit 1
fi
30 changes: 30 additions & 0 deletions healthcheck/mysql-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

HOST_IP=${MYSQL_HOST_IP:-localhost}
mysqladmin ping -h ${HOST_IP} -p${MYSQL_ROOT_PASSWORD}
if [ $? -eq 0 ]; then
echo "MySQL container started successfully."
exit 0
else
echo "MySQL container has not started yet."
exit 1
fi
2 changes: 1 addition & 1 deletion healthcheck/trino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
set -ex

# Because trino-connector must first synchronize a default metalake from the Gravitino server
response=$(trino --execute "SHOW CATALOGS LIKE 'catalog_hive'")
response=$(trino --server ${TRINO_HOST_IP}:8080 --execute "SHOW CATALOGS LIKE 'catalog_hive'")
if echo "$response" | grep -q catalog_hive; then
echo "Gravitino Trino connector has finished synchronizing metadata"
else
Expand Down
26 changes: 26 additions & 0 deletions helm-chart/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

# Ignore init/jupyter/data because it's too large, we use local-path pv to mount it into Pod
init/jupyter/data/
9 changes: 9 additions & 0 deletions helm-chart/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: v2
name: gravitino-playground
description: A Helm chart for Gravitino Playground
type: application
version: 0.1.0
appVersion: "1.0.0"
maintainers:
- name: Your Name
email: [email protected]
1 change: 1 addition & 0 deletions helm-chart/healthcheck
1 change: 1 addition & 0 deletions helm-chart/init
24 changes: 24 additions & 0 deletions helm-chart/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```
62 changes: 62 additions & 0 deletions helm-chart/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "gravitino-playground.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "gravitino-playground.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "gravitino-playground.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "gravitino-playground.labels" -}}
helm.sh/chart: {{ include "gravitino-playground.chart" . }}
{{ include "gravitino-playground.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "gravitino-playground.selectorLabels" -}}
app.kubernetes.io/name: {{ include "gravitino-playground.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "gravitino-playground.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "gravitino-playground.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
Loading

0 comments on commit 4bd5321

Please sign in to comment.