Skip to content

Commit

Permalink
[#50] Add support for helm chart (#56)
Browse files Browse the repository at this point in the history
This PR allow us to deploy `gravitino-playground` with helm chart.
  • Loading branch information
unknowntpo authored Nov 8, 2024
1 parent e40c1eb commit 8f08175
Show file tree
Hide file tree
Showing 36 changed files with 1,192 additions and 86 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
**/.idea
**/.DS_Store
**/packages
**/*.log
85 changes: 77 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Depending on your network and computer, startup time may take 3-5 minutes. Once
## Prerequisites

Install Git (optional), Docker, Docker Compose.
Docker Desktop (or Orbstack) with Kubenetes enabled, and helm CLI are required if you use helm-chart to deploy services.

## System Resource Requirements

Expand All @@ -36,7 +37,7 @@ Install Git (optional), Docker, Docker Compose.
The playground runs a number of services. The TCP ports used may clash with existing services you run, such as MySQL or Postgres.

| Docker container | Ports used |
|-----------------------|------------------------|
| --------------------- | ---------------------- |
| playground-gravitino | 8090 9001 |
| playground-hive | 3307 19000 19083 60070 |
| playground-mysql | 13306 |
Expand All @@ -48,27 +49,94 @@ The playground runs a number of services. The TCP ports used may clash with exis

## Playground usage



### One curl command launch playground
```shell
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/apache/gravitino-playground/HEAD/install.sh)"
```

### Use git download and launch playground
### Use git to download and launch playground

```shell
git clone [email protected]:apache/gravitino-playground.git
cd gravitino-playground
./playground.sh start
```

### Check status
#### Docker

##### Start

```
./playground.sh docker start
```

##### Check status
```shell
./playground.sh status
./playground.sh docker status
```
### Stop playground
##### Stop playground
```shell
./playground.sh stop
./playground.sh docker stop
```

#### Kubernetes

Enable Kubernetes in Docker Desktop or Orbstack.

In Project root directory, execute this command:

```
helm upgrade --install gravitino-playground ./helm-chart/ --create-namespace --namespace gravitino-playground --set projectRoot=$(pwd)
```

##### Start

```
./playground.sh k8s start
```

##### Check status
```shell
./playground.sh k8s status
```

##### Port Forwarding

To access pods or services at `localhost`, you needs to do these steps:

1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```

##### Stop playground
```shell
./playground.sh k8s stop
```




## Experiencing Apache Gravitino with Trino SQL

Expand Down Expand Up @@ -105,7 +173,7 @@ docker exec -it playground-spark bash
2. Open the Spark SQL client in the container.
```shell
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
```

## Monitoring Gravitino
Expand Down Expand Up @@ -315,3 +383,4 @@ os.environ["OPENAI_API_BASE"] = ""
Apache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

<sub>Apache®, Apache Gravitino&trade;, Apache Hive&trade;, Apache Iceberg&trade;, and Apache Spark&trade; are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.</sub>

33 changes: 24 additions & 9 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,13 @@ services:
- "8090:8090"
- "9001:9001"
container_name: playground-gravitino
environment:
- MYSQL_HOST_IP=mysql
- HIVE_HOST_IP=hive
depends_on:
hive :
hive:
condition: service_healthy
mysql :
mysql:
condition: service_healthy
volumes:
- ./healthcheck:/tmp/healthcheck
Expand All @@ -70,15 +73,17 @@ services:
- GRAVITINO_HOST_PORT=8090
- GRAVITINO_METALAKE_NAME=metalake_demo
- HIVE_HOST_IP=hive
entrypoint: /bin/bash /tmp/trino/init.sh
- MYSQL_HOST_IP=mysql
- POSTGRES_HOST_IP=postgresql
entrypoint: /bin/bash /tmp/trino/init.sh
volumes:
- ./init/trino:/tmp/trino
- ./init/common:/tmp/common
- ./healthcheck:/tmp/healthcheck
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy
healthcheck:
test: ["CMD", "/tmp/healthcheck/trino-healthcheck.sh"]
Expand Down Expand Up @@ -112,14 +117,15 @@ services:
- "13306:3306"
volumes:
- ./init/mysql:/docker-entrypoint-initdb.d/
command:
- ./healthcheck:/tmp/healthcheck
command:
--default-authentication-plugin=mysql_native_password
--character-set-server=utf8mb4
--collation-server=utf8mb4_general_ci
--explicit_defaults_for_timestamp=true
--lower_case_table_names=1
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -h localhost -pmysql"]
test: ["CMD", "/bin/bash", "/tmp/healthcheck/mysql-healthcheck.sh"]
interval: 5s
timeout: 60s
retries: 5
Expand All @@ -131,6 +137,10 @@ services:
entrypoint: /bin/bash /tmp/spark/init.sh
environment:
- HADOOP_USER_NAME=root
- GRAVITINO_HOST_IP=gravitino
- GRAVITINO_HOST_PORT=8090
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
ports:
- "14040:4040"
volumes:
Expand All @@ -140,15 +150,20 @@ services:
jupyter:
image: jupyter/pyspark-notebook:spark-3.4.1
container_name: playground-jupyter
environment:
- GRAVITINO_HOST_IP=gravitino
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
- POSTGRES_HOST_IP=postgresql
ports:
- "18888:8888"
volumes:
- ./init/jupyter:/tmp/gravitino
entrypoint: /bin/bash /tmp/gravitino/init.sh
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy

prometheus:
Expand Down
6 changes: 4 additions & 2 deletions healthcheck/gravitino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ max_attempts=3
attempt=0
success=false

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://127.0.0.1:8090/api/version)
HOST_IP=${GRAVITINO_HOST_IP:-localhost}

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://${HOST_IP}:8090/api/version)

if echo "$response" | grep -q "\"code\":0"; then
success=true
break
Expand Down
36 changes: 36 additions & 0 deletions healthcheck/hive-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

# Set Hive connection details
HOST_IP=${HIVE_HOST_IP:-localhost}
HIVE_PORT="10000"

# Attempt to connect to Hive using curl
curl -s -o /dev/null -w "%{http_code}" http://${HOST_IP}:${HIVE_PORT}

# Check the HTTP status code
if [ $? -eq 0 ]; then
echo "Hive connection successful"
exit 0
else
echo "Hive connection failed"
exit 1
fi
30 changes: 30 additions & 0 deletions healthcheck/mysql-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

HOST_IP=${MYSQL_HOST_IP:-localhost}
mysqladmin ping -h ${HOST_IP} -p${MYSQL_ROOT_PASSWORD}
if [ $? -eq 0 ]; then
echo "MySQL container started successfully."
exit 0
else
echo "MySQL container has not started yet."
exit 1
fi
2 changes: 1 addition & 1 deletion healthcheck/trino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
set -ex

# Because trino-connector must first synchronize a default metalake from the Gravitino server
response=$(trino --execute "SHOW CATALOGS LIKE 'catalog_hive'")
response=$(trino --server ${TRINO_HOST_IP}:8080 --execute "SHOW CATALOGS LIKE 'catalog_hive'")
if echo "$response" | grep -q catalog_hive; then
echo "Gravitino Trino connector has finished synchronizing metadata"
else
Expand Down
29 changes: 29 additions & 0 deletions helm-chart/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

# Ignore these directories because they are too large, we use local-path pv to mount it into Pod
init/*/data/
init/*/packages/


9 changes: 9 additions & 0 deletions helm-chart/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: v2
name: gravitino-playground
description: A Helm chart for Gravitino Playground
type: application
version: 0.1.0
appVersion: "1.0.0"
maintainers:
- name: Your Name
email: [email protected]
1 change: 1 addition & 0 deletions helm-chart/healthcheck
1 change: 1 addition & 0 deletions helm-chart/init
24 changes: 24 additions & 0 deletions helm-chart/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```
Loading

0 comments on commit 8f08175

Please sign in to comment.