Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#50] Add support for helm chart #56

Merged
merged 9 commits into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
**/.idea
**/.DS_Store
**/packages
**/*.log
85 changes: 77 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Depending on your network and computer, startup time may take 3-5 minutes. Once
## Prerequisites

Install Git (optional), Docker, Docker Compose.
Docker Desktop (or Orbstack) with Kubenetes enabled, and helm CLI are required if you use helm-chart to deploy services.

## System Resource Requirements

Expand All @@ -36,7 +37,7 @@ Install Git (optional), Docker, Docker Compose.
The playground runs a number of services. The TCP ports used may clash with existing services you run, such as MySQL or Postgres.

| Docker container | Ports used |
|-----------------------|------------------------|
| --------------------- | ---------------------- |
| playground-gravitino | 8090 9001 |
| playground-hive | 3307 19000 19083 60070 |
| playground-mysql | 13306 |
Expand All @@ -48,27 +49,94 @@ The playground runs a number of services. The TCP ports used may clash with exis

## Playground usage



### One curl command launch playground
```shell
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/apache/gravitino-playground/HEAD/install.sh)"
```

### Use git download and launch playground
### Use git to download and launch playground

```shell
git clone [email protected]:apache/gravitino-playground.git
cd gravitino-playground
./playground.sh start
```

### Check status
#### Docker

##### Start

```
./playground.sh docker start
```

##### Check status
```shell
./playground.sh status
./playground.sh docker status
```
### Stop playground
##### Stop playground
```shell
./playground.sh stop
./playground.sh docker stop
```

#### Kubernetes

Enable Kubernetes in Docker Desktop or Orbstack.

In Project root directory, execute this command:

```
helm upgrade --install gravitino-playground ./helm-chart/ --create-namespace --namespace gravitino-playground --set projectRoot=$(pwd)
```

##### Start

```
./playground.sh k8s start
```

##### Check status
```shell
./playground.sh k8s status
```

##### Port Forwarding

To access pods or services at `localhost`, you needs to do these steps:

1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```

##### Stop playground
```shell
./playground.sh k8s stop
```




## Experiencing Apache Gravitino with Trino SQL

Expand Down Expand Up @@ -105,7 +173,7 @@ docker exec -it playground-spark bash
2. Open the Spark SQL client in the container.

```shell
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
spark@container_id:/$ cd /opt/spark && /bin/bash bin/spark-sql
```

## Monitoring Gravitino
Expand Down Expand Up @@ -315,3 +383,4 @@ os.environ["OPENAI_API_BASE"] = ""
Apache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

<sub>Apache®, Apache Gravitino&trade;, Apache Hive&trade;, Apache Iceberg&trade;, and Apache Spark&trade; are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.</sub>

33 changes: 24 additions & 9 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,13 @@ services:
- "8090:8090"
- "9001:9001"
container_name: playground-gravitino
environment:
- MYSQL_HOST_IP=mysql
- HIVE_HOST_IP=hive
depends_on:
hive :
hive:
condition: service_healthy
mysql :
mysql:
condition: service_healthy
volumes:
- ./healthcheck:/tmp/healthcheck
Expand All @@ -70,15 +73,17 @@ services:
- GRAVITINO_HOST_PORT=8090
- GRAVITINO_METALAKE_NAME=metalake_demo
- HIVE_HOST_IP=hive
entrypoint: /bin/bash /tmp/trino/init.sh
- MYSQL_HOST_IP=mysql
- POSTGRES_HOST_IP=postgresql
entrypoint: /bin/bash /tmp/trino/init.sh
volumes:
- ./init/trino:/tmp/trino
- ./init/common:/tmp/common
- ./healthcheck:/tmp/healthcheck
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy
healthcheck:
test: ["CMD", "/tmp/healthcheck/trino-healthcheck.sh"]
Expand Down Expand Up @@ -112,14 +117,15 @@ services:
- "13306:3306"
volumes:
- ./init/mysql:/docker-entrypoint-initdb.d/
command:
- ./healthcheck:/tmp/healthcheck
command:
--default-authentication-plugin=mysql_native_password
--character-set-server=utf8mb4
--collation-server=utf8mb4_general_ci
--explicit_defaults_for_timestamp=true
--lower_case_table_names=1
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -h localhost -pmysql"]
test: ["CMD", "/bin/bash", "/tmp/healthcheck/mysql-healthcheck.sh"]
interval: 5s
timeout: 60s
retries: 5
Expand All @@ -131,6 +137,10 @@ services:
entrypoint: /bin/bash /tmp/spark/init.sh
environment:
- HADOOP_USER_NAME=root
- GRAVITINO_HOST_IP=gravitino
- GRAVITINO_HOST_PORT=8090
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
ports:
- "14040:4040"
volumes:
Expand All @@ -140,15 +150,20 @@ services:
jupyter:
image: jupyter/pyspark-notebook:spark-3.4.1
container_name: playground-jupyter
environment:
- GRAVITINO_HOST_IP=gravitino
- HIVE_HOST_IP=hive
- TRINO_HOST_IP=trino
- POSTGRES_HOST_IP=postgresql
ports:
- "18888:8888"
volumes:
- ./init/jupyter:/tmp/gravitino
entrypoint: /bin/bash /tmp/gravitino/init.sh
depends_on:
hive :
hive:
condition: service_healthy
gravitino :
gravitino:
condition: service_healthy

prometheus:
Expand Down
6 changes: 4 additions & 2 deletions healthcheck/gravitino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ max_attempts=3
attempt=0
success=false

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://127.0.0.1:8090/api/version)
HOST_IP=${GRAVITINO_HOST_IP:-localhost}

while [ $attempt -lt $max_attempts ]; do
response=$(curl -X GET -H "Content-Type: application/json" http://${HOST_IP}:8090/api/version)

if echo "$response" | grep -q "\"code\":0"; then
success=true
break
Expand Down
36 changes: 36 additions & 0 deletions healthcheck/hive-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

# Set Hive connection details
HOST_IP=${HIVE_HOST_IP:-localhost}
HIVE_PORT="10000"

# Attempt to connect to Hive using curl
curl -s -o /dev/null -w "%{http_code}" http://${HOST_IP}:${HIVE_PORT}

# Check the HTTP status code
if [ $? -eq 0 ]; then
echo "Hive connection successful"
exit 0
else
echo "Hive connection failed"
exit 1
fi
30 changes: 30 additions & 0 deletions healthcheck/mysql-healthcheck.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
set -ex

HOST_IP=${MYSQL_HOST_IP:-localhost}
mysqladmin ping -h ${HOST_IP} -p${MYSQL_ROOT_PASSWORD}
if [ $? -eq 0 ]; then
echo "MySQL container started successfully."
exit 0
else
echo "MySQL container has not started yet."
exit 1
fi
2 changes: 1 addition & 1 deletion healthcheck/trino-healthcheck.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
set -ex

# Because trino-connector must first synchronize a default metalake from the Gravitino server
response=$(trino --execute "SHOW CATALOGS LIKE 'catalog_hive'")
response=$(trino --server ${TRINO_HOST_IP}:8080 --execute "SHOW CATALOGS LIKE 'catalog_hive'")
if echo "$response" | grep -q catalog_hive; then
echo "Gravitino Trino connector has finished synchronizing metadata"
else
Expand Down
29 changes: 29 additions & 0 deletions helm-chart/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/

# Ignore these directories because they are too large, we use local-path pv to mount it into Pod
init/*/data/
init/*/packages/


9 changes: 9 additions & 0 deletions helm-chart/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: v2
name: gravitino-playground
description: A Helm chart for Gravitino Playground
type: application
version: 0.1.0
appVersion: "1.0.0"
maintainers:
- name: Your Name
email: [email protected]
1 change: 1 addition & 0 deletions helm-chart/healthcheck
1 change: 1 addition & 0 deletions helm-chart/init
24 changes: 24 additions & 0 deletions helm-chart/templates/NOTES.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command that print on console is incorrect?

ubuntu@ip-10-0-4-171:~$ kubectl get pods --namespace gravitino-playground -l "app.kubernetes.io/name=gravitino-playground,app.kubernetes.io/instance=gravitino-playground" -o jsonpath="{.items[0].metadata.name}"
error: error executing jsonpath "{.items[0].metadata.name}": Error executing template: array index out of bounds: index 0, length 0. Printing more information for debugging the template:
	template was:
		{.items[0].metadata.name}
	object given to jsonpath engine was:
		map[string]interface {}{"apiVersion":"v1", "items":[]interface {}{}, "kind":"List", "metadata":map[string]interface {}{"resourceVersion":""}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
1. Log in to the Gravitino playground Trino pod using the following command:

```
TRINO_POD=$(kubectl get pods --namespace gravitino-playground -l app=trino -o jsonpath="{.items[0].metadata.name}")
kubectl exec $TRINO_POD -n gravitino-playground -it -- /bin/bash
```
2. Log in to the Gravitino playground Spark pod using the following command:

```
SPARK_POD=$(kubectl get pods --namespace gravitino-playground -l app=spark -o jsonpath="{.items[0].metadata.name}")
kubectl exec $SPARK_POD -n gravitino-playground -it -- /bin/bash
```

3. Port-forwarding Gravitino Service, so that you can access it at `localhost:8090`.

```
kubectl port-forward svc/gravitino -n gravitino-playground 8090:8090
```

4. Port-forwarding Jupyter Notebook Service, so that you can access it at `localhost:8888`.

```
kubectl port-forward svc/jupyternotebook -n gravitino-playground 8888:8888
```
Loading