Skip to content

Commit

Permalink
Fix Demo
Browse files Browse the repository at this point in the history
Presto split into presto and trino. Looks like the presto containers are gone from docker. Switch to Trino.

Related to Netflix#1164
  • Loading branch information
tgianos committed Mar 28, 2022
1 parent e7c9739 commit 7b835f6
Show file tree
Hide file tree
Showing 12 changed files with 74 additions and 74 deletions.
28 changes: 14 additions & 14 deletions genie-demo/src/docs/asciidoc/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ https://netflix.github.io/genie/docs/{revnumber}/rest[API Guide].
** 8080 (Genie)
** 8088, 19888, 50070, 50075, 8042 (YARN Prod Cluster)
** 8089, 19889, 50071, 50076, 8043 (YARN Test Cluster)
** 9090 (Presto Cluster)
** 9090 (Trino Cluster)

=== Development Environment

Expand Down Expand Up @@ -108,7 +108,7 @@ Sometimes if you click a link in the UI and it doesn't work try swapping in loca
| `http://localhost:8043`
|===

.Presto Interfaces
.Trino Interfaces
|===
|Endpoint
|URL
Expand Down Expand Up @@ -148,11 +148,6 @@ Sometimes if you click a link in the UI and it doesn't work try swapping in loca
|`./run_hdfs_job.py {sla\|test}`
|Runs a `dfs -ls` on the input directory on HDFS and stores results in stdout

|Presto
|`./run_presto_job.py`
|Sends query (`select * from tpcds.sf1.item limit 100;`) as attachment file to Presto cluster and dumps results to
stdout

|Spark Shell
|`./run_spark_shell_job.py {sla\|test}`
|Simply prints the Spark Shell help output to stdout
Expand All @@ -165,6 +160,11 @@ stdout
|`./run_spark_submit_job.py {sla\|test} 3.0.0`
|Runs the SparkPi example for Spark 3.0.x with input of 10. Results stored in stdout

|Trino
|`./run_trino_job.py`
|Sends query (`select * from tpcds.sf1.item limit 100;`) as attachment file to Trino cluster and dumps results to
stdout

|YARN
|`./run_yarn_job.py {sla\|test}`
|Lists all yarn applications from the resource manager into stdout
Expand All @@ -186,7 +186,7 @@ stdout
.... https://hub.docker.com/r/netflixoss/genie-demo-apache[netflixoss/genie-demo-apache:{revnumber}]
.... https://hub.docker.com/r/netflixoss/genie-client[netflixoss/genie-demo-client:{revnumber}]
.... https://hub.docker.com/r/sequenceiq/hadoop-docker[sequenceiq/hadoop-docker:2.7.1]
.... https://hub.docker.com/r/prestosql/presto[prestosql/presto:337]
.... https://hub.docker.com/r/trinodb/trino[trinodb/trino:374]
... This will use docker compose to bring up 6 containers
.... `genie_demo_app_{revnumber}`
..... Instantiation of `netflixoss/genie-app:{revnumber}`
Expand All @@ -202,9 +202,9 @@ stdout
..... Instantiations of `sequenceiq/hadoop-docker:2.7.1`
..... Simulates having two clusters available and registered with Genie with roles as a production and a test cluster
..... See `Hadoop Interfaces` table for list of available ports
.... `genie_demo_presto_{revnumber}`
..... Instantiation of `prestosql/presto:337`
..... Single node Presto cluster
.... `genie_demo_trino_{revnumber}`
..... Instantiation of `trinodb/trino:374`
..... Single node Trino cluster
..... Web UI bound to `localhost` port `9090`
. Wait for all services to start
.. Verify Genie UI and both Resource Manager UI's are available via your browser
Expand All @@ -228,10 +228,10 @@ stdout
... `./run_yarn_job.py test`
... `./run_hdfs_job.py test`
... `./run_spark_submit_job.py sla 2.1.3`
... `./run_presto_job.py`
... `./run_trino_job.py`
.. Replace `test` with, `sla` to run the jobs against the Prod cluster
.. If any of the Docker container crashes, you may need to increase the default memory available in the Docker preferences.
The current default for a fresh install is 2GB, which is not sufficient for this demo.
The current default for a fresh installation is 2GB, which is not sufficient for this demo.
Use `docker stats`
to verify the limit is 4GB or higher.
. For each of these jobs you can see their status, output and other information via the UI's
Expand Down Expand Up @@ -266,7 +266,7 @@ Move the `sched:sla` tag back
. Verify the agent can connect to the local Genie server
.. `java -jar /usr/local/bin/genie-agent.jar ping --serverHost localhost --serverPort 9090`
. Launch a Genie job, similar to the ones above
.. `java -jar /usr/local/bin/genie-agent.jar exec --serverHost localhost --serverPort 9090 --jobName 'Genie Demo CLI Presto Job' --commandCriterion 'TAGS=type:presto' --clusterCriterion 'TAGS=sched:adhoc,type:presto' -- --execute 'select * from tpcds.sf1.item limit 100;'`
.. `java -jar /usr/local/bin/genie-agent.jar exec --serverHost localhost --serverPort 9090 --jobName 'Genie Demo CLI Trino Job' --commandCriterion 'TAGS=type:trino' --clusterCriterion 'TAGS=sched:adhoc,type:trino' -- --execute 'select * from tpcds.sf1.item limit 100;'`
.. `java -jar /usr/local/bin/genie-agent.jar exec --serverHost localhost --serverPort 9090 --jobName 'Genie Demo CLI Spark Shell Interactive Job' --commandCriterion 'TAGS=type:spark-shell' --clusterCriterion 'TAGS=sched:sla,type:yarn' --interactive`
... This starts an interactive Spark shell. Hit `ctrl-d` to exit gracefully
. In the http://localhost:8080[Genie UI], explore the two jobs
Expand Down
4 changes: 2 additions & 2 deletions genie-demo/src/main/docker/apache/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ MAINTAINER NetflixOSS <[email protected]>
# and Travis still is on too old a version of docker to enable BuildKit
RUN mkdir -p /usr/local/apache2/htdocs/applications/hadoop/2.7.1/ && \
wget -P /usr/local/apache2/htdocs/applications/hadoop/2.7.1/ https://archive.apache.org/dist/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz && \
mkdir -p /usr/local/apache2/htdocs/applications/presto/337/ && \
wget -P /usr/local/apache2/htdocs/applications/presto/337/ https://repo1.maven.org/maven2/io/prestosql/presto-cli/337/presto-cli-337-executable.jar && \
mkdir -p /usr/local/apache2/htdocs/applications/trino/374/ && \
wget -P /usr/local/apache2/htdocs/applications/trino/374/ https://repo1.maven.org/maven2/io/trino/trino-cli/374/trino-cli-374-executable.jar && \
mkdir -p /usr/local/apache2/htdocs/applications/spark/2.0.1/ && \
wget -P /usr/local/apache2/htdocs/applications/spark/2.0.1/ https://archive.apache.org/dist/spark/spark-2.0.1/spark-2.0.1-bin-hadoop2.7.tgz && \
mkdir -p /usr/local/apache2/htdocs/applications/spark/2.1.3/ && \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ set -o errexit -o nounset -o pipefail

START_DIR=`pwd`
cd `dirname ${BASH_SOURCE[0]}`
PRESTO_BASE=`pwd`
cd ${START_DIR}
TRINO_BASE=`pwd`
cd "${START_DIR}"

chmod 755 ${PRESTO_BASE}/dependencies/presto-cli-337-executable.jar
chmod 755 "${TRINO_BASE}"/dependencies/trino-cli-374-executable.jar

# Set the cli path for the commands to use when they invoke presto using this Application
export PRESTO_CLI="${PRESTO_BASE}/dependencies/presto-cli-337-executable.jar"
export TRINO_CLI="${TRINO_BASE}/dependencies/trino-cli-374-executable.jar"

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: trino
user: genieDemo
status: ACTIVE
description: Trino CLI
setupFile: http://genie-apache/applications/trino/374/setup.sh
configs: []
version: 374
type: trino
tags:
- type:trino
- ver:374
dependencies:
- http://genie-apache/applications/trino/374/trino-cli-374-executable.jar
9 changes: 0 additions & 9 deletions genie-demo/src/main/docker/client/example/clusters/presto.yml

This file was deleted.

9 changes: 9 additions & 0 deletions genie-demo/src/main/docker/client/example/clusters/trino.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
name: GenieDemoTrino
user: genieDemo
version: 374
status: UP
tags:
- sched:adhoc
- type:trino
- ver:374
configs: []
15 changes: 0 additions & 15 deletions genie-demo/src/main/docker/client/example/commands/presto337.yml

This file was deleted.

15 changes: 15 additions & 0 deletions genie-demo/src/main/docker/client/example/commands/trino374.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: trino
user: genieDemo
description: Trino CLI
status: ACTIVE
setupFile:
configs: []
executable: ${TRINO_CLI} --server genie-trino:8080
version: 337
tags:
- type:trino
- ver:374
checkDelay: 500
clusterCriteria:
- tags:
- type:trino
16 changes: 8 additions & 8 deletions genie-demo/src/main/docker/client/example/init_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,14 @@ def create_spark_version(genie_client: Genie, version: str, hadoop_app_id: str)
genie.set_application_for_command(yarn_command_id, [hadoop_application_id])
LOGGER.info(f"Set applications for Yarn command to = {hadoop_application_id}")

presto_application_id: str = genie.create_application(load_yaml("applications/presto337.yml"))
LOGGER.info(f"Created Presto 337 application with id = {presto_application_id}")
trino_application_id: str = genie.create_application(load_yaml("applications/trino374.yml"))
LOGGER.info(f"Created Trino 374 application with id = {trino_application_id}")

presto_command_id: str = genie.create_command(load_yaml("commands/presto337.yml"))
LOGGER.info(f"Created Presto 337 command with id = {presto_command_id}")
trino_command_id: str = genie.create_command(load_yaml("commands/trino374.yml"))
LOGGER.info(f"Created Trino 374 command with id = {trino_command_id}")

genie.set_application_for_command(presto_command_id, [presto_application_id])
LOGGER.info(f"Set applications for presto command to = {presto_application_id}")
genie.set_application_for_command(trino_command_id, [trino_application_id])
LOGGER.info(f"Set applications for Trino command to = {trino_application_id}")

create_spark_version(genie, "201", hadoop_application_id)
create_spark_version(genie, "213", hadoop_application_id)
Expand All @@ -96,5 +96,5 @@ def create_spark_version(genie_client: Genie, version: str, hadoop_app_id: str)
test_cluster_id = genie.create_cluster(load_yaml("clusters/test.yml"))
LOGGER.info(f"Created test yarn cluster with id = {test_cluster_id}")

presto_cluster_id = genie.create_cluster(load_yaml("clusters/presto.yml"))
LOGGER.info(f"Created presto cluster with id = {presto_cluster_id}")
trino_cluster_id = genie.create_cluster(load_yaml("clusters/trino.yml"))
LOGGER.info(f"Created Trino cluster with id = {trino_cluster_id}")
Original file line number Diff line number Diff line change
Expand Up @@ -29,25 +29,25 @@
pygenie.conf.DEFAULT_GENIE_URL = "http://genie:8080"

# Create a job instance and fill in the required parameters
# TODO: The Presto executable ends up executing in a different working directory and can't find the script file
# TODO: The Trino executable ends up executing in a different working directory and can't find the script file
# too tired to fix it right now so just go back to using the --execute for now
# job = pygenie.jobs.PrestoJob() \
# .job_name("Genie Demo Presto Job") \
# .job_name("Genie Demo Trino Job") \
# .genie_username("root") \
# .job_version("3.0.0") \
# .script("select * from tpcds.sf1.item limit 100;")

job = pygenie.jobs.PrestoJob() \
.job_name("Genie Demo Presto Job") \
.job_name("Genie Demo Trino Job") \
.genie_username("root") \
.job_version("3.0.0") \
.command_arguments("--execute \"select * from tpcds.sf1.item limit 100;\"")

# Set cluster criteria which determine the cluster to run the job on
job.cluster_tags(["sched:adhoc", "type:presto"])
job.cluster_tags(["sched:adhoc", "type:trino"])

# Set command criteria which will determine what command Genie executes for the job
job.command_tags(["type:presto"])
job.command_tags(["type:trino"])

# Submit the job to Genie
running_job = job.execute()
Expand Down
8 changes: 4 additions & 4 deletions genie-demo/src/main/docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ services:
- genie-hadoop-prod
- genie-hadoop-test
- genie-apache
- genie-presto
- genie-trino
tty: true
container_name: genie_demo_app_${GENIE_VERSION}
genie-apache:
Expand Down Expand Up @@ -43,9 +43,9 @@ services:
- "8043:8042"
tty: true
container_name: genie_demo_hadoop_test_${GENIE_VERSION}
genie-presto:
image: prestosql/presto:337
genie-trino:
image: trinodb/trino:374
ports:
- "9090:8080"
tty: true
container_name: genie_demo_presto_${GENIE_VERSION}
container_name: genie_demo_trino_${GENIE_VERSION}

0 comments on commit 7b835f6

Please sign in to comment.