Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 380 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 380 Bytes

pyspark_learning_stack

Repo to start learning pyspark. Includes docker-compose of spark master and workers with gcs connectors with optional standalone airflow and inconsistent jupyter notebook. Intention is to mirror work done in https://github.com/HybridNeos/comp653_final through spark dataframes and possibly spark ML while using Airflow instead of dbt as the orchestrator.