Below is the listing of all labs and exercises and a suggested schedule. Labs with a description have an exercise.
- Day1
- 1 Course Introduction
- 2 Introduction to Big Data
- 3 Introduction to Scala
- Lab-3.1 - Introduction to Scala
- 4 Hadoop and Spark Introduction
- Day2
- 5 RDD API
- Lab-5.1 - RDD Transformations and Actions
- Lab-5.2 - RDD Word Frequency
- Word frequency count with spark-intro.txt
- Lab-5.3 - RDD Cities
- Read cities.csv, perform analysis and write to grid
- Lab-5.4 - RDD Pair RDDs
- Read data_transactions.txt as PairRDDs and perform analysis
- 6 Structured API and DataFrame
- Lab-6.1 - DataFrame Loading and Saving Data
- Work with different file types: csv, json and parquet
- Lab-6.2 - DataFrame Operations
- Work with query like methods integrated with DataFrames
- Lab-6.3 - DataFrame Using SQL
- Use Spark SQL on a DataFrame
- Lab-6.4 - DataFrame Operations and SQL Usage
- Basic DataFrame operations using video-games-sales.csv
- Lab-6.5 - DataFrame Save and Load DataFrame from Grid
- Build and run example project
- Lab-6.1 - DataFrame Loading and Saving Data
- 7 Architectural Approaches
- 8 InsightEdge
- Lab-8.1 - SQL Query Benchmark Spark vs InsightEdge
- 9 Administration and Deployment
- 5 RDD API
- Day3
- 10 Structured Streaming
- Lab-10.1 - Structured Streaming
- Initialize stream and perform computation
- Lab-10.1 - Structured Streaming
- 11 Microbatch Streaming
- Lab-11.1 - Save Stream to Grid
- From Twitter to Grid project
- Lab-11.2 - Kafka and Geospatial
- Instructions and links to Github demo
- Lab-11.3 - Spark Streaming Examples
- Lab-11.1 - Save Stream to Grid
- 12 Machine Learning
- Lab-12.1 - Machine Learning
- Machine learning
- Lab-12.1 - Machine Learning
- 10 Structured Streaming
- Day4
- 13 GraphX
- Lab-13.1 - GraphX Creation Structural Operators
- Lab-13.2 - GraphX Connected Components
- Find connected components using GraphX
- Lab-13.3 - GraphX Neighbourhood Aggregation
- Perform neighborhood aggregation
- Lab-13.4 - GraphX Airline Demo
- 14 Event Containers
- 15 MemoryXtend
- 16 Kubernetes
- Lab-16.1 - Kubernetes
- Deploy InsightEdge on minikube
- Lab-16.1 - Kubernetes
- 13 GraphX
- Day5
- 17 Flight Delay Demo
- Lab-17.1 - Flight Delay Demo
- Run the example project and the notebook
- Lab-17.1 - Flight Delay Demo
- 18 Jupyter Notebook
- Lab-18.1 - Jupyter Demo
- 19 Tableau Integration
- Lab-19.1 - Tableau setup
- 20 Data Lake Acceleration
- Lab-20.1 - Query Speed Batch
- Exam
- 17 Flight Delay Demo