Skip to content

Latest commit

 

History

History
72 lines (70 loc) · 2.44 KB

toc.md

File metadata and controls

72 lines (70 loc) · 2.44 KB

Table of Contents

Below is the listing of all labs and exercises and a suggested schedule. Labs with a description have an exercise.

  • Day1
    • 1 Course Introduction
    • 2 Introduction to Big Data
    • 3 Introduction to Scala
      • Lab-3.1 - Introduction to Scala
    • 4 Hadoop and Spark Introduction
  • Day2
    • 5 RDD API
      • Lab-5.1 - RDD Transformations and Actions
      • Lab-5.2 - RDD Word Frequency
        1. Word frequency count with spark-intro.txt
      • Lab-5.3 - RDD Cities
        1. Read cities.csv, perform analysis and write to grid
      • Lab-5.4 - RDD Pair RDDs
        1. Read data_transactions.txt as PairRDDs and perform analysis
    • 6 Structured API and DataFrame
      • Lab-6.1 - DataFrame Loading and Saving Data
        1. Work with different file types: csv, json and parquet
      • Lab-6.2 - DataFrame Operations
        1. Work with query like methods integrated with DataFrames
      • Lab-6.3 - DataFrame Using SQL
        1. Use Spark SQL on a DataFrame
      • Lab-6.4 - DataFrame Operations and SQL Usage
        1. Basic DataFrame operations using video-games-sales.csv
      • Lab-6.5 - DataFrame Save and Load DataFrame from Grid
        1. Build and run example project
    • 7 Architectural Approaches
    • 8 InsightEdge
      • Lab-8.1 - SQL Query Benchmark Spark vs InsightEdge
    • 9 Administration and Deployment
  • Day3
    • 10 Structured Streaming
      • Lab-10.1 - Structured Streaming
        1. Initialize stream and perform computation
    • 11 Microbatch Streaming
      • Lab-11.1 - Save Stream to Grid
        1. From Twitter to Grid project
      • Lab-11.2 - Kafka and Geospatial
        1. Instructions and links to Github demo
      • Lab-11.3 - Spark Streaming Examples
    • 12 Machine Learning
      • Lab-12.1 - Machine Learning
        1. Machine learning
  • Day4
    • 13 GraphX
      • Lab-13.1 - GraphX Creation Structural Operators
      • Lab-13.2 - GraphX Connected Components
        1. Find connected components using GraphX
      • Lab-13.3 - GraphX Neighbourhood Aggregation
        1. Perform neighborhood aggregation
      • Lab-13.4 - GraphX Airline Demo
    • 14 Event Containers
    • 15 MemoryXtend
    • 16 Kubernetes
      • Lab-16.1 - Kubernetes
        1. Deploy InsightEdge on minikube
  • Day5
    • 17 Flight Delay Demo
      • Lab-17.1 - Flight Delay Demo
        1. Run the example project and the notebook
    • 18 Jupyter Notebook
      • Lab-18.1 - Jupyter Demo
    • 19 Tableau Integration
      • Lab-19.1 - Tableau setup
    • 20 Data Lake Acceleration
      • Lab-20.1 - Query Speed Batch
    • Exam