This project is for the final exam of the course of Big Data 2019/2020. The goal of the project is to create two jobs (1 MapReduce / 1 Spark) on an Hadoop cluster.
The dataset is illustrated here.
Both the jobs are based on the same query:
- Rank of the best airlines based on a KPI obtained with a relationship between total flights's delay minutes and distance in KM (total delay minutes / distance KM)