The objective of this project is to develop a MapReduce job and get details of employee, male and female seperately with highest CTC in each department.
It's tab seperated
empid name age gender dept CTC
- Use Ubuntu or any flavour of Linux with following installed
- Hadoop 2.x or above
- JDK
The idea here to write custom partitioner class to partition the data based on gender before sending it to the reducer.