Data used for this project is from Kaggle (stored as MCVA.csv in my system as shown in the repo). The data can be downloaded from here :
https://www.kaggle.com/pankajjsh06/ibm-watson-marketing-customer-value-data
State | Customer Lifetime Value | Response | Coverage | Education | Effective To Date | Employment | Gender | Marital Status | Number of Policies | Policy | Sales Channel | Total Claim | Vehicle Class | Vehicle Size |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Washington | 2763.519 | No | Basic | Bachelor | 2/24/2011 | Employed | F | Married | 1 | Corporate L3 | Agent | 384.8111 | Two-Door Car | Medsize |
Arizona | 6979.536 | No | Extended | Bachelor | 1/31/2011 | Unemployed | F | Single | 8 | Personal L3 | Agent | 1131.465 | Four-Door Car | Medsize |
Once the data and the scala codes are all downloaded in the same place, run the loadproject.scala in the spark shell using the command
:load loadproject.scala