Our main idea is to use image url and text from user profile and then passing it through image analysis model and text-analysis model to get the possibility of person being in personality traits from big 5 model namely:
- Extraversion
- Agreeableness
- Conscientiousness
- Neuroticism
- Openness
For Text analysis part we are using pretrained bert model for preprocessing and smaller bert from tensorflow_hub using mean squared loss BinaryCrossentropy loss and AdamW optimizer. The overall accuracy of over model for predicting personality trait is about 50%
model wieghts link : https://drive.google.com/drive/folders/1U1fuZryj3R4rj6E7Umwbt9qefjMckCjA?usp=sharing
For Image analysis part we are using custom convolution network trained using mean squared loss (regression loss) and adam optimizer. The overall accuracy of over model for predicting personality trait is about 80%
Dataset pickle files link : https://drive.google.com/drive/folders/1U1fuZryj3R4rj6E7Umwbt9qefjMckCjA?usp=sharing
Training code for image analysis part is in personality_image.ipynb file you can directly open it in colab and run it.
- run pip install -r requirements.txt
- run python configurator.py, it will prompt you for your linkedin username(email) and password
- Download the weights from https://drive.google.com/drive/folders/1r8rIBKtnzF91gaxSdjFyYLn3Du4hpiiE?usp=sharing and store it in the project directory
- run test.py like python test.py -u [linkedin user url]. Example: python test.py -u https://www.linkedin.com/in/sahil-nare-b96694179/
- Run text_analsysis.pynb on colab
- Load model weights
Just run sentiment_analysis.ipynb. training data link : https://www.kaggle.com/cosmos98/twitter-and-reddit-sentimental-analysis-dataset
work in progress: model saving and inference code will be added soon and this will be automated with twitter scraper which scraps comments of users on twitter and gives sentiment of each comment.