NLP Emotion Classification Project App
Welcome to the NLP Emotion Classification project! This project focuses on classifying text data into six distinct emotion categories using various machine learning models. Below you'll find detailed information on how to set up the project, use the provided CLI tool, understand the project's overall functionality and results, and explore the Streamlit web application.
- Project Overview
- Dataset
- Preprocessing
- Models Used
- Results
- CLI Tool
- Streamlit Web Application
- Setup and Installation
- Contact
This project aims to classify text into emotions such as sadness, joy, love, anger, fear, and surprise using machine learning techniques. The dataset is sourced from Kaggle and consists of text data labeled with these six emotion categories.
- Source: Kaggle Emotion Dataset
- Size: 416,808 entries
- Features: Text data and corresponding emotion labels
Label | Emotion |
---|---|
0 | Sadness |
1 | Joy |
2 | Love |
3 | Anger |
4 | Fear |
5 | Surprise |
The preprocessing steps include:
- Text Cleaning: Removing unnecessary characters, converting to lowercase, removing punctuations and URLs.
- Tokenization: Splitting text into words or tokens for analysis.
- Vectorization: Converting text data into numerical representations using Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF).
The project implements several models to predict emotions:
-
Naive Bayes:
- Accuracy: 74%
- Simple probabilistic model
-
Logistic Regression:
- Accuracy: 89.44%
- Statistical model for binary outcomes
-
XGBoost:
- Accuracy: 89.39%
- Optimized gradient boosting algorithm
-
Custom Neural Network:
- Accuracy: 88.66%
- Multi-layer perceptron with Leaky ReLU and Sigmoid activation functions
Model | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Naive Bayes | 0.74 | 0.79 | 0.74 | 0.69 |
Logistic Regression | 0.8944 | 0.89 | 0.89 | 0.89 |
XGBoost | 0.8939 | 0.90 | 0.89 | 0.90 |
Neural Network | 0.8866 | 0.89 | 0.89 | 0.89 |
The CLI (Command-Line Interface) tool allows users to input a sentence and get an emotion prediction using trained machine learning models.
The tool interacts with the following models to predict emotions:
- XGBoost
- Logistic Regression
- Naive Bayes
- Text Preprocessing: Cleans input text.
- Vectorization: Converts text into numerical format.
- Prediction: Uses models to predict emotions.
- User Interaction: Accepts user input and returns predicted emotions.
-
Input: "I am so happy and surprised you did this!"
- XGBoost: Joy
- Logistic Regression: Joy
- Naive Bayes: Surprise
-
Input: "I am already feeling frantic."
- XGBoost: Fear
- Logistic Regression: Fear
- Naive Bayes: Fear
- Run the script using Python.
- Enter a sentence to get emotion predictions.
- The tool displays predictions from each model.
- Type
exit
to quit the tool.
Explore the Emotion Classifier app hosted on Streamlit Cloud here. This interactive web application allows users to input text and receive predictions for emotions such as sadness, joy, love, anger, fear, and surprise using machine learning models.
- Input: Enter a sentence to predict the corresponding emotions.
- Models Used: Naive Bayes, Logistic Regression, and XGBoost.
- Visualization: Displays predictions, model performance comparison, confusion matrices, class distributions, word clouds, and ROC curves.
-
Clone the repository:
git clone https://github.com/parthrastogicoder/NLP_EMOTION_CLASSIFIER.git cd emotion-classification
-
Install dependencies:
pip install -r requirements.txt
-
Download the dataset from Kaggle
-
Run the CLI tool:
python cliapp.py
-
Explore the Streamlit app:
streamlit run app.py
For any inquiries or suggestions regarding this project, please contact me at [email protected]
.