Sentiment Analysis with Logistic Regression

This repository contains code for sentiment analysis using Logistic Regression on a movie review dataset. The project includes data loading, preprocessing, feature engineering using TF-IDF vectorization, model training, and performance evaluation.

Overview

Sentiment analysis is a natural language processing task that involves classifying text into predefined categories based on the expressed sentiment. In this project, we use a Logistic Regression model to predict the sentiment of movie review phrases.

Dataset

The dataset used for this project consists of two main files:

train.tsv: Training data containing movie review phrases with corresponding sentiment labels.
test.tsv: Test data for evaluating the trained model's performance.

Technologies Used

Python
Pandas: Data manipulation and analysis.
NLTK (Natural Language Toolkit): Text preprocessing, including tokenization, stemming, and stop word removal.
Scikit-Learn: Machine learning library for TF-IDF vectorization and Logistic Regression model.

Workflow

Data Loading and Exploration:
- Read and explore the dataset using Pandas.
- Check for missing values and understand the data structure.
Text Preprocessing:
- Tokenize phrases, perform stemming, and remove stop words using NLTK.
Feature Engineering:
- Use TF-IDF (Term Frequency-Inverse Document Frequency) vectorization to convert text data into numerical features.
Model Training:
- Split the dataset into training and validation sets.
- Train a Logistic Regression model on the TF-IDF transformed features.
Model Evaluation:
- Evaluate the model's performance on both training and validation datasets using accuracy scores.
- Optionally, visualize a confusion matrix to analyze prediction errors.

Usage

Clone the repository and run the provided Jupyter notebook (sentiment_analysis_logistic_regression.ipynb) to reproduce the sentiment analysis task. Make sure to have Python and the required libraries installed.

git clone https://github.com/pragati9998/Sentimental-Analysis.git
cd Sentimental-Analysis
jupyter notebook sentiment_analysis_logistic_regression.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
DSP Assignment .ipynb		DSP Assignment .ipynb
README.md		README.md
test.tsv		test.tsv
train.tsv		train.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis with Logistic Regression

Overview

Dataset

Technologies Used

Workflow

Usage

About

Releases

Packages

Languages

pragati9998/Sentimental-Analysis

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis with Logistic Regression

Overview

Dataset

Technologies Used

Workflow

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages