Skip to content

ItssPrachi/ResumeAnalyser

Repository files navigation

##Step-by-Step Guide

  1. Setting Up the Environment Definition: Environment setup involves creating an isolated Python environment and installing necessary dependencies to ensure project consistency and reproducibility.

###Detailed Explanation:

A virtual environment is a self-contained directory that contains a Python installation for a particular version of Python, plus a number of additional packages. Using a virtual environment allows you to manage project-specific dependencies, regardless of what dependencies every other project has.

-python3 -m venv venv
-source venv/bin/activate
-pip install -r requirements.txt

-python3 -m venv venv: This command creates a new virtual environment named "venv" in the current directory. -source venv/bin/activate: This activates the virtual environment. On Windows, use venv\Scripts\activate. -pip install -r requirements.txt: This installs all the Python packages listed in the requirements.txt file.

#The requirements.txt file should include all necessary libraries, such as:

-PyMuPDF==1.19.0
-nltk==3.6.2
-spacy==3.1.0
-scikit-learn==0.24.2
-pandas==1.3.0
-matplotlib==3.4.2
-seaborn==0.11.1

##Summary of Project Components -Project Structure: A well-organized directory layout for data, code, and documentation.

-Environment Setup: Instructions for creating a virtual environment and installing dependencies.

-Data Extraction: Techniques for extracting text from PDF resumes using libraries like PyMuPDF.

-Data Preprocessing: NLP methods such as tokenization, stopword removal, and lemmatization using spaCy and NLTK.

-Exploratory Data Analysis (EDA): Visualizations of data distribution, word frequencies, and text length using matplotlib and seaborn.

-Feature Extraction: Conversion of text data to numerical features using TF-IDF vectorization.

-Model Training: Implementation of multiple classification algorithms (Naive Bayes, SVM, Random Forest) using scikit-learn.

-Visualization: Advanced visualizations for model evaluation, including confusion matrices, ROC curves, and feature importance plots.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published