- Scope of Learning
- Deployed Link and Repo Link
- Ideas
- Vision
- Innovative Ideas Description
- Prerequisites
- LLM (Gen AI)
- Index of Content
- List of Contents
- Contributing
- License
This repository is aimed at providing hands-on learning experiences in the following areas:
- Data Analysis
- Machine Learning
- Deep Learning
- LLM (Gen AI)
Index | Project | Deployed Link | Repository Link | Tools Used |
---|---|---|---|---|
1 | Car Price Prediction | Deployed Link | Repo Link | Streamlit, Scikit-learn, Pandas, NumPy |
2 | Car Price Prediction | Deployed Link | Repo Link | Flask, Scikit-learn, Pandas, NumPy |
3 | Loan Price Prediction | Deployed Link | Repo Link | Flask, Scikit-learn, Pandas, NumPy |
4 | Diwali Sales Analysis | Not Deployed | Repo Link | Pandas, NumPy , PyPlot , Seaborn |
5 | Cat Vs Dog Image Classification | Not Deployed | Repo Link | Tensorflow , Keras , Matplotlib |
6 | Advanced Resume Tracking System | Deployed Link | Repo Link | LLM , Generative-AI , PyPDF , Streamlit |
Here are your project ideas presented in a tabular format:
Project Idea | Description | Domain |
---|---|---|
Indian Economy Analysis | Analyze various economic indicators and trends to understand the current state and predict future scenarios. | Economics, Data Analysis |
Diwali Sales Analysis | Analyze sales data before, during, and after Diwali to identify trends, patterns, and optimize marketing strategies. | Retail, Sales Analysis |
Car Price Prediction | Develop a machine learning model to predict the price of cars based on various features such as mileage, brand, etc. | Machine Learning, Automotive |
Loan Approval Prediction | Build a machine learning model to predict whether a loan application will be approved or rejected by a financial institution. | Machine Learning, Finance |
Cat vs Dog Classification | Create a deep learning model to classify images of cats and dogs accurately. | Deep Learning, Computer Vision |
Advanced Resume Tracking System | Implement a comprehensive system using LLM techniques to track and analyze resumes for job matching and recruitment. | LLM (Gen AI), Human Resources |
Our vision is to facilitate learning and exploration in the field of data science by providing well-documented code, tutorials, and resources. We aim to empower individuals to understand and apply data science techniques to real-world problems.
We strive to incorporate innovative approaches and ideas in our projects, pushing the boundaries of traditional data science methodologies. Some of the innovative ideas explored in this repository include:
- Novel feature engineering techniques
- Advanced model architectures
- Cutting-edge visualization methods
Before running the code in this repository, ensure you have the following dependencies installed:
- pandas
- numpy
- scikit-learn (sklearn)
- seaborn
- matplotlib
- plotly
Additionally, for deep learning models, you will need:
- TensorFlow
- Keras
For LLM (Gen AI) models, you will also need:
- OpenAI library
- Gen AI libraries
You can install the required dependencies using pip:
pip install pandas numpy scikit-learn seaborn matplotlib plotly tensorflow keras openai gen_ai
LLM (Gen AI) extends the LLM framework to incorporate Generative AI techniques, enabling the generation of novel data, images, text, etc., and exploring the possibilities of AI-driven creativity.
Each section contains detailed notebooks, code, and explanations for specific projects and concepts.
data_analysis
: Contains notebooks and code for data analysis projects.machine_learning
: Includes notebooks and code for machine learning projects.deep_learning
: Consists of notebooks and code for deep learning projects.LLM
: Includes notebooks and code for projects related to the LLM (Data Analysis, Machine Learning, Deep Learning) framework.
Feel free to explore each section and dive into the projects to enhance your understanding of data science concepts.
I would like to express my gratitude to the developers of the various data science tools, libraries, and models that have been instrumental in the creation of this repository:
- pandas: Developed by Wes McKinney and contributors, pandas is a powerful data manipulation and analysis library for Python.
- NumPy: Created by Travis Oliphant, NumPy is the fundamental package for scientific computing with Python.
- scikit-learn: Developed by a community of contributors, scikit-learn is a versatile machine learning library for Python.
- seaborn: Developed by Michael Waskom and contributors, seaborn is a Python visualization library based on matplotlib for statistical graphics.
- matplotlib: Developed by John D. Hunter (and later Michael Droettboom and contributors), matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
- plotly: Developed by Plotly Technologies, plotly is a graphing library for Python that creates interactive, publication-quality graphs online.
- TensorFlow: Developed by the Google Brain team and contributors, TensorFlow is an open-source platform for machine learning and deep learning.
- Keras: Developed by François Chollet and contributors, Keras is an open-source neural network library written in Python that serves as a high-level API for TensorFlow.
- OpenAI: Developed by OpenAI, OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc.
- Gen AI Libraries: Developed by Gen AI, Gen AI Libraries provide tools and frameworks for Generative AI techniques, enabling the generation of novel data, images, text, etc.
- XGBoost: Developed by a community of contributors, XGBoost is an optimized distributed gradient boosting library designed for speed and performance.
- LightGBM: Developed by Microsoft, LightGBM is a gradient boosting framework that uses tree-based learning algorithms.
- CatBoost: Developed by Yandex, CatBoost is an open-source gradient boosting library that provides state-of-the-art results out of the box.
- SciPy: Developed by a community of contributors, SciPy is a scientific computing library that builds on NumPy and provides additional functionality.
- StatsModels: Developed by a community of contributors, StatsModels is a Python module that provides classes and functions for the estimation of many different statistical models.
- PyTorch: Developed by Facebook's AI Research lab (FAIR) and contributors, PyTorch is an open-source machine learning library based on the Torch library.
- fastai: Developed by fast.ai, fastai is a deep learning library built on top of PyTorch that provides high-level abstractions for training and deploying deep learning models.
We extend our sincere appreciation to these developers and the broader open-source community for their invaluable contributions to the field of data science.
Contributions to this repository are welcome! Whether it's fixing a bug, adding a new project, or improving documentation, your contributions help make this resource better for everyone.
Please refer to the contribution guidelines before submitting your contributions.
This repository is licensed under the MIT License. See the LICENSE file for details.