This project focuses on predicting the prices of avocados using various regression algorithms. The dataset was sourced from Kaggle and includes relevant features to facilitate the prediction process.
- Utilizes various regression algorithms for avocado price prediction.
- Dataset collected from Kaggle, containing information about avocado prices and characteristics.
- Linear Regression
- SVR
- Random Forest
- Gradient Boosting Models (GBM)
- Extreme Gradient Boosting (XGBoost)
- AdaBoostRegressor
- Decision Tree
- KNeighborsRegressor(KNN)
- Artificial Neural Networks (ANN)
- LSTM(Long Short term Memory)
The dataset used in this project is sourced from Kaggle and includes information about avocado prices, types, and characteristics. It contains features such as average price, total volume, type, region, etc.
data/
: Contains the dataset files.notebooks/
: Jupyter notebooks with the code for data exploration, preprocessing, and model training.src/
: Python source code for the project.requirements.txt
: List of dependencies needed to run the project.
- Install dependencies using
pip install -r requirements.txt
. - Execute the notebooks in the
notebooks/
folder in the given order. - Run the scripts in the
src/
folder for further analysis or model training.
The sequence of all the algorithms used is as follows:
- Linear Regression
- SVR
- Random Forest
- Gradient Boosting Models (GBM)
- Extreme Gradient Boosting (XGBoost)
- AdaBoostRegressor
- Decision Tree
- KNeighborsRegressor(KNN)
- Artificial Neural Networks (ANN)
- LSTM(Long Short term Memory)
The Accuracy of all the following 10 Regression Algorithms is provided below:
The RMSE of all the following 10 Regression Algorithms is provided below:
The MAE of all the following 10 Regression Algorithms is provided below:
The MAPE of all the following 10 Regression Algorithms is provided below:
The Precision of all the following 10 Regression Algorithms is provided below:
The Recall of all the following 10 Regression algorithms is provided below:
- This project is a basic comparison of few selected regression algorithms on an avocado dataset.
- We can also create a dashboard on the same dataset to create a visualization of sales data.
- Clearly, we can see that the accuracies of the algorithms 1 and 6 are the highest.
- Also we can see from the chart of rmse that the rmse of algorithm 1 is lower as compared to that of algorithm 6.
- Thus we conclude that algorithm 6 i.e. AdaBoostRegressor has performed well in the given dataset.
Rohit Dubey
This project is licensed under the MIT License.
- Mention any references or external libraries used in the project.
Feel free to customize the content according to your project specifics. Don't forget to include a license file and any other necessary documentation for users to understand and replicate your work.