Sales prediction and visualization of demand patterns
- SalesTrendsAnalyzer
This project aims to analyze sales data for a sushi restaurant chain. The data includes daily sales for a variety of sushi dishes, as well as information on the type and EAN (European Article Number) code for each dish. The analysis focuses on daily/weekly sales for each dish.
Requirements: List any dependencies or requirements for running the Python scripts, such as specific Python versions, libraries, or external tools.
To use this project, you will need to install the following packages:
- datetime
- IPython
- matplotlib
- numpy
- pandas
- seaborn
- tensorflow
You can install these packages using pip:
pip install -r requirements.txt
Use clear headings or bullet points to describe each script in the repository, and include the following information for each script:
Name: The name of the script.
Description: A brief description of the script's purpose and functionality.
Inputs: A description of the input parameters or data required by the script, including their format and any default values.
Outputs: A description of the output produced by the script, including its format and destination (e.g., file, console, etc.).
Usage: Provide an example of how to run the script, including any necessary command-line arguments or options.
Detailed description: see bellow.
Inputs:
- "Data/merged_cleaned_FE_imputed(v)_w.csv"
Outputs:
- The trained models mentioned in Introduction with comparison
The data used in this project is sales data for a sushi restaurant over a period of time. The data contains information about the products sold, the date of sale, and the quantity sold.
The merged_cleaned_FE_imputed(v).csv file has the following columns:
- Date: The date of the day for which the data is recorded.
- Month: The month of the year for which the data is recorded.
- DayoftheMonth: The day of the month for which the data is recorded.
- WeekoftheMonth: The week of the month for which the data is recorded.
- DayoftheWeek: The day of the week for which the data is recorded.
- WeekoftheYear: The week of the year for which the data is recorded.
- DayoftheYear: The day of the year for which the data is recorded.
- isWeekend: A binary column that indicates whether the day is a weekend or not.
- isWeekStart: A binary column that indicates whether the day is the start of a week or not.
- isWeekEnd: A binary column that indicates whether the day is the end of a week or not.
- isMonthStart: A binary column that indicates whether the day is the start of a month or not.
- isMonthEnd: A binary column that indicates whether the day is the end of a month or not.
- Season_Autumn, Season_Spring, Season_Summer, Season_Winter: Binary columns that indicate the season for which the data is recorded.
- isHoliday: A binary column that indicates whether the day is a holiday or not.
- StoreID: The ID of the store for which the data is recorded.
- 4260705920003 to 4260705920638: These columns represent the ProductID of different products sold at the store. The values in these columns indicate the quantity of each product sold on the given day.
- total_quantity_day: The total quantity of products sold at the store on the given day.
The Sushi Menu.csv file has the following columns:
- Nummer (unique dish ID)
- EAN (European Article Number) code
- Type
- sub_type
- Count
- main_ingredient
- sub_ingredient1
- sub_ingredient2
- sub_ingredient3
- sub_ingredient4
- sub_ingredient5
- sub_ingredient6
- sub_ingredient7
- diet_type
- meat_type
- is_new
- Price_per_100g
- Price_per_stuck
- Weight(g)
- Brennwert(KJ)
- Brennwert(Kcal)
- Fett(g)
- gesättigte Fettsäuren(g)
- Kohlenhydrate(g)
- Zucker(g)
- Ballaststoffe(g)
- Eiweiß(g)
- Salz(g)
- Enthält
- Allergic_material
- Religious_kosher
- Zutaten
The results of the analysis are visualizations that show the sales trends for different products over time. The visualizations are generated using matplotlib and show the cumulative sales for each product on a weekly basis.
Contributions to this project are welcome. Please open an issue or pull request if you have any suggestions or would like to contribute code. It would be helpful to add the possibility to choose the products from the dropdown menu to improve the user experience.
- Builds on the top of the training(h).ipynb and training(h)_w.ipynb including the changes:
- not use early stopping as this is not working well with shuffled batches
- specify shapes and some „Flatten“-Layers in the model to better control what is happening.
- removed some of the features that were like an „index“ of the data, to prevent overfitting
- reduce the window-width (to 3) as overfitting will happen always when we have more input features then training samples
- Reduce the „label“ width, as if it is bigger then „shift“, values can just be passed through from the input