This repository is the official implementation of [PREDICTING MOTION PICTURE BOX-OFFICE PERFORMANCE]. Features and Performance has been obfuscated due to pending paper.
The difficulty in gauging product demand has made the production of motion pictures a risky endeavor. The movie industry has grown tremendously over the past few decades, making billions of dollars for stakeholders. However, the financial success of a movie is largely uncertain. From an investor’s standpoint, one would want to be assured of returns on investment. Hence, in order to provide the stakeholders an estimation of what return a movie might generate, this research proposes a framework for predicting the degree of box-office success for motion pictures using an ensemble model and gradient boosted decision trees for pre-production and post-release stages. We used a wide selection of classifiers and regressors to evaluate our approach, namely, Extremely Randomized Trees, Histogram Gradient Boosting, Logistic Regression, Light Gradient Boosting Machine, Extreme Gradient Boosting, CatBoost, Stacking Ensemble, Hard Voting Ensembles, and Soft Voting Ensembles. Our proposed models achieve state-of-the-art performance and are pending publication.
To install requirements:
conda env create -f environment.yml
conda activate thesis
Paper Pending Publication TBA