ODI-Cricket-Match-Prediction

Overview

With the advent of statistical modeling in sports, predicting the outcome of a game has been established as a fundamental problem. Cricket is one of the most popular team games in the world. Moreover cricket betting is a multi-billion dollar market. Therefore, there is a strong incentive for models that can predict the outcomes of games and beat the odds provided by bookers. The aim of this study is to investigate to what degree it is possible to predict the outcome of cricket matches.

Cricket matches are quite unpredictable. Considering only ODI matches, we will try to predict who is going to be the winner of the match, before the match begins. Suppose we know that there is going to be an ODI match between Team A and Team B in near future. Before the match begins, our ML model should predict which team will win the match.

Data Collection

Ball by ball records of every international ODI match is available on the web, in computer readable format. The data has been gathered from Cricsheet.Entire dataset has been built from scratch and every feature has been normalised before feeding into the model.

Feature engineering

The Main Approach

We experimented with various classification models like Naive Bayes, Logistic Regression, Random Forest, SVC etc.

Results

Linear Regression : 67%
Support Vector Classifier : 67%
Gaussian Naive bayes : 57%
Random Forest : 49%

Conclusion

It is possible to predict the winner of ODI cricket games in more than two thirds of instances. This is an improvement upon levels present in the gambling industry today and implies a potential financial opportunity. However, the overall level of accuracy is lower than that observed in many other sports and undergoes more significant fluctuations. This suggests that ODI cricket has a relatively high level of instability or randomness within it, which should come as no surprise to those familiar with the game.
Of the various methods tried, the most effective classification method was a Logistic Regression model combined with significant data preprocessing, feature selection and complex hierarchical features.
With regards to future work, further avenues could be explored either in the form of new algorithms or in the form of new features. This work has focused entirely on using team performance based statistics as features. Whereas previous investigations use predominantly categorical features, such as ‘home team’, ‘who won the toss’ or ‘venue’ for the same task. Combining these feature types could perhaps yield performance gains, however algorithm choice would be limited to those capable of dealing with both categorical and numerical features concurrently. In addition, external data such as the weather or social media comments could be mined to provide even more features for further experimentation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.ipynb_checkpoints		.ipynb_checkpoints
aggregator		aggregator
.gitignore		.gitignore
README.md		README.md
dataset.csv		dataset.csv
models.ipynb		models.ipynb
team_features.png		team_features.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ODI-Cricket-Match-Prediction

Overview

Data Collection

Feature engineering

The Main Approach

Results

Conclusion

About

Releases

Packages

Contributors 2

Languages

amoriqbal/CricketODIPrediction

Folders and files

Latest commit

History

Repository files navigation

ODI-Cricket-Match-Prediction

Overview

Data Collection

Feature engineering

The Main Approach

Results

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages