Skip to content

amoriqbal/CricketODIPrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ODI-Cricket-Match-Prediction

Overview

With the advent of statistical modeling in sports, predicting the outcome of a game has been established as a fundamental problem. Cricket is one of the most popular team games in the world. Moreover cricket betting is a multi-billion dollar market. Therefore, there is a strong incentive for models that can predict the outcomes of games and beat the odds provided by bookers. The aim of this study is to investigate to what degree it is possible to predict the outcome of cricket matches.

Cricket matches are quite unpredictable. Considering only ODI matches, we will try to predict who is going to be the winner of the match, before the match begins. Suppose we know that there is going to be an ODI match between Team A and Team B in near future. Before the match begins, our ML model should predict which team will win the match.

Data Collection

Ball by ball records of every international ODI match is available on the web, in computer readable format. The data has been gathered from Cricsheet.Entire dataset has been built from scratch and every feature has been normalised before feeding into the model.

Feature engineering

team_features

The Main Approach

  • We experimented with various classification models like Naive Bayes, Logistic Regression, Random Forest, SVC etc.

Results

  • Linear Regression : 67%
  • Support Vector Classifier : 67%
  • Gaussian Naive bayes : 57%
  • Random Forest : 49%

Conclusion

It is possible to predict the winner of ODI cricket games in more than two thirds of instances. This is an improvement upon levels present in the gambling industry today and implies a potential financial opportunity. However, the overall level of accuracy is lower than that observed in many other sports and undergoes more significant fluctuations. This suggests that ODI cricket has a relatively high level of instability or randomness within it, which should come as no surprise to those familiar with the game.
Of the various methods tried, the most effective classification method was a Logistic Regression model combined with significant data preprocessing, feature selection and complex hierarchical features.
With regards to future work, further avenues could be explored either in the form of new algorithms or in the form of new features. This work has focused entirely on using team performance based statistics as features. Whereas previous investigations use predominantly categorical features, such as ‘home team’, ‘who won the toss’ or ‘venue’ for the same task. Combining these feature types could perhaps yield performance gains, however algorithm choice would be limited to those capable of dealing with both categorical and numerical features concurrently. In addition, external data such as the weather or social media comments could be mined to provide even more features for further experimentation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published