Scrapy-based webcrawler which collects all data for a specific NBA season from basketball-reference.com.
Requires the scrapy and pandas python packages to be installed.
The webcrawler can be started from the project directory using the command
scrapy crawl basketball-reference -a season=2020
where the season for which data should be collected is given by the season
argument (default is current season).
Use this cmd below in your terminal. Must change the csv or .py files' path if you needable
python3 run.py -xgb -odds=fanduel (args for model & odds data)
odds data
: Odds data collecting with sbrscrape, scraping FanDuel odds data ➡️ test.pyin odds data, you can access tomorrow's betting info. ➡️ bet_api.py, accessable with your own key {https://api.the-odds-api.com/v4/sports}python3 test.py year = ["2023", "2024"] , season = ["2023-24"] ➡️ change year with when you want to discover
season data
: Seasonal data collecting with https://www.basketball-reference.com/leagues/NBA_{self.season}_games.html site.➡️ br_spider.pyscrapy crawl basketball-reference -a season=2020 ➡️ crawl command, change season args with when you want.
merged data
: merge odds data & season data with [date, home, away] ➡️ data_preprocess.py