Scrapy-based webcrawler which collects all data for a specific NBA season from basketball-reference.com.
Requires the scrapy and pandas python packages to be installed.
The webcrawler can be started from the project directory using the command
scrapy crawl basketball-reference -a season=2020
where the season for which data should be collected is given by the season
argument (default is current season).
odds data
: Odds data collecting with sbrscrape, scraping FanDuel odds data ➡️ test.pyin odds data, you can access tomorrow's betting info. ➡️ bet_api.py, accessable with your own key {https://api.the-odds-api.com/v4/sports}python3 test.py year = ["2023", "2024"] , season = ["2023-24"] ➡️ change year with when you want to discover
season data
: Seasonal data collecting with https://www.basketball-reference.com/leagues/NBA_{self.season}_games.html site.➡️ br_spider.pyscrapy crawl basketball-reference -a season=2020 ➡️ crawl command, change season args with when you want.
merged data
: merge odds data & season data with [date, home, away] ➡️ data_preprocess.py
hadoop과 spark를 사용해 nba 경기의 승부를 예측하는 프로그램입니다.
first commit은 무시하셔도 됩니다..