Within the data folder are 3 folders containing .csv files to download. To download a play-by-play dataset, click on it then right click on the Download button, and use save link as to save the file.
For the team and player season/game stats, click on it then right click on the Raw button, , and use save link as to save the file.
- 2009 play-by-play
- 2010 play-by-play
- 2011 play-by-play
- 2012 play-by-play
- 2013 play-by-play
- 2014 play-by-play
- 2015 play-by-play
- 2016 play-by-play
- 2017 play-by-play
The season_play_by_play folder contains play-by-play datasets for each regular season game in each season from 2009-2016. The files are separated by season. Each dataset contains 100 columns (see the nflscrapR game_play_by_play() and season_play_by_play() function documentation for descriptions).
Through list manipulation using the do.call and rbind functions a 13 column dataframe with basic information populates directly from the NFL JSON API. These columns include the following:
"Drive" - Drive number
"sp" - Whether the play resulted in a score (any kind of score)
"qtr" - Quarter of Game
"down" - Down of the given play
"time" - Time at start of play
"yrdln" - Between 0 and 50
"ydstogo" - Yards to go for a first down
"ydsnet" - Total yards gained on a given drive
"posteam" - The team on offense
"AirYards" - Number of yards the ball was thrown in the air for both complete and incomplete pass attempts (negative means behind line of scrimmage)
"YardsAfterCatch" - Number of yards receiver gained after catch
"QBHit" - Binary: 1 if the QB was knocked to the ground else 0
"desc" - A detailed description of what occured during the play
Through string manipulation and parsing of the description column using base R and stringR, columns were added to the original dataframe allowing the user to have a detailed breakdown of the events of each play. Also provided are calculations for the expected points and win probability for each play using models built entirely on nflscrapR data. The added variables are specified below:
"Date" - Date of game
"GameID" - The ID of the specified game
"TimeSecs" - Time remaining in game in seconds
"PlayTimeDiff" - The time difference between plays in seconds
"DefensiveTeam" - The defensive team on the play (for punts the receiving team is on defense, for kickoffs the receiving team is on offense)
"TimeUnder" - Minutes remaining in half
"SideofField" - The side of the field that the line of scrimmage is on
yrdline100 - Distance to opponents endzone, ranges from 1-99 situation
GoalToGo - Binary: 1 if the play is in a goal down situation else 0
"FirstDown" - Binary: 1if the play resulted in a first down conversion else 0
"PlayAttempted" - A variable used to count the number of plays in a game (should always be equal to 1)
"Yards.Gained" - Amount of yards gained on the play
"Touchdown" - Binary: 1 if the play resulted in a TD else 0
"ExPointResult" - Result of the extra-point: Made, Missed, Blocked, Aborted
"TwoPointConv" - Result of two-point conversion: Success or Failure
"DefTwoPoint" - Result of defensive two-point conversion: Success or Failure
"Safety" - Binary: 1 if safety was recorded else 0
"Onsidekick" - Binary: 1 if the kickoff was an onside kick
"PuntResult - Result of punt: Clean or Blocked
"PlayType" - The type of play that occured, potential values are:
Kickoff, Punt
Pass, Sack, Run
Field Goal, Extra Point
Quarter End, Two Minute Warning, Half End, End of Game
No Play, QB Kneel, Spike, Timeout
"Passer" - The passer on the play if it was a pass play
"Passer_ID" - NFL GSIS player ID for the passer
"PassAttempt" - Binary: 1 if a pass was attempted else 0
"PassOutcome" - Pass Result: Complete or Incomplete Pass
"PassLength" - Categorical variable indicating the length of the pass: Short or Deep
"PassLocation" - Location of the pass: left, middle, right
"InterceptionThrown" - Binary: 1 if an interception else 0
"Interceptor" - The player who intercepted the ball
"Rusher" - The runner on the play if it was a running play
"Rusher_ID" - NFL GSIS player ID for the rusher
"RushAttempt" - Binary: 1 if a run was attempted else 0
"RunLocation" - Location of the run: left, middle, right
"RunGap" - Gap of the run: guard, tackle, end
"Receiver" - The targeted receiver of a play
"Receiver_ID" - NFL GSIS player ID for the receiver
"Reception" - Binary: 1 if a reception was recorded else 0
"ReturnResult" - Result of a punt, kickoff, interception, or fumble return: Fair Catch, Touchback, Touchdown
"Returner" - The punt or kickoff returner
"BlockingPlayer" - The player who blocked the extra point, field goal, or punt
"Tackler1" - The primary tackler on the play
"Tackler2" - The secondary tackler on the play
"FieldGoalResult" - Outcome of a fieldgoal: No Good, Good, Blocked
"FieldGoalDistance" - Field goal length in yards
"Fumble" - Binary: 1 if a fumble occured else no
"RecFumbTeam" - Team that recovered the fumble
"RecFumbPlayer" - Player that recovered the fumble
"Sack" - Binary: 1 if a sack was recorded else 0
"Challenge.Replay" - Binary: 1 if play was reviewed by the replay official else 0
"ChalReplayResult" - Result of the replay review: Upheld or Reversed
"Accepted.Penalty" - Binary: 1 if penalty was accepted else 0
"PenalizedTeam" - The penalized team on the play
"PenaltyType" - Type of penalty on the play, alues include:
Unnecessary Roughness, Roughing the Passer
Illegal Formation, Defensive Offside
Delay of Game, False Start, Illegal Shift
Illegal Block Above the Waist, Personal Foul
Unnecessary Roughness, Illegal Blindside Block
Defensive Pass Interference, Offensive Pass Interference
Fair Catch Interference, Unsportsmanlike Conduct
Running Into the Kicker, Illegal Kick
Illegal Contact, Defensive Holding
Illegal Motion, Low Block
Illegal Substitution, Neutral Zone Infraction
Ineligible Downfield Pass, Roughing the Passer
Illegal Use of Hands, Defensive Delay of Game
Defensive 12 On-field, Offensive Offside
Tripping, Taunting, Chop Block
Interference with Opportunity to Catch, Illegal Touch Pass
Illegal Touch Kick, Offside on Free Kick
Intentional Grounding, Horse Collar
Illegal Forward Pass, Player Out of Bounds on Punt
Clipping, Roughing the Kicker, Ineligible Downfield Kick
Offensive 12 On-field, Disqualification
"PenalizedPlayer" - The penalized player
"Penalty.Yards" - The number of yards that the penalty resulted in
"PosTeamScore" - The score of the possession team (offensive team)
"DefTeamScore" - The score of the defensive team
"ScoreDiff" - The difference in score between the offensive and defensive teams (offensive.score - def.score)
"AbsScoreDiff" - Absolute value of the score differential
"HomeTeam" - The home team
"AwayTeam" - The away team
"Timeout_Indicator" - Binary: 1 if a timeout was charge else 0
"Timeout_Team" - Team charged with penalty (None if no timeout)
"posteam_timeouts_pre" - Number of timeouts remaining for possession team at the start of the play
"HomeTimeouts_Remaining_Pre" - Number of timeouts remaining for home team at the start of the play
"AwayTimeouts_Remaining_Pre" - Number of timeouts remaining for away team at the start of the play
"HomeTimeouts_Remaining_Post" - Number of timeouts remaining for home team at the end of the play (handles loss of timeout from lost challenge)
"AwayTimeouts_Remaining_Post" - Number of timeouts remaining for away team at the end of the play (handles loss of timeout from lost challenge)
"No_Score_Prob" - Probability of no score occurring within the half
"Opp_Field_Goal_Prob" - Probability of the defensive team scoring a field goal next
"Opp_Safety_Prob" - Probability of the defensive team scoring a safety next
"Opp_Touchdown_Prob" - Probability of the defensive team scoring a touchdown next
"Field_Goal_Prob" - Probability of the possession team scoring a field goal next
"Safety_Prob" - Probability of the possession team scoring a safety next
"Touchdown_Prob" - Probability of the possession team scoring a touchdown next
"ExPoint_Prob" - Probability of the possession team making the PAT
"TwoPoint_Prob" - Probability of the possession team converting the two-point conversion
"ExpPts" - The expected points for the possession team at the start of the play
"EPA" - Expected points added with respect to the possession team considering the result of the play
"airEPA" - Expected points added from air yards
"yacEPA" - Expected points added from yards after catch
"Home_WP_pre" - The win probability for the home team at the start of the play
"Away_WP_pre" - The win probability for the away team at the start of the play
"Home_WP_post" - The win probability for the home team at the end of the play
"Away_WP_post" - The win probability for the away team at the end of the play
"Win_Prob" - The win probability added for team with possession
"WPA" - The win probability added with respect to the possession team
The season_player_stats folder contains datasets with player statistics at the season level, while the game_player_stats contains datasets with player statistics at the game level. There are 3 types of datasets in each folder:
- Passing
- Receiving
- Rushing
Both game and season level have a Team column with the abbreviation for which team the player was on. The game level datasets contain a column for the opponent's team.
The season_team_stats and game_team_stats folders contain the same statistics but at the team level, with the first player list being the most prevalent player.
The datasets contain traditional stats as well as expected points added (EPA) based metrics. See here for which stats are more predictable on a seasonal basis (HINT: not all yards are created equal)
- Attempts - number of pass attempts
- Completions - number of completed passes
- Comp_Perc - completion percentage
- Total_Yards - total yards gained from passing
- Total_Raw_AirYards - total air yards thrown (both complete and incomplete passes)
- Yards_per_Att - yards gained from passing per attempt
- Yards_per_Comp - yards gained from passing per completion
- Raw_AirYards_per_Att - air yards thrown per pass attempt (all pass attempts)
- Comp_AirYards_per_Comp - air yards thrown per completion (only completed passes)
- TimesHit - number of times the QB was knocked down (includes being hit on pass attempts)
- Interceptions - number of interceptions
- TDs - number of touchdowns thrown
- Air_TDs - number of touchdowns thrown with 0 yards after catch
- Air_TD_Rate - percentage of touchdowns thrown in the air
- TD_to_Int - TD to INT ratio
- Total_EPA - total expected points added from pass attempts
- Success_Rate - percentage of pass attempts with positive EPA
- EPA_per_Att - EPA per pass attempt
- EPA_per_Comp - EPA per completion
- EPA_Comp_Perc - completion percentage weighted by the EPA for each play
- TD_per_Att - TDs thrown per pass attempt
- Int_per_Att - INTs thrown per pass attempt
- TD_per_Comp - TDs thrown per completion
- Total_WPA - total win probability added from pass attempts
- Win_Success_Rate - percentage of pass attempts with positive WPA
- WPA_per_Att - WPA per pass attempt
- WPA_per_Comp - WPA per completion
- WPA_Comp_Perc - completion percentage weighted by the WPA for each play
- Total_Clutch_EPA - total EPA from pass attempts weighted by each play's WPA
- airEPA_Comp - total EPA in the air for completed passes
- airEPA_Incomp - total EPA that could've been from incomplete passes
- Total_airEPA - total EPA in the air from all passes
- airEPA_per_Att - total airEPA per pass attempt
- airEPA_per_Comp - total airEPA from completions per completion
- air_Success_Rate - percentage of pass attempts with positive airEPA
- air_Comp_Success - percentage of pass attempts with positive airEPA for completions
- Drives - number of drives involved in
- Total_Comp_AirYards - total air yards from completions
- Targets - number of targets
- Receptions - number of receptions
- Total_Yards - total yards gained from receiving
- Total_Raw_YAC - total of yards after catch (YAC) as well as yards missed from incomplete passes
- Yards_per_Rec - yards gained per reception
- Yards_per_Target - yards gained per target
- Caught_YAC_per_Rec - yards gained after catch per reception
- Rec_Percentage - reception percentage
- Fumbles - number of fumbles
- TDs - number of TD receptions
- AC_TDs - number of touchdowns receptions with positive yards after catch
- AC_TD_Rate - percentage of touchdown receptions with positive yards after catch
- TD_to_Fumbles - number of interceptions
- Total_EPA - total expected points added from all targets
- Success_Rate - percentage of targets with positive EPA
- EPA_per_Rec - EPA per reception
- EPA_per_Target - EPA per target
- EPA_Rec_Perc - reception percentage weighted by the EPA for each play
- TD_per_Targets - TDs caught per target
- Fumbles_per_Rec - fumbles per reception
- TD_per_Rec - TDs caught per reception
- Total_WPA - total win probability added from all targets
- Win_Success_Rate - percentage of targets with positive WPA
- WPA_per_Target - WPA per target
- WPA_per_Rec - WPA per reception
- WPA_Rec_Perc - reception percentage weighted by the WPA for each play
- Total_Clutch_EPA - total EPA from targets weighted by each play's WPA
- AirYards_per_Target - air yards per target
- airEPA_per_Target - airEPA per target providing the value of the air yards
- yacEPA_Rec - EPA from yards after catch for receptions
- yacEPA_Drop - the cost of dropping/missing the pass
- Total_yacEPA - total EPA from yards after catch including cost of drops
- yacEPA_per_Target - total EPA from yards after catch per target
- yacEPA_per_Rec - total EPA from yards after catch per reception
- yac_Success_Rate - percentage of targets with positive yacEPA
- yac_Rec_Success_Rate - percentage of targets with positive yacEPA for receptions
- Total_Caught_YAC - total yards gained after catch for receptions
- Total_Dropped_YAC - total yards missed from incomplete passes
- Carries - number of carries
- Total_Yards - total yards gained from rushing
- Yards_per_Car - yards gained per carry
- Fumbles - number of fumbles
- TDs - number of TD rushes
- TD_to_Fumbles - number of interceptions
- Total_EPA - total expected points added from all carries
- Success_Rate - percentage of carries with positive EPA
- EPA_per_Car - EPA per reception
- EPA_Ratio - ratio of EPA for carries with positive EPA to total EPA
- TD_per_Car - TDs per carry
- Fumbles_per_Car - fumbles per carry
- Total_WPA - total win probability added from all carries
- Win_Success_Rate - percentage of carries with positive WPA
- WPA_per_Car - WPA per carry
- WPA_Ratio - ratio of WPA for carries with positive WPA to total WPA
- Total_Clutch_EPA - total EPA from carries weighted by each play's WPA