-
Notifications
You must be signed in to change notification settings - Fork 413
Tutorial for non programmers: Installation and examples
nflgame is a convenient tool that can be used to programmatically analyze statistics from NFL games. The data is retrieved directly from NFL.com. In fact, it's the same data used to power NFL.com's live updating Game Center. (For the curious, the data is not scraped—it is taken directly from a JSON feed.)
Even if you aren't a programmer, I believe nflgame should still be simple and intuitive enough to at least play around with. At the very least, you can export all of the data to Excel where you might be more comfortable.
Quick, who led the league in rushing between weeks 10 and 14 of the 2010 regular season?
I won't make you hunt down the answer. Here's five lines of code that lists the top ten rushers in weeks 10-14 of the 2009 season:
>>> import nflgame
>>> games = nflgame.games(2010, week=[10, 11, 12, 13, 14])
>>> players = nflgame.combine(games)
>>> for p in players.rushing().sort("rushing_yds").limit(10):
... print p, p.rushing_yds
...
...
M.Jones-Drew 632
M.Turner 480
A.Foster 466
F.Jackson 462
K.Moreno 462
J.Charles 458
P.Hillis 426
C.Johnson 416
S.Jackson 405
B.Green-Ellis 401
By the end of this tutorial, you'll be able to construct your own answers—on your own computer—to equally arbitrary questions!
This tutorial is particularly targeted at people with little or no programming experience. Those with programming experience should feel encouraged to skip ahead and skim the examples for how to use nflgame's API. (However, you may find nflgame's API documentation to be more appropriately condensed.)
If you're a Mac user and need help installing Python and nflgame, please see the Mac Installation Tutorial. Once you've installed nflgame, please come back here and skip to the Using IDLE and nflgame to get NFL statistics section.
nflgame is written in Python, which is a popular programming language. Python can run on many platforms, including Windows, Linux and Mac. In order for you to be able to use nflgame, you'll first have to install Python. Python can be downloaded at python.org. If you're using Windows, you'll want to download either the "Python 2.7.3 Windows Installer" or "Python 2.7.3 Windows X86-64 Installer" file. If you know you have a 64 bit system, choose the second. Otherwise, there is no harm in using the first.
(Note: It is possible that Python is already installed on your system for other programs. If it is, please make sure it is at least version 2.7 and not 3.x.x. If it isn't, just proceed with these instructions as normal—installing a different version shouldn't interfere with existing installations.)
(Note for Mac/Linux users: You very likely already have Python on your
system. Please make sure that Python 2.7.x is installed, and if it isn't, use
your distribution's package manager to install it. Once that's done, you should
be able to install nflgame by simply running sudo pip-2.7 install nflgame
.
You can now skip ahead to the next section.)
After it has downloaded, run the installer and hit next through the install screens. After that's done, there should be a new entry in your start menu called something like "Python2.7". Remember that, because we're going to use a program in that Python2.7 folder in a bit. Python is now installed on your system.
The easiest way to install and maintain nflgame is with PIP. If you are not familiar with this package manager, reference the install instructions from the nfldb wiki.
Now all you have to do is install nflgame. If PIP is installed properly, this
is as simple as pip install git+https://github.com/BurntSushi/nflgame.git
from a command prompt.
(64-bit woes: Some users have been reporting that this error occurs when installing nflgame with 64-bit versions of Python in Windows: "No Python installation found in the registry." I am unsure of what's causing that—assuming they have a 64-bit CPU—but uninstalling the 64-bit Python and installing the "Python 2.7.3 Windows Installer" seems to fix it. Alternatively, another user has suggested to install Python with the "Only available to me" option checked instead of "available to all Windows users.")
Once that's complete, everything should be installed and ready to go. Open your start menu, find the Python2.7 folder, click it, and then click on the "IDLE" program.
(Note for Mac/Linux users: Instead of using IDLE, simply open a terminal
and type python2.7
, or if that doesn't work, python2
—and then hit
enter.)
If you're following along, you should have the IDLE program open. IDLE is an interpreter which allows you to run Python code on the fly. You can do a little experimenting and try some basic math:
(The lines starting with ">>>" are prompts where you type. After you type a command, press enter. The lines that don't start with a ">>>" are output from the previous command.)
>>> 5 * 5
25
>>> 1 + 2 + 3 + 4 + 5
15
>>> 10 / 5
2
>>> 1 - 10
-9
However, I suspect you're reading this so you can learn how to use nflgame. (If you do want to learn more about Python, you can either go with a gentle introduction or if you're more ambitious, you can learn it the hard way.) To get started using nflgame, you have to tell Python that you want to use it:
>>> import nflgame
If no errors are reported, then nflgame has been successfully imported and is ready to be used. (If there is an error—usually called an ImportError—then it means nflgame has not been installed or had a problem while trying to install. Whatever the case, please ask for help and provide as many details as you can.) To get our feet wet, let's check out who played in last year's week 17 game when the Patriots smacked the Bills:
>>> game = nflgame.one(2011, 17, "NE", "BUF")
>>> print game.players
[B.Hoyer, T.Brady, B.Green-Ellis, A.Hernandez, J.Edelman, S.Ridley, D.Woodhead, R.Gronkowski, W.Welker, S.Gostkowski, Z.Mesko, M.Slater, K.Arrington, M.Anderson, J.Mayo, A.Molden, N.Jones, P.
Chung, B.Deaderick, D.Fletcher, D.McCourty, V.Wilfork, N.Koutouvides, R.Ninkovich, K.Love, L.Polite, B.Spikes, S.Moore, R.Fitzpatrick, C.Spiller, G.Wilson, T.Choice, R.Martin, N.Roosevelt, D.
Nelson, S.Chandler, D.Hagan, K.Brock, St.Johnson, C.Brown, B.Coutu, B.Moorman, J.Rogers, L.McKelvin, D.Florence, J.Byrd, N.Barnett, L.Dotson, Sp.Johnson, M.Dareus, K.Heard, C.Kelsay, A.Carrin
gton, K.Morrison, A.Davis, K.Moore, B.Scott, K.Sheppard, A.Moats, A.Williams, D.Edwards]
We first had to tell nflgame which game we want to inspect. We do that by
calling a function called one
which always returns a single game. A game can
be specified by the year, week number and the home and away teams (where the
home team always comes first). The game returned by one
is now stored in the
game
variable. (In Python, the =
sign means "assign the thing on the right
to the thing on the left.")
Using the game stored in the game
variable, we can access every player in the
game using the players
property, which is accessed by game.players
. We then
print it using the print
statement—which simply echos a list of player
names in the game.
Let's get a little more interesting. What if we wanted to see who threw passes
in the game? We can search our list of players using methods that filter the
data. Assuming the game
variable still holds that week 17 NE vs. BUF game:
>>> print game.players.passing()
[B.Hoyer, T.Brady, R.Fitzpatrick]
We can do the same thing for rushing, receiving, defense or kicking:
>>> print game.players.rushing()
[B.Hoyer, B.Green-Ellis, A.Hernandez, J.Edelman, S.Ridley, D.Woodhead, R.Fitzpatrick, C.Spiller, G.Wilson, T.Choice]
>>> print game.players.receiving()
[B.Green-Ellis, A.Hernandez, D.Woodhead, R.Gronkowski, W.Welker, C.Spiller, T.Choice, R.Martin, N.Roosevelt, D.Nelson, S.Chandler, D.Hagan, K.Brock, St.Johnson]
>>> print game.players.defense()
[J.Edelman, K.Arrington, M.Anderson, J.Mayo, A.Molden, N.Jones, P.Chung, B.Deaderick, D.Fletcher, D.McCourty, V.Wilfork, N.Koutouvides, R.Ninkovich, K.Love, L.Polite, B.Spikes, S.Moore, G.Wil
son, J.Rogers, D.Florence, J.Byrd, N.Barnett, L.Dotson, Sp.Johnson, M.Dareus, K.Heard, C.Kelsay, A.Carrington, K.Morrison, A.Davis, K.Moore, B.Scott, K.Sheppard, A.Moats, A.Williams, D.Edward
s]
>>> print game.players.kicking()
[S.Gostkowski, B.Coutu]
To close out this first section, let's see how we can look at more than a player's name. In order to do this, we need something called a loop—which is simply a way to walk through each player in the lists we printed above, and do something with each player. For example, we could print each passer's completions, attempts and yards:
>>> for p in game.players.passing():
... print p, p.passing_cmp, p.passing_att, p.passing_yds
...
...
B.Hoyer 1 1 22
T.Brady 23 35 338
R.Fitzpatrick 29 46 307
Here we use Python's for loop to walk through each player that has a passing
statistic. We store each player in the variable p
. Finally, since we
restricted our list of players to players that have passed the ball, we can
access passing statistics such as passing_cmp
, passing_att
and
passing_yds
—which are properties of the player stored in p
.
In the previous section, we saw how to get lists of players with certain statistics like passing, rushing or receiving. But what if we want to filter those players even more? Perhaps we're only interested in players on the defense that have two interceptions:
>>> print game.players.defense().filter(defense_int=2)
[S.Moore]
The filter
method here filters only the players with defensive stats since
we've used game.players.defense()
. The filter says, "Take only players in the
list whose property defense_int
is equivalent to 2."
We can also use filter to look at only the home team players:
>>> print game.players.filter(home=True)
[B.Hoyer, T.Brady, B.Green-Ellis, A.Hernandez, J.Edelman, S.Ridley, D.Woodhead, R.Gronkowski, W.Welker, S.Gostkowski, Z.Mesko, M.Slater, K.Arrington, M.Anderson, J.Mayo, A.Molden, N.Jones, P.
Chung, B.Deaderick, D.Fletcher, D.McCourty, V.Wilfork, N.Koutouvides, R.Ninkovich, K.Love, L.Polite, B.Spikes, S.Moore]
In this case, New England is the home team, so only players on the Patriots are returned.
A more advanced use of filter
is to use functions to determine whether a
particular stat should be filtered or not. For example, here we look at every
player in the game with at least one interception:
>>> print game.players.defense().filter(defense_int=lambda x: x >= 1)
[A.Molden, D.McCourty, S.Moore, N.Barnett]
And finally, filter attributes can be combined:
>>> print game.players.defense().filter(home=True, defense_int=lambda x: x >= 1)
[A.Molden, D.McCourty, S.Moore]
Which returns only players on the home team that have at least one interception.
One of the most important aspects of viewing statistics is the ability to sort them. Sorting works much like everything else we've seen so far. It is a method that can be used on any list of players.
For example, we might want to see a list of rushing leaders in the game by yards:
>>> for p in game.players.rushing().sort("rushing_yds"):
>>> ... print p, p.rushing_att, p.rushing_yds
>>> ...
>>> ...
S.Ridley 15 81
C.Spiller 13 60
R.Fitzpatrick 5 36
A.Hernandez 2 26
B.Green-Ellis 7 22
J.Edelman 1 6
G.Wilson 1 6
D.Woodhead 1 5
T.Choice 1 4
B.Hoyer 3 -2
Or we could sort rushers by attempts:
>>> for p in game.players.rushing().sort("rushing_att"):
>>> ... print p, p.rushing_att, p.rushing_yds
>>> ...
>>> ...
S.Ridley 15 81
C.Spiller 13 60
B.Green-Ellis 7 22
R.Fitzpatrick 5 36
B.Hoyer 3 -2
A.Hernandez 2 26
J.Edelman 1 6
D.Woodhead 1 5
G.Wilson 1 6
T.Choice 1 4
Perhaps you only care about the statistics of a few star players. nflgame provides a way to look up statistics by player name:
>>> tom_brady = game.players.name("T.Brady")
>>> print tom_brady, "\n", tom_brady.formatted_stats()
T.Brady
passing_twoptm: 0, passing_twopta: 0, passing_att: 35, passing_ints: 1, passing_tds: 3, passing_yds: 338, passing_cmp: 23, fumbles_lost: 0, fumbles_trcv: 1, fumbles_rcv: 1, fumbles_tot: 1, fu
mbles_yds: 0
Here we use the name
method which works on any list of players. It searches
the current list of players for a player that has a name matching the one
provided. (Note: Names are typically first initial, followed by a ".",
followed by the last name with no spaces. The names are this way because it
is how the NFL formats their GameCenter data.) If no player is found, a special
value called None
is returned.
In this example, we load a player into the aptly named tom_brady
variable. We
then print the player's name and a roughly formatted list of all statistics
available for the player in this particular game. The formatted statistics are
accessed with the formatted_stats
method, which works on every player. (There
is also a stats
attribute reference that returns a dictionary of statistics.)
Another convenience method that works on any list of players is the
touchdowns
method. It filters out the players to only those who have hit
pay-dirt. For example, we can look at every player who scored in this game:
>>> for p in game.players.touchdowns():
>>> ... print p, p.tds
>>> ...
>>> ...
T.Brady 3
B.Green-Ellis 2
A.Hernandez 1
R.Gronkowski 2
R.Fitzpatrick 2
C.Spiller 1
T.Choice 1
St.Johnson 1
For each player, we print out the player's name and the total number of touchdowns credited to the player across all statistical categories.
There is a host of information related to the game itself that may also be interesting to see. Such things include the current game time, the score, the quarter, scoring plays, etc. Here are a couple examples:
>>> print g.winner
NE
>>> print g.game_over()
True
>>> print g.score_home, g.score_away
49 21
And of course, checking out all of the scoring in this game is particularly startling:
>>> for score_play in g.scores:
... print score_play
...
...
BUF - Q1 - TD - T.Choice 4 yd. run (B.Coutu kick is good) Drive: 8 plays, 80 yards in 3:42
BUF - Q1 - TD - St.Johnson 18 yd. pass from R.Fitzpatrick (B.Coutu kick is good) Drive: 10 play
s, 70 yards in 4:33
BUF - Q1 - TD - C.Spiller 15 yd. pass from R.Fitzpatrick (B.Coutu kick is good) Drive: 6 plays,
82 yards in 3:07
NE - Q2 - TD - B.Green-Ellis 1 yd. run (S.Gostkowski kick is good) Drive: 9 plays, 77 yards in
3:50
NE - Q2 - TD - A.Hernandez 39 yd. pass from T.Brady (S.Gostkowski kick is good) Drive: 7 plays,
81 yards in 3:17
NE - Q3 - FG - S.Gostkowski 47 yd. Field Goal Drive: 9 plays, 50 yards in 2:41
NE - Q3 - FG - S.Gostkowski 20 yd. Field Goal Drive: 9 plays, 71 yards in 3:25
NE - Q3 - TD - R.Gronkowski 17 yd. pass from T.Brady (D.Woodhead run) Drive: 5 plays, 25 yards
in 2:22
NE - Q4 - TD - B.Green-Ellis 3 yd. run (S.Gostkowski kick is good) Drive: 2 plays, 47 yards in
0:48
NE - Q4 - TD - R.Gronkowski 7 yd. pass from T.Brady (S.Gostkowski kick is good) Drive: 14 plays
, 88 yards in 7:21
NE - Q4 - TD - S.Moore 21 yd. interception return (S.Gostkowski kick is good)
Absolutely! Every list of players can be exported to a comma-separated values
(CSV) file by using the csv
method. Indeed, it can be applied to any of the
aforementioned examples.
To export all players with passing statistics, sorted by passing yards:
>>> g.players.passing().sort("passing_yds").csv("passers.csv")
If nothing appears after you hit enter, that means the command executed
successfully. The data is saved to passers.csv
. In the above example you
could change "passers.csv"
to something like
"C:/Users/YourUsername/Desktop/passers.csv"
to have the file
saved to your desktop. You should then be able to open it with Excel, Google
Docs, Open Office, Libre Office, etc.
Up until this point, we've focused on examining statistics for just a single
game. But what if we wanted to examine statistics for an entire week—or
even an entire season? In fact, we can examine statistics for any number of
games using nflgame.combine
, which takes a list of games and returns a
sequence of players in every game. And the best part is, player statistics are
automatically added together for you if they've played in more than one of
those games.
Enough blabbing. How about the top ten rushers in week 2 of the 2009 season? First, let's get all of the players that played in week 2 of 2009:
>>> week2 = nflgame.games(2009, 2)
>>> players = nflgame.combine(week2)
We use nflgame.games
to automatically retrieve statistics for all games in
week 2 of the 2009 season. We then use nflgame.combine
to group all of the
players in those games into a single searchable list of players. Now, we only
need to apply what we've already learned from previous examples:
>>> for p in players.rushing().sort("rushing_yds").limit(10):
... print p, p.rushing_att, p.rushing_yds, p.rushing_tds
...
...
F.Gore 16 207 2
C.Johnson 16 197 2
F.Jackson 28 163 0
C.Benson 29 141 0
R.Brown 24 136 2
M.Barber 18 124 1
M.Turner 28 105 1
S.Jackson 17 104 0
F.Jones 7 96 1
A.Peterson 15 92 1
What if you wanted to see who passed for the most touchdowns in the first five weeks of the 2011 season?
>>> games1_5 = nflgame.games(2011, week=[1, 2, 3, 4, 5])
>>> players = nflgame.combine(games1_5)
>>> for p in players.passing().sort("passing_tds").limit(20):
... print p, p.passing_tds
...
...
T.Brady 13
A.Rodgers 12
M.Stafford 11
D.Brees 10
R.Fitzpatrick 9
M.Hasselbeck 8
E.Manning 8
K.Orton 8
J.Flacco 7
M.Schaub 7
M.Ryan 6
M.Vick 6
M.Sanchez 6
C.McCoy 5
J.Cutler 5
R.Grossman 5
K.Kolb 5
C.Newton 5
T.Jackson 5
P.Rivers 5
Or perhaps how many touchdowns Brady threw at home in 2010?
>>> nehome = nflgame.games(2010, home="NE")
>>> players = nflgame.combine(nehome)
>>> brady = players.name("T.Brady")
>>> print brady, brady.passing_tds
T.Brady 16
And for away games in 2010?
>>> neaway = nflgame.games(2010, away="NE")
>>> players = nflgame.combine(neaway)
>>> brady = players.name("T.Brady")
>>> print brady, brady.passing_tds
T.Brady 15
Or how about the receiving leaders for the entire 2009 season?
(Note: The first two prompts might take a few seconds to complete, depending upon the speed of your computer. It has to read, load and merge 256 games!)
>>> season2009 = nflgame.games(2009)
>>> players = nflgame.combine(season2009)
>>> for p in players.receiving().sort("receiving_yds").limit(15):
... print p, p.receiving_yds, p.receiving_rec, p.receiving_tds
...
...
A.Johnson 1504 95 9
W.Welker 1336 122 4
S.Holmes 1243 78 4
R.Wayne 1243 95 10
M.Austin 1230 74 11
S.Rice 1200 78 6
R.Moss 1189 78 13
S.Smith 1163 97 7
A.Gates 1145 78 7
D.Jackson 1120 60 9
H.Ward 1106 87 6
V.Jackson 1097 63 9
G.Jennings 1091 66 4
R.White 1087 79 10
B.Marshall 1081 93 10
Remember, with any of the above examples, you can export the statistics to a CSV file that can be read by excel. For example, to export the entire 2011 season in just a single line:
>>> nflgame.combine(nflgame.games(2011)).csv("2011.csv")
This tutorial has covered the essentials of what nflgame has to offer. nflgame's convenient API—coupled with the ability to get live updates as games are being played make the possibilities of what you can do with nflgame limitless. (Perhaps a well built and open source piece of fantasy football software?)
If you think your Python-fu is up to snuff, check out nflgame's API, which offers a complete look at what nflgame has to offer.
Have fun!