Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate Defense Player Stats #288

Merged
merged 31 commits into from
Sep 12, 2022
Merged

Conversation

christianlohr9
Copy link
Contributor

Please note that I just saw in line 618 still is a static year. Let @mrcaseb fix this, he has nothing else to do anyways :)

This script only works season long if the season is pre defined. Working on some stuff and also to get my noob code ready for seasons and weeks.

…ing on some stuff and also to get my noob code ready for seasons and weeks.
@TheMathNinja

This comment was marked as off-topic.

@christianlohr9

This comment was marked as off-topic.

@tanho63
Copy link
Member

tanho63 commented Sep 30, 2021

more globally, every NSE variable will need .data$ prefixed to it for CRAN purposes, e.g. every time you use a column name without quotations

@mrcaseb
Copy link
Member

mrcaseb commented Sep 30, 2021

The principle way calculate_player_stats() works is that we compute all stats on a per game base. And if the argument weekly is set to FALSE we summarise this per player across the complete dataset, i.e. we summarise multiple seasons if the argument pbp holds multiple seasons.

We need a way to write the pivot_longer, pivot_wider calls (which are nice btw) to do it per game rather than directly across the complete dataset

@mrcaseb
Copy link
Member

mrcaseb commented Oct 25, 2021

OK I started updating this. I have removed irrelevant stuff and added a function that can be used to sum across columns using tidyselect variable delection to avoid crashes if a stat is missing.

I also redid the principle counting of tackling stats and implemented a weekly summary. I'd like to use this method for all other stats as well so to be able to summarise over weeks optionally.

@tanho63
Copy link
Member

tanho63 commented Nov 14, 2021

Spent a stream working through this.

TODO:

  • fumbles review/refactor -> DONE, Seb, 2022/08/10
  • penalties review/refactor (I notice at the top it filters only to run/pass etc - that would theoretically drop out all the no-play penalties, right?)
  • touchdowns review/refactor
  • similarly, I notice at the top it pulls out just a data object that only includes run/pass - do special teams and 2pt conv tackles get included in defensive stats?

@mrcaseb mrcaseb marked this pull request as draft August 10, 2022 12:44
@mrcaseb
Copy link
Member

mrcaseb commented Aug 10, 2022

Fumble stats refactored for season week distinction.
Player position now joined via load_players()
Still needs the option to summarise over all seasons and weeks similar to offensive stats

@TheMathNinja
Copy link
Contributor

TheMathNinja commented Aug 19, 2022

Thanks for writing this, @christianlohr9 !
I think we'll also need a penalty_yards = sum(.data$penalty_yards) at the end, yeah?

If either @mrcaseb or @tanho63 can push a commit updating sack_yards and adding penalty_yards in the next couple hours, I could run some pretty extensive testing between now and 2:30 p.m. ET.

@tanho63
Copy link
Member

tanho63 commented Aug 19, 2022

I'm unavailable, but @christianlohr9 is the branch owner and should be able to add it to the branch directly!

@christianlohr9
Copy link
Contributor Author

I just finished work on that function and tested it.
It should work fine now.

Tell me how I submit anything new as branch owner (sorry Tan, I'm a noob :( ) and I will do so :D

@TheMathNinja
Copy link
Contributor

My biggest question about this function is at the highest level: does this function compute only defensive stats while the player is on defense, or does it compute the stats whenever?

Player stats to me are "all of X stat for the given week" - i.e. what shows up on the box score. I understand wanting "on-defense" stats but that's not in scope, imo. Let's keep it simple and get this data out the door - more detailed computations can be done externally from the base data calculation

Quite a few variables here, like penalty, own_fumble_recoveries and even tackles happen on offense and defense

Further to this: we've classified the stats themselves as belonging to either offense or defense or kicking, not "when the player is on the field"

So @tanho63 are you in agreement in saying that this function needs a decent philosophical upgrade/re-write? Right now I'm seeing like 20 references to defteam, but based on what you're saying here, we need to remove those filters so that any stat we are counting here is counting for a player on offense, defense, special teams. So Own Fumble Recoveries and Own Fumble Recovery Yards here is going to apply to a QB picking up their own fumble on offense, etc. Is that right? You want to remove all those references to defteam where it's filtering only plays on the defensive side of the ball?

@TheMathNinja
Copy link
Contributor

TheMathNinja commented Aug 31, 2022

I think before the next update gets pushed, it will be helpful to clarify what we should do with the side-of-the-ball issue, especially for those variables that are presently repeats of calculate_player_stats() used for offensive players (TD, fumbles, etc.). I'm thinking since the work was already done on those, we should re-name the calculate_player_stats_def() variables to something like td_def and fumbles_def, while making every new category (tackles, tackle_assists, penalties, etc.) ignorant of side of the ball.

@mrcaseb
Copy link
Member

mrcaseb commented Sep 7, 2022

Most recent commit

  • fixes bugs according to @christianlohr9 (he added some group_by > summarise > ungroup statements that might be unnecessary but who knows where we actually need them, so I kept all of them),
  • adds team names to intermediate data frames to make sure every player is listed with a team (smart approach @tanho63),
  • adds more player information analog calculate_player_stats

To-Do

@tanho63
Copy link
Member

tanho63 commented Sep 7, 2022

* [ ]  review actual function name?

I don't care what it's named that much but think the current name is fine since it comes up in autocomplete immediately after the current (offense) version.
Most people will use the load_player_stats function anyway which is better ergonomics (stat_type arg)

* [ ]  review [Calculate Defense Player Stats #288 (comment)](https://github.com/nflverse/nflfastR/pull/288#discussion_r961771483)

In hindsight we can be lazy and solve this by naming everything with def_ prefixing it, e.g. def_td, def_int, def_tackle. This disambiguates if user ever tries to rowbind offense and defense stats together. Also it leaves the special teams tackles and such for future exploration and doesn't necessitate a rewrite - I am very very much in fuck it ship it mode since this PR is a year old.

@TheMathNinja
Copy link
Contributor

FWIW I like putting def_ in front of everything that is already included in the current calculate_player_stats() but for things that aren't in that function, (like tackles and penalties, I think there's wisdom in keeping it universal so that we can use this function to populate all of Lamar Jackson's tackles and all of T.J. Watt's penalties in one df). The penalties formula that @christianlohr9 last shipped did not recognize side of ball.

@mrcaseb
Copy link
Member

mrcaseb commented Sep 7, 2022

Hm I think we'll put def_ everywhere to help ourselves recognize which function creates the variable. It is always up to the user to run both of them and combine what user thinks it's necessary to combine.

@tanho63
Copy link
Member

tanho63 commented Sep 7, 2022

also delineates def_tackles from st_tackles later. Penalty code currently checks for penalty_team = defteam which to me means defensive penalties only.

@TheMathNinja
Copy link
Contributor

also delineates def_tackles from st_tackles later. Penalty code currently checks for penalty_team = defteam which to me means defensive penalties only.

Yeah I guess it could just get a bit confusing when that happens on "special teams plays". I knew of specific instances where it was counting special teams penalties when the player was on the defensive side of the ball.

@tanho63 tanho63 marked this pull request as ready for review September 12, 2022 12:05
@tanho63
Copy link
Member

tanho63 commented Sep 12, 2022

Naming finalized as discussed.

Yeah I guess it could just get a bit confusing when that happens on "special teams plays". I knew of specific instances where it was counting special teams penalties when the player was on the defensive side of the ball.

Problem to resolve sometime later, let's ship this. Waiting on @mrcaseb review.

@mrcaseb
Copy link
Member

mrcaseb commented Sep 12, 2022

@tanho63 we will merge another PR soon that will cause conflicts with NEWS. Can you merge when it is in?

@tanho63
Copy link
Member

tanho63 commented Sep 12, 2022

Yep, just gimme a PR content review when you get a chance

NEWS.md Outdated Show resolved Hide resolved
R/aggregate_game_stats_def.R Show resolved Hide resolved
mrcaseb
mrcaseb previously approved these changes Sep 12, 2022
Copy link
Member

@mrcaseb mrcaseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this now

@tanho63 tanho63 merged commit 090140c into nflverse:master Sep 12, 2022
@tanho63
Copy link
Member

tanho63 commented Sep 12, 2022

Thanks for all of the hard work and feedback, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants