Author Topic: Lots of data to look at  (Read 4809 times)

0 Members and 1 Guest are viewing this topic.

Offline mocat

  • Pak'r Élitaire
  • ****
  • Posts: 39169
    • View Profile
Re: Lots of data to look at
« Reply #25 on: October 16, 2012, 02:13:21 PM »
well be my guest. have fun compiling that much data. it's much harder to find out how many possessions a team has than simply adding passing attempts + rushing attempts

I know... it would be nice if the stat services had it available though.  Not like it would be hard or unheard of, they do it for basketball.

possessions in basketball are essentially the same as plays in football; teams usually average somewhere in the 60-80 range

Offline 8manpick

  • Pak'r Élitaire
  • ****
  • Posts: 19132
  • A top quartile binger, poster, and friend
    • View Profile
Re: Lots of data to look at
« Reply #26 on: October 16, 2012, 02:20:24 PM »
well be my guest. have fun compiling that much data. it's much harder to find out how many possessions a team has than simply adding passing attempts + rushing attempts

I know... it would be nice if the stat services had it available though.  Not like it would be hard or unheard of, they do it for basketball.

possessions in basketball are essentially the same as plays in football; teams usually average somewhere in the 60-80 range

Oh boy, tell me more :dubious:

The point is that the possession is the smallest unit you can break a football game into to get the best idea of an offense's (or defense's) effectiveness.  It eliminates the pts/play bias that causes a team that averages 12 plays per possession appear to be half as good as one that averages 6 plays per possession when they score at the same rate.
:adios:

Offline SwiftCat

  • #LIFE
  • Pak'r Élitaire
  • ****
  • Posts: 3618
  • Depth Charge
    • View Profile
Re: Lots of data to look at
« Reply #27 on: October 16, 2012, 02:34:03 PM »
Exactly. I'm sure the charts would still have similar results as points per play, but I still think it'd be a better indicator of a team's overall efficiency.

Offline mocat

  • Pak'r Élitaire
  • ****
  • Posts: 39169
    • View Profile
Re: Lots of data to look at
« Reply #28 on: October 16, 2012, 02:46:03 PM »
well be my guest. have fun compiling that much data. it's much harder to find out how many possessions a team has than simply adding passing attempts + rushing attempts

I know... it would be nice if the stat services had it available though.  Not like it would be hard or unheard of, they do it for basketball.

possessions in basketball are essentially the same as plays in football; teams usually average somewhere in the 60-80 range

Oh boy, tell me more :dubious:

The point is that the possession is the smallest unit you can break a football game into to get the best idea of an offense's (or defense's) effectiveness.  It eliminates the pts/play bias that causes a team that averages 12 plays per possession appear to be half as good as one that averages 6 plays per possession when they score at the same rate.

yeah i get what youre saying, except, k-state is absolutely off the charts on PPP, even with our 12 plays per possession

Offline SleepFighter

  • Katpak'r
  • ***
  • Posts: 1965
  • I'll wait here for my Cherry Coke Zero.
    • View Profile

Offline SleepFighter

  • Katpak'r
  • ***
  • Posts: 1965
  • I'll wait here for my Cherry Coke Zero.
    • View Profile
Re: Lots of data to look at
« Reply #30 on: October 16, 2012, 04:12:49 PM »
To summarize for the straight to the bottom crowd, K-State is #1 in the country in both raw and adjusted points per possession.

Offline Rage Against the McKee

  • Pak'r Élitaire
  • ****
  • Posts: 37111
    • View Profile
Re: Lots of data to look at
« Reply #31 on: October 16, 2012, 04:26:56 PM »
To summarize for the straight to the bottom crowd, K-State is #1 in the country in both raw and adjusted points per possession.

Also #1 in adjusted points per play

Offline Stevesie60

  • Fattyfest Champion
  • Pak'r Élitaire
  • *****
  • Posts: 17146
    • View Profile
Re: Lots of data to look at
« Reply #32 on: October 16, 2012, 04:51:56 PM »

Offline CHONGS

  • Master of the Atom
  • Administrator
  • Pak'r Élitaire
  • *****
  • Posts: 19427
    • View Profile
    • goEMAW.com
Re: Lots of data to look at
« Reply #33 on: October 16, 2012, 05:09:47 PM »
I think there is a slight difference of "philosophy" as it were on how to compare teams.  My goal is to use as few parameters as possible.  I have no doubts a model with 27 fit parameters will fit the data better than a model with 3.  I appreciate the limitations of using only 2+1 statistics (points per play scored, points per play given up, and losses), but in fact that is my goal.  There is a strong correlation between the Pythagorean win % calculated with OE and DE and the actual win %. 

Another limitation is the availability of statistics in computable form.  Possessions are not an easy stat to extract for every team for every game, but I agree I would love to have it.   

Offline michigancat

  • Contributor
  • Pak'r Élitaire
  • *****
  • Posts: 53786
  • change your stupid avatar.
    • View Profile
Re: Lots of data to look at
« Reply #34 on: October 16, 2012, 05:48:56 PM »
I think there is a slight difference of "philosophy" as it were on how to compare teams.  My goal is to use as few parameters as possible.  I have no doubts a model with 27 fit parameters will fit the data better than a model with 3.  I appreciate the limitations of using only 2+1 statistics (points per play scored, points per play given up, and losses), but in fact that is my goal.  There is a strong correlation between the Pythagorean win % calculated with OE and DE and the actual win %. 

Another limitation is the availability of statistics in computable form.  Possessions are not an easy stat to extract for every team for every game, but I agree I would love to have it.   

seems like extracting possessions would be easy. Punts, turnovers, scores, or ends of halves signify ends of possession. Is that in the wolfram download thingy?

Offline CHONGS

  • Master of the Atom
  • Administrator
  • Pak'r Élitaire
  • *****
  • Posts: 19427
    • View Profile
    • goEMAW.com
Re: Re: Lots of data to look at
« Reply #35 on: October 16, 2012, 06:02:40 PM »
I think there is a slight difference of "philosophy" as it were on how to compare teams.  My goal is to use as few parameters as possible.  I have no doubts a model with 27 fit parameters will fit the data better than a model with 3.  I appreciate the limitations of using only 2+1 statistics (points per play scored, points per play given up, and losses), but in fact that is my goal.  There is a strong correlation between the Pythagorean win % calculated with OE and DE and the actual win %. 

Another limitation is the availability of statistics in computable form.  Possessions are not an easy stat to extract for every team for every game, but I agree I would love to have it.   

seems like extracting possessions would be easy. Punts, turnovers, scores, or ends of halves signify ends of possession. Is that in the wolfram download thingy?
Hmmm I would have to see how close the numbers of punts + to + fg attempted + fourth downs not converted + safeties  is to the numnet of offensive possessions,  it might just be close enough.   I will miss on drives stopped by the half and multiple turnover plays might muck it up but should be statistically small.

Offline SwiftCat

  • #LIFE
  • Pak'r Élitaire
  • ****
  • Posts: 3618
  • Depth Charge
    • View Profile
Re: Lots of data to look at
« Reply #36 on: October 16, 2012, 06:03:16 PM »
Is a turnover on downs listed as a turnover? What about of a fumble is returned for a TD?

Offline michigancat

  • Contributor
  • Pak'r Élitaire
  • *****
  • Posts: 53786
  • change your stupid avatar.
    • View Profile
Re: Lots of data to look at
« Reply #37 on: October 16, 2012, 06:06:19 PM »
I think there is a slight difference of "philosophy" as it were on how to compare teams.  My goal is to use as few parameters as possible.  I have no doubts a model with 27 fit parameters will fit the data better than a model with 3.  I appreciate the limitations of using only 2+1 statistics (points per play scored, points per play given up, and losses), but in fact that is my goal.  There is a strong correlation between the Pythagorean win % calculated with OE and DE and the actual win %. 

Another limitation is the availability of statistics in computable form.  Possessions are not an easy stat to extract for every team for every game, but I agree I would love to have it.   

seems like extracting possessions would be easy. Punts, turnovers, scores, or ends of halves signify ends of possession. Is that in the wolfram download thingy?
Hmmm I would have to see how close the numbers of punts + to + fg attempted + fourth downs not converted + safeties  is to the numnet of offensive possessions,  it might just be close enough.   I will miss on drives stopped by the half and multiple turnover plays might muck it up but should be statistically small.

Is there not a timestamp on plays?

And I think a multiple turnover play should probably count as a new possession - the next play is 1st and 10 (or a score) no matter what.

Offline CHONGS

  • Master of the Atom
  • Administrator
  • Pak'r Élitaire
  • *****
  • Posts: 19427
    • View Profile
    • goEMAW.com
Re: Lots of data to look at
« Reply #38 on: October 16, 2012, 06:11:47 PM »
Is a turnover on downs listed as a turnover? What about of a fumble is returned for a TD?
I don't think it officially is, but I could be wrong.

The trouble again will be compiling all of these stats.  While in principle this could maybe be extracted from the website SleepFighter mentioned, it would take downloading/scraping at least 700+ webpages by the end of year.  I think they would greatly frown upon that. 

Right now I extract my stats from  NCAA and I only have to scrape 5 or so pages.  It it through my own record keeping in fact that I can break it down into game by game stats.   There is also the process of building the schedule matrix which can be a pain in the ass.

Offline michigancat

  • Contributor
  • Pak'r Élitaire
  • *****
  • Posts: 53786
  • change your stupid avatar.
    • View Profile
Re: Lots of data to look at
« Reply #39 on: October 16, 2012, 06:16:44 PM »
Is a turnover on downs listed as a turnover? What about of a fumble is returned for a TD?
I don't think it officially is, but I could be wrong.

The trouble again will be compiling all of these stats.  While in principle this could maybe be extracted from the website SleepFighter mentioned, it would take downloading/scraping at least 700+ webpages by the end of year.  I think they would greatly frown upon that. 

Right now I extract my stats from  NCAA and I only have to scrape 5 or so pages.  It it through my own record keeping in fact that I can break it down into game by game stats.   There is also the process of building the schedule matrix which can be a pain in the ass.

can you link to the pages you scrape?

Offline CHONGS

  • Master of the Atom
  • Administrator
  • Pak'r Élitaire
  • *****
  • Posts: 19427
    • View Profile
    • goEMAW.com
Re: Lots of data to look at
« Reply #40 on: October 16, 2012, 06:18:50 PM »
I think there is a slight difference of "philosophy" as it were on how to compare teams.  My goal is to use as few parameters as possible.  I have no doubts a model with 27 fit parameters will fit the data better than a model with 3.  I appreciate the limitations of using only 2+1 statistics (points per play scored, points per play given up, and losses), but in fact that is my goal.  There is a strong correlation between the Pythagorean win % calculated with OE and DE and the actual win %. 

Another limitation is the availability of statistics in computable form.  Possessions are not an easy stat to extract for every team for every game, but I agree I would love to have it.   

seems like extracting possessions would be easy. Punts, turnovers, scores, or ends of halves signify ends of possession. Is that in the wolfram download thingy?
Hmmm I would have to see how close the numbers of punts + to + fg attempted + fourth downs not converted + safeties  is to the numnet of offensive possessions,  it might just be close enough.   I will miss on drives stopped by the half and multiple turnover plays might muck it up but should be statistically small.

Is there not a timestamp on plays?

And I think a multiple turnover play should probably count as a new possession - the next play is 1st and 10 (or a score) no matter what.

The trouble is getting this data and being able to compute with it.  It's the trouble almost every company has: all this data and no efficient, plausible way to use it.    In my opinion, the work required does not impart a big enough benefit.  Averaged over a whole game and across a whole season and against all teams I imagine points per play and points per possession will differ merely by a scaling factor.   This scaling factor is irrelevant if you normalize the data in a consistent manner and should not affect the overall correlation with actual winning %.

Offline CHONGS

  • Master of the Atom
  • Administrator
  • Pak'r Élitaire
  • *****
  • Posts: 19427
    • View Profile
    • goEMAW.com
Re: Lots of data to look at
« Reply #41 on: October 16, 2012, 06:23:17 PM »
This is an example:
http://statistics.ncaafootball.com/merge/tsnform.aspx?c=ncaa-football&page=cfoot/stat/ncaa-team-totaloff.htm

(note only 120 teams are reported, the newest 4 are neglected, but don't really matter much anyway).