In Defense of Statistics: In Defense of Polls
It's been a long while since I posted an In Defense of Statistics post. The idea behind these was to explain exactly how the statistical rankings work. I know, crazy people (including the BCS) call them computer rankings, but the computers aren't evaluating games. For the most part, the computers are just solving a system of equations that simple statistics ("the better team is likely to win") sets up.
This one is a little backwards. I'm defending the human polls. That's right. The horribly biased, terribly flawed, human polls. You know them, you hate them, right? I mean, there's that crazy love for the SEC and USC, there's the weird preseason bias where if you're highly ranked at the beginning, you've got an easier path, and then there's just the all-around stupidity that comes from the Harris poll. It's nuts, right?
Strangely, no. Believe it or not: the polls are surprisingly good, and you can prove it. With math.
Yes, People Actually Study This Stuff
There's actually been a fair amount of research into the AP/AFCA (Coaches') polls, mainly because it's a good example of human decision-making in aggregate - the "wisdom of crowds" - and this is, of course, interesting to people in economics, where failures in the "wisdom of crowds" costs actual money.
AP Voters Do Overreact
A lot of the most interesting stuff just came out this year. D.F. Stone, a professor at Oregon State, did research into AP voters' decision making methodologies, and found that voters tend to base their decision primarily on the scoring margin of the game, but they tend to overreact. The "money quote," in my opinion, is this:
I find that the voters do not respond sufficiently to relatively subtle aspects of game results (the
signals), namely, home status and margin of victory over unranked opponents. The voters respond
in a more Bayesian way to margin of victory over ranked opponents and margin of loss against all
opponents.
This, to me, is interesting, because I think this is how almost all of us would like teams to be ranked. We don't care how you pummel creampuffs. We care how bad you lose to a creampuff, and how good your victories over good teams are. This may not be the most predictive result, but ranking teams isn't about predicting performance, it's about evaluating a team's season, and I don't think anyone would really argue with that methodology.
Stone also found evidence that voters tend to overreact immediately after a game - this is almost certainly where the "it's better to lose early" belief, which seems to be real, comes from. That overreaction is common in human decision making. His conclusions appear to support the hypothesis that in general, the polls work well, but the problem is that the AP voters are overtaxed - ranking accurately takes too much work, so they take the easy route out.
The question, though, is how bad is the problem? If the 'easy route' works almost perfectly, is it really that bad? When you consider that the BCS uses a merger of human polls and statistical rankings, maybe the "easy route" is all you really need.
... but there's no preseason bias, and Penn State readers are about to kill me
The other interesting paper that came out recently was a paper by Ross, Larson, and Wall, which looked for conference biases in the AP/AFCA poll, as well as a possible preseason bias. They looked at the AP/AFCA poll from 2003-2008, so the sample size is pretty large.
The results may shock you. First off, again, "the polls are pretty good." The higher-ranked opponent wins ~80% of the time. But there are a few biases, and they're not what you would expect.
The first was presented elsewhere: Campbell, Rogers, and Finney had shown that there's a bias in the AP/AFCA polls towards television coverage. This isn't surprising - when you can see the game, you have a stronger reaction towards the victor and against the loser because you understand the game more. That bias gets confirmed via a proxy of population size, although the real source is almost certainly television coverage.
The other interesting thing is this: there's no weekly effect. That is, the poll was roughly equally accurate at all points in the season. Teams which are ranked high early don't tend to linger and be overrated throughout the season - there is no preseason bias. This agrees with Stone's findings: the polls overreact, and so the preseason rankings don't have a real effect. Also note that this semi-contradicts a finding elsewhere, but the study there was fairly simplistic, and even that author noted that the effect was very small. If preseason rankings are not entirely inaccurate, adding them to a model to predict a final ranking will add information that might not be present. This does not indicate a preseason bias - it indicates that voters aren't stupid. You could likely get a similar result by adding a ranking midway through the season.
As for conferences: the conference that the poll was most biased for? The MAC. Yes, the MAC. I'm not kidding. When polls rate a MAC team, they tend to be overrated. This is probably not really a bias for the MAC. The MAC's had only two teams that ended the year ranked, back in 2003, and the lower-ranked team actually upset the higher ranked team. Other than that, the only ranked teams have proved to be unworthy of that ranking. The authors acknowledge this - it's akin to the "long shot bias" in betting, where heavy favorites win more than you would expect.
But there was a major conference with a statistically-significant bias, and you wouldn't expect sample size to be an issue here: AP/AFCA polls were biased against the SEC, at about a 5-10% likelihood level.
I know, I know. I'm dead to you now. But don't kill the messenger - this is just statistics here. And the size of the bias isn't that large. To be honest, I can come up with an explanation that might further get me killed - quite a few SEC teams have been good over that period, and rankings are ordinal - that is, you can't tie teams. Since there are fewer "top" spots, that'll produce a downward bias in the rankings.
It should also be noted that this might be cyclical - that is, the SEC was underrated early this decade, and now it may be being overrated because it had been underrated for a while. Might the Big Ten experience the same thing when its big-money teams finally stop being mediocre (cough Michigan cough)? Quite possibly.
Can we make them better?
Now it's the opinion part. Clearly, the polls aren't that bad. There's no preseason bias, there's negligible conference bias, and in general voters seem to be prioritizing what we'd like them to do.
Can we do better? I can't say for certain. I can say that the authors of the research pointed out important points - first, the AP Poll acted as you would expect if the voters were overworked. Makes sense - they have other jobs. So it might make sense to create a panel of voters who have the time - that is, they're paid for it, who can see any of the games they wish (free ESPN GamePlan!), and whose performance is periodically monitored to avoid bias or poor voting. It might make sense to do the same for coaches, too - instead of using active coaches, use, say, retired coaches.
Will this work? I have no idea. It might not be feasible. It might be more inaccurate than what we currently have. And finally, it might not be worth it, because as noted above, surprisingly, things aren't that bad.
Thanks for the fanpost! Please do not post any content from a premium site that requires a subscription. Also, if you just want to share a link consider using fanshots instead. Thanks.
4 recs |
15 comments
Comments
Yay!
Bout time, dude!
"For me the game wasn’t grounded in reality. It was about the uniform you put on that turned you into a warrior. It was about the mythology of the battle, the victory, the defeat, the struggle." - Mike Reid, PSU '69
by jtothep on Nov 2, 2009 4:35 PM EST reply actions 0 recs
Yeah, well
It took me a while to find something interesting to write about. Most of those articles were published this year, and I’ve been meaning to write something on them.
by Bleed Blue 'n White on Nov 2, 2009 4:38 PM EST up reply actions 0 recs
I'm just curious here
In an effort to avoid rehashing what you wrote, I’ll ask this in a big-picture kind of way- Aren’t the ‘good’ aspects of human polls basically what Billingsley is trying to capture in his, ahem, “computer model”?
You gave your analysis with a healthy dose of ’don’t shoot the messenger’, and I’d appreciate a response in kind.
"I thought the kid we were using had the potential to be a good quarterback, and I blew that one." - Joseph V. Paterno
by leeharvey418 on Nov 2, 2009 4:39 PM EST reply actions 0 recs
Nope
For two reasons. One, voters use margin of victory, whereas Billingsley can’t because he’s grouped with the sane statistical rankings.
Second, Billingsley has two completely indefensible statements: first, he has a bias towards defense and rewards teams who shut out or nearly shut out opponents. That’s just idiotic. He also rewards teams based on their opponents rankings at the time of the game. This, also, is brain-dead stupid – you’re boosting a team because you had incomplete information on its opponent. No sense. Voters don’t do that. If they did, we’d see a form of preseason bias, because they’d be clinging to early, inaccurate, rankings.
If Billingsley was using his model to determine a vote in the Harris poll, though, I really wouldn’t mind much. His model is based on his opinion, and the aggregate of human opinions on football tend to be what football is.
It isn’t, though. It’s grouped with other rankings which are “ideally unbiased.” So all it does is serve to dilute their benefit.
by Bleed Blue 'n White on Nov 2, 2009 4:53 PM EST up reply actions 0 recs
Pre-season Bias
I didn’t read through the papers you link to, and I’m taking the easy way out and just asking you.
What exactly are you defining as pre-season bias? It seems to me like it is being defined here as “teams that are overrated early on soon drop to where they should be, and vice versa” since you say “teams that are ranked high don’t linger”.
Wouldn’t this be more a function of the pollsters (as a group, not individually, as this definitely is not true) are not idiots, and if a team loses they’ll drop them down to the rest of the 1-loss, 2-loss, 3-loss, etc. teams. But did it look into whether or not a trend exists where if teams finish with (basically) the same win-loss record at the end of the year, the team that started out ranked higher tends to finish ranked higher? It would appear this trend is magnified moreso for undefeated teams (as they never had a loss to use for dropping them).
I would assume that the effect of losing later would drown this out for the other tiers of ranked teams, so it might be moo, but I am curious, especially when I think about years (such as 2005 if we hadn’t lost to Michigan) where there was zero chance of us going to the MNC game unless USC or Texas lost, purely because we started out the season unranked, while they started out 1 & 2, respectively.
Finally, they talk about how the higher ranked team wins ~80% of the time, but did they look at the difference in the ranking. I would guess that there are more matchups with large differences in ranking than small differences in ranking, and so while the #1 team will probably beat the #15 team 80% of the time, I don’t think the same could be said for the #14 team beating the #15 team. I would actually be interested to see a chart with % time higher ranked team wins on the Y-axis, and difference in ranking on the X-axis.
The reason that I bring this up is 2-fold. First, I think that it is generally pretty obvious that the top 5 teams are better than teams 5-10, who are better than teams 10-15, etc., but where the really critical differences arise (at least in the BCS system) is the question of which team is better, the #2 team or the #3 team.
by The JuggerNitt on Nov 2, 2009 5:25 PM EST reply actions 0 recs
But did it look into whether or not a trend exists where if teams finish with (basically) the same win-loss record at the end of the year, the team that started out ranked higher tends to finish ranked higher?
First, I think this year is a great counterexample. Florida was a commanding #1 at the start of the season. They have the same W/L as Alabama two weeks ago, yet Alabama leaped them. Florida’s back at #1 again, but that’s because of what they just did – after all Florida is at the top of a lot of statistical rankings again.
But did it look into whether or not a trend exists where if teams finish with (basically) the same win-loss record at the end of the year, the team that started out ranked higher tends to finish ranked higher?
The way they did it would’ve seen a bias like that for non-undefeated teams, and it isn’t there – basically the higher initially ranked team tends to lose with the same frequency as the lower initially ranked team.
Now, this method wouldn’t’ve been able to easily detect a preseason bias that persists due to a team that goes undefeated. But, of course, it’s impossible to tell from a prediction-based system how good a team that’s undefeated is. Looking at recent history, the only time where the rankings “got it wrong” based on preseason order between 2 undefeated teams is 2005.
(Look at last year’s 1 loss teams, incidentally: the preseason order is noticeably different than the prebowl order. It ‘kinda’ looks like there might be a bias, but looking at the bowl results, the order made sense – #1 beat #2, #5 beat #6, #3 won, #4 lost)
where there was zero chance of us going to the MNC game unless USC or Texas lost, purely because we started out the season unranked, while they started out 1 & 2, respectively.
That’s not true at all. If we hadn’t lost to Michigan in 2005 we could’ve easily gotten in. I mean, again, look at this year. Florida isn’t stuck at #1 because of the preseason.
I know what you’re thinking, of course – if you look at the order of undefeated teams, it’s always mirrored preseason rankings. The problem is that except for 2005, it also mirrored the statistical rankings – that is, the order was the same as preseason, but it made sense.
Finally, they talk about how the higher ranked team wins ~80% of the time, but did they look at the difference in the ranking.
That number is just a nice, catchall average. It’s obviously true that the likelihood that a team will win drops as the difference decreases, but that’s obviously true. To one extreme (A much much better than B) it approaches 100%, to one extreme (B much much better than A) it approaches 0%, so there’s got to be a region where it’s intermediate.
by Bleed Blue 'n White on Nov 2, 2009 6:53 PM EST up reply actions 0 recs
that's what I figured
but as I said, I was lazy and didn’t want to look it up ;-)
As for this year being a counterexample, it definitely is, and I did have that in mind while I was writing, but I’m a bit curious if this is more of an exception to the rule. I suppose there’s no way to really prove it otherwise, as the bowls don’t always match up perfectly to get the data you’d want, but with the data that does exist, it pretty much does show little bias.
by The JuggerNitt on Nov 3, 2009 12:30 PM EST up reply actions 0 recs
The thing is that there are few years that provide any ability to actually test the idea. Even 2005 is a bad example: yeah, USC was a preseason #1 and stuck there, and Texas probably should’ve been ranked ahead, but they were very, very close.
I should point out that there have been studies of the AP poll that do show a path dependence, which would indicate some preseason bias. However, that study was done back in 1996, and studied poll behavior from 1980-1989. It’s also not available freely online, so I didn’t mention it because I’m not sure how relevant it is to today. Back then the poll didn’t actually ‘mean’ anything.
Another point is that even the whole idea of a “preseason bias” is hard to even quantify. It’s a bad thing if a team is ranked above another team for no valid reason, right? But before 2005, voters placed USC #1 and Texas #2 for valid reasons, based on the performance they saw in the previous year and what they knew about who was returning.
Go through 2005, and Texas had a stronger slate of opponents and beat them all, but in the end, USC won all of their games, too, and if you rank them above Texas because of what you know about how Leinart/Bush performed the previous year in addition to what you saw this year… isn’t that a perfectly valid reason?
In other words, the problem with inferring that there’s a preseason bias, and that it’s a bad thing, is that the logic that goes into the preseason polls may still be valid later in the year. And so if a team ends up slightly higher than you might expect based strictly on wins/losses, and that correlates with preseason position, that’s not going to be fixed by, say, eliminating preseason polls.
So yeah, I think a minor apparent preseason bias probably isn’t a bad thing, so long as the position stays justifiable. And based on the predictivity staying constant, it is.
by Bleed Blue 'n White on Nov 3, 2009 1:52 PM EST up reply actions 0 recs
Very interesting
As far as improving them, I think you’re right about not using current coaches and having some sort of monitoring system. Particularly with the Harris voters, anything that boosts accountability is a good thing.
I’m surprised statistics show that we’re actually in pretty good shape, but we can always strive to improve.
It never gets to be easy
by chitownhawkeye on Nov 2, 2009 9:14 PM EST reply actions 0 recs
I don't care what
some biased statistician does to corrupt numbers into an arguement I can’t agree with.
I will not stand for a bunch of eggheads stealing from me my right to complain about biased polls voted on by a bunch of PSU hating communists.
One man doing the work of 100's for the good of 1000's
by rahpsu92 on Nov 3, 2009 9:50 AM EST reply actions 0 recs
Way to stick to your too country guns!
"For me the game wasn’t grounded in reality. It was about the uniform you put on that turned you into a warrior. It was about the mythology of the battle, the victory, the defeat, the struggle." - Mike Reid, PSU '69
by jtothep on Nov 3, 2009 10:24 AM EST up reply actions 0 recs
The first was presented elsewhere: Campbell, Rogers, and Finney had shown that there’s a bias in the AP/AFCA polls towards television coverage.
So it doesn’t seem like a far leap to say the Big Ten Network being on basic or extended cable gives us a competitive advantage that e.g. Big East and ACC teams do not enjoy, in terms of eyeballs. Maybe the columnist that gets bored with Florida scoring 31 on Kentucky in the first quarter on ESPN2 and flips over to watch the Illinois game?
by gumbercules on Nov 3, 2009 8:05 PM EST reply actions 0 recs
Maybe
I’m not sure it’s really a huge advantage, but I’m sure that was some of the thought process going into the Big Ten Network – more publicity is always a good thing.
I think the Big Ten’s still at a big disadvantage compared to the SEC, for instance. That contract with CBS results in a lot of eyeballs.
by Bleed Blue 'n White on Nov 4, 2009 3:39 PM EST up reply actions 0 recs
Same goes for
Notre Dame on NBC. Even more so there, perhaps, being one single team on TV every week. However, I wonder if voters tend to underrate Notre Dame simply because they don’t want to be accused of overrating them. Food for thought.
by jimbo2psu on Nov 4, 2009 4:24 PM EST up reply actions 0 recs

by 

















