FanPost

In Defense of Statistics: In Defense of Polls



It's been a long while since I posted an In Defense of Statistics post. The idea behind these was to explain exactly how the statistical rankings work. I know, crazy people (including the BCS) call them computer rankings, but the computers aren't evaluating games. For the most part, the computers are just solving a system of equations that simple statistics ("the better team is likely to win") sets up.

This one is a little backwards. I'm defending the human polls. That's right. The horribly biased, terribly flawed, human polls. You know them, you hate them, right? I mean, there's that crazy love for the SEC and USC, there's the weird preseason bias where if you're highly ranked at the beginning, you've got an easier path, and then there's just the all-around stupidity that comes from the Harris poll. It's nuts, right?

Strangely, no. Believe it or not: the polls are surprisingly good, and you can prove it. With math.

Yes, People Actually Study This Stuff

There's actually been a fair amount of research into the AP/AFCA (Coaches') polls, mainly because it's a good example of human decision-making in aggregate - the "wisdom of crowds" - and this is, of course, interesting to people in economics, where failures in the "wisdom of crowds" costs actual money.

AP Voters Do Overreact

A lot of the most interesting stuff just came out this year. D.F. Stone, a professor at Oregon State, did research into AP voters' decision making methodologies, and found that voters tend to base their decision primarily on the scoring margin of the game, but they tend to overreact. The "money quote," in my opinion, is this:

    I find that the voters do not respond sufficiently to relatively subtle aspects of game results (the
signals), namely, home status and margin of victory over unranked opponents. The voters respond
in a more Bayesian way to margin of victory over ranked opponents and margin of loss against all
opponents.

This, to me, is interesting, because I think this is how almost all of us would like teams to be ranked. We don't care how you pummel creampuffs. We care how bad you lose to a creampuff, and how good your victories over good teams are. This may not be the most predictive result, but ranking teams isn't about predicting performance, it's about evaluating a team's season, and I don't think anyone would really argue with that methodology.

Stone also found evidence that voters tend to overreact immediately after a game - this is almost certainly where the "it's better to lose early" belief, which seems to be real, comes from. That overreaction is common in human decision making. His conclusions appear to support the hypothesis that in general, the polls work well, but the problem is that the AP voters are overtaxed - ranking accurately takes too much work, so they take the easy route out.

The question, though, is how bad is the problem? If the 'easy route' works almost perfectly, is it really that bad? When you consider that the BCS uses a merger of human polls and statistical rankings, maybe the "easy route" is all you really need.

... but there's no preseason bias, and Penn State readers are about to kill me

The other interesting paper that came out recently was a paper by Ross, Larson, and Wall, which looked for conference biases in the AP/AFCA poll, as well as a possible preseason bias. They looked at the AP/AFCA poll from 2003-2008, so the sample size is pretty large.

The results may shock you. First off, again, "the polls are pretty good." The higher-ranked opponent wins ~80% of the time. But there are a few biases, and they're not what you would expect.

The first was presented elsewhere:  Campbell, Rogers, and Finney had shown that there's a bias in the AP/AFCA polls towards television coverage. This isn't surprising - when you can see the game, you have a stronger reaction towards the victor and against the loser because you understand the game more. That bias gets confirmed via a proxy of population size, although the real source is almost certainly television coverage.

The other interesting thing is this: there's no weekly effect. That is, the poll was roughly equally accurate at all points in the season. Teams which are ranked high early don't tend to linger and be overrated throughout the season - there is no preseason bias. This agrees with Stone's findings: the polls overreact, and so the preseason rankings don't have a real effect. Also note that this semi-contradicts a finding elsewhere, but the study there was fairly simplistic, and even that author noted that the effect was very small. If preseason rankings are not entirely inaccurate, adding them to a model to predict a final ranking will add information that might not be present. This does not indicate a preseason bias - it indicates that voters aren't stupid. You could likely get a similar result by adding a ranking midway through the season.

As for conferences: the conference that the poll was most biased for? The MAC. Yes, the MAC. I'm not kidding. When polls rate a MAC team, they tend to be overrated. This is probably not really a bias for the MAC. The MAC's had only two teams that ended the year ranked, back in 2003, and the lower-ranked team actually upset the higher ranked team. Other than that, the only ranked teams have proved to be unworthy of that ranking. The authors acknowledge this - it's akin to the "long shot bias" in betting, where heavy favorites win more than you would expect.

But there was a major conference with a statistically-significant bias, and you wouldn't expect sample size to be an issue here: AP/AFCA polls were biased against the SEC, at about a 5-10% likelihood level.

I know, I know. I'm dead to you now. But don't kill the messenger - this is just statistics here. And the size of the bias isn't that large. To be honest, I can come up with an explanation that might further get me killed - quite a few SEC teams have been good over that period, and rankings are ordinal - that is, you can't tie teams. Since there are fewer "top" spots, that'll produce a downward bias in the rankings.

It should also be noted that this might be cyclical - that is, the SEC was underrated early this decade, and now it may be being overrated because it had been underrated for a while. Might the Big Ten experience the same thing when its big-money teams finally stop being mediocre (cough Michigan cough)? Quite possibly.

Can we make them better?

Now it's the opinion part. Clearly, the polls aren't that bad. There's no preseason bias, there's negligible conference bias, and in general voters seem to be prioritizing what we'd like them to do.

Can we do better? I can't say for certain. I can say that the authors of the research pointed out important points - first, the AP Poll acted as you would expect if the voters were overworked. Makes sense - they have other jobs. So it might make sense to create a panel of voters who have the time - that is, they're paid for it, who can see any of the games they wish (free ESPN GamePlan!), and whose performance is periodically monitored to avoid bias or poor voting. It might make sense to do the same for coaches, too - instead of using active coaches, use, say, retired coaches.

Will this work? I have no idea. It might not be feasible. It might be more inaccurate than what we currently have. And finally, it might not be worth it, because as noted above, surprisingly, things aren't that bad.

You created a Fanpost! Good for you! Any content from a premium site will be deleted once we catch wind of it--as will any inappropriate content. If you simply want to share a link, quote, or video, please consider using Fanshots instead.