Why the Election Polls Missed the Mark

In the days following an election in which his organization's polls proved to be inaccurate, Gallup Editor in Chief Frank Newport published a blog post warning of "a collective mess."

The results of the election--President Obama's 4-point victory--was not the only indication that Gallup's polls were biased in favor of Republican Mitt Romney. Websites that average and aggregate polls showed, on balance, that Obama was in a stronger position than Gallup's polls did, which allowed some observers to paint the longtime pollster as an outlier, both before and after the votes were tallied.

Newport, in his blog post three days after the election, saw these aggregators as a threat--not only to the Gallup Organization but to the entire for-profit (and nonprofit) public-opinion industry. "It's not easy nor cheap to conduct traditional random sample polls," Newport wrote. "It's much easier, cheaper, and mostly less risky to focus on aggregating and analyzing others' polls. Organizations that traditionally go to the expense and effort to conduct individual polls could, in theory, decide to put their efforts into aggregation and statistical analyses of other people's polls in the next election cycle and cut out their own polling. If many organizations make this seemingly rational decision, we could quickly be in a situation in which there are fewer and fewer polls left to aggregate and put into statistical models."

Newport's hypothetical--that because aggregators that averaged polls or used polls to model the election results more accurately predicted the results than his traditional, phone polling--sounds a little paranoid on its face. But it underscores the effects that increasing costs and decreasing budgets are having on media organizations that cover politics and typically pay for this kind of survey work.

It also reopens a long-standing debate over poll aggregation. Some pollsters and media organizations think the practice of averaging polls that survey different universes or are conducted using different methodologies is bunk. They warn that considering cheaper, less rigorous polling on the same plane as live-caller polls that randomly contact landline and cell-phone respondents allows the averages to be improperly influenced by less accurate surveys. And, ultimately, while the poll averages and poll-based forecasts accurately picked the winner, they underestimated the margin of Obama's victory by a significant magnitude.

But others, including the poll aggregators themselves, maintain that averaging polls, or using poll results as part of a predictive model, produces a more accurate forecast than considering any one individual poll. Before an election, it's difficult to predict which polls will be more accurate and which polls will miss the mark. Averaging results together also provides important context to media and consumers of political information when every new poll is released, proponents argue.

Ultimately, this is a debate that also goes beyond the statistical questions about averaging polls. It touches on the nature of horse-race journalism and the way in which we cover campaigns.

The First Number Crunchers

Real Clear Politics began the practice of averaging polls before the 2002 midterm elections. RCP was joined by Pollster.com--which is now part of The Huffington Post--four years later. "Pollster started in 2006, and we were really building on what Real Clear Politics did," founding Coeditor Mark Blumenthal said. The statistician Nate Silver began a similar practice in 2008, and his site, FiveThirtyEight, was acquired by The New York Times shortly thereafter. More recently, the left-leaning website Talking Points Memo started its PollTracker website before the 2012 election.

Each of these organizations differ in their approaches. Real Clear Politics does a more straightforward averaging of the most recent polls. TPM's PollTracker is an aggregation involving regression analysis that uses the most recent polls to project a trajectory for the race. FiveThirtyEight and HuffPost Pollster use polls, adjusting them for house effects--the degree to which a survey house's polls lean consistently in one direction or another. FiveThirtyEight also uses non-survey data to project the election results.

All four of these outlets underestimated Obama's margin of victory. Both Real Clear Politics and PollTracker had Obama ahead by only 0.7 percentage points in their final measurements. HuffPost Pollster had Obama leading by 1.5 points, while FiveThirtyEight was closest, showing Obama 2.5 points ahead of Romney in the last estimate. The aggregators that came closest to Obama's overall winning margin were the ones that attempted to account for pollsters' house effects.

"The polls, on balance, understated President Obama's support," said John McIntryre, cofounder of Real Clear Politics. "Our product is only as good as the quality and the quantity of the polls that we use."

These sorts of house effects were why HuffPost Pollster moved to a model that attempted to control for them, but their average still underestimated Obama's margin of victory by a sizable magnitude. "One of the main reasons why we moved to using a more complex model that controlled for house effects was precisely to prevent that phenomenon from happening," Blumenthal said. "Our goal is to minimize that to next to zero."

Pros and Cons

John Sides, a political-science professor at George Washington University and the coauthor of the blog The Monkey Cage, is one of the more prominent proponents of using polling averages--and a critic of press coverage that doesn't. "You're better off looking at averages, because any individual poll may be different from the truth because of sampling error and any idiosyncratic decisions that pollsters make," he said.

Sides believes that news coverage of campaigns tends to overemphasize some polls at the expense of others. In some cases, a poll is considered newsier because it shows something different than the balance of other polling in the race. In other words, polls that are outliers are given more attention than polls that hew more closely to the average, and those outlier polls are more likely to be inaccurate, Sides argues.

"I'm not overly optimistic that the averages are going to become a more important factor in news coverage. I think there are still strong incentives to seek drama where you can find it. And that may mean chasing an outlier," Sides said. "I would like to think that it would start to creep in at the margins. So instead of saying, 'Some polls say ___,' it may say, 'A new poll showed ___, but other polls haven't showed that yet.' "

But, as this year's results show, the averages aren't perfect, and they all showed a closer race than the actual outcome. Comparing polls that accurately predicted the election--the final poll from the Pew Research Center, for example--to the poll averages at the time would have made those more accurate polls appear to be outliers.

Part of that problem, at least when it comes to the national presidential race, were the daily tracking polls from Gallup and automated pollster Rasmussen Reports. Both firms reported results that were biased in favor of Romney this cycle, but by publishing a new result every day, their polls could be overrepresented in the averages. "The one sort of Achilles' heel of the regression trend line that we've done classically on our charts, there are two pollsters that contribute most of the data points," said Pollster's Blumenthal. "Not only does that make the overall aggregate off, it can also create apparent turns in the trend line that are [because] we've had nothing but Gallup and Rasmussen polls for the last 10 days."

In addition to the ubiquitousness of surveys from some firms, poll aggregators also worry that partisans may try to game the system by releasing polls with greater frequency, or by skewing or fabricating results. A prominent Democratic strategist told National Journal last month that some outside groups conducted polls in presidential swing states and released them to the public as a means of countering polls from Rasmussen Reports that were less favorable to President Obama. Blumenthal told National Journal that his biggest fear was not the increase in partisan polls but the possibility that groups would release rigged or fabricated results, as outfits like Strategic Vision and Research 2000 apparently have over the past handful of years, to influence the averages. "My biggest concern over the last two or three years has been the potential for the repeat of something like that," Blumenthal said.

"Champagne" of Polls

The blog post written by Gallup's Newport after the election demonstrates another source of opposition to poll averages: the pollsters themselves. Pollsters all make choices about how best to sample the probable or likely electorate, and those choices vary. Additionally, more expensive live-caller polls compete in the averages with cheaper automated-phone and Internet polls that may not make the same efforts to obtain random samples of voters; merits aside, those live-caller pollsters surely want to protect their businesses from less expensive competitors. Moreover, news organizations that spend tens of thousands of dollars to conduct a poll are likely to report and trumpet their poll's results over other surveys.

"If you're merely an information aggregator, it's very hard for me to see how you're adding value to the proposition," Gary Langer, whose firm Langer Research Associates produces polls for ABC News, told National Journal in a phone interview last month. "Averaging polls is like averaging champagne, Coca-Cola, and turpentine," Langer added.

Overall, 2012 brought more attention than ever to poll aggregators, with their methods becoming more sophisticated. But where do they go from here?

"I don't think there are great advances in averaging or modeling horse-race polling data," Blumenthal said. "We are ultimately reliant on the quality of the data that's collected."

There is evidence that data are becoming less reliable, but supporters of using polling averages argue the underlying changes that are leading to more variable polls bolster their case for using averages instead of individual poll results. "It's a powerful piece of information, and it's a very good piece of information, and I think it's better than any one single poll in terms of using it as a data point to analyze a race," said RCP's McIntyre.

The questions remain open over how much attention the averages will get in the next election--and how much attention they deserve. But, for the horse-race media, 2016 is right around the corner, and some aggregators have already started keeping score.