Charting a Future for the Polling Industry

By Bradley Honan & Trevor Smith  |  01/26/2022  |  Original Article

The recent polling misses  in off-year races and elsewhere have reaffirmed for many political and public affairs professionals that polling isn’t to be trusted. 

After all, the polling misses this past November are on top of other recent big misses notably, the 2016  and 2020  presidential election, and to a lesser extent the 2018 midterm elections. A report by the American Association of Public Opinion Researchers found the 2020 election polling errors to be of an “unusual magnitude.” To wit, it was the highest error in 40 years  of public opinion polls.

All of this has created another chorus of headlines  suggesting that polling, as an industry, is on the decline.

Ross Baker’s USA Today op-ed blared; “Don’t trust the polls: Pollsters are increasingly unable to predict American thought ”.  And thedirector of Monmouth University’s Polling Institute, Patrick Murray, wrote an op-ed with the following title, “I blew it. Maybe it’s time to get rid of election polls ,” where he apologized to the candidates running in the ’21 New Jersey gubernatorial race for failing to show the race as close as it was with his public-facing polling.

This analysis is the first of a two-part series about what’s wrong with contemporary public opinion polling and how it can be fixed. 

We came together — two pollsters with two very different political ideologies — to write this series, because we are firmly united in the belief that the polling industry needs to adopt important methodological changes or continue to get black eyes and weather further hits to its reputation.

First, let us lay out what we see as being some of the most common missteps. Our subsequent piece will focus on our ideas for how the industry can evolve its methodology and approach to consistently produce more accurate results.

Problem #1: An Inaccurate Polling Frame

What often produces a significant error in polling data is a mistake in the so-called “sampling frame.” Basically, the decision about who’s eligible to be polled and who’s excluded from participating.  This is, unfortunately, not a new problem for the polling industry.  

Indeed, in 1936 The Literary Digest, which had previously accurately predicted the winner of presidential elections going back to 1916, forecast an overwhelming win by Kansas Governor Alf Landon  over sitting President Roosevelt. But rather than a Landon landslide, it was Roosevelt who won 46 states , with the Republican winning only Maine and Vermont. 

While The Literary Digest polled 10 million people, and 2.7 million responded – who they polled and who responded was not representative of who turned out to vote on Election Day in 1936.  The error that caused this mistake?  An error in the sampling frame. 

The Literary Digest primarily polled their own readers and subscribers – people who in the midst of the Great Depression had the necessary financial and intellectual wherewithal to subscribe to a literary magazine. In addition, the magazine supplemented their polling outreach with a list of car owners and people in homes with telephones – hardly a representative sampling of the population in 1936. 

While that methodology had previously been accurate, the economic stratification of those who supported Roosevelt versus Landon was significantly different than prior elections.  

In other words, the rich-versus-middle class and poor divide, in terms of who voters supported, was more pronounced than it ever had been before as a result of opinions around the New Deal.  

Despite a huge number of polling interviews that were tabulated and analyzed, the results were way off the mark because of whom had been polled. 

A more contemporary example is the 2016 polling, much of which undercounted white, non-college educated voters , particularly in the industrial Midwest. Those voters ended up being both a larger segment of the electorate than many pollsters imagined, and they broke heavily for Donald Trump over Hillary Clinton compounding the impact of the polling error.

Thus, as you would expect, who gets polled — and who doesn’t get polled — has an enormous influence on what the polling data concludes. The multitude of ways of contacting people through landlines, cell phones, IVR, online surveys, SMS surveys, and SMS-to-web surveys only compounds the possibility of polling errors and bias.

Problem #2: Polling American Adults vs. Likely Voters

The second error that we see most often is on the part of the news media which frequently polls American adults or registered voters, rather than the most likely voters — those who indicate they will absolutely turn out to vote.  

Part of this is an issue of cost, it’s far cheaper to poll the larger universe of all adults than the smaller, more targeted universe of likely voters. The costs of survey data collection and fieldwork continues to increase, in part due to labor shortages and wage inflation, and screening down to reach a more niche audience of those who are highly likely to vote means making many more phone calls. And that adds to the cost of polling.  

And it’s not enough to reach likely voters, you often need to reach voters who are absolutely certain to vote, some of whom may be “surge voters ” who may not have voted in the prior election, making a sample frame harder to develop. 

When the media only polls Americans or registered voters, many of whom will never vote on Election Day, it may be interesting from a civics standpoint, but it frequently presents a different result than what Election Day actually produces — and in doing so harms the reputation of the polling discipline. In sports terms, this is what is referred to as an unforced error. 

Simply put, being an American adult or even being registered to vote doesn’t equate to making an effort to show up and vote on Primary Day or Election Day.  Indeed, there are quite a lot of registered voters, who don’t vote. Even in the highly competitive 2020 presidential election, the Pew Research Center found that 1 out of every 3 registered voters  didn’t actually participate in the election.

Problem #3: Not Using a Voter File

The third most common error is not using a voter file , or one that is currently up to date. Obviously, only registered voters can participate in an election and the only way to know for sure if someone is registered, is by verifying that their name appears on the voter file.  

And the voter file can also tell you that not only is someone registered, but that they are an active voter, meaning they have actually voted in recent past elections. 

But acquiring the voter file costs money and acquiring a file from a state party for either no cost or a low cost usually means getting a file with a significant number of wrong or missing phone numbers.  

Thus, you may not be able to reach as many voters as a commercial file would allows for. Skip this step if you want to save money, but doing so will ensure that you are taking people’s word about them being registered to vote and again taking their word they have some past history of voting – and thus a likelihood to do so again in the near future. These assumptions can skew the results.

Problem #4: Socially Desirable Answers

While it’s very common for voters to want to honestly share their opinions about public policy issues, that isn’t always the case as it relates to hot button issues like race , gender, ethnicity, and who they are voting for. 

It’s well documented in social science research  that people won’t always want to share controversial viewpoints with a stranger who’s polling them, or answer honestly questions about whether they’re voting regularly. 

As an example, most Americans report they certainly don’t have any racial biases, but the neighbor next door might. This reality has given rise to the idea of the “Shy Trump ” voter, someone who supports Trump, but doesn’t want to tell an unknown pollster they support a candidate that they perceive to be not the popular choice, among their social circle. 

Research and polling that doesn’t take this reality into account, especially in a highly polarized electoral climate, will distort the insights they gleam from the biased survey answers they collect.

Problem #5: Landlines Versus Cell Phones Versus Online Surveys

More and more voters are “cutting the cord ”, and by that we mean canceling their cable TV subscription, and with it, their landline service and instead relying solely on their cell phones for a phone number and internet connection.  

Current FCC regulations allow landline phone numbers to be mass dialed by a computer dialer, so dialing thousands of numbers can be done quickly and efficiently, but they prohibit the same practice for cell phones , which must be individually hand dialed one-by-one. 

That means to accurately sample voters via their cell phones, takes longer, and thus costs more money. But failing to include a sufficient number of cell phones in a polling sample ensures your poll will be out of whack.  As an example, Jerry Skurnik of Engage Voters U.S. confirms that 22 percent of New York State voters only have a cell phone listed on the basic, un-enhanced voter file. 

Don’t dial cell phones — or don’t dial a sufficient number of them — and your poll numbers will be off, as cell phone only voters tend to be younger and more likely to be a person of color and thus more progressive and lean Democratic.  

Not surprisingly, their voting behaviors are different from the rest of the electorate. In addition, many pollsters are relying upon online surveys instead of making phone calls, and because this is not a probability based approach, this introduces a high degree of potential error and bias  into the results.

In our forthcoming article, we’ll look at how the polling industry needs to adapt to consistently produce better results and earn back trust.

Bradley Honan is CEO and President of the Honan Strategy Group, a Democratic polling and data analytics firm. Trevor Smith is a Ph.D. and is the Chief Research Officer of WPA Intelligence, a Republican polling firm.