Vision2 poll's real margin of error
There are a couple of serious flaws with the Vision2 poll that was published today in the Tulsa World and conducted by SoonerPoll.com. The results weren't too surprising, and they may very well match what we see on election day, but nevertheless the results should be taken with a big grain of salt.
I commend SoonerPoll.com for being transparent enough about their approach that the limitations imposed by that approach can be discussed. Here's their disclosure:
A poll of 440 likely voters was conducted by SoonerPoll.com, using a random digit dialing technique that included both cell phone and landline telephone numbers.
Interviewers collected the data Oct. 25-Nov. 1. Results were weighted by gender and phone status (cell phone only/landline only/both). The poll was sponsored by the Tulsa World.
The margin of error is plus or minus 4.67 percentage points. This poll conforms to the Standards of Disclosure of the National Council on Public Polls. A complete description of the methodology can be found here.
In SoonerPoll.com's detailed description of their methodology, they write:
A detailed methodology that discloses sampling errors and statistical tests of significance will be made available for every survey conducted by SoonerPoll.com. Among other data, SoonerPoll.com's methodology reports will include sample sizes and disposition reports.
So far, they don't seem to have published sample size and disposition reports for this survey.
The biggest problem first: You can't simply add independent samples and use the total as the margin of error of the aggregate. In their disclosure, SoonerPoll.com effectively tells us that they took a sample of six separate populations, then weighted them according to their estimate of the percentage each subset contributes to the total electorate.
Margin of error is mathematically tied to sample size. (You can find a margin of error calculator here.) As your sample size gets bigger the margin of error decreases. The MOE for a sample of 440 from a homogeneous population is indeed plus or minus 4.67 percentage points.
But what we have here are samples of six separate populations (men with cell phones, men with land lines, men with both, women with cell phones, women with land lines, women with both) that together add up to 440 respondents.
In the best case scenario, they got 73 responses from each group. That's a MOE of 11.47 for each of the six samples. You can't simply add six separate samples with an MOE of 11.47 and magically get a lower MOE when you combine them. At best, your MOE is plus or minus 11.47 percentage points, but then there's also the potential for error in the weight you assign to each sample.
Legal Insurrection has a good overview of polling sample size and margin of error.
The next problem: How do you know your respondents are really likely voters?
Let me acknowledge that it is far more difficult for candidates to get information to the voters and for pollsters to get information from the voters than it was in the days when everyone had a landline, no one had a mobile phone, no one had caller ID, and almost everyone had a listed number. Today, you might get a phone to ring, but no one will pick up because they don't recognize the number. I hear that the ratio of completed responses to dialing attempts is typically less than 10%.
There is a great deal of debate in the polling community as to how to compensate for these challenges. One approach, which Sooner Poll has taken here, is to dial numbers randomly, then ask a series of screening questions to determine whether the respondent is likely to vote. Each pollster has his own set of screening questions. Some may be as simple as asking the voter to rate his own likelihood of voting, some may involve asking the voter if he knows where his polling place is.
The other approach is to use past voting history to identify likely voters. In Oklahoma and in most other states, the election board keeps a record of the elections in which you voted and by what method (in person on election day, absentee by mail, absentee in-person at the election board). A pollster can match voter records with phone numbers and then call only registered voters, or screen more tightly based on past voting frequency and recency of registration. There are problems with that approach too -- voters without landlines, voters who move but continue to vote at their old address, ambiguities in matching voter names to phone subscriber names.
Pollster John McLaughlin (not the TV pundit) explained the problems with random-digit-dial polling about a month ago in the context of the presidential race:
So recently it was revealed by the Daily Caller that Obama's most senior campaign strategist David Axelrod has been lobbying Gallup Poll staffers saying that their polls were "saddled with some methodological problems". Dick Morris reported that Axelrod was upset at Gallup for "generating polling data negative to the President." Gallup didn't change their methods and by coincidence found the Justice Department suing them with an unrelated lawsuit. You only have to wonder if these other media pollsters received emails, calls and visits about the correct Axelrod methodology.
So what's the common Axelrod methodology that causes the media polls to under count Republicans? Are they calling registered voters from the publicly available lists with actual voter history? Those lists easily reflect the 130 million voters who turned out in 2008, or 2010, or have registered since those elections. They truly represent the actual voter population. Good scientific sampling would say pull a random sample of voters from the actual population of voters.
However, David Axelrod has been urging pollsters to randomly dial phones exchanges and cell phone exchanges and merge them somehow without regard to voter affiliation. The 2010 Census said that the American Voting Age Population was over 230 million adults. About 40% don't vote. Calling the 100 million eligible adults who choose not to register, or are registered, but don't vote, waters down enthusiastic Republicans. Who knows if the person who is talking to the NBC pollster is really registered to vote? Overall there's about a quarter of a million landlines in the United States that could be called. Plenty more than actual voters. However, if that doesn't dilute the Republicans enough, there's over 330 million wireless cell phone connections in the United States that can be randomly dialed.
So these swing state media pollsters are just randomly dialing the phone book and cell phone listings to water down Republican votes. The deck is stacked. Regardless how Mitt Romney does tonight he can't win the post debate polls - unless they call voter lists and make sure the demographics match the real voter file for age, gender, race geography and even party.
Then there's the duration of the survey period: It took them a full week to collect a sample, by which time the first people they contacted may have changed their minds.
Finally, they haven't disclosed how the questions were worded. Voters won't see words like "Vision2," "deal-closing fund," or "low-water dam" on the ballot, and they might have a different reaction to the pollster's question than they would to the actual ballot language.
It may be that all these flaws cancel each other out, and I don't mean to cast blame on SoonerPoll.com, which is no doubt doing its best to gauge public sentiment in an increasingly difficult environment. We'll find out on Tuesday.
Why does it matter? Poll results can be used to create a bandwagon effect, particularly when an issue isn't strongly partisan. Without any strong sense of what to do, some voters will go along with whichever side they see in the majority. That's why, if you're in opposition to Vision2, you need to post it on your Facebook wall, put a sign in your yard, and send an email to friends explaining why you're voting no.
Listed below are links to blogs that reference this entry: Vision2 poll's real margin of error.
TrackBack URL for this entry: