OK, here’s my first take on what a poll’s “margin of error” really means. (Title of this post shamelessly cribbed from the Car Talk guys, of course!)
First, you don’t actually get enough information just by knowing what the margin of error is. You also need to know the “confidence interval.” Apparently, a confidence interval of 95% is common in polls, but other levels can be used (99% is also commonly used in some kinds of studies), so you need to know. Fortunately, Survey USA tells us: 95% it is. (Aside to statistics geeks: a 95% confidence interval corresponds to two standard deviations away from the mean; a 99% interval corresponds to three, all assuming a normal distribution.)
So: a result that 55% of the people in an appropriate random sample support candidate X, with a 5% MOE at 95% confidence, means that we can be 95% certain that the actual support for candidate X in the relevant community is somewhere between 50% and 60%. (95% certain? What does that mean? As I understand it, it means that if you did this survey 100 times, you’d get an answer in the range you expect 95 times.)
Applying this to the results from the last two Survey USA Dem primary polls, here’s what we actually know.
- We can be 95% certain that the support for these candidates among likely Democratic primary voters falls somewhere in the following ranges (previous poll results in parentheses):
- Patrick: 30.4-39.6 (32.1-41.9)
Gabrieli: 25.4-34.6 (22.1-31.9)
Reilly: 22.4-31.6 (21.1-30.9)
We also know, from a spiffy little calculator that Kevin Drum set up, that in the last poll there was a probability of 99.5% that Patrick was actually ahead of Gabrieli, a probability of 99.8% that he was actually ahead of Reilly, and a probability of 61% that Gabrieli was actually ahead of Reilly. (Note that this is just the probability that candidate X is ahead of candidate Y by some unspecified amount – as little as 1 vote would do it.) In the current poll, those numbers have changed a bit: the probability that Patrick is “really” ahead of Gabrieli is now down to 91.3%; the probability that Patrick is ahead of Reilly is down to 98.7%; and the probability that Gabrieli is ahead of Reilly has increased to 80.8%.
Note how the probability calculations reflect the margins of error. Since the margin of error means that we can say, with 95% certainty, that the candidate’s “real” support is within the expected range, one would expect that, if one candidate’s range has no overlap with another candidate’s range, the probability of that candidate being ahead would exceed 95%. And that is exactly what we see. Looking just at Patrick and Gabrieli, for example, the most recent poll shows an overlap in the expected (i.e., estimated with 95% confidence) “real” ranges of support – that is, the lower limit of Patrick’s expected “real” support is 30.4%, and the upper limit of Gabrieli’s is 34.6% – though neither of those extremes is all that likely. Accordingly, the probability that Patrick is “really” ahead of Gabrieli is 91.3% – pretty good, but not 95%. In the previous poll, on the other hand, there is no overlap in the expected ranges, and accordingly the likelihood that Patrick was “really” ahead of Gabrieli at that time was 99.5%.
So I was wrong to say that there’s no meaningful difference between the last poll and this one. There is: the likelihood that Patrick is “really” ahead of the closest of his rivals is down from over 99% to just over 90%. He’s still looking good, but the difference is meaningful.
However, it is also wrong to describe the current poll as a “statistical dead heat,” as is commonly done when the margin of error allows overlap between two candidates. As the discussion above shows, all an overlap between the expected ranges for the two candidates really means is that the “real” likelihood that one is ahead of the other is less than 95%. How much less? Use the spiffy calculator referenced above to find out – all you need to know are the percentages reflected in the poll and the sample size.
Another fun fact: when comparing the lead of candidate X over candidate Y, you cannot simply add the two margins of error. The actual calculation is complicated, but a good approximation is to multiply the poll’s reported margin of error by 1.7 if you want to figure the margin of error for one candidate’s lead over another. So the current SurveyUSA poll’s margin of error of +/- 4.9% becomes an 8.3% margin of error when comparing two candidates.
That’s all for now. If I’ve got some of this wrong (and I wouldn’t be surprised if I have), feel free to correct me. And more to come, no doubt!