This, published by Markos at Daily Kos, is quite worrisome.
I have just published a report by three statistics wizards showing, quite convincingly, that the weekly Research 2000 State of the Nation poll we ran the past year and a half was likely bunk….
We contracted with Research 2000 to conduct polling and to provide us with the results of their surveys. Based on the report of the statisticians, it’s clear that we did not get what we paid for. We were defrauded by Research 2000, and while we don’t know if some or all of the data was fabricated or manipulated beyond recognition, we know we can’t trust it. Meanwhile, Research 2000 has refused to offer any explanation….
While the investigation didn’t look at all of Research 2000 polling conducted for us, fact is I no longer have any confidence in any of it, and neither should anyone else. I ask that all poll tracking sites remove any Research 2000 polls commissioned by us from their databases. I hereby renounce any post we’ve written based exclusively on Research 2000 polling.
Very unpleasant business. We commissioned and published three (IIRC) Research 2000 polls over the last couple of years, including one in the Scott Brown-Martha Coakley Senate race that got a lot of attention. We will be following what happens between Daily Kos and Research 2000 with great interest. Where things are now:
the lawyers will soon take over, as Daily Kos will be filing suit within the next day or two.
If any statistics wizards care to look over the report to which Markos refers, it is available here.
UPDATE: This is getting really ugly. Daily Kos’s lawyer, Adam Bonin, has told TPM that Research 2000 “handed Daily Kos fiction.” Research 2000, for its part, has retained the large Howrey law firm, which has both threatened to sue Daily Kos and sent a threatening cease and desist letter to Nate Silver at FiveThirtyEight.com.
jconway says
Was not going with CarTalks pollster, Paul Murky of Murky Research.
heartlanddem says
Are reliable too.
peter-porcupine says
couves says
<
p>Awkward.
lasthorseman says
the publicity out of said poll you wanted. What BTW is the point of asking morons what they think outside to trying to market something. You may want to Google Mark Dice and get the youtube about Doc Holliday and Wyatt Earp signing the Declaration of Independence.
mark-bail says
reptilian humanoids for a conspiracy theory that at least has entertainment value.
dcsohl says
Because our system of governance is based on asking those same “morons”* to decide who’s going to lead the commonwealth and nation?
<
p>* Your word, not mine.
amberpaw says
While “figures don’t lie” – “liars can figure” and it is all in how it (numbers) are set up and analyzed.
<
p>Not being a “numbers person” and having to use a CPA…like I do…I have learned the presentation seems to matter more than the data most of the time.
stomv says
This isn’t about figuring out the combination of phraseology and resulting numbers which offer a rosy picture.
<
p>This is about the raw data being so statistically unlikely that error — willful or not — is the only explanation.
<
p>
<
p>Let’s say I ask every BMGer to go collect all the coins in their couches, under their car seats, in their pockets, and in their purses and wallets, and count it up. If every single BMGer reported that the pennies found always came in sets of three, that the number of quarters was always even, that the people with more sofas and cars always found more coins than those with fewer sofas and only bicycles, etc… that’s that this data looks like. The raw data itself has statistical patterns which simply don’t happen in actual sampling — they could only be explained by a measurement error.
somervilletom says
It looks like manufactured data to me — like they didn’t do the poll at all and instead conjured up some guesses and took the money to the bank.
stomv says
it might not be. I won’t suggest one or the other.
<
p>An example of the problems is the even/odd problem. When counts were broken down by gender, they were almost always both odd or both even. This data isn’t natural — but is it pencil-whipped, or is it just some bug or rounding error in software? I won’t speculate.
<
p>Another example is that the week-to-week changes in Obama’s ratings in the tracking poll were almost NEVER +0. He’d get + (or -) 1 or 2 or 3 in a way which mapped to a normal distribution centered on 0, but instead of 0 being the most frequent it almost never occurred. Fraud, or the artifact of some sort of moving average? I won’t speculate.
ryepower12 says
That’s pretty amazing. Stunning, actually.
cater68 says
I predicted a Brown victory through clenched teeth. Anyone with a pulse knew something extraordinary was afoot. It would be interesting to re-post the poll and affiliated comments….
stomv says
The R2K polls are concerning not because their results were out of line with expectations, but instead because they were too much in line with expectations.
<
p>Check pollster. The R2K polls weren’t coming up with statistically different top line results than the other pollsters (save Ras).
<
p>
<
p>It’s like this: go to a shopping mall in the middle of the day. I’m going to pay you $100 to count all the cars. I don’t know how many cars are in the parking lot, and neither do you. We both know that three other folks have counted, and have come up with 825, 832, and 803.
<
p>Now, you have a few choices:
1. Count the cars, and try to do a really good job.
2. Count the cars, but if you make a few mistakes, who cares?
3. Don’t count the cars — just return a number that sounds close.
<
p>(1) is the hardest, and (3) is the easiest. It really looks like R2K did is akin to (3). Their results were believable in each single-poll analysis because they fell in line with the others. So let’s say hypothetically that R2K were hired to count cars in parking lots in 5000 parking lots. While their numbers always seemed right, over the span of thousands of counts, they looked like this:
<
p>832
23658
8258
14
770
336
.
.
.
2224
<
p>Notice something about those numbers? They’re all even. You’d never notice this unless you looked at their results across many, many polls. Once you’ve noticed it, it’s easy to do statistical tests to see how likely that pattern would exist “in real life”… and it turns out that some of the patterns turn up on the order of one in gajillions.
<
p>
<
p>The problem with the R2K numbers is not that they’re wrong. The top lines are consistently reasonably accurate. The problem with the R2K numbers is that the underlying data demonstrates statistical properties which suggest that it isn’t the correct raw data — that either (a) it’s made up, or (b) it’s been fudged in a way which is systematic. Not fudged to get a different overall outcome, but fudged to make the numbers more clean — like getting a regular trim doesn’t make your hairdo shorter, it just makes the longest hairs a bit shorter.
af says
make people question the whole field of political polling. What do they hope to accomplish, get an idea where the electorate is on a given issue or race, move support for or against a candidate or issue by a desired poll result, or reinforce a chosen position by promoting a poll that supports it? I think the political news industry is far too hooked on the opiate of polls to withdraw, but withdraw they must. Polls have become like the scoop or exclusive to the news business. Do they inform the viewers, or are they just a selling feature that says “I’m doing my job better than them, buy from me”?
peter-porcupine says
Said this many times – the ONLY polls that count are taken in Novembers.
sabutai says
So will you sign this letter that says the poll taken last January that put Scott Brown into the Senate doesn’t count?
peter-porcupine says
johnt001 says
It’s on the rec list at this link:
<
p>http://www.dailykos.com/story/…
<
p>It has a very interesting chart, showing an analysis of the cross-tabs for R2K vs PPP – the R2K results are all highly correlated, while PPP’s show randomness in the correlation. Highly correlated numbers are not random, they are artificial, made-up – checking the correlation in this manner is a method for detecting accounting fraud.
stomv says
D,C,or B:
<
p>Drop me an email and I may be able to give this an hour or two this weekend. At the very least, I may be able to identify if the problematic raw data trends in the kos polls also show up in the raw data for which y’all paid.
<
p>I’m no statistics guru, but I do have a relatively high comfort level when taking a dip in raw data.
sleeples says
Check out Nate Silver’s fantastic breakdown of the nonrandom results.
<
p>Amazing. There is so little verification of pollsters we have at LEAST two companies now just raking in money off phony data. How many other polling firms are scams?
kate says
I was reminded the other day that you won the prodection contest I did back in the Coakley race. I didn’t post the “winner” until the thread was dead. Please contact me off-line at KateDonaghue AT aol DOT com . I don’t rememeber what I promised the winner!
stomv says
what was interesting about the R2K polls is that it wouldn’t have been discoverable had kos not commissioned continuous polling. If R2K had done polling the way lots of other firms do — a poll here, a poll there, without consistent time intervals and samples — it wouldn’t have been visible at all.
sabutai says
But if one were going to defraud thousands of dollars as a polling regime, wouldn’t you be a little better at it than this?
<
p>It would take about two hours to write a quick program (even in BASIC) to plug in random deviations from some baseline numbers. Run the program every week and voila who would know?
shillelaghlaw says
10 A = RND (100)
20 PRINT A
30 GOTO 10
medfieldbluebob says
Or, their random number generator is seriously flawed. Which is why they eventually got caught (if indeed they’ve been “caught”). There is enough non-randomness in their data to be very suspicious, and a good random number generator wouldn’t have generated that.
<
p>It strikes me that a fair amount of thought and effort went it to faking this data. You need randomness, yes. But the results also have to pass muster with a whole lot of people, some of them very good statisticians/pollsters like Ned at 538.com. And, the results had to compare, somehow, with a few dozen other pollsters, who are also your competitors and would love to expose you as a fraud.
<
p>Thousands of eyeballs were looking at this data, and two years later they’re getting caught. The Obama tracking poll differences are the most compelling, for me anyway. If a simple random number generator was used there would have been many 0’s in that data. There are almost none.
<
p>LIke Maddoff, if they’d put as much thought and effort into actually doing the work they got paid to do, they might have actually produced decent results.
kbusch says
When cryptographers go about breaking codes they look very carefully for things that are non-random. Research 2000 appears to have been playing a game at which they could easily be caught.