In my beloved home state of Maryland, this year's governor's race is a rematch of the contest four years ago, and most polls show a close race, with current Gov. Martin O'Malley (D) up a few points over former Gov. Bob Ehrlich (R), but at or below the crucial 50 percent mark.
Enter the Washington Post, which two days ago released a poll that shows O'Malley up by 11 points, breaking the 50 percent mark. As might be expected, Post journalists are hyping the results, casting the race as possibly starting to break decisively in O'Malley's direction.
In an online chat, the Post's Chris Cillizza vouched for the poll by stating that pollster "Jon Cohen is the best in the business, so yes," O'Malley has indeed opened up a wide lead over Ehrlich. Today, the Post's Mike DeBonis penned a column about how O'Malley is "right now, in a place where a lot of his fellow Democrats around the country sure wish they were."
Eh, not so fast, veteran Maryland political observer Blair Lee argues in an October 1 article for Gazette.net.
The Post poll oversamples demographic groups that are O'Malley-friendly and doesn't take into account the heightened energy among Maryland Republicans and depressed primary turnout from Democrats this year, Lee argues (emphasis mine):
[S]ince July virtually every national expert, professional handicapper and pundit has called Maryland's governor's race a neck-and-neck tossup. So how can O'Malley be ahead by 3 points (Rasmussen poll) on Sept. 15 and by 11 points (Post poll) less than two weeks later?
Well, not surprisingly, it's all in the methodology. Specifically, it's the different ways pollsters arrive at their sample.
On Election Day, more than 1.5 million Maryland voters will go to the polls. But obviously, pollsters can't call up 1.5 million voters, so, instead, they consult a small group of voters (730 likely voters in The Post poll) that is designed to reflect, as closely as possible, the same characteristics (race, gender, age, party affiliation, geographic residence, etc.) as the 1.5 million voters expected to actually vote.
This weighted microcosm of the electorate is called the sample, and a poll is only as accurate as its sample. In other words, if 52 percent of the voters on Election Day are women, then 52 percent of the 730 voters in the pollster's sample should be women. Likewise with every other category — race, party affiliation and so on. To the extent that a pollster's sample fails to accurately mirror the characteristics of Election Day voters, the poll is flawed.
The first red flag in The Post's poll was O'Malley's 4-point lead among male voters. In every poll since 2005 (and in the 2006 election itself) Ehrlich wins among men. The latest Gonzales Poll has Ehrlich ahead by 8 points with men. The Post poll is the first time O'Malley has led among male voters, ever.
The next alarm bell was The Post poll's finding that in Montgomery County, Ehrlich gets only 27 percent of the vote. Impossible. In the 2006 election, he got 36 percent; in 2002, he got 38 percent.
Then, I saw O'Malley's favorable/unfavorable ratings. Post poll: 64 percent favorable, 26 percent unfavorable (+38 differential); Rasmussen's Sept. 15 poll: 54 percent favorable, 38 percent unfavorable (+16 differential). Too big a difference — something's wrong.
To The Post's credit, its pollster, Jon Cohen, shared his backup data and methodology with me. Here's what I discovered:
Most pollsters build their sample by looking at actual past voter turnout and exit polls from prior elections. Then they feed in current trends, what they're seeing in the field and, finally, their educated estimates based on experience and hunches. This is the art of polling.
The Post poll builds its sample much differently. Instead of using past elections, trends and hunches, it uses Census data. So, if blacks are 29.7 percent of the population, they are 29.7 percent of the Post poll's initial sample. If Republicans are 28 percent of the registered voters, they become 28 percent of the sample. Then, after building its sample based on this generic data, The Post refines its sample to final form by asking respondents a series of questions about their voting likelihood and intensity.
As you might expect, this difference in methodology leads to different samples, which, in turn, lead to different poll results.
My problem with The Post poll's Census-based approach is that it misses a lot of political data, such as September's primary election returns. The Post poll says that O'Malley's and Ehrlich's supporters are equally enthusiastic, yet this year's low-turnout primary saw a Republican surge (37,901 more voters) and a big Democratic drop (125,628 fewer voters) compared with the 2006 primary. Something's going on.
The bottom line? (emphais mine):
I believe The Post's sample over-counts African-Americans (who vote 9 to 1 Democratic) and undercounts Republicans, which accounts for O'Malley's startling lead. I further believe that O'Malley is leading Ehrlich by 3 to 4 points, primarily because O'Malley, with three times more campaign cash than Ehrlich, has had the airwaves to himself since June. But, trust me, O'Malley's lead is not 11 points. That may be the way some folks want it to be, but it's not the way it is.