Chapter 5: Three Easy Ways to Spot Spin

Ethical MarketsEthical Markets Originals

Chapter 5: Three Easy Ways to Spot Spin
By Alan F. Kay, PhD
© 2004, (fair use with attribution and copy to author)
July 9, 2004

A little explanation of how polling works will lead us into easy ways to spot spin.  Regardless of the question, some people will give a “non-substantive” answer, called a “DK” in Chapter 2.  They may say, “I’m not sure,” ” I dunno,” “That’s a tough question,”  “What,” “I can’t answer that” (surprisingly rare), and maybe they will say nothing at all – for a long time. After all, if the interview is in a shopping mall, they might turn away to chat with a passing friend, or, at home, they might have laid the phone down to let someone in, stop the dog from chewing up the curtains, take Johnny to the bathroom, etc. All of these non-substantive responses are dumped into the DK bin and called the DKs.

The number of DKs can vary all over the place, from less than 1% to well over 50%.  That broad range turns out to be a clue for spotting spin.

Small DKs are associated with a well-designed question that:

  1. Is clear, unambiguous, tightly phrased in standard language with no unnecessary words.
  2. Offers a range of response choices sufficiently wide so that people find one that they have no difficulty in agreeing to.
  3. Follows closely, naturally and logically from preceding questions in the poll or from a short preamble that serves as a stage-setting introduction.
  4. People feel is important.
  5. Has a lot of meaning and significance.  Most people know it is pertinent to things that affect their lives deeply and significantly.

A question scoring high on all of these counts will have less than 2% DKs.

An important kind of question, easily meeting criteria 4 and 5, is a question on what people want for governance – policy, legislation, regulation and government actions.

In contrast, questions that typically have high DKs, usually 10% to over 50%, ask for:

  1. Opinions on a breaking news story.
  2. An evaluation of a one time situation or an event.
  3. A prediction of what will be the outcome of event or a policy.
  4. Information on a subject that a significant fraction of the populace are unaware of.
  5. Factual knowledge (like quiz show questions) and/or are unclear, ambiguous, confusing. (Most people will not approve what they do not understand.)

Statistical analysts and academic pollsters shun criteria like these that do not lend themselves to mathematical precision.  As I write, there are PhD political scientists trying to show that there is a correlation between length — the number of words in a question – and the size of the DKs.  That these two correlate is not an unreasonable conjecture.  If one has never paid much attention to poll question responses, it is easy to imagine that respondents fatigued from hearing a lengthy question get confused and opt for a DK.   However, the reality is that all of the above criteria, which came from the careful study of tens of thousands of question responses, show that the length of a question that has few DKs must generally be quite long – not short (more on this in Chapter 9).  Readers who have followed this presentation now know an important thing about polling that many academic pollsters, who have to do fancy statistical analyses for their theses, simply do not know.  Since we are only on the second page of this chapter, that’s not a bad record of accomplishment.  You get a good grade.

But now its time to move on and link the size of the DKs to spin.  When you hear or read a poll result reported by the media, the DKs are usually not mentioned at all.  Let’s look at a typical example.  Assume that the substantive responses actually asked in the poll were the four choices: “strongly agree,” “somewhat agree,” “somewhat disagree,” and” strongly disagree.”  The media usually collapse them, by combining the two agrees and the two disagrees to get, for example: 25% agree and 65% disagree.  The missing part 10% is the size of the DKs.  It is the shortfall of 25 plus 65 from 100.  As we said before, it’s like making change.  25 + 65 + 10 pennies give you a dollar.

So, if a poll result comes from the media and doesn’t add up to 100%, just count the pennies, see how you are being short-changed, and estimate the size of the DKs.  An added simplification is that you don’t have to get the arithmetic exactly right.  Because of the sampling error, you can be off by a few pennies because none of these numbers are exact.

Sometimes the media skirt DKs altogether by reporting a result this way:  “A new poll shows that the American people approve of school vouchers by three to two.”  You cannot figure the DKs from that.  To show the significance of this tricky way of reporting, consider two possibilities: (1) DK is 10% and (2) DK is 50%.  The numbers for the three-to-two split then become: with (1) favor 54% and opposed 36%; and with (2) favor 30% and opposed 20%.  Quite a difference!  In one case, a majority is in favor and in the other not even one-third of the public is in favor.  Similar results will happen for splits other than three to two.  We cannot tell what the situation is without more information.  On a school voucher question, for example, would the DKs be as large as 50%?  Ah, there’s the rub.  It may well depend on the exact wording of the question and the media seldom show us the complete question.  When many people are very unfamiliar with a new subject — such as globalization more than 10 years ago — there can be remarkably high DKs , such as the 45% DKs in Chapter 4.

If the media does not give the percentages favoring and opposed, it is telling the public very little.  Is that spin?  Since most people do not even notice the DKs and don’t miss them when not given (spin by “omission”), there is a widespread presumption that DKs are unimportant.  Yes, a small DK is not very important.  But when omitted, its size is unknown and since it could be large, not knowing is unfair and misleading.  I call that spin.  It is often not the reporter who is at fault.  It could be the editor, the station owner, a PR flack, or the pollster.  But someone has put a spin on it.  Don’t trust a poll result that does not give the substantive percentages or give DK explicitly.

The DK bin acts like a safety valve.  If there are no other clues that the public does not like any of the choices offered very much, or cannot deal with the question itself, a large DK is the best clue that spin is present.

Good pollsters often give the public other opportunities to opt out of the choices presented.  They allow volunteered responses.  What that means may not seem obvious until you think about the way surveys are taken.  Data entry is always into a computerized system like CATI (Computer Assisted Telephone Interviews) or by hand on a pre-printed paper form.  For in-person or field interviews, there are software systems functionally equivalent to CATI.  This recording process is essential for quality control to make certain that all interviewers read the same questions and offer the same choices, especially in highly interactive questionnaires, where answers to one question change what the following questions will be.  Only good software makes highly interactive polls practical.

We saw in Chapter 2 that many CATI-based polls are set up to accept only choice A, choice B, and DK.  No other responses are recorded.  Thus, if the respondent, R, responds with any substantive response except A or B, that volunteered response cannot appear in the poll data.  By contrast, the question in the following example allows the volunteered response into the poll data.  It does so by asking two different samples of the public, each sample asked in two slightly different but important ways.  For one sample, volunteered response possibilities were not mentioned by the interviewer and for the other they were.  When that is done, here is how it works out:

Whatever number of respondents volunteer, there are also a large number who would make that same choice — but only if the choice were explicitly read by the interviewer in the same way as all the other choices.  Why is that?  A lot of people are not proactive.  The interview may remind them of school tests or job application interviews.  To pass the test or get the job, they learned not to question the questioners, to choose only from what they were offered.  For long after, such people are not likely to rock the boat.  Case in point:  almost everyone drafted into the army in  WWII learned quickly, “never volunteer.”  Here is a real question example:

Question (ATI#5, ’88):  In combating the drug problem, which do you think our government should concentrate MORE on: (ROTATE choices)

    Volunteered   
  NOT Mentioned Mentioned
  Stopping Americans from USING illegal drugs? 39%   30%
  Stopping other countries from PRODUCING illegal drugs?  33%   27%
  Both equally (volunteered)?  — NOT Mentioned and Mentioned 24%   36%
  Neither (volunteered)?            — NOT Mentioned and Mentioned 3%   5%
  DK

2%   2%

Note that, compared to the results in the “Mentioned” column, when the interviewer did not mention that “Both” and “Neither” were acceptable answers, fewer responded with those choices and hence more responded with the two substantive choices.  (Both columns have to add to approximately 100, including the DKs, which are negligible in this example.  Yes, the first column adds to 101, due to round off error.)

Allowing all reasonable choices to be heard can be extremely important.  In this example, when the volunteered was mentioned, the third place choice at 24% shot up 36% to the top of the four choices, a 50% increase.

Here is the lesson for spin spotters, if a poll result is presented by the media with some volunteered choices allowed, first be pleased that this is a fair and well-designed question – compared to those where “volunteered” is locked out – and then, more important, recognize that the size of the sample choosing that volunteered choice would have been much larger if it had been an offered choice.  “Volunteered” can be even more of a safety valve than DK, but the cases where the absence of volunteered choices can be used as a sign of spin are less frequent.

In Chapter 2, we considered the simplest question, the “binary” or “either-or” question, with two substantive answers, A and B.  There is something that seems simpler than the either-or question, but in reality seldom is.  That is an `A or Not-A’ question.”  You know:  “Just tell me whether you favor A or not!”   This kind of a question is inherently unfair.  It is kind of like a fork in the road with only one branch in the fork – the A branch.  There is no other. You are being asked to either take A or what?  Stop dead, I suppose.

Politicians do this all the time.  They favor a proposal and state conclusively and in a heartfelt positive way, “This great nation has no other choice.”  Sometimes they say this after making the case for the proposal.  Sometimes out of the blue.  But, for all practical purposes, there is always at least one other choice and often many others.  Politicians want you to think that the policy choice that they and their sponsors want is “the only choice.”  Margaret Thatcher, former prime minister of the United Kingdom, by the time she had been driven from office, had such a reputation for labeling her major proposals with, “There Is No Alternative!” that her opponents in the public began to reply, “Not TINA again!  There are better choices.”

Now, we have to be fair to politicians and explain that the better ones do sincerely believe TINA, “There Is No Alternative!”  Here is their rationale.  Throughout their political lives, they have sought the solution to a political problem with the least possible change from something that has worked before.  That approach is the easiest to explain, understand and get supported.  Successful politicians can make their solutions appear to work well enough that their careers are not hurt, at least in the near term. But the truth is “There Is An Alternative!”  Often there are many simple alternatives that would work better than what a particular politician claims is without alternative.  Even when this is not the case, there is an alternative, and those who work closely with heads of state know it.  When within earshot of an intimate audience not including the head of state, they relax and sometimes succumb to this moment of truth: “Heads of state always do the right thing – after all else fails.”

Several, even many, “last resort” alternatives can be explored by using what is called the “systems” approach.  Instead of trying to affect their constituents, or indeed the whole world, in the smallest possible way, problem solvers should consider embedding their problem in a larger context – that is, to consider the “system” in which the problem occurred.  For example, there are hundreds of different ways of reducing hunger by directly supplying people with more food or educating them about how to get more – that is, become a more successful beggar, obtain microcredit, etc.  But the problem of hunger could be considered embedded in the larger problem of poverty.  If poverty, or some aspects of poverty, could be properly tackled it might be easier to reduce hunger than by doing so directly — and such an approach might improve things in many other ways, too.  Since people without money are considered poor, the problem of poverty is thereby embedded in the money system, which in turn is embedded in the financial system.  In principle, a better overall solution may be found in tackling one of the larger systems.  The example illustrates that there is always an embedded nest of systems that could be tackled to solve the problem in the smallest system.  In truth, it is not clear in advance how large of a system is the appropriate level to look into to fix the problem.  The larger the system considered, the more complex it is and the more possibilities for failure.  The “systems” approach should be looked into only when all politicians are screaming TINA at us.  Then, we gain more confidence that a larger system needs to be reviewed in order to have some hope of making real progress.

“There is no other choice” is reminiscent of the “choiceless choice” made famous by Nazi bullyboys, who sometimes formed two lines for concentration camp inmates and allowed each person to choose to wait in one.  Then both lines were led into the gas chambers by different entrances. Everybody died.  Adolescent Nazis could imagine the victims blaming themselves, just as the gas ovens were turned on, “If only I had chosen the other line.”  In reality, most Nazis, and their victims too, knew from the beginning that it was to be a choiceless choice.

Pollsters hired by, or otherwise favoring, a candidate for political office, know somewhat more sophisticated ways of offering choiceless choices.  Assume the pollster wants the public to choose A.  He can set up B to be close to “not-A” and build up the desirability of A by stressing its strong points.  Then B, appearing to be essentially not-A, seems weak without the pollster having to say or imply any weakness in B.  Alternatively, if B is quite different from not-A, the pollster can describe B as a weak and inadequate proposal.  Most respondents will then still choose A over B.  To use more subtlety, the bias in B can be made slightly negative, and the pollster may still get the result he seeks.  When a bad poll question does favor A over B, while the public itself favors B, people, surprisingly often, dig in their heels and the majority will still express its preference for B.  An example, “One-Sided Arguments,” appeared in Chapter 4.

Public interest polling resolves this difficulty in a clean way.  We use a small balanced team of polling and issue experts who design the poll, with at least one team member favoring each of the choices, in this case A and B.  Before the final acceptance of wording, the whole team has to agree that the descriptions of A and of B both are as strong as they can be made without bias.  If there is any question about this, then several versions of A and of B are tested in different surveys (with different samples) until this point is absolutely clear.  (And the same is true for C, etc, when there are three or more substantive choices offered).

But typically, commercial pollsters are willing to go with the strong version of A and the weak version of B.  This puts spin on the outcome.  How can you spot it?  Common sense helps.  Does the description of A have positive sounding phrases or of B weak, dubious, or confusing phrases?  Sometimes the spin can be spotted because the A choice simply has many more words in it than the B choice.

Before leaving this subject, there is one more thing that we have to say about DKs.  Remember when the respondent was not responding at all to a pollster’s question?  Well, re-asking the question still may not produce a substantive answer.  There is one thing that is a no-no for the pollster, that is to say something that suggests a specific substantive answer – that is, anything other than re-reading the entire question with all the allowed choices given equal weight and emphasis.  It would be totally unprofessional and out-of-bounds for the pollster to suggest or imply that a particular substantive answer is preferred.  Still, there may be more that the pollster can do that will make a difference.

Years ago, I had sent one survey to a field house and a second survey to a different field house at about the same time.  Although they were designed for different purposes and had many different questions, both surveys had some important identical questions — a few with a large number of choices offered.

I was quite upset when I saw the results.  For questions with the same wording, the two surveys differed by as much as 10%.  I queried the two field houses and learned that the difference arose because one used a procedure called “probing the DKs” and got DKs down to about 2% or 3% depending on the question, and the other one did not.  I was relieved when I discovered that if I imagined that the percentage of extra DKs from a question in the unprobed survey were distributed to all the substantive choices, it was possible to allocate the distribution so that, post-allocation, the responses of the same question in the two surveys were in agreement with each other, often within ± 1%.  Moreover, this procedure of allocating the DKs worked for all the questions that were identical in both surveys.  For example, if the original results for a question asked identically in both surveys — with the same five substantive choices offered in each — was:

Unprobed Survey:  DK = 11%, A=36%, B=27%,   =16%, D=6%, E=4%

Probed Survey:      DK = 2%, A=40%, B=30%, C=19%, D=6%, E=4%

Differences                   9%      – 4%       – 3%       – 3%     0         0.

The response to all of the five substantive questions in the two surveys, for example, can be made identical if we reduce the unprobed survey DK to 1%.  The DKs of the two surveys differ by 1% and all other response differences are zero.

A shift of response of 10% or more is much greater than the error due to sample size, less than ±3%, but it is only mentioned by pollsters when dealing with a customer who knows about this problem.  The pollster can ask, “Do you want to probe the DKs?” and do whatever the customer wants.  But time is money.  Pollsters do not do this unless the customer really wants it and will pay.

Allocating the DKs is not a very accurate rule-of-thumb, but it can tell us something important in some cases that could not be found any other way.  A remarkable study using data current on Nov. 11, 2002, gives us an example that depends on the responses to just one question, the most frequently asked question in polling, a question that many different field houses ask frequently:

Do you approve or disapprove of the way George W. Bush is handling his job as president?”

Table 1 shows the field dates of the most recent asking of this single question, by 11 different highly regarded and prominent polling organizations in the United States, followed by the number of times they have each asked this question in the past 22 months (311 times altogether).  The organizations are often a collaboration of three: pollster/ print news/ TV news.  The field house used may or may not be independent of the collaboration.

Table 1. In order of most recent asking
(Pollsters/Print news/TV news)
most recent
field dates
# of times asked this same question in past 22 months
 

1.

Gallup/USA Today/CNN

11/8-10

68

 

2.

PrincetonSRA/Newsweek

11/7-8

32

 

3.

NYT/CBS News

11/2-4

33

 

4.

Wash.Post/ABC News

10/31-11/2

26

 

5.

Ipsos-Reid/Cook

10/28-31

17

 

6.

Zogby/Reuters

10/26-29

27

 

7.

Harris/Time/CNN

10/23-24

17

 

8.

OpinionDyn/FOX news

10/22-23

36

 

9.

Hart-Teeter/WSJ/NBC news

10/18-21

13

 

10.

PrincetonSRA/Pew

10/17-27

23

 

11.

Harris

10/15-21

19

 

 

311

Table 2 shows the responses to the most recent asking of each organization (listed in the same order as in Tables 1 and 3).  The percent of the public who don’t answer the question or give no answer, or say they are not sure or, as in (5), have “mixed feelings,” are recorded under the catch-all, DK.

Table 2  Responses to most recent asking. Organizations in same order as in Table 1.
  Organizations

Percentages

     

Approve

Disapprove

DK

Population
Sampled

Wording Variations

 

1.

68

27

5

Adults

 
 

2.

60

30

10

Adults

 
 

3.

61

30

9

Likely Voters

 

4.

67

32

1

Likely Voters

 

5

64

34

2

Adults

“mixed feelings”

 

6.

64

35

1

Likely Voters

 
 

7.

61

33

6

Adults

“In general”

 

8.

60

30

10

Likely Voters

 
 

9.

63

31

6

Registered Voters

“In general”

 

10.

59

29

12

Adults

 
 

11.

64

35

1

Adults

In all 11 surveys, the populations sampled are as indicated: either all “adults” (over 18), people who say they are “likely to vote,” or “registered voters.”

There also were slightly different question wording in the most recent asking of all 11 organizations.  In surveys (7) and (9) the phrase “In general” was included as the first words of the question.  In the Harris poll (11) the public responses allowed were: “excellent”, “pretty good”, “only fair”, or “poor”.  As frequently done “excellent” and “pretty good” were combined under “approve,” while “only fair” and “poor” were combined under “disapprove”.

A clue to the importance of probing the DKs was uncovered by further research revealing that organizations (4), (6) and (11) that have 1% DK in Table 2 generally obtained very low DKs when they repeatedly asked the question earlier.  The four organizations with DKs ranging from 9% to 12%, similarly had large DKs when they asked the same question earlier.

A polling organization that wants small DKs asks its field house to “probe the DKs,” which requires them to use a number of techniques to increase either “approve” or “disapprove” responses and reduce DKs.  Techniques include: (1) waiting patiently for an answer, (2) after a long time encouraging a substantive, not a DK answer, by saying a neutral colloquial phrase like, “Well, whad’ya think?”  (3) being willing to call back later if respondent feels rushed, (4) reminding respondents that the survey results are important in determining what kind of governance we’ll all get in the future.

Those pollsters who need to meet short deadlines, or put their interviewers on hourly quotas, or want their costs as low as possible (the shorter the survey, the less it costs the sponsor) want an interviewer to give the respondent almost no thinking time.  After a second or two, interviewer impatiently says, “Don’t know? That’s OK.”  Most respondents assent, and it’s on to the next question.  Another reason for encouraging a quick DK response is ideology.  Many organizations sincerely believe that the general public knows little about anything as important as politics.  For some or all of these reasons, certain organizations prefer large DKs.

In each of the 11 rows of Table 2, the ratio of “approve” to “disapprove” is close to two to one.  Table 3 results are derived from Table 2 values by splitting the percentage points of each DK in the two to one ratio and adding those points to the “approve” and “disapprove” percentages in the same row, rounding off fractional points for best-fit. For example, the third row DK is nine.  Nine, split two to one, is six and three. Six is added to “approve” and three is added to “disapprove”, making them 67 and 33 with DK reduced to zero.  If the DK is not divisible by three without fractions, such as in row one where it is five, we get five divided two to one is 3.33 and 1.67, which rounds off to three and two when added to the “approve” and “disapprove” percentages and gives exactly what is shown in Table 3.  This procedure, called allocating the DKs, is based on the idea that those who respond to the question have roughly the same preference ratio as those who explicitly choose their responses for either “approve” or “disapprove.”   Allocating the DKs successfully in an increasing number of cases develops empirical evidence to substantiate the validity of the idea.

Table 3.  Responses, most recent asking, “Don’t Knows” allocated to 0.

Organizations Approve Disapprove
  1. 71 29
  2. 66 34
  3. 67 33
  4. 67 33
  5. 65 35
  6. 65 35
  7. 65 35
  8. 66 34
  9. 67 33
  10. 67 33
  11. 65 35

Look at Table 3.  At the top, (1) is anomalous: four points (or more) higher for “approve”, and four points (or more) lower for “disapprove”, than any of the other 10 cases.  The 10 are each within +1 point of 66% “approve” and 34% “disapprove.”  Was there something momentous that happened before (1) was in the field and after (2)-(11) had been completed?  Yes, there was. The UN Security Council unanimously approved the Iraq resolution wanted by Bush to make it clear that the “whole world was behind the U.S.,” a very important forward step for Bush’s evolving preemptive policy.  Bush’s approval rating went up significantly after his big UN win.  If we had not allocated the DKs, the distinction of the Gallup result, would have been almost unnoticeable in Table 2.

We have shown empirically that the variation in responses in Table 2 that theoretically might be due to four differences among the 11 surveys can be ascribed to a sampling error of only +1%, not the usual +3%.  Further, within that small error, the facts that (A) the field dates of the 11 surveys were not exactly the same, but stretched over a short 24-day period; (B) the populations sampled varied; and, (C) the exact wording varied somewhat are all immaterial.

Allocating DKs is useful, makes polling results more consistent, and sometimes tells us something important that would be totally lost.  I’m for it.

There is no standard behavior in the polling industry about whether or how to probe the DKs.  Pollsters feel they have enough trouble fending off hecklers and critics who question many of the weaknesses of bad polling, such things as high discontinuance rates.  They do not relish trying to explain what they do about “probing the DKs,” let alone why it is a problem.  A famous, honest pollster, Fred Steeper, who can be credited with the election of George W. Bush and his father before him, called the problem of probing the DKs, “the dirty little secret of polling”.

Now, the reader who has completed this chapter is way ahead of political junkies and leading politicians, too.  Both of these groups with very few exceptions don’t know about “the dirty little secret.”

??????????????????????????

Summarizing our findings in this chapter:  The three easy ways that help spot spin are these:

1. There is spin if the media don’t report enough for you to figure the size of the DKs.

2. There is spin if the DKs are large, particularly for the kind of question being asked.

3. There is spin if the “Volunteered” is large, but you can correct for it.

4. There is spin (a) if only one choice is offered, (b) if only two choices are offered while “neither” and/or “both” are omitted, and, (c) if any choice is omitted that common sense tells you should be there, even if it does not score well.

[Click here to read on into Chapter 6 …]