Changes

Directory:Akahele/Survey says... (view source)

Revision as of 01:36, 23 October 2010

837 bytes removed , 01:36, 23 October 2010

"Om nom nom nom"? (partial)

Line 1: Line 1: −

Those old enough to remember the Carter and Reagan administrations are likely to have enjoyed the highly popular game show, ~~<a title="Family Feud (funny clips)" href="~~http://www.youtube.com/watch?v=_oxt9e5B4bE~~" target="_blank">~~Family Feud~~</a>~~, if not for the spectacle of two extended families competing against each other, then for the "play along at home" aspect of matching wits with those families, or (if anything) counting to see how many times host Richard Dawson would plant a (too often unwelcome) kiss (or two) on the lips of any female contestant.

+

Those old enough to remember the Carter and Reagan administrations are likely to have enjoyed the highly popular game show, [http://www.youtube.com/watch?v=_oxt9e5B4bE ''Family Feud''], if not for the spectacle of two extended families competing against each other, then for the "play along at home" aspect of matching wits with those families, or (if anything) counting to see how many times host Richard Dawson would plant a (too often unwelcome) kiss (or two) on the lips of any female contestant.

−

~~~~A survey we trusted~~~~

+

'''A survey we trusted'''

−

The most intellectually viable aspect of ~~~~Family Feud~~~~ was the core of the program -- the response data from a survey of 100 people answering questions that tend to cluster common answers: "Name something you buy on every visit to the grocery store" or "Give a slang term for a policeman".

+

The most intellectually viable aspect of ''Family Feud'' was the core of the program -- the response data from a survey of 100 people answering questions that tend to cluster common answers: "Name something you buy on every visit to the grocery store" or "Give a slang term for a policeman".

<tbody>

Line 13: Line 13:

</tr>

</tbody></table>

−

As a practitioner in the field of marketing research, I know darn well that a sample of 100 respondents (~~<a title="Family Feud survey panel theory" href="~~http://sg.answers.yahoo.com/question/index;_ylt=Ana0kDXsKamqprcApIm6RfQh4wt.;_ylv=3?qid=20060606190510AAzZnok~~" target="_blank">~~heaven knows~~</a>~~ how they were selected for participation in the survey) is practically bunk. But the methodology seemed to work out just fine for a family game show. There were never any scandals or disputes centered on the answers to that survey. We knew we were about to come face to face with a reliable-enough "fact" when Dawson would turn to that big board behind him and shout, "Survey says...!"

+

As a practitioner in the field of marketing research, I know darn well that a sample of 100 respondents ([http://sg.answers.yahoo.com/question/index;_ylt=Ana0kDXsKamqprcApIm6RfQh4wt.;_ylv=3?qid=20060606190510AAzZnok heaven knows] how they were selected for participation in the survey) is practically bunk. But the methodology seemed to work out just fine for a family game show. There were never any scandals or disputes centered on the answers to that survey. We knew we were about to come face to face with a reliable-enough "fact" when Dawson would turn to that big board behind him and shout, "Survey says...!"

−

Today, in the world of overnight web-panel-based consumer data collection, I'm not nearly as comfortable as I was at a young age with trusty Richard Dawson and his big, flashing incandescent board on ~~~~Family Feud~~~~.

+

Today, in the world of overnight web-panel-based consumer data collection, I'm not nearly as comfortable as I was at a young age with trusty Richard Dawson and his big, flashing incandescent board on ''Family Feud''.

−

~~~~My experience with Internet surveys

+

'''My experience with Internet surveys

−

~~~~

+

'''

I'm hardly new to the practice of conducting survey research via the Internet. In fact, e-mail borne surveys were an important part of my business practice as far back as 1993 -- respondents would "edit" the reply e-mail text with their answers, send it back, and the software would detect the answers within the confines of pre-formatted response spaces within the e-mail text. Crude in retrospect, but these techniques worked fairly well, especially when targeting a highly selective sample (such as the customer list of a business-class laser printer manufacturer).

−

About four or five years later, true web-based survey platforms were well established, but how to populate these questionnaires with ~~<a title="Probability sampling" href="~~http://www.socialresearchmethods.net/kb/sampprob.php~~" target="_blank">~~representative, diverse respondents~~</a>~~ was becoming a hot potato. Everyone seemed to acknowledge that web panels attracted non-typical consumers, but the low cost of execution and speed of turn-around were just so damn tempting. Of course, the major web panel vendors did their best to come up with various techniques (and white papers) that demonstrated ways to "balance" web samples, so that they might pass muster with executives on the client side. But, remaining at the crux of all survey research and not just web-based sampling, is the question of self-selection bias. People who willingly spend 15 minutes of their time to complete a questionnaire are not "normal", in the sense that they sometimes fail to represent the attitudes and behaviors of people who prefer not to spend their time that way. It appears that, simply, this problem is accentuated among Internet populations.

+

About four or five years later, true web-based survey platforms were well established, but how to populate these questionnaires with [http://www.socialresearchmethods.net/kb/sampprob.php representative, diverse respondents] was becoming a hot potato. Everyone seemed to acknowledge that web panels attracted non-typical consumers, but the low cost of execution and speed of turn-around were just so damn tempting. Of course, the major web panel vendors did their best to come up with various techniques (and white papers) that demonstrated ways to "balance" web samples, so that they might pass muster with executives on the client side. But, remaining at the crux of all survey research and not just web-based sampling, is the question of self-selection bias. People who willingly spend 15 minutes of their time to complete a questionnaire are not "normal", in the sense that they sometimes fail to represent the attitudes and behaviors of people who prefer not to spend their time that way. It appears that, simply, this problem is accentuated among Internet populations.

−

~~~~Losing faith

+

'''Losing faith

−

~~~~

+

'''

−

Between about 2001 and the present day, I've gradually been losing faith in the entire premise of reliable Internet-sampled and Internet-fielded marketing research. Last month, a presentation at the ~~<a title="CTAM Research Conference" href="~~http://www.ctam.com/conferences/Research/index.html~~" target="_blank">~~CTAM Research Conference~~</a>~~ in Washington, DC, practically sealed the deal for me. ~~<a title="Mktg, Inc." href="~~http://www.mktginc.com/ourteam.asp~~" target="_blank">~~Dr. Steven Gittelman~~</a>~~ conducted a meta audit of 17 different U.S. web panels. His research found that on nine of these panels, well over 15% of the participants were completing more than thirty Internet surveys per month. Furthermore, on most U.S. panels, anywhere from 40% to 55% of members are also enrolled in at least ~~~~four other~~~~ survey research panels!

+

Between about 2001 and the present day, I've gradually been losing faith in the entire premise of reliable Internet-sampled and Internet-fielded marketing research. Last month, a presentation at the [http://www.ctam.com/conferences/Research/index.html CTAM Research Conference] in Washington, DC, practically sealed the deal for me. [http://www.mktginc.com/ourteam.asp Dr. Steven Gittelman] conducted a meta audit of 17 different U.S. web panels. His research found that on nine of these panels, well over 15% of the participants were completing more than thirty Internet surveys per month. Furthermore, on most U.S. panels, anywhere from 40% to 55% of members are also enrolled in at least '''four other''' survey research panels!

<div>

−

~~~~Things that make you go, "Hmm..."

+

'''Things that make you go, "Hmm..."

−

~~~~

+

'''

My research team recently fielded a quick online survey with a San Diego vendor I implicitly trust to have one of the best panels in the online research business. The sampling was intended to be nationally representative of Internet households who had either cut wire-line telephone service in the past 12 months, or were strongly intending to do so in the next 12 months, and guess what? It’s rather clear that a lot of respondents weren’t paying attention by the end of the survey: nearly 32% of the respondents said they were Hispanic or Latino. There is no way that's a true statistic, especially considering how Hispanics under-index for Internet penetration and English fluency.

Line 37: Line 37:

Granted, some of this particular over-reporting was due to the way the question was asked (in a format usually intended for a telephone survey, where I’m sure the live interviewer does a better job of getting the right answer):

−

~~~~To ensure proper ethnic representation, please answer; are you of Hispanic or Latino ethnicity or background?~~~~

+

''To ensure proper ethnic representation, please answer; are you of Hispanic or Latino ethnicity or background?''

−

<div>~~~~1 Yes (white Hispanic)

+

<div>''1 Yes (white Hispanic)

2 Yes (non-white Hispanic)

3 No

−

R Prefer not to say~~~~</div>

+

R Prefer not to say''</div>

My guess is that a significant number of white non-Hispanics and black non-Hispanics selected punch 1 and punch 2, semi-consciously reacting to the words “white” and “non-white” to inform their response, rather than the question text itself.

Line 56: Line 56:

Yeah, right. Maybe if the respondents are time travelers, reporting back to us their household characteristics from the year 2019. Why do we tolerate "findings" like these? In a word, because the data can be collected quickly and cost-efficiently, and (thankfully) these behavioral measures were not a key objective of what was essentially an attitudinal survey.</div>

−

~~~~Setting the trap

+

'''Setting the trap

−

~~~~

+

'''

−

Over the past year, I have taken to using a simple technique to "trap" respondents who are not paying attention to (or lying about) survey questions. By adding "tripwire" questions to the beginning of a survey, I am able to diagnose respondents who are more likely blithely clicking check-boxes ("~~<a title="Jon Krosnick on satisficing in surveys" href="~~http://www3.interscience.wiley.com/journal/112415330/abstract?CRETRY=1&SRETRY=0~~" target="_blank">~~satisficing~~</a>~~" a questionnaire) than actually paying attention. I provide a list of relatively uncommon products or experiences, then terminate from the survey anyone who answers that an ~~~~extremely~~~~ unlikely number of these items apply to them -- that is, it's far more likely the respondent is lazily or deceptively completing the questionnaire than it is that they are attentively and truthfully responding. Some examples may help illustrate the principle.

+

Over the past year, I have taken to using a simple technique to "trap" respondents who are not paying attention to (or lying about) survey questions. By adding "tripwire" questions to the beginning of a survey, I am able to diagnose respondents who are more likely blithely clicking check-boxes ("[http://www3.interscience.wiley.com/journal/112415330/abstract?CRETRY=1&SRETRY=0 satisficing]" a questionnaire) than actually paying attention. I provide a list of relatively uncommon products or experiences, then terminate from the survey anyone who answers that an ''extremely'' unlikely number of these items apply to them -- that is, it's far more likely the respondent is lazily or deceptively completing the questionnaire than it is that they are attentively and truthfully responding. Some examples may help illustrate the principle.

In a recent survey, I asked which of the following items were in the respondent's home, and these were the results:

Line 65: Line 65:

<tbody>

−

<td class="xl26" style="height: 13.5pt; width: 193pt;" width="257" height="18">~~~~PRESENT IN HOUSEHOLD~~~~</td>

+

<td class="xl26" style="height: 13.5pt; width: 193pt;" width="257" height="18">'''PRESENT IN HOUSEHOLD'''</td>

−

+

</tr>

Line 105: Line 105:

</tr>

</tbody></table>

−

Never mind that as of February 2007, only about ~~<a title="Scientific American on Segway" href="~~http://www.scientificamerican.com/article.cfm?id=power-walker~~" target="_blank">~~24,000 Segway units~~</a>~~ had ever been sold, and many of them to corporate and law enforcement clients, not residential households. So, we may choose between lazy and/or lying survey respondents (1.6 million), or we have realistic transactional data to guide us (24,000).

+

Never mind that as of February 2007, only about [http://www.scientificamerican.com/article.cfm?id=power-walker 24,000 Segway units] had ever been sold, and many of them to corporate and law enforcement clients, not residential households. So, we may choose between lazy and/or lying survey respondents (1.6 million), or we have realistic transactional data to guide us (24,000).

Do you see my frustration with web-based data collection?

Line 113: Line 113:

<tbody>

−

<td class="xl28" style="border-color: -moz-use-text-color black -moz-use-text-color -moz-use-text-color; height: 13.5pt; width: 193pt;" width="257" height="18">~~~~PARTICIPATION LAST 3 MONTHS~~~~</td>

+

<td class="xl28" style="border-color: -moz-use-text-color black -moz-use-text-color -moz-use-text-color; height: 13.5pt; width: 193pt;" width="257" height="18">'''PARTICIPATION LAST 3 MONTHS'''</td>

−

+

</tr>

Line 147: Line 147:

On this panel, we terminated any who affirmed at least 4 of these items -- a near impossibility. What is the likelihood, for example, of a person selected at random who is on unemployment, stayed in a Ramada Inn, rolls in a bowling league, and coaches a youth baseball or soccer team? But, we "caught" four such respondents out of 504. This nearly impossible configuration would pro-rate to being true for about 1,785,700 Americans. That is, 4 divided by 504, times about 225,000,000 adults.

−

This same data shows that 2.4% of adults are in a bowling league within the past three months, or 5.4 million adults. This is about two times the known count of adults ~~~~and~~~~ children (combined) participating annually in a bowling league, ~~<a title="2.3 million league bowlers" href="~~http://www.bowl.com/usbowler/about.aspx~~" target="_blank">~~according to the USBC~~</a>~~. From corporate reports, I estimate that Ramada has about 50,000 rooms in the United States. Over three months, that's about 4.5 million room-nights possible. According to the above survey screener, 6.7 million adults have stayed in a Ramada room at some point in the past 3 months. Even with 2 adults per room, that's an amazing occupancy rate -- Monday through Sunday, every week of the past three months, if we are to believe this sample. I conclude that we cannot believe the sample. The duplicate bridge stat is interesting -- web panels skew younger, and bridge skews older. According to the ACBL, there are about 11 million people in the U.S. who play ~~<a title="ACBL study (1986)" href="~~http://homepage.mac.com/bridgeguys/pdf/Newspaper/RecreationSpecialization.pdf~~" target="_blank">~~contract bridge~~</a>~~. According to our screener, though, it's only 2.25 million -- under-reported by a factor of perhaps five.

+

This same data shows that 2.4% of adults are in a bowling league within the past three months, or 5.4 million adults. This is about two times the known count of adults ''and'' children (combined) participating annually in a bowling league, [http://www.bowl.com/usbowler/about.aspx according to the USBC]. From corporate reports, I estimate that Ramada has about 50,000 rooms in the United States. Over three months, that's about 4.5 million room-nights possible. According to the above survey screener, 6.7 million adults have stayed in a Ramada room at some point in the past 3 months. Even with 2 adults per room, that's an amazing occupancy rate -- Monday through Sunday, every week of the past three months, if we are to believe this sample. I conclude that we cannot believe the sample. The duplicate bridge stat is interesting -- web panels skew younger, and bridge skews older. According to the ACBL, there are about 11 million people in the U.S. who play [http://homepage.mac.com/bridgeguys/pdf/Newspaper/RecreationSpecialization.pdf contract bridge]. According to our screener, though, it's only 2.25 million -- under-reported by a factor of perhaps five.

−

~~~~Can they pass the test?~~~~

+

'''Can they pass the test?'''

When showing respondents a description of a new product or service concept (sometimes even with an informative video clip), we've taken to the habit of giving the respondents a short, three-question "true or false" quiz about the concept they've just read about (and/or watched). These are not very difficult questions for a sentient, attentive person of even less-than-average IQ to answer. Consistently, we are finding that between 20% and 35% of respondents will fail this quiz that immediately follows presentation of the concept. My conclusion: perhaps a third of web survey respondents aren't paying any attention to the communications we're putting before them in surveys.

−

~~~~Akahele~~~~ is presenting you data, both anecdotal and quantitative, each and every week. What conclusions are you drawing about the key theme of ~~~~trust ~~~~and the~~~~ Internet~~~~? We look forward to your joining us with personal comments below.

+

''Akahele'' is presenting you data, both anecdotal and quantitative, each and every week. What conclusions are you drawing about the key theme of '''trust '''and the''' Internet'''? We look forward to your joining us with personal comments below.

−

~~~~Image credits:~~~~

+

'''Image credits:'''

<ul>

−

<li>Richard Dawson (Mark Goodson-Bill Todman Productions), ~~<a title="Fair use" href="~~http://www.copyright.gov/title17/92chap1.html#107~~" target="_blank">~~fair use doctrine~~</a>~~.</li>

+

<li>Richard Dawson (Mark Goodson-Bill Todman Productions), [http://www.copyright.gov/title17/92chap1.html#107 fair use doctrine].</li>

−

<li>Segway personal transporter, ~~<a title="Fair use" href="~~http://www.copyright.gov/title17/92chap1.html#107~~" target="_blank">~~fair use doctrine~~</a>~~.</li>

+

<li>Segway personal transporter, [http://www.copyright.gov/title17/92chap1.html#107 fair use doctrine].</li>

</ul>

</div>

Seurat

35

edits