Data Quality Metrics for Online Samples: Considerations for Study Design & Analysis


Sampling from online panels, recruited either using probability-based sampling techniques or opt-in web-based data collection procedures, has become an increasingly common methodology for survey researchers over the last decade.

In 2016, an AAPOR task force issued a report titled “Evaluating Survey Quality in Today’s Complex Environment” which outlined 17 questions that users of survey data should ask to help them make judgements about the survey’s results, regardless of the survey methodology. These questions work in tandem with the guidelines outlined by the AAPOR Transparency Initiative for survey disclosure in providing consumers of survey data an excellent framework for better understanding and assessing the possible error associated with a particular survey project. Yet, there is still a gap of guidance surrounding assessing the quality of online panels prior to data collection. In addition, the existing AAPOR task force reports on online survey panels and nonprobability sampling are aging and largely do not address probability-based online panels, which have become increasingly common since 2016. This task force examines the characteristics of online survey panels and gives guidelines for evaluating the quality of various online panel methodologies.

The goal of this task force is to provide audiences who have a basic understanding of survey methodology with an overview of the various types of online survey sampling methodologies currently being employed by survey researchers and major survey firms, as of the release of this report. Specifically, the report provides an overview of the landscape of online survey data collection, focusing mainly on probability and nonprobability online panels. We discuss how alternative methodologies for initial recruitment, decisions around panel freshening, respondent attrition, and missing data may impact sampling and data quality. Finally, we outline some ways to assess the quality of online samples, including well-known measures such as cumulative response rates and cooperation rates, as well as newer metrics that can be applied to online samples to evaluate representativeness and inferential reliability. We conclude with some key questions for researchers who are designing research studies based on online panels to potentially ask panel vendors as they develop their research design

Specific objectives:

  1. Develop a clear, concise, updated explanation of survey sample sources for online panels.
  2. Describe the representativeness and fitness for use of common sampling strategies that providers use to construct and replenish online panels.
  3. In particular, assess coverage error of online panels, including systematic error related to recruitment methods, self-selection, and coverage of internet non-users.
  4. Propose alternative metrics of sample quality, beyond completion rates and cumulative response rates that measure the underlying representativeness and utility of online samples.
  5. Discuss whether sample quality metrics developed for use with probability-based panels can be applied to samples from panels that do not recruit using probability-based methods.
  6. Discuss the application of AAPOR’s Code of Ethics’ reporting guidelines to studies using online panels and outline important issues regarding methodological transparency.