Short Courses

How to Use NLP and Generative AI in Survey Research

Speakers: Joshua Y. Lerner and Soubhik Barari

Thursday, October 10th, 2024

2:00 pm – 5:30 pm ET

$100 member/$125 non-member

Description:

This course covers the principles and application of Natural Language Processing (NLP) and Generative AI for survey research. As advanced AI methodologies become increasingly integrated into survey research, this course aims to equip participants with the necessary theoretical and practical knowledge to harness these methods effectively.

The course begins with an introduction to the foundational concepts of NLP and Generative AI, establishing a theoretical framework for their application within survey research. This foundational understanding is crucial for appreciating subsequent discussions on the utility and limitations of these methods.

Following this, we will provide an overview of the key developments in the integration of NLP and Generative AI into surveys, focusing on innovations in the recent methodological literature.

The course then delves into the areas where NLP and Generative AI demonstrate exceptional utility, including analyzing open-ended survey responses, dynamic survey probing, conversational interviewing, iterative survey design, tailored survey experiments, and missing value imputation. These applications underscore the transformative potential of AI in enhancing the precision, efficiency, and adaptability of survey research.

Conversely, the course will critically assess the limitations and challenges inherent in applying NLP and Generative AI to survey research. Specific focus will be placed on the complexities associated with synthetic respondents, the generation of survey questions de novo, the prediction of public opinion, and the application of these technologies in cross-cultural contexts. Understanding these limitations is essential for the responsible and informed application of AI in survey research.

By the conclusion of this course, participants will have acquired a rigorous understanding of both the capabilities and constraints of NLP and Generative AI in survey research, enabling them to apply these tools with greater sophistication and discernment in their professional practice.

 

Learn More & Register

Qualitative In-depth Interview Method: A Quality Approach to Data Collection and Analysis

Wednesday, October 25, 2023

2:00 PM – 5:30 PM ET

Speaker: Margaret R. Roller MA

As the most frequently used qualitative research method, the in-depth interview enables researchers to explore complex issues and gain a contextually rich understanding of participants’ lived experiences. However, the complexities associated with conducting qualitative in-depth interviews present unique challenges to researchers who strive to develop qualitative research designs that result in meaningful contextual data and analysis while incorporating quality measures that maximize the ultimate usefulness of their research.

This short course will discuss an approach that is focused on rigorous in-depth interview method design that does not stifle the unique attributes of qualitative research and the creative approaches utilized by skilled qualitative researchers. With respect to data collection, this course will discuss the strengths and limitations of the in-depth interview method, scope (sampling, sample size, and cooperation), data gathering (researcher and participant effects, guide development, interviewer skills and techniques), and mode considerations. The analysis portion of the short course will discuss the association between the unique attributes of qualitative research and the skills required to analyze qualitative data, analytical approaches, data formats, an eight-step analysis process, and a review of CAQDAS (computer-assisted qualitative data analysis software). Practical examples from Margaret’s work and the literature will be used throughout the course to illustrate points of discussion. As an interactive course, attendees will be asked for their input on topic areas and encouraged to ask questions.

Weighting and Analyzing Nonprobability Samples for Population-Based Inferences

Virtual  |  May 7, 2024  |  10:00 am – 1:30 pm ET

Instructors: Lingxiao Wang, University of Virginia and Yan Li, University of Maryland

Description:

Making valid population-level inference is a central goal in survey research. While studies with probability sampling are the gold standard to conduct design-based inferences about the target finite population, they are facing substantial challenges such as high costs and reduced response rates in recent decades. As a remedy, nonprobability samples have been increasingly collected in many areas including education, medical studies, and public opinion research. Nevertheless, nonprobability samples cannot well represent the target population due to non-random sampling. Consequently, the naïve nonprobability estimates can be biased from the target population quantities.

Quasi-randomization methods are among the most used approaches to improve representativeness of nonprobability samples. These methods create “pseudoweights” for nonprobability sample individuals using contemporaneous probability surveys as references, which substantially reduce selection bias in estimating target population quantities.

This course will first provide a comprehensive review of the framework for making finite population inferences from nonprobability samples, covering various pseudoweighting methods published in recent literature. Then the attendees will be guided through the specific steps of constructing pseudoweights. To ensure practical application, the course will include software and real-data examples, illustrating how to construct pseudoweights and analyze nonprobability samples, for estimating finite population means and associations.

Register Here

Using Qualitative Inquiry to Inform Quantitative Data Collection and Analysis

Virtual  |  May 8, 2024 |  10:00 am – 1:30 pm ET

Instructor: Lila Rabinovich, University of Southern California

Description:

Qualitative methods are increasingly recognized as a valuable addition to quantitative research including survey and experimental studies. There are several channels through which qualitative inquiry can contribute to quantitative research. For instance, qualitative data can help explain the causal mechanisms behind the observed effects of an intervention. It can also provide insights into particular study populations that are difficult to capture well through traditional recruitment methods, thus improving methods of recruitment, retention and measurement. It can help develop and improve quantitative data collection instruments and processes.

This course aims to provide guidance on when and how best to implement these methods. Specifically, the course will explore the ways in which qualitative data collection can support and inform quantitative research design, data collection, and analysis. In particular, the course will focus on actionable recommendations for incorporating qualitative data collection into study designs, including, among others:
• Formative qualitative research to understand a topic more deeply to inform development of hypotheses and survey instruments;
• Formative qualitative research to understand a hard-to-reach population;
• Cognitive testing to check clarity, burden and interpretation of existing survey instruments;
• Developing and pre-testing experimental interventions;
• Understanding/probing survey results.

Participants will be encouraged to bring questions from their own projects for discussion with the group during the workshop.

Register Here

Questionnaire Design 101

Virtual  |  May 9, 2024  |  10:00 am – 1:30 pm ET

Instructor: Pam Campanelli, The Survey Coach

Description:
Are you new to questionnaire design or learned on the job, but haven’t had formal training or studied in the past, but would like a refresher? This course is for you. It covers both well-known and less known, but important rules. It covers creating a new questionnaire, trade-offs (clear versus short and simple), the 4 cognitive stages in survey response, question wording guidelines, issues with factual and subjective questions, and problematic question formats to beware of or avoid. An ‘end-of-course’ appendix includes tips for demographic/socio-economic questions and business facts for establishment surveys, the visual side to web surveys, aids to improve respondent recall and reduce question sensitivity, and how these last two concerns differ between surveys for individuals/households and establishments.

Register Here

Navigating the Privacy-Utility Tradeoff: An Introduction to Data Privacy Techniques

Virtual  |  May 9, 2024  |  2:00 pm – 5:30 pm ET

Instructors: Claire Bowen, Maddie Pickens, and Gabe Morrison, Urban Institute

Description:
With an increasingly connected and surveilled world, high-quality datasets can be more easily constructed but also are more vulnerable to abuse than ever. Although collecting more and better data can provide great benefits to society, for example by furthering medical research or targeting public investments to help those most in need, data privacy concerns surface when that information can be de-anonymized and used maliciously. This half-day course will provide an overview of current data privacy methodology, focusing on the generation of synthetic data and the application of differentially private methods. Through examinations of case studies and hands-on exercises, you will learn to apply data privacy techniques and evaluate the resulting disclosure risk and data utility. Attendees should have basic R programming experience.

Register Here

A Practical Introduction to Sensitivity Analyses for Non-ignorable Selection Bias in Surveys

In-Person  |  May 14, 2024  |  2:00 pm to 5:30 pm ET

Instructors: Rebecca Andridge, Ohio State University and Brady West, University of Michigan

Description:
This short course will provide a hands-on introduction to a set of recently developed indices (SMUB, MUBP) for the assessment of potentially non-ignorable selection bias in estimates computed from nonprobability samples and low response rate surveys. The course will begin with a non-mathematical overview of the indices, including the theoretical underpinnings and model assumptions as well as what data are necessary in order to effectively use these indices. Hands-on computing exercises using publicly available data and R software will walk participants through the use of these indices for a sensitivity analysis. The focus will be on calculation and interpretation of these indices in real-world applications, including data from U.S. pre-election polls, public opinion polls, and large-scale probability surveys with low response rates. The importance of high-quality, population-level auxiliary data for the computation of these indices will be discussed, along with specific recommendations for such data sources. Registered participants will get electronic instructions for installing and starting the R Studio software on their personal laptops prior to the short course. Prior experience with R is not necessary. Participants will also be provided with electronic versions of annotated, working code and data sets prior to the short course, enabling participants to easily follow along without making errors in typing code during the short course.

Cross-cultural Survey Research: Considerations and Best Practices

In-Person  |  May 14, 2024 |  2:00 pm to 5:30 pm ET

Instructor: Emilia Peytcheva, RTI

Description:
This course will provide an introduction to survey research methods for designing multinational and multicultural surveys. It will focus on measurement error in cross-cultural surveys, but will briefly touch on other sources of error. The course will begin with a theoretical background for cross-cultural differences drawing on cross-cultural psychology and psycholinguistic theories. The discussion of known mechanisms, differences and challenges will be presented within the survey response formation framework.
The second part of this course will focus on translation approaches and best practices. The course concludes with a case study, demonstrating the effect of language on survey responding.

Previous Short Courses

Presenter(s): Emilia Peytcheva, RTI

Description:
This course will provide an introduction to survey research methods for designing multinational and multicultural surveys. It will focus on measurement error in cross-cultural surveys, but will briefly touch on other sources of error. The course will begin with a theoretical background for cross-cultural differences drawing on cross-cultural psychology and psycholinguistic theories. The discussion of known mechanisms, differences and challenges will be presented within the survey response formation framework.
The second part of this course will focus on translation approaches and best practices. The course concludes with a case study, demonstrating the effect of language on survey responding.

Presenter(s): Rebecca Andridge, Ohio State University and Brady West, University of Michigan

Description:
This short course will provide a hands-on introduction to a set of recently developed indices (SMUB, MUBP) for the assessment of potentially non-ignorable selection bias in estimates computed from nonprobability samples and low response rate surveys. The course will begin with a non-mathematical overview of the indices, including the theoretical underpinnings and model assumptions as well as what data are necessary in order to effectively use these indices. Hands-on computing exercises using publicly available data and R software will walk participants through the use of these indices for a sensitivity analysis. The focus will be on calculation and interpretation of these indices in real-world applications, including data from U.S. pre-election polls, public opinion polls, and large-scale probability surveys with low response rates. The importance of high-quality, population-level auxiliary data for the computation of these indices will be discussed, along with specific recommendations for such data sources. Registered participants will get electronic instructions for installing and starting the R Studio software on their personal laptops prior to the short course. Prior experience with R is not necessary. Participants will also be provided with electronic versions of annotated, working code and data sets prior to the short course, enabling participants to easily follow along without making errors in typing code during the short course.

Presenter(s): Claire Bowen, Maddie Pickens, and Gabe Morrison, Urban Institute

Description:
With an increasingly connected and surveilled world, high-quality datasets can be more easily constructed but also are more vulnerable to abuse than ever. Although collecting more and better data can provide great benefits to society, for example by furthering medical research or targeting public investments to help those most in need, data privacy concerns surface when that information can be de-anonymized and used maliciously. This half-day course will provide an overview of current data privacy methodology, focusing on the generation of synthetic data and the application of differentially private methods. Through examinations of case studies and hands-on exercises, you will learn to apply data privacy techniques and evaluate the resulting disclosure risk and data utility. Attendees should have basic R programming experience.

Presenter(s): Pam Campanelli, The Survey Coach

Description:
Are you new to questionnaire design or learned on the job, but haven’t had formal training or studied in the past, but would like a refresher? This course is for you. It covers both well-known and less known, but important rules. It covers creating a new questionnaire, trade-offs (clear versus short and simple), the 4 cognitive stages in survey response, question wording guidelines, issues with factual and subjective questions, and problematic question formats to beware of or avoid. An ‘end-of-course’ appendix includes tips for demographic/socio-economic questions and business facts for establishment surveys, the visual side to web surveys, aids to improve respondent recall and reduce question sensitivity, and how these last two concerns differ between surveys for individuals/households and establishments.

Presenter(s): Lila Rabinovich, University of Southern California

Description:

Qualitative methods are increasingly recognized as a valuable addition to quantitative research including survey and experimental studies. There are several channels through which qualitative inquiry can contribute to quantitative research. For instance, qualitative data can help explain the causal mechanisms behind the observed effects of an intervention. It can also provide insights into particular study populations that are difficult to capture well through traditional recruitment methods, thus improving methods of recruitment, retention and measurement. It can help develop and improve quantitative data collection instruments and processes.

This course aims to provide guidance on when and how best to implement these methods. Specifically, the course will explore the ways in which qualitative data collection can support and inform quantitative research design, data collection, and analysis. In particular, the course will focus on actionable recommendations for incorporating qualitative data collection into study designs, including, among others:
• Formative qualitative research to understand a topic more deeply to inform development of hypotheses and survey instruments;
• Formative qualitative research to understand a hard-to-reach population;
• Cognitive testing to check clarity, burden and interpretation of existing survey instruments;
• Developing and pre-testing experimental interventions;
• Understanding/probing survey results.

Participants will be encouraged to bring questions from their own projects for discussion with the group during the workshop.

Presenter(s): Lingxiao Wang, University of Virginia and Yan Li, University of Maryland

Description:

Making valid population-level inference is a central goal in survey research. While studies with probability sampling are the gold standard to conduct design-based inferences about the target finite population, they are facing substantial challenges such as high costs and reduced response rates in recent decades. As a remedy, nonprobability samples have been increasingly collected in many areas including education, medical studies, and public opinion research. Nevertheless, nonprobability samples cannot well represent the target population due to non-random sampling. Consequently, the naïve nonprobability estimates can be biased from the target population quantities.

Quasi-randomization methods are among the most used approaches to improve representativeness of nonprobability samples. These methods create “pseudoweights” for nonprobability sample individuals using contemporaneous probability surveys as references, which substantially reduce selection bias in estimating target population quantities.

This course will first provide a comprehensive review of the framework for making finite population inferences from nonprobability samples, covering various pseudoweighting methods published in recent literature. Then the attendees will be guided through the specific steps of constructing pseudoweights. To ensure practical application, the course will include software and real-data examples, illustrating how to construct pseudoweights and analyze nonprobability samples, for estimating finite population means and associations

Presenter(s): Eric Plutzer, Penn State University

Description:
Web survey technology makes it easy to ask open-ended questions and their use has risen sharply in the last decade.  This course provides a hands-on overview of the ways that open-ended questions can be effectively utilized to provide valid and reliable measures of opinions, preferences, and values.

The course will begin with a review the four main ways that survey methodologists have used open-ended questions: a) Probing to improve question wording and questionnaire design. b) Elaboration of answers to prior forced-choice questions. c) Exploratory opinion research allowing respondents to answer “in their own words” rather than in terms posed by the investigator. d) Measurement of system-1 (thinking fast) considerations and associations through “top-of-head” responses.

In the middle third of the course, attendees will work through practical issues and challenges concerning question design, eliciting thoughtful responses, qualitative data analysis, quantitative content analysis, and computational approaches to analysis (e.g., topic models).

The final third of the class will focus on research ethics, effective reporting, and how researchers can apply AAPOR and professional norms of transparency and replicability to open-ended question data.

Presenter(s): Sixia Chen, University of Oklahoma Health Sciences Center

Description:
Non-probability samples have been used frequently in practice including education, medical study, and public opinion research. Due to selection bias, naïve estimates without adjustments by using non-probability samples may lead to misleading results. In this course, we will include the following topics: 1. Introduction to probability samples, non-probability samples, and their applications in practice; 2. Calibration weighting approach; 3. Propensity score weighting approaches; 4. Mass imputation approaches; 5. Hybrid approaches by combining both propensity score weighting and mass imputation approaches. For each of the previous topics, we will provide hands on exercises by using some real data applications including National Health Nutrition and Examination Survey, the Behavioral Risk Factor Surveillance System, and National Health Interview Survey by using SAS/R computational codes. Computational codes will be made publicly available for audience to use.

Presenter(s): Thomas Young, Youngceltic

Description:
The course covers the entire process of using cell phone tracking data to model political polling. The initial steps use R and Python to extract large data sets and transform them into useful structures for modeling at Census Block Groups. We will also cover visualization and modeling of political affiliation.

Speaker: Margaret R. Roller MA

This short course will discuss an approach that is focused on rigorous in-depth interview method design that does not stifle the unique attributes of qualitative research and the creative approaches utilized by skilled qualitative researchers. With respect to data collection, this course will discuss the strengths and limitations of the in-depth interview method, scope (sampling, sample size, and cooperation), data gathering (researcher and participant effects, guide development, interviewer skills and techniques), and mode considerations. The analysis portion of the short course will discuss the association between the unique attributes of qualitative research and the skills required to analyze qualitative data, analytical approaches, data formats, an eight-step analysis process, and a review of CAQDAS (computer-assisted qualitative data analysis software).