Cell Phone Task Force
in RDD Cell Phone Survey
WEIGHTING IN RDD CELL PHONE SURVEYS
Very often weights are needed in the analysis of RDD telephone surveys. Reasons that weighting is required include (1) differential probabilities of selection, (2) differential propensities to respond, and (3) sampling frame coverage problems among various groups in the population.
The emergence of households that have residents with cell phone service, but no landline service (i.e., the cell-only population) affects the way weights are constructed for RDD telephone surveys in the U.S. that sample cell phone numbers. Researchers have also recognized two other types of households that affect weighting: (1) the cell-mostly/mainly and (2) the landline-mostly/mainly households.1 As previously noted, whereas each of these groups has both cell and landline service, it has been observed that the cell-mostly/mainly have a greater propensity to respond if called on a cell phone and the landline-mostly/mainly have a greater propensity to respond if called on a landline.
Currently, weighting that accounts for cell phones is mostly done for surveys that use two sampling frames to obtain coverage of the cell-only population. However, there remain the possibilities of surveys for which the sampling frame is made up only of cell phone numbers, and surveys conducted with landline sampling frames where the sample is weighted to a target population that includes (at least in theory) the cell-only population. In addition, some surveys have used Address Based Sampling (ABS) to obtain coverage of the cell-only population. Surveys that use ABS frames present weighting issues different from those encountered in RDD surveys and are not addressed in these guidelines. (See Appendix A for more information about ABS.)
In the remainder of this section considerations related to weighting are discussed. As was the case of the first AAPOR Cell Phone Task Force report in 2008, the discussion below applies specifically to surveys where:
- The samples for the survey are selected from RDD landline sampling frames, and/or from RDD cell phone frames; and
- The population being studied comprises households, units within the household such as families, or members of households.
Although single frame designs may be employed, the main focus of these guidelines is on general population telephone surveys that are currently more common – those that sample both from landline and from cell RDD sampling frames.
Despite the fact that much remains to be learned about how to weight data obtained from cell phone household surveys, some important considerations have been identified. Two such considerations that greatly affect decisions about weighting are:
- Geography – is the study national in scope, multi-state (e.g. a census region), for a single state, or for an area within a state (county, city, etc.)? A study may be concerned with multiple levels; a national survey may also require regional estimates or a statewide survey may also require estimates at the city or county level.
- Dual service users – for a dual frame survey, whether those households/persons with both cell and landline service are accepted from either sampling frame, or whether dual users are screened out from one frame (e.g., dual users identified on the cell phone sampling frame are screened out and only the cell-only group is screened in).
Initial Questions About Weighting RDD Cell Phone Samples
Those planning RDD telephone surveys in the U.S. that will include cell phone numbers in the sample, as well as researchers planning to analyze data from such surveys, should ask (and answer to their own satisfaction) a number of questions, including:
- Are weights needed?
- If weights are required, how should the approach to weighting differ for different sample designs?
- If weights are constructed, what steps are needed?
- If post-stratification is part of the weighting, what variables should be used?
- If weights are to be used, what data does the questionnaire need to gather to facilitate weighting? What other data may need to be collected from secondary sources?
- What other issues must be dealt with in weighting?
What follows is a discussion of these questions to try to aid researchers in making informed decisions.
Factors Affecting Answers to These Questions
Answers to the above questions depend on the population being studied (defined by telephone usage, geography or both) and on the sample design used. These considerations apply when the study population comprises households, families, other subunits within a household, persons living in households or subsets of those populations.
The U.S. household population can be divided into at least four groups based on telephone service:
- No telephone service,
- Landline service only,
- Cell service only, or
- Both landline and cell service.
The first group is currently very small. The last group may be divided further into (a) those who mostly rely on their cell phones and (b) those who rely mostly on their landline phones. The target population may include all or only a subset of these groups. In addition to telephone service and/or usage, the population may be defined by its location; it may include the entire U.S., a subset of states, a single state, one or more counties or some smaller geopolitical unit.
Four different sampling designs that affect the approach to weighting include two dual frame and two single frame designs:
- Nonoverlapping dual frame design: Samples are selected for the survey from landline and cell phone RDD frames, but screening is done so that any member of the target population has a nonzero probability of selection from only one of the frames.
- Overlapping dual frame design: Independent samples that are selected from RDD frames that overlap in their coverage (e.g., a landline frame and a cell phone frame) and there is no screening; thus some members of the study population (e.g., those with both cell and landline service) have a nonzero probability of selection from more than one frame.
- Landline frame: Studies in which only a landline RDD frame is used, but weighting adjustments are desired to account for the fact that the frame excludes the cell-only group.
- Cell phone frame: Studies in which the sample is selected only from a cell phone RDD frame, but weighting adjustments may be desired to account for the fact that the frame excludes the landline-only group.
In addition to other weighting adjustments for these four designs, it may be desirable to adjust to account for those with no telephone service.
The remainder of this section on Weighting will focus primarily on the two dual frame designs since in 2010 they are the designs most commonly used in telephone surveys of the general population in the United States. The examples given focus on two variations within each design: one where households are sampled and the other where adults are sampled within households.
When Are Weights Required?
Weights would almost always be required if both cell and landline RDD frames are used, especially if respondents having both types of service are interviewed from both frames (i.e., the dual frame without screening design).
However, there are a few instances when it may be permissible not to use weights. For example, weights might not be needed in a sample that uses only one frame and no attempt is made to generalize about those who could only be contacted via the other frame. But even in these surveys, weights usually should be constructed if there are non-ignorable differences in the probabilities of selection or if there is differential nonresponse across various groups of the population.
Another occasion when weights may not be required arises when a new mode of survey administration – one that arises from advances in telecommunication technology – is being used. In this case, comparing unweighted data across the old and new modes becomes a logical first step in determining how findings may differ and whether or not weighting methods, particularly for post-stratification, need to be substantially revised.
The Need for Disclosure of Weighting Procedures. The survey research community of scholars and practitioners is still in a period of uncertainty and experimentation in surveying cell phone numbers in the U.S. Thus, it remains vitally important for researchers to clearly describe (disclose) how they constructed any weights used in their analyses or to describe the basis on which they decided not to weight, if that was their decision. Thus by comparing results across studies that use different weighting procedures, the survey research community can begin to determine which procedures most effectively adjust for the kinds of errors that occur inevitably during the survey process but that can be addressed by weighting.
Steps in Weighting Process for Different Types of RDD Sample Designs
For each sample design that follows, a discussion is provided on the steps that would typically be used to weight the data. Single frame designs are covered for completeness but most attention is given to dual frame designs, overlapping and nonoverlapping dual frame designs.
Overlapping dual frame designs typically require the use of compositing weight adjustment factors when the samples from the two frames are combined in order to account for the adults/households that can be sampled through either frame. Two basic types of designs are illustrated: (1) a sample design that randomly selects one adult from the household, and (2) a sample of households (also covers a sample of family households, a sample of unrelated individual households, a sample of households with a specific characteristic, etc.).
The random selection of one adult from a landline RDD sample is a widely used approach. For a cell phone sample three approaches can be considered:
1. Treat the cell phone as a personal-use device,
2. Allow for sharing of a single cell phone by two or more adults in the household, and
3. Treat the sample cell phone number similar to a
landline telephone number and randomly select one adult from among all
adults in the household.
To date, the third approach is not often used in the U.S. and we concentrate on the first two approaches. However this may change and researchers should monitor future developments.
For a sample of households one must account for the linkages or associations between the household and the landline and cell phone telephone numbers that can be used to reach that household. In other words, within the cell phone frame a household may contain more than one personal-use adult cell telephone number and within the landline frame a household may have more than one voice-use landline telephone number. Finally, a discussion of weighting national versus state and local samples is also provided.
Weighting for Single Frame Designs. Consider, for example, a survey using only an RDD cell phone frame that does not seek to make inferences about adults not having cell phone service. For this survey, one would weight to reflect any differences in the probability of selection of sample telephone numbers. The cell phone is either treated as a personal communication device or one adult is randomly selected from the adults in the household that share the cell phone. If one adult sharing the cell phone is selected then this needs to be accounted for in the weight calculations. Ratio adjustments to account for unit nonresponse can also be considered. For a national sample of adults one could post-stratify to population control totals from the latest NHIS public-use data file for adults living in households with cell phones. The weighting steps would be similar if one had a national sample of cell-only adults. For a sample of households one needs to add a step in the weighting process to account for the number of personal-use adult cell telephone numbers in the household.
Overlapping Dual Frame Design.A design that employs both the cell phone and landline frames without any screening presents more difficulty in weighting. So-called “overlap designs,” such as these, can be used for various units of analysis: adults, or children living in households, households themselves or some combination. Weighting for overlap designs must account for the fact that some of the adults/individuals had a chance of selection from both the cell and landline frames.
Landline frame adjustments. Weighting adjustments for the landline frame have three components: (1) phone selection probability (measured from the sampling frame), (2) number of voice-use landlines (measured via the questionnaire); and (3) for the random selection of one adult from the household, a within-household selection probability (measured via the questionnaire).
Cell phone frame adjustments. For the cell phone sample the weighting adjustment to account for differential selection probabilities depends on the sampling strategy used. If the cell phone number that is called is linked to only one adult (e.g. the adult who owns or is the main user of that number) the adjustments will differ from instances where that number is linked to multiple adults (e.g., the entire household or those adults who share that phone). The cell phone sample weighting adjustments can include:
- Phone number selection probability (measured from the sampling frame);
- For adults, the number of cell phones that can be linked to each sampled adult (measured via the questionnaire);
- For households, the number of personal-use adult cell phones attached to the household, e.g., the number of cell phones for all adults in household (measured via the questionnaire); and if sampling from multiple adults linked to a cell phone, the within cell-phone selection probability of the adult (measured via the questionnaire).
The number of cell phones that can be linked to a person within the same household may be different for children than for adults. Although this adjustment depends on the sampling design, it may be the number of cell phones owned (or shared) by an adult, whereas for a child – if the survey is including minors as respondents – in the same household it may include the number of cell phones for each guardian who can grant permission and provide access to the child (or speak for the child if serving as a proxy).
Many, perhaps most, recent cell phone survey designs in the U.S. appear to assume the linkage of a cell phone to one and only one adult. It is easy to understand why this is appealing. However, one phone could be shared by multiple people and some surveys have selected from those sharing a cell phone. Multiple people could share multiple phones with multiple different people (cf. Fuchs and Busse, 2010). In the U.S., this is assumed to be rare and would be very burdensome to measure if there is a complex weave of multi-person cell phone usage. (See Best and Hugick (2010) and Wolter, Smith and Blumberg (in press) for a discussion of the linkage between telephone numbers and the individuals in the household.)
Nonresponse adjustments (i.e., weighting for nonresponse) in an overlap design should take into account differential response between and within the frames. First, there may be differential response between the cell phone and landline frames. In addition, there may be differential response within either frame between various types of telephone users. For example, as noted previously, the cell-mainly group has been observed to be more likely than the landline-mainly group to respond when contacted through the cell phone frame; and the reverse appears to be the case for the landline-mainly group. In addition, within the cell phone frame in an overlap design, cell-only users consistently have been found to be more likely to respond than the dual phone users.
Combining Samples Where There is Overlap. To combine the samples, researchers need items included in the questionnaire for telephone group classification. These questions should determine whether the respondent could have been selected in the other frame. In a dual frame with overlap design, the weight adjustments for probabilities of selection and nonresponse will likely need to be checked and adjusted to fit external estimates by telephone usage and demographics. Post-stratification methods can be used for groups where there are external data sources. However, depending on the geography of the survey, external estimates of phone usage groups may be unavailable or of low quality (i.e., poor reliability).
Observations for those who have a chance of selection from both frames (i.e., dual users) may be combined by the use of composite weights. One aspect of overlapping dual frame telephone samples that has received considerable discussion is the use of composite weights for the dual users from the landline sample and the cell phone sample (cf. Hartley, 1962; 1974). In combining the dual user samples, one typically selects two compositing factors that sum to one. Many researchers currently are setting the two compositing factors to 0.5. Another approach involves calculating the effective sample sizes for the two dual user samples, and then using the effective sample sizes to determine the compositing factors. Brick et al. (forthcoming) discuss the use of compositing factors equal to 0.5 and propose an alternative approach to calculating the compositing factors based on inferring the dual user response rate in each of the two samples based on external information, typically from the NHIS. Regardless of the choice of the compositing factors, the researcher should make an effort to assemble control totals2 of landline-only, dual user, and cell-only adults/households, etc., for use in post-stratification or raking (along with socio-demographic variables such as age, gender, education, etc.). For state and sub-state surveys where the control totals may not be accurate, the researcher should consider conducting a sensitivity analysis3 to assess the impact of the survey estimates arising from errors in the control totals for the geographic area.
Once the weighted respondents from each frame have been merged together, a second stage of weights, known as sample balancing or raking, are usually applied to balance the sample to selected population or household demographic parameters. To help compensate for differential response/nonresponse between the two frames, some researchers include a telephone status and usage parameter in addition to their standard socio-demographic adjustments.
Non-Overlapping Dual Frame Design. A dual frame survey with screening for cell-only adults or households eliminates overlaps and the weighting process is somewhat simpler. Typically, an RDD landline frame is used to interview the landline-only group and those with both types of service (i.e., the dual users), and an RDD cell phone frame is screened and those with only a cell phone (i.e., no landline service) are interviewed.
For weighting and estimation purposes, this design can be considered a stratified sample. The study defines three nonoverlapping telephone usage groups (strata) for households or adults:
- Landline only;
- Landline and cell (or dual users which include cell-mainly and landline-mainly; and
- Cell only.
In weighting, the researchers perform design weight calculations for the landline sample and for the cell phone sample, and then combine them. The weighting of the landline sample must account for differential nonresponse by telephone usage group (including dual cell phone mostly/mainly versus dual landline mostly/mainly). In this type of design, dual users are included only if they are reached via the landline frame, and among these, evidence to date suggests that cell phone mostly/mainly users will be underrepresented.
For its part, weighting for differential nonresponse by telephone usage group is affected by the two different cell samples (i.e., the cell phone only group and the screened out dual service group). Different weighting procedures for the combined sample depend on whether control totals by telephone usage group are available. When they are not, raking/post-stratification to socio-demographic control totals may be the only approach available. When control totals by telephone usage group are available, the cell phone and landline samples can be adjusted to the appropriate totals, but raking/post-stratification to socio-demographic control totals may still be needed.
For this type of design as with others, the approach to weighting will be affected by the choice of sampling, reporting and analysis of units. Telephone surveys may sample individuals, or may use a “most knowledgeable” respondent to provide information about the household and its members. For a sample of households one needs to account for the number of voice-use landline telephone numbers in the household and the number of personal-use adult cell telephone numbers in the household.
Variance Estimation in Single Frame VersusDual Frame Designs. Of note, variance estimation for dual frame sample designs is somewhat more complex than for single frame designs. Thus, the Task Force suggests researchers work with a survey statistician who has experience with variance estimation for complex sample designs.
Considerations for National and State/Local Surveys
The procedures for weighting national and state/local surveys in the U.S. share many similarities, but there are two issues that may result in notable differences. The first is related to geographic eligibility and the second to accuracy and availability of control totals by telephone status and usage. Both of these issues are discussed briefly.
Geographic eligibility refers to whether the respondent lives within the geographic boundaries of the target population. For national surveys, respondents are all geographically eligible since essentially all sampled cell phone numbers (and landline numbers) are residents of the United States.
At the state/local level, the picture is very different especially for cell phone numbers that are not associated with local geographies even to the extent that landline numbers are – a situation due in part to the portability and interoperability across state/local boundaries of cell phones. As a result, some geographically eligible numbers are not sampled (resulting in undercoverage) and some sampled numbers reach persons residing in households outside the target geography (resulting in overcoverage). The overcoverage can usually be minimized by asking potential respondents to confirm their area of residence and then screening out any who do not reside within the targeted geographic area.4 Weighting procedures for state/local surveys have an additional consideration to try to deal with this potential coverage error that does not arise in national surveys. The choice of different auxiliary variables for post-stratification control totals is one of the most common ways to deal with this.
At the state/local level there may be limited external control totals that are available, and which are accurate, reliable, and/or consistent over time. In a particular state/local area control totals by telephone status (cell-only, dual user, landline-only) and telephone usage among dual users (cell-mainly or landline-mainly) may neither be available nor accurate enough to be used in post-stratification adjustments. Control totals by telephone status and usage are frequently used in weighting telephone surveys to reduce potential biases due to differential response rates by these characteristics. Of note, and as previously noted, the NHIS collects information on cell phone status and usage and reports estimates of these quantities at both the national and census region level every six months. Currently, no other federal face-to-face survey provides reliable estimates for these characteristics. The National Center for Health Statistics has used the data from the NHIS in conjunction with other data sources to produce model-based estimates of the prevalence of cell-only households and adults at the state level, but these estimates are for the cell-only population (not the full telephone status) and are subject to substantially larger errors than the national estimates.5
Researchers may develop their own model-based estimates using the NHIS and other sources of information such as the American Community Survey Public Use Microdata Sample (ACS PUMS). Battaglia et al. (2008), Battaglia et al. (2010), and Blumberg et al. (2009) should be consulted for examples of this approach.
As a result, national telephone surveys can use more reliable and up-to-date data for weighting than are available for state/local surveys. The availability of data for post-stratification may affect the choice of design (screening versus full overlap) and the weighting procedures. Both of these have consequences for the potential for bias of some of the estimates. In the absence of data to use as control totals in local surveys, Guterbock (2009) has suggested an approach that adjusts the phone usage distribution from the realized local samples by applying response-rate differentials calculated from comparison of national or regional samples with the appropriate NHIS control totals. ZuWallack and Conrey (2010) have proposed a response propensity approach to weighted state and sub-state surveys when external telephone service control totals cannot be obtained.
Finally, it may be possible that state/local level information that could be used in post-stratification adjustments is available from several smaller surveys serving varying constituencies. In this case state/local level adjustments may require selection of a “best available” control total or a method for combining all control totals together to provide a more accurate measure for the area of interest via a small area estimation procedure or some other type of linking method.
Gathering Data Within Dual Frame Surveys to Determine Telephone Service Usage and for Post-Stratification
At a minimum, dual frame RDD telephone surveys require items to be added to both the cell phone and landline questionnaires that permit classification of respondents on the basis of telephone ownership and usage. In addition, for accurately weighting telephone surveys that include cell phone samples, certain data must be available about the target populations’ parameters and the survey samples’ characteristics. Thus, U.S. telephone researchers need to gather data in their questionnaire to facilitate this process by measuring those sample characteristics needed for proper weighting to be possible.
As previously noted there is no consensus regarding how RDD cell phone samples should be weighted, especially when combining them with RDD landline samples. As such, there also is no consensus on exactly what survey items need be asked of respondents to support this process.
To date, a mix of measures has been employed in RDD cell phone surveys for this purpose, including:
Has the respondent been reached on a landline or a cell
Do the respondents reached on a cell phone also have a landline telephone?
Do respondents reached on a landline also have a cell phone?
Is the cell phone on which the respondent was reached used or answered only or mostly by the respondent?
- If not, how many other eligible persons use/answer the phone?
Does the respondent have other cell phones?
- If so, considering all of the respondent’s personal telephone usage, how much does the respondent use each of them?
For respondents with both a cell phone and a landline
phone, what proportion of all of their incoming telephone calls are
taken via each type of phone service?
What portion of a typical day is their cell phone turned on (e.g., number of hours a day)?
Is the cell phone used primarily for business purposes?
- If so, what portion of their usage is for incoming business versus incoming personal calls?
Does the respondent use alternate forms of communication
via their cell phone (e.g., text messages, SMS, e-mail)?
Appendix B includes examples of the questions used for these purposes by several major survey organizations. This appendix should not be considered an endorsement of these questions, but rather is offered as a resource to researchers looking for examples of survey variables that could be gathered and how the questions have been worded by other survey organizations.
Factors to Consider in Selecting Questions for Weighting Purposes. These factors include the sample design (cell phone only, dual frame with screening for cell-only households/persons, or dual frame without screening), as well as the weighting parameters.
Weighting to external estimates is most effective when items in the questionnaire replicate as closely as possible the manner in which the data were gathered in the external survey. Researchers conducting national (and in some cases regional) surveys may consider using the most current telephone service estimates from the NHIS.6 The NHIS features a large, national area probability sample that covers both telephone and nontelephone households. Data collection for the NHIS is continuous throughout the year, and parameter estimates for telephone service are published twice yearly (in December for interviews completed from January - June, and in May for interviews completed from July - December). For surveys making inference to smaller geographic areas, satisfactory parameter estimates may not be readily available, though state level cell phone only estimates are becoming available. As such, researchers who are conducting non-national telephone surveys of the general population must recognize that weighting to inappropriate parameter estimates may not improve survey estimates and, in some cases, may increase error.
There also are concerns about the reliability of many telephone service and usage questions. For example, the term, “landline telephone,” is not a familiar term to everyone, and there is potential for some respondents to confuse cordless landline telephones with “wireless” cell phones. Furthermore, estimating the proportion of calls made on a cell phone versus a landline phone may be very difficult for some respondents, and in many cases their answers will be unreliable. Of note, Villar, Krosnick and DeBelle (2010) report that more reliable data will be gathered when respondents are asked about the proportion of their “personal” calls that are made and received via cell phone or landline rather than the data that are gathered without using the word “personal” (which is now the case in the NHIS survey items).
The questions used by the NHIS to measure telephone use are posed at the family level, are then aggregated to the household level (for households with multiple families or unrelated persons living together), and finally the status of all individuals in the household is based on the household-level measures. This matches the practice of most telephone surveys with respect to the presence of a landline telephone, which is assumed to be available to and used by all members of the household. For cell phones, the NHIS asks if anyone in the household has a cell phone. If the answer is “Yes,” all members of the household are assumed to have access to that device. They are then assigned cell phone only status if there is no landline in the home, or dual phone status if there is a landline. Although the sharing of cell phones in a household, at least on occasion, is not an uncommon practice, the vast majority of people in the U.S. are thought not to share their phones with others in their household.7 In addition, even when sharing goes on, it is likely that not all members of a household are equally accessible to incoming calls on a cell phone. Yet, as a practical matter, most surveys using cell phones do not ask about cell phone sharing or the accessibility of all household members via cell phone. As a consequence, there is a potential mismatch between the parameter estimates of telephone status from the NHIS and the telephone status of household members as measured by most telephone surveys.8
The battery of potential telephone usage measures (see Appendix B) presents a number of practical concerns. A nontrivial amount of interviewing time would be required to be able to include all or many of these in a questionnaire, which in many surveys would likely necessitate a reduction in the number of substantive questions. These items are also apt to be uninteresting and potentially sensitive to many respondents, thus raising the chances for item nonresponse and even breakoffs occurring.
These are matters that must be worked out in the coming years so that valid standardized measures can be used to gather the variables needed for weighting RDD cell phone respondents in the U.S. Likewise, this must be done in ways that are reasonably cost effective for researchers who need to conduct telephone interviews with those reached on cell phones.
in RDD Cell Phone Survey
1 Of note, important subgroups may later be identified within each of these groups.
2 External population estimates for the survey target population, referred to as control totals, may be available from a previous census, the American Community Survey, the National Health Interview Survey, etc. If the sample can be divided into subgroups, for example, age by gender, and external control totals are available for the subgroups, then the sample in each subgroup can be weighted to the population total for that subgroup.
3 See Battaglia, Eisenhower, Immerwahr and Konty (2010).
4 As discussed in the section on Operational Issues, geographic screening must be crafted in a careful manner and interviewers must be well trained to administer it accurately to avoid Errors of Omission (false negatives) and Errors of Commission (false positives).
5 See Blumberg, S., Luke, J., Davidson, G., Davern, M., Yu, T. and Soderberg, K. 2009. Wireless Substitution: State-level Estimates from the National Health Interview Survey, January - December 2007. http://www.cdc.gov/nchs/data/nhsr/nhsr014.htm
7 As of 2010, there appears to be no reliable national estimate of the proportion of U.S. cell phone owners who share their cell phone with someone else, but most estimates put it the 10%-20% range (cf. Link et al., 2007).
8 When cell phones are shared, it may be useful to determine the proportion of time the survey respondent uses this device as compared to other users of the same device. This can be applied in adjusting selection probabilities for having reached the respondent on this device. For example, a person using the cell phone only half the time would have a 0.50 chance of being included in the survey and thus the inverse of this probability would be used to correctly adjust their selection for inclusion.
Back to top