Saturday, January 27, 2007

Longitudinal Survey Method

Barriers in Use of Longitudinal Survey

ESRC examines what are the most common barriers which prevent social scientists from making more frequent use of longitudinal survey data resources. It found the most important challenges are:
• Lack of appropriate software skills and good habits in software programming. In response, the LDA online materials include a number of introductory resources to working with relevant software through textual ('syntax') programming. Most resources refer to the packages SPSS and Stata.
• Lack of confidence in undertaking data managment tasks in the handling of complex combinations of longitudinal data files and variables. Our resources are heavily oriented to training in the data management tasks common to longitudinal survey resources, such as merging data between different files, and summarising 'key' variables in a longitudinal context.
• Lack of appreciation of the qualities of appropriate longitudinal survey data resources. Our materials use illustrative analyses of secondary survey resources, and feature links to numerous information resources on relevant survey data.
• Lack of confidence in statistical techniques for the analysis of complex survey data. Our materials feature general training in issues of working with complex survey data as well as links to further training resources ('complex' survey data is not necessarily longitudinal, but longitudinal survey data is usually complex).
• Lack of a balanced array of skills in the statistical techniques of quantitative longitudinal data analysis. Our materials attempt to demonstrate a wide range of methods for the analysis of longitudinal data. We also try to point readers to other resources which can further develop such skills

Source: www.longitudinal.stir.ac.uk/materials_summary.html




Study of Survey Methods
A number of studies have outlined the relative advantages and disadvantages of online research. The key advantages nearly always quoted first are greater speed and lower cost. In a number of circumstances these are going to be significant – particularly for multinational research and research with specialist audiences. There is also general cost advantages considering the cost of building and maintaining a panel being quite substantial in the beginning. The advantages can be considerable as it is possible to accumulate very large volumes of interviews in a short space of time. Having said this, a minimum fieldwork period is often recommended for online surveys to ensure good coverage. Another advantage suggested is that online surveys do not require interviewers to be present and so interviewer effects are avoided. Prominent example given is the higher admission of undesirable behaviour in online surveys than in interviewer-administered surveys (Comley 2003). Within political polls, the anonymity afforded by internet-based approaches is particularly highlighted as a way around the problem of the ‘spiral of silence’. The increasing individualism and selectiveness of potential respondents, as well as their use of new technology such as voicemail and caller ID to avoid telephone surveys add to online research facilitation. Online surveys get around by fitting in with a respondent’s life; they can fill them in at their convenience and can partially complete and return whenever they like. It is argued that this may help explain the more ‘socially liberal’ attitudes seen in many online surveys, as respondents on average tend to lead less home-based lives and so are less cautious (Kellner 2003b). it is suggested that online interviewing reaches ‘busy people – often educated and well-off– who systematically repel or ignore cold callers but are willing to answer questions posted on their computer screen’ (Kellner 2004).

As for reliability it is known that online respondents use scales differently from respondents in other modes. There is conflicting research on this, some showing that online respondents are more likely to choose midpoints in scales and ‘don’t know’ options in general, or in contrary use extreme options. It is possible to correct for this to an extent through modeling. Unlike face-to-face surveys, which can be sampled from reasonably comprehensive databases, online surveys are most often conducted among respondents from a panel who have agreed to be contacted for market research. No simple database of everyone who is online exists, and it looks unlikely to exist for the foreseeable future. Furthermore, even if there were such a list, prohibitions against ‘spamming’ online users would prevent it from being used as a sampling frame.

There are therefore three main issues relating to coverage bias or selection error that are raised with the sampling approach to online panel: first, of course, they can reach only those who are online; second, they can reach only those who agree to become part of the panel; and, third, not all those who are invited respond (Terhanian 2003). What makes online surveys different from other survey approaches, such as telephone in the USA and face to face in the UK, is that such a large proportion of the population are excluded before the survey begins, and that these are known to be different from those who are included. Although internet access in the UK is around six in ten of the adult population and rising, the demographic profile of internet users is not representative of the UK adult population as a whole, tending towards younger age groups. Those who choose to sign up for online panels may also have a younger, more male profile (Terhanian 2005).

It has been observed that online data tend to paint a more active picture of the population: online survey respondents tend to be more politically active, more likely to be earlier adopters of technology, and tend to travel and eat out more than face-to face survey respondents.

The technique behind propensity score weighting – propensity score matching (Rosenbaum & Rubin 1984) – has been used since the early 1980s, most commonly in evaluations of social policy, to ensure that experiment and control groups have similar characteristics (where random assignment is not possible).

The propensity score matching process is as follows.
• Parallel online and telephone or face-to-face surveys are conducted where the same questions are asked at the same time using different modes (an online survey and a telephone or face-to-face survey).
• Logistic regression is then employed to develop a statistical model that estimates the probability that each respondent, conditional on his or her characteristics, participated in the telephone or face-to-face study rather than the online one. The probability, or ‘estimated propensity score’, is based on answers to several socio-demographic, behavioural, opinion and attitudinal questions.
• Next, in the ‘propensity score adjustment’ step, respondents are grouped by propensity score within the survey group (telephone/face-to-face or online) they represent.
Statistical theory (Rosenbaum & Rubin 1984) shows us that when the propensity score groupings are developed methodically, the distribution of characteristics within each internet grouping will be asymptotically the same as the distribution of characteristics within each corresponding telephone or face-to-face grouping.

One of the first major UK studies comparing online and face-to-face data as opposed to online and telephone research was projected in parallel surveys comparing an online panel survey (Harris Interactive) with a face-to-face CAPI omnibus survey (MORI). Five ‘propensity score’ questions were asked on each survey covered issues such as online purchasing behaviour, views on the amount of information respondents receive and personal attitudes towards risk, social pressure and rules.

Question wordings on both surveys were kept as similar as possible, but some adaptations were required to reflect the different interviewing methods. Show cards were used in the face-to-face survey for all questions except for those with a simple ‘yes/no’ or numerical response, and the order of response scales and statements was rotated in both surveys.

The objective of the study is to establish whether data from an online panel survey can be successfully matched to data from a nationally representative face-to-face survey. Specifically, the study aims to make comparisons at a number of levels.

Once the surveys had been completed, both sets of data were weighted to the correct demographic profile (UK adults aged 15+). In the case of the omnibus survey this involved applying simple rim weights on region, social class, car ownership, and age and work status within gender. For the online survey the demographic weights that were applied were age within gender, ITV region, education level, income level and internet usage (ranging from high to low, measured in number of hours per week).

Several questions were placed on both surveys, with the target questions covering voting intention, socio-political activism, knowledge of/attitudes towards cholesterol, views of immigration and access to technology. These questions were selected to provide a relatively stern test of how close an online survey can get to a face-to-face survey, given that there are likely to be significant mode effects (particularly interviewer effects) and a noticeable impact from any attitudinal bias in the online sample.

The first question area looked at was voting intention. Comparison of unweighted face-to-face and online data shows us what previous studies of online research methodologies have suggested: online respondents are more likely to say they would vote Liberal Democrat or Conservative than their face-to-face counterparts. This is likely to be because of two competing effects seen throughout the study.

It has been hypothesised and shown to some degree that online panels tend to achieve samples that are more educated and active. The application of demographic weighting to both sets of data does serve to close the gap between online and face-to-face results. While the face-to-face weighting has had very little effect on data (increasing Conservative and Liberal Democrat support by just one percentage point), the propensity score weighting has had a significant impact on online data (for example, increasing Labour support by eight percentage points) (see Table 1).
It should also be noted that the design effect of propensity score weighting has nearly halved the effective sample size of the fully weighted online data, whereas demographic weighting has very little effect on the face-to-face effective sample size. However, as the original online sample was very large, comparisons are still relatively robust (a difference greater than +/– three percentage points would be significant).

Attitudes towards immigration have been surveyed by MORI a number of times, and findings have varied greatly by education, social class and general world-view. Further, these questions cover sensitive issues and are likely to be susceptible to eliciting socially desirable responses, particularly when an interviewer is present. These questions were therefore interesting to repeat in the online vs face-to-face experiment, as large differences could be expected. Weighting does not have much effect on either online or face-to-face survey data, and the key finding from these questions is that online survey respondents seem much more inclined to select the neutral point (‘neither agree nor disagree’) than face-to-face respondents. It could therefore be argued that the face-to-face results artificially emphasise opinions, when actually there are few strongly held views on these sensitive, complex issues.

The results on understanding of issues surrounding cholesterol (Tables 10–15) appear to confirm that online respondents are generally better informed than face-to-face samples, with a significantly higher number of online respondents correctly saying that cholesterol is ‘a type of fat that circulates in the bloodstream’. The rating of the seriousness of cholesterol as a health risk clearly illustrates the pattern seen in other studies, where online respondents are less likely to choose extreme options.

Conclusion
It was put forward some theories as to why data from online and face-to-face surveys might be different; however, we need to understand more about why weighting by both demographics and attitudes has varying degrees of success. There seem to be two main competing effects at play when comparing online and face-to-face methodologies. Online research using panel approaches appears to attract a more knowledgeable, viewpointorientated sample than face-to-face surveys. This could be because this is a prior characteristic of those with access to the internet or those who join online panels, or it could be a learned behaviour from taking part in a number of surveys. However, face-to-face respondents are more susceptible to social desirability bias due to the presence of an interviewer. Sometimes these effects appear to balance, bringing the outcomes from the two methodologies together, but sometimes they don’t. Voting intention is an example of a question area that has been successfully matched online, suggesting that, for some areas of study, welldesigned internet-based surveys with appropriate weighting strategies can produce similar results to well-designed face-to-face surveys. However, a number of other question areas are not so encouraging, particularly where the issues are sensitive.

A further note of caution when applying relatively heavy weighting to data sets, such as the propensity score weights used in this study, relates to design effect and impact on effective sample size. If such weights are to be used, the sample must be large enough to ensure that the resulting effective sample size will stand up to significance testing. Of course, as the cost per interview is low when a large number of online interviews are conducted, this may not be a problem. Despite these limitations it seems likely that online surveys will grow substantially over the next few years. This is partly because there are some doubts over either the capacity for or methodological advantages of traditional methods. First, face-to-face interviewing resources are limited and increasingly expensive. Landline telephone penetration is dropping, with currently 7% of households having no phone or mobile only. This is likely to grow fairly significantly and, more importantly, there is significant bias involved, with young households in particular much more likely to have mobiles alone. It is therefore important to continue to think about in what circumstances and how internet-based methodologies can be used for data collection, and to develop approaches that will be as robust and representative as possible.

Source: Bobby Duffy and Kate Smith MORI Online, George Terhanian and John Bremer, Harris Interactiv, Comparing data from online and face-to-face surveys