Do previous survey experience and participating due to an incentive affect response quality? Evidence from the CRONOS panel

As ever more surveys are conducted, recruited respondents are more likely to already have previous survey experience. Furthermore, it has become more difficult to convince individuals to participate in surveys, and thus, incentives are increasingly used. Both previous survey experience and participation in surveys due to incentives have been discussed in terms of their links with response quality. This study aims to assess whether previous web survey experience and survey participation due to incentives are linked with three indicators of response quality: item non‐response, primacy effect and non‐differentiation. Analysing data of the probability‐based CROss‐National Online Survey panel covering Estonia, Slovenia and Great Britain, we found that previous web survey experience is not associated with item non‐response and the occurrence of a primacy effect but is associated with non‐differentiation. Participating due to the incentive is not associated with any of the three response quality indicators assessed. Hence, overall, we find little evidence that response quality is linked with either previous web survey experience or participating due to the incentive.


INTRODUCTION
Ever more surveys are being conducted in the past decades, and thus, the share of individuals who have experience with answering surveys is growing (ESOMAR & WAPOR, 2014). For example, in the German context, it was reported that between 2006 and 2009, the share of individuals with previous survey experience rose from 54% to 63% (Vehre, 2011). Based on the data we use for this research, we estimate the share of the adult population who reported to have survey experience in 2016-2017 to be at least 70% in Estonia, at least 83% in Slovenia and at least 53% in Great Britain (CROss-National Online Survey panel, 2018). To exclude the experience with the survey that was used for the recruitment of respondents to collect the CROss-National Online Survey (CRONOS) panel data, respondents' mentions of surveys in face-to-face mode were disregarded. For the United States, Leeper (2019) reports the estimated share of individuals who had participated in at least one survey in the year prior to being asked to have risen from 40% in the 1990s to 60% in 2001. For Germany, based on data from the general population GESIS panel, we arrive at an estimate of 14% in 2013and 21% in 2016(GESIS, 2020. As individuals are confronted with more and more survey requests, survey fatigue can develop which is often cited as one reason for declining response rates (Callegaro et al., 2015). Researchers and data collecting entities, therefore, increasingly employ material incentives to motivate individuals to participate in surveys (Singer & Couper, 2008). Such material incentives mostly consist of cash or shopping vouchers, but also include entering lottery draws and gaining bonus points that can later be exchanged for gifts. Thus, the expansion of survey data collection has two important implications: Respondents are more likely to (1) have previously participated in surveys and (2) participate in surveys due to the incentive they receive.
Both previous survey experience and participating due to material incentives have been hypothesised to be associated with the quality of respondents' answers. However, for neither factor is it clear whether we should expect a positive or negative association with response quality. On the one hand, previous survey experience could result in survey fatigue and thus lower response quality (Schonlau & Toepoel, 2015). On the other hand, there might also be benefits of previous experience because respondents have familiarised themselves with the survey answering process (Struminskaya, 2016). Such mechanisms could play out in two distinct settings, the first being a setting where respondents are part of a longitudinal study or an opt-in panel and therefore collect experience responding to surveys. Longitudinal studies typically ask respondents the same questions repeatedly over time and are carried out by scientific research organisations, while opt-in panels are prominently employed in market research and are databases of individuals who have agreed to be contacted for future survey data collection . Quite some research has been focused on the panel setting looking at so-called 'panel conditioning' or the learning effects that occur due to repeated participation in a particular panel (see e.g. Bach & Eckman, 2018;Struminskaya, 2016). Our research is interested in a different setting, namely investigating what consequences it could have if respondents more broadly happened to collect any previous survey experience before participating in a given survey.
The effect of participating due to an incentive could also have a positive or a negative association with response quality. On the one hand, respondents' extrinsic motivation to obtain the incentive might be associated with lower intrinsic motivation leading to investing less effort into answering survey questions and thus lower response quality (Matthijsse et al., 2015). On the other hand, respondents who participate due to the incentive might regard survey participation in terms of an economic or social exchange situation and invest more effort as they feel they are being compensated in exchange for engaging seriously in the task or that they should do so based on norms of reciprocity (Dillman et al., 2014). Some prior empirical research has focused on the association of previous survey participation and incentives with response quality, but the results have been mixed so far (see e.g. Halpern-Manners & Warren, 2012;Medway & Tourangeau, 2015;Petrolia & Bhattacharjee, 2009). Against this background, we argue that further research is needed on the links between previous survey experience and incentives with response quality.
Previous research on these links suffers from various gaps. It has so far only focused on survey experience obtained by answering surveys on the same panel, as opposed to by answering surveys in general. Furthermore, the link between incentives and response quality has mostly been studied with experiments manipulating the incentive amount and/or type. The often assumed motivational mechanisms for effects of providing incentives on response quality have received almost no empirical investigation. The rare studies on this topic use exclusively data from non-probability panels (Achimescu et al., 2017;Matthijsse et al., 2015). As one is unlikely to be able to manipulate respondents' reasons for participation experimentally, we argue that observational studies looking at the association between self-reported reasons for participation and response quality are needed to complement experimental studies manipulating the incentive amount and/or type. Another research gap is that some response quality indicators, such as non-differentiation and primacy effects indicators, remain understudied.
To start filling these gaps, we investigate whether respondents with previous web survey experience provide responses of significantly lower or higher quality than those without experience. Furthermore, we investigate whether respondents who participate due to an incentive provide responses of significantly lower or higher quality than those participating for other reasons. We aim to answer these questions by analysing data of the probability-based CRONOS panel, established across three countries (Estonia, Great Britain and Slovenia) and employing moderate prepaid incentives (around 5€). We use three response quality indicators: item non-response, occurrence of primacy effects and non-differentiation, and assess through regression models whether these quality indicators are affected by having previous web survey experience and by participating due to the incentive.

The link between previous survey experience and response quality
The effects of previous survey experience on response quality are usually discussed referring to members of online panels, as worries arise that their responses could be influenced by prior surveys they answered on the panel (see e.g. Bach & Eckman, 2018). This is referred to as panel conditioning (Struminskaya, 2020). It is frequently discussed referring to commercial opt-in (nonprobability) panels as respondents in such contexts are likely to respond to large numbers of surveys (Hillygus et al., 2014;Matthijsse et al., 2015). However, there are also discussions of survey conditioning more broadly. This includes discussions referring to probability-based panels (Schonlau & Toepoel, 2015;Struminskaya, 2016) as well as referring not just to experience with surveys from the same panel but also experience with other surveys (Whitsett, 2013). Either way, also surveys taken in the context of the same panel might differ substantially, especially in the case of opt-in panels where panelists tend to be invited to answer surveys conducted for different client companies (Revilla, 2017).
The survey experience obtained through answering previous surveys can have negative effects on response quality. For example, Struminskaya (2016) argues that through previous experience with surveys, respondents might have learned how to answer filter questions strategically in order to avoid follow-up questions, thus facilitating satisficing behaviour (see also Duan et al., 2007;Eckman et al., 2014). Halpern-Manners and Warren (2012) find empirical evidence of experienced respondents systematically avoiding follow-up questions. They compare fresh panelists to those who had participated in a previous wave and find that the latter were less likely to report having two or more jobs. This could be due to the respondents' perception that doing so would lead to more follow-up questions. Schonlau and Toepoel (2015) investigate negative consequences of repeatedly answering surveys in the Dutch probability-based Longitudinal Internet Studies for the Social Sciences (LISS) panel because they suspect that it could lead to lower response quality due to fatigue. Indeed, they find that during the first 3 years of panel tenure non-differentiation (measured by pure straightlining, i.e. respondents giving the exact same answer to different items on a grid type of question) increases. Struminskaya (2016) posits that, next to disadvantageous effects, previous survey experience in a panel can also have advantageous effects for response quality because respondents understand the survey procedure better and are more familiar with the survey process which induces trust. She finds empirical support for this in a field experiment showing that more experienced respondents in a probability-based online panel of internet users in Germany provided more 'don't know' responses to knowledge questions which is attributed to respondents giving more honest answers. More studies find positive effects of repeated interviewing in previous panel waves on response quality. For example, Uhrig (2012) finds that social desirability effects decrease if respondents had been asked the same sensitive question already one year before in a previous panel wave. Waterton and Lievesley (1989) find that respondents who had answered three previous panel waves answer more honestly to sensitive questions. Besides, these more experienced respondents show fewer refusals to answer income questions and, generally, opt for 'don't know' answers less often than fresh panelists.
The reasoning for why positive and negative panel conditioning can occur is also plausible when considering previous survey experience more broadly, not just survey experience collected on the same panel (Whitsett, 2013). The previous completion of any survey can cause survey fatigue. If the previously taken surveys contain filter and follow-up questions, they can also enable respondents to avoid follow-up questions in the future. We thus have reasons to expect a negative association of response quality with previous survey experience in general. At the same time, previous survey experience with any survey can also increase familiarity with and trust in the survey process. We could therefore also expect a positive association of response quality with previous survey experience collected with any survey. Both negative and positive conditioning are expected to be increasingly pronounced if previous survey experience was collected in an increasingly similar setting. For example, if previous surveys contained the same questions, the expected effects of both fatigue and trust should be more pronounced. Similarly, if previous surveys were taken in the same mode, we would expect respondents to be more able to manipulate the survey instrument to reduce burden.

2.2
The link between participating due to an incentive and response quality The argument has been made that response quality should decrease in surveys that use incentives because those individuals who would not have participated without the incentive are likely to provide lower quality responses (Singer & Ye, 2013). This argument refers to selection effects in the presence of conditional incentives. More broadly, various scholars have expressed concerns that providing incentives could lead to a lack of genuine interest in the survey resulting in lower response quality because such rewards represent external motivators to participate (Cole et al., 2015;Keusch et al., 2014;Tourangeau, 2007). Psychologists have studied whether the presence of extrinsic motivation, defined as doing an activity in order to attain some outcome, leads to a decrease in intrinsic motivation, defined as doing an activity for inherent satisfaction of the activity itself (Ryan & Deci, 2000). More specifically, intrinsic motivation here denotes motivation coming from within a person, arising from their own interests and values without external pressure. Extrinsic motivation, in contrast, represents the willingness to do something due to external factors, such as rewards (Deci et al., 1999;Ryan & Deci, 2000). Deci et al.'s (1999) meta-analysis of 128 studies shows that extrinsic rewards indeed tend to decrease intrinsic motivation. However, if those rewards are provided unconditionally, their presence does not affect intrinsic interest. Following this reasoning, one would expect that incentives provided conditional on participation decrease genuine interest in the survey and therefore also response quality while incentives provided unconditionally do not affect respondents' genuine interest. In contrast with this, theoretical arguments exist for why we could expect higher response quality through incentives. Singer and Ye (2013) state that providing incentives could increase response quality because respondents feel rewarded and thus invest more effort into the survey. Based on the norm of reciprocity (Gouldner, 1960), a sense of obligation caused by a reward can push respondents to put greater effort, respond more honestly and provide more personal information (Medway et al., 2011). Dillman et al. (2014) distinguish between conditional and unconditional incentives and state that they invoke economic and social exchange situations respectively. They point out that, in situations where incentives are conditional on survey completion, survey participation can be perceived in terms of an economic exchange. We thus speculate that respondents who participate due to the incentive should be more likely to regard the survey situation in terms of such an economic exchange than respondents who participate for other reasons. This could lead to extrinsically motivated respondents putting more effort and providing higher response quality as they perceive the situation such that they are compensated for a service they provide. Unconditional incentives invoke social exchange situations in which norms of reciprocity could lead to a sense of obligation to invest effort into the survey task (Dillman et al., 2014;Medway et al., 2011). Again, for respondents participating due to the incentive, this social exchange situation is likely to be more salient than for those with other reasons for participation and they might thus invest more effort into answering the survey. Hence, also unconditional incentives might result in higher response quality. Whether any of these mechanisms is triggered by the provision of incentives is, however, likely to differ between respondents. Some respondents will become extrinsically motivated when incentives are provided, whereas others will still participate out of intrinsic motivation.
In the survey methodology literature, scholars have previously lamented that there is not enough empirical evidence on the link between the provision of incentives and response quality (Cole et al., 2015;Singer & Ye, 2013). Yet, since then, hardly any studies have been added to the body of literature. Moreover, the few existing studies show mixed findings. While some find a negative effect of incentives on response quality (Barge & Gehlbach, 2012;Tzamourani & Lynn, 1999), others find a positive effect (Cole et al., 2015;Singer et al., 2000). In yet others, findings depend on survey questions (Petrolia & Bhattacharjee, 2009), response quality indicators (James & Bolstein, 1990) or the type of incentive employed (Bosnjak & Tuten, 2003). Furthermore, there are many studies that find no significant effect of providing incentives on response quality (Davern et al., 2003;Göritz, 2004;Heerwegh, 2006;Medway, 2012Medway et al., 2011Porter & Witcomb, 2003;Sánchez-Fernández et al., 2010;Shettle & Mooney, 1999). Singer et al. (1999) also draw mixed conclusions from their review of 13 studies investigating the impact of either cash or gift incentives on response quality. In seven of the 13 studies, no effect of incentives on response quality is found. In the remaining six, incentives lead to higher quality responses. The mentioned studies differ in various aspects. They were conducted in face-to-face, mail, telephone and web mode and used unconditional as well as conditional incentives in the form of vouchers or cash with values between 0.25 and 20 USD as well as lottery and bonus point incentives. The response quality measures include item non-response, inconsistent responses, deviating from instructions in the questionnaire, length of open-ended answers, number of responses in check-all-that-apply questions, rounding responses, rushing and dropout. Based on reviewing the abovementioned literature, we could not deduce the presence of any systematic differences in findings conditional on any of these aspects.
While the above review considers studies investigating the link between the provision of incentives and response quality, our focus is somewhat different. Namely, we aim to investigate the link between participating due to an incentive and response quality. We only identified two studies that looked at precisely this link. Matthijsse et al. (2015) use self-reported reason for participation as one variable in their latent class analysis to classify opt-in online panel members into four groups ranging from 'professional respondents', who are extrinsically motivated by conditional incentives to participate in online panel surveys, to 'altruistic respondents' who are intrinsically motivated. They find only little evidence that professional respondents provide lower response quality, measured in terms of acquiescence, extreme responding, tendency to choose the middle category and non-differentiation. Achimescu et al. (2017) study opt-in panelists' self-reports of participating in a survey due to the conditional incentives. They use this as a predictor for response quality, measured by five indicators: item non-response, non-differentiation, speeding, invalid text inputs and failing an instructional manipulation check. They find that respondents who are motivated by the incentive show higher response quality on all indicators.

Limitations of previous research and contribution
In terms of the effects of previous survey experience on response quality, studies have so far mostly focused on experience collected on the same panel. We argue that, as ever more individuals answer surveys, it becomes relevant to also investigate the effects that previous survey experience more broadly speaking, that is, not just on the same panel, can have on response quality. Regarding incentives, studies on the effect of providing incentives on response quality exist, but the link between incentives being the reason for participation and response quality has remained understudied. Furthermore, most of the existing studies on the link between providing incentives and response quality have used experimental designs to assess differences between providing an incentive versus not providing one and in some cases also between providing incentives of different value. The implicit or explicit assumption in such studies is that differences would be due to differently motivated respondents in the samples. This might be either because different types and levels of motivation can be triggered by providing an incentive and/or because differently motivated respondents choose to participate in the first place if an incentive is offered. The latter is often discussed as a potential source of selection bias as differently motivated respondents might differ on a range of other (unobserved) characteristics as well . However, empirical measures of the reasons for participation, type of motivation and strength of motivation are scarce (Hillygus et al., 2014). Hence, there is a lack of empirical evidence on the association of either of these factors with response quality. Little empirical evidence exists on the specific link between the reasons for participation and response quality, with the few existing studies focusing exclusively on non-probability panels (Achimescu et al., 2017;Matthijsse et al., 2015).
Furthermore, while item non-response is commonly used as an indicator for response quality, empirical evidence using other response quality indicators, such as non-differentiation, is more rare. On yet other indicators, such as the occurrence of primacy effects, empirical studies are still completely lacking in the literature reviewed here.
Our aim is to address the identified gaps by adding more empirical evidence to the scarce literature on the links between, firstly, previous survey experience and response quality and, secondly, participating due to an incentive and response quality. Our two research questions are thus: (1) Do respondents with previous survey experience provide responses of significantly lower or higher quality? and (2) Do respondents who participate due to an incentive provide responses of significantly lower or higher quality? We study these links in the first survey of a probability-based online panel of adults living in Estonia, Great Britain and Slovenia. In addition to the commonly used response quality indicator item non-response, we also use non-differentiation and the occurrence of primacy effects.

Data
We use data from the CRONOS panel (CROss-National Online Survey panel, 2018), a probability-based online panel encompassing three countries: Estonia, Great Britain and Slovenia. The target population was comprised of adults (18+) living in private households in these countries (Villar & Sommer, 2017). Participants were recruited during the European Social Survey (ESS) Round 8 interviews fielded between August 2016 and December 2017. CRONOS was established with the aim to create the first cross-national probability-based input-harmonised panel. Estonia, Great Britain and Slovenia were the three countries to participate in this pilot project with the aim to extend the panel to further countries in the future. The countries were selected to maximise variation in factors that could be decisive for the success of this approach, such as internet penetration and population size (Villar & Sommer, 2017). At the very end of the ESS Round 8 face-to-face interviews, respondents in the three participating countries were asked to join an online panel. Specifically, they were invited to participate in 20-min online surveys every 2 months for about a year. Respondents were told that they would receive a 5€ (Estonia)/5£ (Great Britain)/7€ (Slovenia) voucher for every survey to which they were invited as a token of appreciation. The amount was higher in Slovenia due to practical issues in the implementation.
To account for these differences, country dummies were used in the analysis (see Section 4.3).
The incentive was not conditional on participation. Participants who had previously indicated that they did not have internet access for personal use and were willing to participate were provided with a free tablet with internet access (Villar & Sommer, 2017). We use data predominantly from CRONOS wave zero (welcome survey), which was conducted between December 2016 and April 2017. The response rate for this wave was 23% for Estonia, 12% for Great Britain and 29% for Slovenia (Villar et al., 2018). It was calculated as the sum of complete and partial interviews over the total number of eligible sample units, corresponding to AAPOR response rate six (The American Association for Public Opinion Research, 2016). The breakoff rate was 3% for Estonia, 5% for Great Britain and 2% for Slovenia. It was calculated as the number of people who dropped off during the survey over the number of people who started the survey. For panelists who were given a tablet in order to participate, this wave was used as a training instance: Respondents completed the survey while the interviewer was present and ready to assist (Villar & Sommer, 2017). The wave zero survey dealt with topics such as (societal) well-being, social inequality, personality traits, attitudes about science and technology and living conditions. The mean number of questions asked to respondents was 52 and the median completion time was 13 min. One of our variables, participation due to incentive, was constructed based on a question only available in wave five, which was fielded in November and December 2017, hence between 7 and 12 months after wave zero. However, as the question underlying this variable asked explicitly what respondents' initial motivation was to join the panel (see Section 3.2.1), we opted for assessing response quality only in wave zero, since the reasons for participation could have changed later on. The analytic sample after listwise deletion of observations that contained missing values on the covariates for the multivariate regression analysis consisted of n = 1367 respondents. The (non-weighted) sample had the following characteristics: Respondents had a mean age of 48 years and 58% of them were female. Furthermore, 36% had obtained at least a bachelor's degree, while 12% indicated to have only finished primary or lower secondary education. The Estonian sample made up 36% of the overall sample, 38% was made up by the Slovenian sample and 26% by the British sample. Moreover, 7% of the sample received a tablet in order to participate in the panel and thus had an interviewer present during the wave zero survey (see also Supplement S1).

Main independent variables of interest
Our main independent variables are web survey experience and participation due to incentive. We limit our analysis to the previous survey experience which respondents obtained in the same mode (i.e. web). The reasons for this are twofold: firstly, because we expect experience in the same mode to be more relevant for the expected consequences in terms of learning, trust or fatigue; and, secondly, because data on the exact extent of previous survey experience were collected only concerning the web mode in CRONOS.
Web survey experience is a variable indicating how many web surveys respondents had taken at any point prior to joining the CRONOS panel. The questions underlying the web survey experience variable were: (1) 'In which type of surveys have you previously participated? Please select all types of surveys that apply' and (2) for respondents indicating they had previously participated in web surveys, a follow-up question: 'In how many web surveys have you participated before joining this panel? Please enter the number of web surveys in this box. If you are not sure, please give your best estimate'.
Participation due to incentive is a dummy variable that is coded '1' for those respondents who, in response to the check-all-that-apply question 'Why did you INITIALLY agree to participate in this study?', checked the answer option 'because I wanted the gift card', irrespectively of whether it was chosen in combination with other answer options or not. For a robustness check, we coded only those cases '1' who were exclusively motivated by the incentive (see Section 4.3).
For the original formulations of questions that the main independent variables are based on including answer options, see Supplement S2.

Other independent variables
As control variables, we use sex, age, education (three levels) and country dummies for Estonia and Slovenia with Great Britain as reference category. We use three further control variables: political interest, frequency of internet use and tablet capturing whether respondents have received a tablet in order to participate in the survey. Having received a tablet is an important control variable because for those respondents who received a tablet, the interviewer was present during the interview for the first wave (wave zero), which is likely to affect response quality. The motivation for controlling for political interest and the frequency of internet use is to be able to prevent more potential spurious effects, as we think these variables could impact both previous web survey experience and response quality. Politically interested individuals might be more likely to participate in surveys, as they might have stronger opinions on various issues and be more willing to share them. We would also expect politically interested individuals to be more likely to put effort into answering a survey as they might be more motivated to share their opinions, thus providing higher response quality. Respondents who use the internet more frequently are more likely to have been invited to web surveys previously. Furthermore, their internet skills could make it easier for them to provide high quality responses, as filling in a web survey should be easier for them. In contrast, their skills might also enable them to manipulate the survey instrument to decrease burden thus potentially leading to lower response quality.

Dependent variables: Response quality indicators
We constructed three response quality indicators capturing three response patterns which indicate low response quality: (1) item non-response, (2) occurrence of primacy effects and (3) non-differentiation (Krosnick, 1991;Lugtig & Toepoel, 2016;Tourangeau et al., 2000). We chose these response quality indicators because they were the most suitable ones given the questions asked in CRONOS wave zero. For example, no open questions were asked, so we could not assess the length of answers to open questions. The item non-response indicator proportion item missing was computed as the proportion of all applicable questions which a respondent skipped without providing any answer. For most questions, when respondents tried to skip a question, they received a prompt asking them to answer to the best of their ability even if unsure and adding a 'don't know' and a 'prefer not to answer' option to the displayed answer categories. However, respondents could proceed to the next question without selecting any answer (ESS ERIC, 2018). Hence, our variable proportion item missing contains instances in which respondents tried to skip a question and then also ignored the prompt and proceeded without providing any answer. We also ran a robustness check using an item non-response indicator for which also 'don't know' and 'prefer not to answer' responses were coded as item missing.
Primacy effects occur when respondents show a tendency to select the first answer option (Krosnick, 1999;Krosnick & Alwin, 1987). For the primacy effect indicator proportion first answer option, the proportion of all applicable questions for which a respondent selected the first answer option was computed. On its own, this number does not have much meaning but when comparing it between different groups while using other variables as controls, that is, in a multivariate regression, it becomes a meaningful response quality indicator.
For non-differentiation, we identified sets of at least five consecutive items using the same answer scale and containing reversely phrased items. Answer patterns with little variance on these items are implausible, an important condition for distinguishing invalid from valid non-differentiation (Reuning & Plutzer, 2020). We identified two suitable sets: one set of six consecutive items about attitudes towards science and technology, and another set of 12 consecutive items assessing personality traits (see Supplement S3 for question formulations). We constructed an indicator taking the mean of the variance of respondents' answers in these two sets of items (nondifferentiaton indicator = var(set1)+var(set2) 2 ). To account for the fact that valid non-differentiation could still lead to low variance when respondents opt for the neutral middle category (Reuning & Plutzer, 2020), we did a robustness check employing a different indicator. Based on the aforementioned two sets of items, we constructed a dummy coded '1' for respondents who provided answers on only one side of the scale. Answers on the middle category were ignored here as they could plausibly occur alongside answers on either side of the scale.

Analyses
First, we look at descriptive statistics to see how much web survey experience respondents have, how many of them are participating due to the incentive and what the distributions of the response quality indicators show about the occurrence of undesirable response behaviour. Then, we look at bivariate relationships. Firstly, we inspect the correlations between number of web surveys previously taken and the response quality indicators. We use the Spearman correlation as number of web surveys previously taken is a skewed variable so the relationships might not be linear. Secondly, we inspect differences between those respondents participating due to the incentive and those participating exclusively due to other reasons on all three response quality indicators. We use t-tests in order to test if the weighted means (see description of weights below) differ between groups.
To investigate the association of the independent variables of interest, web survey experience and participation due to incentive, with the three response quality indicators in the presence of control variables, we use three separate multivariate linear regression models. The same predictor variables are used in all models: web survey experience and participation due to incentive as well as control variables (see Section 3.2.2). We also checked for independent variables moderating each others' effects, looking at interaction effects between our main independent variables of interest web survey experience and participation due to incentive and socio-demographic control variables as well as looking at the interaction between our two main variables of interest web survey experience and participation due to incentive. We checked if the assumptions of normality of residuals, heteroscedasticity and multicollinearity hold. The results of these diagnostics and their discussions are provided in Supplement S4. Indications for violations are found for the model with dependent variable proportion item missing. For this model, an additional analysis using a logged version of the dependent variable is run (see also page 14).
For all analyses, we weight the data with the wave-specific weights provided for wave zero as recommended by Villar et al. (2018). The weights were constructed by adjusting the post-stratified design weights for ESS Round 8 for non-response at CRONOS wave zero. Furthermore, we use population size weights for all analyses pooling the data from several countries as recommended in the ESS weighting guide (European Social Survey, 2014). All analyses were done using Stata version 14.1 (StataCorp., 2015). Table 1 presents the distributions of our main variables of interest. Looking at our main independent variables of interest in our final (weighted) sample (n = 1367), the median number of previously taken web surveys is 0. This means that at least half of the sample had not ever taken a web survey before joining the panel. Furthermore, 20.6% indicated that at least one of their initial motivating factors to join the panel was obtaining the incentive (i.e. the gift card).

Distributions of variables of interest
Regarding the response quality indicators, there is a low occurrence of item non-response with at least half of the respondents having skipped none (median = 0) of the applicable questions. We also calculated the mean here, for comparability with other studies. It shows that respondents skipped a mean of 0.2% of the applicable questions. We can compare this, for example, with Tzamourani and Lynn (1999) who found item non-response to range between 1% and 4% in a probability-based general population face-to-face survey in Great Britain, though arguably that survey included more sensitive items than CRONOS wave zero. Furthermore, CRONOS respondents chose the first available response option for a mean of 14.8% of all questions they were asked. The mean of our non-differentiation indicator (i.e. the mean variance across the two relevant item sets) is 1.2. Table 2 presents the bivariate results. Firstly, Spearman correlations (weighted) between the three quality indicators and the number of web surveys previously taken and, secondly, differences in the three quality indicators between respondents who participated due to the incentive and those who reported exclusively other reasons for participation (weighted) are displayed.

Bivariate results
To assess the bivariate relationships between web survey experience (i.e. the number of web surveys previously taken) and the three response quality indicators, we inspect the (weighted) Spearman correlations. We find no significant associations. Having taken more web surveys previously thus does not seem to be associated with any of the assessed response quality indicators (research question one). Looking at differences between individuals who indicated that they initially participated in the panel because of the incentive and those who did not, we find that the non-differentiation indicator, mean variance, is significantly higher among those who participated due to the incentive (p < 0.05). However, substantively, the difference in mean variance is rather small at 0.1. It indicates that those respondents who participate due to the incentive show a slightly higher response quality. In terms of bivariate results, we thus find an association between participation due to incentive and response quality when measured as non-differentiation (research question two). Table 3 presents the results of the three regression models predicting response quality.

Multivariate results
In the model with outcome variable proportion item missing, none of the independent variables of main interest have a significant effect. The Estonia country dummy is significant showing that respondents in Estonia show less item non-response, on average, compared to those in Great Britain. As the variable proportion item missing is very skewed, we ran the same model using a logged version of the variable as a robustness check. To do this, we added '1' to all values of the variable proportion item missing to deal with the zero values, as the logarithm of zero is not defined. This did not change the results concerning our main independent variables of interest. We also ran a robustness check in which the item non-response indicator was computed such that also 'don't know' and 'prefer not to answer' responses were counted as item non-response. The effects of the independent variables of main interest remained largely unchanged. A robustness check was also run using a log-transformed version of the independent variable web survey experience to account for the fact that the variable is skewed. The results did not show to be substantively altered.
Regarding proportion first answer option, neither web survey experience nor participation due to incentive are associated with the tendency to choose the first answer option. Age turns out to be a significant positive predictor of the tendency to choose first answer options (p = 0.017). Furthermore, female respondents show less of a tendency to choose first answer options than male respondents (p = 0.010). Moreover, respondents in both Estonia and Slovenia were, on average, significantly more likely to choose first answer options than respondents in Great Britain (both p < 0.001). Respondents who had been provided with a tablet and filled in the survey in the presence of an interviewer also chose on average significantly more first answer options (p < 0.001).
Again, a robustness check was run using a log-transformed version of the independent variable web survey experience. The results did not show to be substantively altered. Finally, the model for non-differentiation shows that having more previous web survey experience is a significant positive predictor of this response quality indicator (p < 0.001). The coefficient of 0.01 shows that respondents with more previous web survey experience tend to have a higher mean variance in answers to the consecutive item sets using the same scale. More precisely, the mean variance increases on average by 0.01 for each additional web survey a respondent has previously taken. Hence, there is evidence from the multivariate analysis concerning research question one, indicating that previous web survey experience is positively associated with response quality when measured as non-differentiation.
Participation due to incentive has no significant effect on this response quality indicator either. There is thus overall no evidence from the multivariate analysis suggesting that participating due to the incentive is associated with response quality (research question two).
Furthermore, more politically interested respondents and respondents who had received a tablet showed on average higher response quality according to the non-differentiation indicator (respectively, β = 0.10, p = 0.012 and β = 0.33, p = 0.034). Female respondents showed on average lower response quality, according to this indicator (β = −0.27, p < 0.001).
Again, we ran a robustness check using a log-transformed version of the independent variable web survey experience. The results did not show to be substantively altered. We also checked if the model would change under another operationalisation of web survey experience, namely a dummy only distinguishing between those who had any previous web survey experience and those who had none. The results show that the web survey experience dummy is not associated with non-differentiation. It thus seems to matter for response quality, measured as non-differentiation, how much previous web survey experience a respondent has. Having any versus having none does, on average, not make a difference.
We also conducted a robustness check looking at an alternative indicator for non-differentiation (see Section 3.2.3). The results of this model differ from the model in Table 3 in that web survey experience is not a significant predictor, nor is being female or being more politically interested. Instead, here differences between Estonia and Great Britain appear, with Estonians showing on average lower response quality (p < 0.05). Furthermore, the low education dummy is associated with non-differentiation in this model, showing that respondents in this category provide lower response quality than those in the medium education category (p < 0.05). This illustrates that a different operationalisation of non-differentiation can affect which covariates this response quality indicator is associated with.
Furthermore, we conducted robustness checks for all models introducing a dummy predictor participation only due to incentive instead of participation due to incentive (whether in combination with other reasons for participation or not). Results of the models were not substantively altered. The results of all robustness checks are presented in Supplement S5.
Moreover, we looked at interaction effects between our main independent variables of interest and the control variables to find out whether effects might be moderated by them. We find interaction effects between web survey experience and internet use on proportion first answer option as well as between web survey experience and internet use on non-differentiation. Looking at the marginal effects of having web survey experience on proportion first answer option at different levels of internet use shows that the less frequently respondents usually use the internet, the stronger the negative effect that web survey experience has on proportion first answer option (see Figure 1), hence the stronger the positive association of web survey experience and response quality. Inspecting the marginal effects of web survey experience for the model with outcome variable non-differentiation at different levels of internet use shows that web survey experience has a significant positive effect on response quality only for respondents who use the internet every day (see Figure 2). A look at the distribution of web survey experience among the different internet use groups suggests that this could be due to the fact that a large share of respondents (more than 70%) fall into the group who use the internet every day. Furthermore, unsurprisingly, among those using the internet every day, there is a much higher variance in the number of web surveys previously taken.
We also checked for the presence of an interaction effect between our two main independent variables of interest web survey experience and participation due to incentive to see if being extrinsically motivated might only impact response quality for experienced respondents. We did not find this to be the case.
An additional analysis we conducted was running the models separately for each of the three countries. The results are provided in Supplement S6. Findings concerning the role of web survey experience are consistent with the pooled model in all country-specific models. Interestingly, participation due to incentive seems to play a role for response quality in Slovenia, though showing contradictory directions of association for different response quality indicators. We find participation due to incentive to be negatively associated with proportion first answer option F I G U R E 2 Marginal effects of having previous web survey experience on quality indicator non-differentiation among different values of internet use (interaction between internet use and previous web survey experience) (and hence positively associated with response quality). At the same time, we find participation due to incentive to be negatively associated with response quality when using the indicator non-differentiation.

CONCLUSION AND DISCUSSION
The aim of this study was to investigate if response quality is associated with two factors, which are increasingly present in the current day setting where the amount of conducted surveys increases. More precisely, our first research question asks if previous web survey experience is positively or negatively associated with response quality and our second research question whether participating due to an incentive is positively or negatively associated with response quality. Concerning the latter, no evidence was found that participating due to the incentive is linked with either higher or lower response quality. Concerning research question one, the results of multivariate analyses show that having more previous web survey experience seems to be linked with higher response quality when measured as non-differentiation but not when measured as a proportion item missing or proportion first answer options chosen. Thus, we find some but limited evidence that previous web survey experience is positively associated with response quality. More specifically, looking at the interaction effect between web survey experience and internet use in the model predicting proportion of first answer options chosen, it seems that having previous web survey experience might be helpful in furthering learning of the survey answering process and/or increasing trust in survey settings among those respondents who do not usually interact frequently with internet technology. This would also mean that, as using the internet frequently becomes ever more widespread, the positive link between previous web survey experience and response quality might disappear. The positive association between having previous web survey experience and response quality in terms of non-differentiation is detected particularly for the group of very frequent internet users. Different dynamics thus seem to be at play here concerning the moderating role that the frequency of internet use can have in the relationship between web survey experience and response quality. Future studies should account for this moderating variable and study these dynamics in more depth.
The results of this study also show that the operationalisation of response quality indicators can make a difference. Specifically, our results showed to be sensitive to a different operationalisation of non-differentiation. Further research about the validity of response quality indicators is thus needed.
Our results concerning both research questions are not in line with previous studies. The finding that previous web survey experience leads to a decrease in non-differentiation behaviour stands in contrast to the findings of Schonlau and Toepoel (2015) who find that experienced panelists showed more non-differentiation behaviour. Possibly, the fatigue the authors hypothesised to be the reason for non-differentiation behaviour is more likely to occur for panelists, who receive the exact same questions repeatedly, than for respondents who took part in different surveys before, as have the respondents in the survey we investigated. Furthermore, the LISS Panel investigated by Schonlau and Toepoel (2015) is a longer survey, taking 15-30 min (Centerdata, n.d.) as compared to a median completion time of 13 min in the survey investigated here. This could also account for the differences in findings. In terms of our other independent variable of interest, participation due to incentive, this study adds to the very sparse empirical literature on its links with response quality. In our study, participation due to incentive was overall not associated with response quality, whereas in the two identified previous studies on this link, one finds a clear positive association (Achimescu et al., 2017) and one a weak negative link (Matthijsse et al., 2015).
Utilising the cross-national character of our study, we showed that the response quality differed by country. Both in Estonia and in Slovenia, the tendency to choose the first answer option was more pronounced than in Great Britain. Furthermore, respondents in Estonia showed lower proportions of item missing answers than those in Great Britain. Respondents in Slovenia tended to show less non-differentiation behaviour compared to those in Great Britain. As even in a harmonised panel, such as CRONOS, differences in response quality between countries are salient, future studies would be well advised to investigate or at least take into account national and regional differences in response quality. The finding that participation due to incentive is associated with two response quality indicators for Slovenia is interesting. Slovenia happened to be the country, where, due to practical considerations, the incentive was worth a slightly higher amount than originally planned (seven instead of 5€). Possibly, a 5€/£ incentive is perceived as a token of appreciation, while an incentive with a higher amount might be perceived as a compensation for respondents efforts. It might thus be given a higher importance by respondents and have more of an impact on their survey behaviour. Future research would be well advised to investigate the effect of higher incentives, using amounts that are less likely to be perceived as tokens of appreciation and more likely to be perceived as a compensation. Furthermore, the fact that response quality was associated with participation due to incentive in opposite directions when considering two different indicators in Slovenia leads us to conclude that the internal consistency of these indicators should be further assessed. It seems worthwhile to investigate if they possibly measure distinct dimensions of response quality.
It should be noted that this study differs from previous studies in terms of various features, and that this could account for the different findings. Variations in the following features could also be observed among the studies reviewed in the background section and could account for the mixed findings there: (a) different survey modes, (b) different types of incentives, (c) different values of incentives, (d) use of conditional versus unconditional incentives, (e) probability versus non-probability samples, (f) different geographical coverage, (g) different response quality indicators, (h) different survey topics and (i) different levels of effort required to complete surveys in terms of length and question difficulty. With an increasing volume of studies on the topic, future research will hopefully be able to systematically take these study features into account.
Our findings are not in line with either of the two opposing theoretical arguments concerning the link between response quality and extrinsic motivation: That response quality is lower where extrinsic motivation is present as it undermines intrinsic motivation or that response quality is higher where extrinsic motivation is present, as the perceived economic or social exchange situation should result in taking the task more seriously. To get a better idea of how different reasons for participation affect response quality, future research could employ survey questions targeting relevant aspects more in detail, for example, by asking about strength of intrinsic and extrinsic types of motivation as well as about perceptions of the economic and/or social exchange situation in which respondents frame the survey. In this study, we have already employed the approach to take a closer look at a possible mechanism that could mediate the effect of providing an incentive on response quality, namely, that participating due to an incentive, and thus being extrinsically motivated, might be linked with response quality. Future studies would be well advised to zoom in on this and other mechanisms even more closely.
Clearly, this approach has the disadvantage that it has little potential to be studied experimentally. The literature investigating effects of providing incentives on response quality has a stronger argument to make causal claims based on experimental research than our observational study. However, a direct experimental manipulation of the type of motivation respondents have to participate in a survey seems difficult to implement in practise. Thus, using an observational approach to investigate a mechanism that is often assumed to be underlying the link between providing incentives and response quality is a necessary step if we are to make sense of the mixed findings that experimental studies on the effects of incentives on response quality have produced. We need to acknowledge the tentative nature of our findings due to the observational research design also in terms of research question one. To make definitive causal claims, experimental designs would be needed. For example, an experimental group could take several surveys before measuring response quality and then be compared to a group that was not asked to complete additional surveys. Yet, as participation in surveys cannot be forced, such designs will likely remain suboptimal. For example, selective drop-out or attrition from such studies by respondents who would also otherwise not engage in surveys is to be expected. We thus argue that there is merit in using an observational study like ours, which contains respondents who naturally have different levels of survey experience.
There are further limitations to this study. Firstly, we did not look at response quality indicators for different types of questions (such as knowledge questions, attitude questions or behavioural questions) separately because the data were not suited for this with, for example, questions of the knowledge type not being present. This would be an important aspect for future research to investigate, as some theoretical arguments are question type specific. For example, we would expect beneficial effects of previous survey experience particularly on response quality in knowledge questions (Struminskaya, 2016). Furthermore, in terms of item non-response, we observed a low incidence of behaviours indicating low response quality. This is likely due to the fact that CRONOS respondents are relatively engaged in the survey task, keeping in mind that they already agreed to take part in the ESS survey and, subsequently, agreed to take part in the CRONOS panel as well. Moreover, CRONOS respondents received prompts when trying to skip a question for most questions. Future research using data with more variation in question type and in response quality would thus be useful.
Secondly, the variable participation due to incentive was only collected in a later wave, between 7 and 12 months after the wave for which we analyse response quality. However, the relevant question did explicitly ask for respondents' initial motivation to join the panel. Survey questions asking for retrospective assessments so long after the time period they are referring to are likely to contain more measurement error, due to limits in respondents' recall ability (Bradburn et al., 1987). This could, at least partly, account for our null findings concerning the second research question. Another consideration is that social desirability might result in respondents not answering the question about their reason for participation truthfully, which could be an explanation for the null findings as well. Studies show that social desirability bias can affect web surveys, but that it tends to be lower in this mode than in others (Kreuter et al., 2008;Persson & Solevid, 2014). Still, future studies could aim to reduce social desirability further, for example, by including a question introduction communicating to respondents that researchers consider all kinds of reasons for participation to be legitimate. Moreover, future studies could also look at behavioural measures of participating due to the incentive, such as respondents cashing in an incentive they received in form of a voucher, to reduce the problem of social desirability in the measurement of reason for participation.
Another limitation arises due to the fact that a total of 385 cases had to be dropped from the analysis because of missing data on one of the variables used in the multivariate analysis. For most of these cases (n = 300), we were lacking the wave five measure for participation due to incentive. This could be a reason for the null findings. For example, those respondents who participated due to the incentive and then delivered low response quality might have dropped out of the panel before wave five at higher rates such that those cases would be systematically missing in our analyses. The relatively high number of cases with missing data as well as the suspicion that these cases could be missing systematically also reduces the generalisability of the results to the populations under study.
Lastly, our three response quality indicators only measure some aspects of response quality. Ideally, future research would look at a larger set of indicators, especially including the quality of open answers as well. This was not possible in this study because such questions were not asked in the CRONOS wave zero survey. Moreover, it would be interesting to investigate more sophisticated indicators of measurement quality. For example, Medway (2012) employed measurement invariance testing to check if the same latent traits were measured in the experimental group receiving the incentive and in the control group. Comparing measurement quality, defined as the product of reliability and validity (Saris & Andrews, 1991), between these two groups would also provide deeper insights. However, this requires the implementation of Multitrait-Multimethod experiments which were not present in CRONOS wave zero.
To conclude, this study shows that in a panel like the probability-based CRONOS panel covering Estonia, Great Britain and Slovenia and using incentives of moderate value (around 5€), researchers seem to have little reason to worry about extrinsic participation motivation affecting response quality. Nor does the ever more widespread previous web survey experience seem to pose any threat to response quality, measured as item non-response, occurrence of primacy effect or non-differentiation, on such a panel. In contrast, previous web survey experience might have the potential to enhance response quality by reducing non-differentiation. Such learning effects deserve further investigation. Moreover, more data should be collected on respondents' reasons to participate, strength of motivation and the way in which they frame the exchange situation with the data collection entity to advance our understanding of which survey design features are conducive to obtaining high response quality.