1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices

This article presents the results of an experiment designed to compare item-level non-response rates to paper- and web-based versions of a survey questionnaire, focusing on attitudinal variables, that was administered to highly accomplished teachers. A sample of teachers reported their perceptions of professional community; half were assigned to a web-based version of the questionnaire, and the other half to a paper-based version. In both groups, the survey implementation procedures reflected Dillman’s (2007)Tailored Design Method. Item-level nonresponses were compared between groups for overall rates, and differential response rates by demographics, item position, item format (i.e., branching items and respondent-provided text), and item content. Results revealed small differences in item-level nonresponse rates, both overall and in comparisons between demographic groups. In addition, there was no evidence of differential item-level nonresponse by item position and item content. However, item-level nonresponse rates were considerably higher for fill-in-the-blank items on the web-based questionnaire.


Les taux de non-réponse dans un questionnaire d’attitude transmis à des professeurs par courrier et par le web

Cet article présente les résultats d‘une expérience conçue pour comparer les taux d’absence de réponse à chaque question des versions sur papier et sur le web d‘un même questionnaire portant sur les variables d’attitude et distribué auprès de professeurs fort accomplis. Un échantillon de professeurs ont témoigné de leurs perceptions de leur communauté professionnelle. Une moitié fut assignée à une version web du questionnaire, tandis que l‘autre moitié a répondu à une version sur papier. Pour les deux groupes, les procédures d’implantation du sondage reflétaient la méthode du modèle personnalisé de Dillman (2007). Les non-réponses aux questions furent comparées entre les groupes pour obtenir des taux globaux et les taux de réponse différentiels furent comparés selon les données démographiques, la position de la question, le format de la question (par exemple les questions de redirection ou celles demandant au répondant de développer sa réponse) et le contenu de la question. Les résultats révèlent de petites différences dans les taux de non-réponse aux questions, tant en général que dans les comparaisons entre les groupes démographiques. La position et le contenu des questions n‘a pas révélé de taux différentiels d’absence de réponse aux questions. Cependant, les taux de non-réponse aux questions étaient considérablement plus élevés face aux questions à développement du questionnaire web.


Nichtantwortquoten auf Item-Level in einer Einstellungsbefragung (Mail und Web) von Lehrern

Dieser Artikel präsentiert die Ergebnisse eines Experiments, mit Hilfe dessen die Nichtantwortquoten auf Item-Level bei papier- und webbasierten Fragebogenversionen verglichen werden sollte. Der Fragebogen fokussierte auf Einstellungsvariablen und wurde an gut ausgebildete Lehrer zugestellt. Ausgewählte Lehrer teilten ihre Sichtweise auf die berufliche Gemeinschaft mit, wobei jeweils eine Hälfte der webbasierten Gruppe, die andere Hälfte der Papierversion zugeordnet wurde. In beiden Gruppen folgte die Fragebogenprozedur der Tailored Design Method nach Dillman (1980). Die Nichtantwortquoten beider Gruppen wurden bezüglich der Gesamtquote und verschiedenen Antwortraten nach demographischen Variablen, Item-Position, Item-Format (z.B. verzweigte Items und Text des Befragten), und Item-Inhalt verglichen. Die Ergebnisse zeigen geringe Unterschiede der Nichtantwortquoten im Gesamtsample und im Vergleich verschiedener soziodemographischer Gruppen. Es gab keine Belege für unterschiedliche Nichtantwortquoten aufgrund der Position oder des Inhalts des Items. Letztendlich waren die Nichtantwortquoten bei offenen Frage-Items im webbasierten Fragebogen größer.


La Frecuencia del nivel de la No-Respuesta de una Encuesta Actitudinal de Profesores Enviada a través del Correo y la Web

Este ensayo presenta los resultados de un diseño experimental que compara la frecuencia del nivel de la no-respuesta de las versiones en papel y de la web de un cuestionario de encuesta, focalizando en las variables actitudinales, que fueron administradas a profesores altamente consumados. Una muestra de profesores reportó sus percepciones sobre la comunidad profesional; la mitad fueron asignados a una versión web del cuestionario, y la otra mitad a una versión en papel. En ambos grupos, la implementación de los procedimientos de la encuesta reflejó el Modelo de Diseño de Dillman (2007). El nivel de la no-respuesta fue comparado entre los grupos por sus frecuencias generales y por la frecuencia de las respuestas diferenciales basados en datos demográficos, posición del artículo, formato del artículo (a saber, ramificación de los artículos y el texto provisto en la respuesta), y contenido del artículo. Los resultados revelaron diferencias pequeñas en el nivel de frecuencia de la no respuesta, en términos generales y en la comparación de grupos demográficos. Además, no hubo evidencia en el nivel diferencial de los artículos de no-respuesta por la posición del artículo y el contenido del artículo. No obstante, la frecuencia del nivel del artículo de la no respuesta fue considerablemente mayor en los cuestionarios para completar que en los basados en la web.


inline image

Yo yak

inline image

Increasingly, survey researchers and practitioners are opting to use web-administered questionnaires due to the potential for reduced administration cost, shorter duration of data collection, and automation of data collection, scoring, and reporting. As this shift occurs, however, potential concerns arise about the comparability of results, specifically the reliability or consistency of measures, between administration media. In particular, considerable attention has been directed toward concerns that web-based surveys, although they can be broadcast more widely, may achieve lower unit-level response rates (e.g., higher noncontact and nonreturn rates) (Cook, Heath, & Thompson, 2000; Fricker & Schonlau, 2002; Sheehan & McMillan, 1999). By contrast, few studies have addressed other potential sources of noncomparability between mailed and web-based surveys. The purpose of our study is to go beyond the literature reporting general nonresponse rates and more specifically determine whether respondents who choose to respond to web- or paper-administered surveys differ with respect to the frequency with which they leave items unanswered and the types of items to which responses are not provided.

Prior research indicates that item-level nonresponse rates (i.e., items that are left unanswered on returned questionnaires) are not high, although unanswered items in returned questionnaires may impact item- and scale-level inferences. In one of the more extreme examples, Bosnjak and Tuten (2001) determined that up to 36% of respondents may leave at least one item unanswered. Further, Wolfe (2003) found that, taking into account potential item nonresponses, marginal percentages of estimated drug purchasing rates among teens could range from as low as 1% to as high as 11%. However, it is clear from previous research concerning item-level non-responses that several features of the questionnaire may increase the likelihood of item-level nonresponse. Questions that request personal and sensitive information have higher rates of nonresponse—item-level nonresponse rates of 7% have been observed for questions relating to sexual orientation and income (Gruskin, Geiger, Gordon, & Ackerson, 2001) and ranging from 7% to 14% for questions focusing on illicit drug use and purchases (Kadushin, Reber, Saxe, & Livert, 1998; Wolfe, 2003). Questions that follow branching instructions are also frequently left unanswered (Messmer & Seymour, 1982). Item-level nonresponses also vary across item formats, particularly in web-based questionnaires (Healey, 2007; Smyth, Dillman, Christian, & Stern, 2006). The format and content of the response options provided for a question may also influence item-level nonresponse rates. For example, item nonresponse varies with the number of scale points (Leigh & Martin, 1987), allowance of multiple selections (e.g., to race/ethnicity questions) (Brener, Kann, & McManus, 2003), and provision of “don’t know” options (Johanson, Gips, & Rich, 1993).

Other research indicates that characteristics of the respondent may impact item nonresponse. Most research in this area indicates that those with less education and lower social class tend to produce higher item-level nonresponse (Alvik, Haldorsen, & Lindemann, 2005; Craig & McCann, 1978; Gruskin, Geiger, Gordon, & Ackerson, 2001; Guadagnoli & Cleary, 1992; Kupek, 1998, 1999). Also, older respondents tend to leave more questions unanswered (Colsher & Wallace, 1989; Craig & McCann, 1978; Gruskin, Geiger, Gordon, & Ackerson, 2001; Messmer & Seymour, 1982). Results have been mixed concerning gender effects (Colsher & Wallace, 1989; Guadagnoli & Cleary, 1992; Messmer & Seymour, 1982) and race effects (Gruskin, Geiger, Gordon, & Ackerson, 2001) for item-level nonresponse. Although the literature discusses these characteristics separately, it is likely that these issues are dependent on one another and are not as easily separated in operational settings (i.e., they interact to influence item-level nonresponse).

An important issue that is not clear from prior research is the nature of item-level nonresponse rates in web-based administration of surveys. While several studies exist that compare unit-level response rates between paper- and web-based surveys, few studies exist that directly compare item-level response rates between these two survey media. Research studies report only small differences in overall item-level nonresponse rates between telephone- or paper- and web-based surveys (Bongers & van Oers, 1998; Fricker, Galesic, Tourangeau, & Yan, 2005; Mehta & Sivadas, 1995; Pettit, 2002; Stanton, 1998; Tse, 1998). Detailed studies of differential item nonresponse rates for paper- and web-based surveys with respect to demographic groups, item format, and item content have not been undertaken. Our study provides a direct comparison of the nature of item-level nonresponses for web-based and paper-based versions of a questionnaire.

Although previous research has not focused on the extent to which item-level response rates vary by demographic group, item position (i.e., the serial position of the item), item format (e.g., items requesting respondent-supplied text), and item content (e.g., attitudinal items) on web-based versus paper-based questionnaires, there is reason to be concerned about whether these aspects of the survey context are associated with differential item-level nonresponses. Specifically, parameter estimates may be biased if survey respondents differ in the decisions that they make about responding to survey items between the two media as a function of these contextual features. Given previous evidence of demographic differences in item-level nonresponse (e.g., Gruskin, Geiger, Gordon, & Ackerson, 2001), it appears useful to examine whether these effects differ between media. Similarly, practical, perceptual, and motivational factors may impact how respondents approach the task of answering questions in different survey administration media (e.g., Dillman, 2007), suggesting that item characteristic effects may also be important to examine. For example, prior research already indicates that, within survey administration media, questions elicit different levels of item-level non-response as a function of the sensitivity of the question’s content, whether the item is associated with branching, and the question or response formatting. Hence, an important extension of this research is to determine whether item-level nonresponse rates vary between survey administration media in order to allay potential concerns about such potential bias in the survey results. In our study, we focus on whether item position, format, and content vary between web- and paper-administered questionnaires.

To examine these issues, the present study addresses the following research questions:

Research Question 1 (Overall): Do overall item-level response rates differ between web-based versus paper-based instruments?

Research Question 2 (Demographic): Do item-level response rates vary across demographic groups on web-based versus paper-based instruments?

Research Question 3 (Item Position): Do item-level response rates vary by item position on the questionnaire on web-based versus paper-based instruments?

Research Question 4 (Item format): Do item-level response rates vary by item format on web-based versus paper-based instruments?

Research Question 5 (Item Content): Do item-level response rates vary by item content on web-based versus paper-based instruments?


  1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices

Our study focused on responses of a sample of award-winning elementary and secondary teachers who replied to a survey examining their perceptions of professional community in either a paper- or web-administered questionnaire.


We drew a pair of simple random samples, 750 teachers each from the states of Ohio and South Carolina (in the United States), from a roster of teachers (Pre-K through 12th grade) who had been awarded certification by the National Board for Professional Teaching Standards (NBPTS). These states were purposely sampled because of the high number of National Board Certified Teachers (NBCTs) in these states as well as substantive differences in the manner in which NBPTS certification is rewarded in these states (i.e., one provides on remuneration while the other grants prestige). The roster contained the mailing and e-mail addresses of each teacher. Of the original sample, four individuals were not NBCTs (i.e., our nonmember error rate was less than 1%), one indicated he/she did not wish to participate in the survey (i.e., our refusal rate was less than 1%), and 558 did not respond to the survey request for an overall response rate of 62%. Of these nonrespondents, about half were nondeliverables, and these rates were comparable between survey media. An important outcome concerns differential response rates for the two contact media. Specifically, we obtained a response rate of 81% for those asked to complete the paper questionnaire and 44% for those asked to complete the web questionnaire. Table 1 summarizes the demographic characteristics of the two samples. Generally the samples were similar on the measured variables.

Table 1.  Respondent Characteristics
  1. Note: Percentages represent 609 paper-based and 328 web-based respondents.

South Carolina48.8%48.9%
SexNo Response0.5%2.1%
RaceNo Response1.7%3.6%
Native American0.6%0.6%
EducationNo Response1.1%1.5%
Master’s + additional72.5%73.6%
Years TeachingMean18.9919.91


Participants responded to a questionnaire that focused on their perceptions of, and engagement in, activities relating to the strength of the professional community of teachers in their schools and states, along with some demographic variables. Two versions of the questionnaire were created. First, a paper-based version of the questionnaire was created (shown in Appendix A). Next, a web-based version of the same questionnaire was created (shown in Appendix B) with effort to preserve the appearance—both in form and in content—of the paper-based questionnaire, so that we could determine the comparability of item nonresponse rates in the absence of web-based features designed to deter item-level nonresponse. Still, the two questionnaires were not identical—they differed in the following ways: (a) coloring of banding within tabular sets of questions, (b) border style of tabular sets of questions, (c) use of bracketed and underlined “response boxes” on the paper-based version versus the use of clickable response circles or boxes on the web-based version, (d) page numbering and page breaks on the paper-based version versus continuous, scrolled text on the web-based version, (e) branching questions on the paper-based version used arrows to indicate visual paths in addition to the verbal indicators (e.g., “Go to question X”) that appeared on both versions, and (f) conclusion of the paper-based version with “thank you” versus conclusion of the web-based version with a “submit” box (the latter not including a ‘thank you’ was an oversight by the researchers). In addition, depending on the screen width, the number of lines of text for each question may have been different, potentially resulting in more line wrapping in the web-based version.

The reliability of measures from this instrument with this population are supported in multiple ways by Chard (2005) and Chard and Wolfe (2007). Those studies reveal that: (a) the intended subscale structure is realized in confirmatory factor analyses, (b) the subscale reliabilities are suitable for research purposes (i.e., they range from .66 to .86), (c) rating scale category use is consistent with the intent of the instrument authors, (d) the items function as intended (i.e., item-total correlations and model-data fit statistics are appropriate), and (e) items function comparably between gender groups (i.e., there is no differential item functioning).


Participants within each state were randomized into two equal-sized groups. Half of the sample in each state received only paper-based contacts and questionnaires via postal mail, and the other half received contacts via email that directed them to a web-based questionnaire. Data collection took place during October and November of 2004. Survey implementation followed Dillman’s (Dillman, 2007) Tailored Design Method. Specifically, using that method the paper-based group received the following contacts via U.S. Mail: (1) a personalized prenotice letter that briefly explained that a survey would be arriving soon, along with a $2 bill incentive, (2) (one week later) a questionnaire, personalized letter of instruction, and a preaddressed stamped envelope for returning the questionnaire, (3) (one week later) a personalized thank you/reminder, and (4) (two weeks later) a second questionnaire in an envelope with a letter similar to the initial questionnaire mailing. The web-based group received the following contacts via email, with the exception of the prenotice, which was sent via U.S. Mail: (1) the same prenotice letter sent to the paper-based sample1, (2) (1 week later) personalized instructions to the web-based questionnaire with an embedded link to the web questionnaire, (3) (one week later) a personalized thank you/reminder with an embedded link to the web questionnaire, and (4) (one week later) a second personalized reminder with an embedded link to the web questionnaire. The fourth contact was only utilized for those who did not respond after the first attempt.


Our analyses focused on the demographic variables and 80 items that were not answered conditionally, based on a response to a branching item. For each item, observed codes were converted to dummy codes indicating whether the individual responded or did not respond to that item. For analyses that employed linear modeling, an arcsine transformation was applied to nonresponse proportions, so that proportions are distributed more normally as is appropriate to linear modeling assumptions (Kirk, 1995). Eta-squared (η2) was computed as an effect size index by dividing the Type III sum of squares for each effect by the total sum of squares. Model assumptions were checked and seemed reasonable for each of these analyses.

The analyses relating to Research Questions 1 (Overall) and 2 (Demographic) focused on responses to all 80 items. Follow-up analyses for Research Question 1 omitted numerical fill-in-the-blank questions (e.g., estimated number of students in a school). The dependent variable for these analyses was the arcsine proportion of item-level nonresponses across these items. For Research Question 1, a general linear model containing a medium (web vs. paper) main effect was fitted to the data. For Research Question 2, a main and interaction effect was added for each demographic variable of interest: (a) teaching experience (in years), (b) gender, (c) education level, and (d) mother’s education. Analyses were not performed for race because sample sizes for subgroups were small.

Analyses relating to Research Question 3 (Item Position) focused on the same set of items in Research Questions 1 and 2 but were conducted at the descriptive level only. We examined the Pearson correlation between item position and the arcsine transformed proportion of item-level nonresponses. We also examined the proportion and average length of noncompletion (i.e., those with a string of nonresponses to items that extended to the end of the questionnaire).

The analyses addressing Research Question 4 (Item format) focused on only factual and demographic items. Because of the small number of items within each item type (only three items and four items, respectively) only descriptive statistics are reported for between-media differences concerning errors in routing following conditional questions (i.e., answering conditional questions that were not supposed to be answered, or not answering conditional questions that were supposed to be answered) and provision of respondent-supplied text (e.g., items requesting a written fill-in-the-blank).

Analyses relating to Research Question 5 (Item Content) focused on the item type-by-survey medium interactions from a linear model of the arcsine of item-level nonresponse rates. Items (n = 77) were categorized according to question focus (i.e., measures of attitude—48 items, factual information—23 items, or demographics—6 items.). Attitudinal items were subclassified according to the type of rating requested of the respondents: agreement with statements, importance of issues, and influence that the respondent can exert. The attitudinal items were also subclassified according to whether or not respondents were provided with a forced choice (i.e., provision or nonprovision of a “do not know” or “not applicable” option). The 23 factual items were subclassified according to their type of rating (i.e., check one, numerical, fill-in-the-blank, or Yes / No). Demographic items were not subjected to subclassification.


  1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices

Overall Item-Level Nonresponse

Descriptive statistics concerning overall item-level nonresponse rates, which address Research Question 1 (Overall), are shown in Table 2. Item completion rates were high, with respondents in both media answering about 98% of the questions. However, item-level nonresponse rates were slightly higher for web surveys than for paper surveys overall—a statistically significant difference with a small effect size [F(1,935) = 14.92, p < .01, η2= .02]. Similarly, a slightly higher percentage of the paper respondents completed all of the items. On the other hand, removing numerical fill-in-the-blank items from consideration reveals a slightly higher item-level nonresponse rate for paper surveys rather than web surveys—another statistically significant difference with a small effect size [F(1,935) = 9.54, p < .01, η2= .01]. Finally, the Pearson correlation between the arcsine transformation of the percentages of item-level nonresponses across items for the web- and paper-based forms equals .24, indicating that the pattern of item non-response tended to differ across forms.

Table 2.  Overall Item-Level Nonresponse Rates for Paper and Web Surveys
  1. Note: NFITB = Numerical fill-in-the-blank. Nitems= 80 and NNon-NFITB items= 71. Percentages are averaged across 609 paper-based and 328 web-based respondents.

Average % of Items Answered98.6%98.0%
Average % of Items Not Answered1.4%2.0%
Average N of Items Not Answered1.11.7
% Respondents Answering All Items59.4%47.3%
Average % of Non-NFITB Items Not Answered1.5%0.9%
Average N of Non-NFITB Items Not Answered1.10.6

The results relating to Research Question 2 (Demographic) indicate that item-level response rates do not vary in a meaningful manner across demographic groups. Specifically, none of the two-way interactions between survey medium and demographic groups, nor the main effects for demographic group, were statistically significant for respondent gender, education level, or mother’s education level, and all effect sizes were small (i.e., none of the η2 values were greater than .005).

Item Sequencing, Noncompletion, and Item-Level Nonresponse Runs

The results associated with Research Question 3 (Item Position) provide little evidence of a relationship between item position on the questionnaire and the probability of item-level nonresponse. Specifically, the Pearson correlations between item position on the questionnaire and the arcsine transformed item-level nonresponse proportions equals −.13 for the paper survey (i.e., paper respondents tended to distribute their item-level nonresponses fairly uniformly across items—r2= .02) and −.26 for the web survey (i.e., items close to the end of the web questionnaire were slightly less likely to be answered than were items close to the beginning of the web questionnaire—r2= .07). Indeed, noncompletion rates were slightly higher for the web survey than for the paper survey (1% vs. 0.5%), with the relative risk of noncompletion for web respondents compared to paper respondents being 2.44 (i.e., web respondents were 2.4 times more likely to be noncompleters). However, this difference was not statistically significant (χ2(1)= 1.72, p = .19). On the other hand, the web respondent noncompleters left a greater portion of the items unanswered (39% vs. 15% for respondents who took the paper survey but did not complete it). This difference was not subjected to a hypothesis test because there were only four noncompletions in each group.

Conditional Errors and Supplied Text

Table 3 displays statistics relating to the exploratory analysis of the tendency for respondents to make errors when responding to conditional items and to supply respondent-specified text (Research Question 4—Item format). These statistics indicate a higher probability of routing errors on conditional items for paper surveys—the relative risk of such a mistake for paper surveys versus web surveys equals 2.13. It is important to note that two conditional items extended over a page break on the paper survey (items #33 and 42 – see Appendix A), which may have caused some respondent confusion. On the other hand, paper and web respondents were equally likely to provide responses to respondent-supplied text items; the relative risk for providing such text was approximately 1.04 (i.e., nearly equal proportions of responses) between the two survey media.

Table 3.  Item-Level Conditional Error and Supplied Text Omission Rates for Paper and Web Surveys
TrendItem TypeNitemsPaperWeb
  1. Note: Percentages indicate the percentage of the sample that either committed a routing error or chose not to provide text to fill-in-the-blank options, averaged across items.

Conditional ErrorsFactual23.5%1.1%
Supplied TextFactual285.4%79.7%

Item Types

Table 4 displays statistics relating to Research Question 5 (Item Content). With respect to the content focus of the questions, web respondents exhibited higher levels of item nonresponse to factual items while paper respondents exhibited (slightly) higher nonresponse to attitude and demographic items—differences that are both statistically significant and moderate in size according to Cohen’s (1988) guidelines [F(2,1870) = 58.05, p < .01, η2= .04]. Analysis of the attitudinal items indicated no meaningfully large differences in item-level nonresponse rate for paper versus web respondents with respect to either the focus of the rating [F(2,1870) = 2.49, p = .08, η2= .002] or the inclusion of an “NA/DNK” option [F(2,935) = 5.87, p = .02, η2= .006]. Although the latter comparison is statistically significant, the effect size is small. Finally, in response to factual items, web respondents were considerably more likely to exhibit item nonresponse to numerical fill-in-the-blank items (perhaps due to leaving an item blank when meaning zero2) while paper respondents were slightly less likely to respond to “Yes / No” and “Check One of the Following” question formats [F(2,1832) = 169.04, p < .01, η2= .08].

Table 4.  Item-Level Nonresponse Rate Comparisons for Item Types Between Paper and Web Surveys
ComparisonItem TypeNitemsPaperWeb
  1. Notes: FITB = fill-in-the-blank. Percentages indicate the percentage of the sample that did not respond to the questions of each type, averaged across questions.

Attitudinal FocusAgreement271.6%0.7%
Attitudinal ForcedForced301.5%0.5%
FactualYes / No122.1%1.4%
Check One21.2%0.9%


  1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices

Our results lead to several conclusions concerning populations of highly competent teachers responding to questionnaires that focus on their beliefs about teaching. First, the results relating to Research Question 1 (Overall) indicate that there are only small differences between the overall rates of item-level nonresponse when questionnaires are administered in a paper- versus a web-based medium to this population—only about 2% of the items were not answered in this study in each medium. These numbers are consistent with previous comparisons of overall item-level nonresponses between these two media (Mehta & Sivadas, 1995; Stanton, 1998; Tse, 1998) and are much lower than surveys focusing on more sensitive topics (Gruskin, Geiger, Gordon, & Ackerson, 2001; Kadushin, Reber, Saxe, & Livert, 1998; Wolfe, 2003). Similarly, our results indicate that a slightly higher percentage of paper respondents may answer all items and that the percentages in our study are about double those observed elsewhere (Bosnjak & Tuten, 2001), perhaps because of a combination of our selective sample of NBCTs and because of our incentive for responding.

Concerning Research Question 2 (Demographics), unlike research results that have focused on other populations, there were no large demographic differences with respect to item-level nonresponse rates for these accomplished teachers. Our study focuses on a fairly well-educated population, so it is not surprising that we observed no differences relating to the level of education of our teachers. Other studies that have observed education differences have tended to focus on populations that were more variable in terms of education level (Alvik, Haldorsen, & Lindemann, 2005; Craig & McCann, 1978; Gruskin, Geiger, Gordon, & Ackerson, 2001; Guadagnoli & Cleary, 1992; Kupek, 1998, 1999). Prior research concerning item-level nonresponse levels by gender has produced mixed results (Colsher & Wallace, 1989; Guadagnoli & Cleary, 1992; Messmer & Seymour, 1982). Our study adds to this body of literature indicating that, for populations of accomplished teachers, administration medium has no meaningfully large differential effects on item-level nonresponse rates for education or gender.

Concerning Research Question 3 (Item Position), the results of our exploratory analyses indicate a slightly higher tendency for web respondents to leave unanswered items toward the end of the questionnaire and to leave a greater proportion of the questionnaire unfinished than was the case for paper respondents. However, noncompletion rates were only slightly greater for web respondents. Prior research has not investigated the potential of differential effects of item position on item nonresponse between web- and paper-based questionnaires.

Concerning Research Question 4 (Item format), routing errors were more common on conditional items among paper respondents. In our study, this outcome may have been due to the positioning of a page break for two conditional items (an important practical design consideration for paper questionnaires), which is consistent with other studies of branching errors on paper-based questionnaires (Messmer & Seymour, 1982). Our results suggest that, because our web-based questionnaire did not utilize features that could have potentially reduced the number of branching errors (e.g., via explicit redirection after responding to a trigger question), the use of a single scrolled page may decrease the rate of item-level nonresponse due to routing errors. We also found no statistically significant differences between paper and web respondents with respect to the likelihood of providing respondent-supplied text for items requesting such optional information.

Results relating to Research Question 5 (Item Content) indicate that web respondents were considerably less likely to respond to numerical fill-in-the-blank items (e.g., items that require the respondent to identify the number of students or teachers in the school belonging to a particular group). We believe this occurred because web respondents chose to leave these items blank when the answer was “zero,” and the fact that the reported averages for these items were always greater for web-respondents supports that notion. An alternative explanation could be that these items place a higher cognitive demand on the respondent, and some feature of the response medium increased the probability that web-based respondents would choose to leave these items unanswered (e.g., the paper version could be carried by the respondent to a physical location to look up the necessary information while those on computer would be more inconvenienced by finding that information). On attitudinal items, item-level nonresponse rates were comparable for items asking for different types of judgments (e.g., agreement with a statement vs. importance of a statement) and for items forcing a choice versus items allowing for “not applicable” or “do not know” responses—a result that is consistent with prior research comparing item-level nonresponses between paper and web media (Smyth, Dillman, Christian, & Stern, 2006).

There are important limitations that should be kept in mind when interpreting these results. First, although we attempted to make the two versions of the questionnaire as comparable as possible, the two versions neither perfectly replicate one another nor are perfect implementations of the questionnaire in their respective media. One potentially important difference relates to the space provided for several of the items: The paper version involved a wider item column than the web version, and thus the items carried over to a second or third line less often for the paper version. This created a difference in item appearance, with the web version formatting creating greater density of text, which may have impacted item nonresponse. For instance, this greater density may have reduced respondent willingness to read and answer each item, leading to the slightly higher item-level nonresponse rate observed for the web survey. On the other hand, this increase in the number of lines per item meant the response options for adjacent items were separated more in the web version, potentially reducing the likelihood of accidentally skipping an item for the web survey (given that it would be easier to detect a missing response). Thus, the effect of this difference is not entirely clear, but it should be considered in interpreting the current results. In addition, conditional items were presented somewhat differently for the two versions. In the paper version, arrows were used to direct respondents to subitems (and in one case no specific guidance was given following a “No” response); in the web version, short written descriptions were used. This difference may have also contributed to the conditional item results, in addition to the page break issue discussed above.

The second issue concerns the appropriate population for generalizing these findings. Our study seeks to generalize its results to a population of accomplished teachers—individuals who are fairly highly educated, experienced in developing assessments, and likely to be computer-savvy. These teachers represent a fairly homogenous group with respect to gender and race. Thus, perhaps it is no surprise that demographic differences between media were relatively minor in our study, although more generally speaking, differences have been found to exist in the demographic characteristics of paper- versus web-based unit-level respondents (Yun & Trumbo, 2000). But given the homogeneous population we were studying, there perhaps were other individual differences variables that would indicate which teachers decided to respond. For instance, perhaps those with more computer experience and access to a computer would be more likely to respond to a web survey request, and those teachers who are more conscientious would be more likely to respond to a mail survey request. Third, the information requested in our questionnaire contained relatively innocuous questions, leaving unaddressed the issue of the degree to which requests for sensitive information may elicit differential item-level nonresponse rates across survey administration media. Fourth, most of the results presented here are exploratory—the comparisons are based on questions from an existing questionnaire rather than on items that were designed to address group differences hypothesized a priori based on sample and item characteristics (in fact, substantive theory on group-by-measure interactions is completely lacking). As a result, the implications that we present here are tentative, but are guided by a useful framework that could be helpful in conducting future research of this nature.

Within these limitations, our results suggest several potential implications for survey researchers who desire to utilize web-based questionnaires that focus on populations and attitudinal measures similar to those examined in this study. First, existing concerns regarding the comparability of item-level response rates between paper- and web-based questionnaires, an important indicator of the reliability of measures made in these two survey media, may be unwarranted. For example, our study suggests that there are only small differences in overall item-level nonresponse rates between web- and paper-based surveys and that these rates are consistent across demographic groups within the relatively educated, professionally accomplished, and computer-savvy population of teachers. Similarly, noncompletion rates may be comparable between these two survey administration media.

However, a second implication may be that the choice of administration medium has a differential impact on respondent motivation. For example, our study suggests that there may be differences in the number of omitted items for surveys that were not fully completed. Not only are web respondents less likely to complete all items than paper respondents, but when web respondents lose motivation, they may be more likely to leave larger portions of the questionnaire blank. In our study, because we utilized a single “scrolled” page displaying the items, the observed nonresponse patterns may have been due to the fact that respondents could not “save” work and return to it at a later time, leading web respondents to be more likely to abandon the questionnaire. However, it is unclear whether presenting the items on separate “pages” allowing for a partial save of the data would improve the situation because respondents could potentially abandon the survey due to uncertainty of how many questions remain to be answered. Future research could modify web-based questionnaires to manipulate some of these factors that might affect respondent motivation. Designers of web-based survey instruments may need to take these issues into account during instrument development.

A third implication of our results concerns the formatting of routing instructions for conditional questions. Web-based questionnaires may make it easier to cue respondents to the proper routing through conditional questions. In this study, this was true even though explicit controls were not programmed into the web-based questionnaire (e.g., “pop-up questions” or using text “graying” to conceal questions that should not be answered). Survey instrument developers may want to take advantage of the higher levels of flexibility concerning coloring and placement of arrows and text on the web screen in order to help respondents more easily visually maneuver through conditional questions in the web medium so that fewer routing errors are made.

The fourth implication of our results concerns fill-in-the-blank questions that request counted estimates. One of the most apparent outcomes of this study is the fact that numerical fill-in-the-blank items produced a considerably higher item-level nonresponse rate on web-based questionnaires than on paper-based questionnaires. We believe that this is because web respondents left those blanks empty when the implied answer was “zero” while paper respondents wrote in the numeral value for zeros. Hence, web formats with numerical fill-in-the-blank may need to provide a direct prompt for the individual to enter zero when that is the intended response to avoid treating these values as item-level nonresponses, thus upwardly biasing the resulting numerical estimates.

  • 1

    The prenotice with incentive was sent to the web-based group via U.S. Mail in order to minimize potential differences in response rate due to receiving the incentive.

  • 2

    Follow-up analyses support this notion because the numerical fill-in-the-blank items with the largest differences in item-level nonresponses between the two survey media produce the largest mean difference in the reported numerical values between web and paper survey responses, with those values always being greater for the web responses. For example, when responding to Question 37 concerning how many NBPTS certified teachers they discuss instructional issues with each week via list servers, chat sessions, or discussion boards (a fairly low frequency activity), only about 83% of the web respondents entered a zero (compared to 97% for the paper respondents).


  1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices
  • Alvik, A., Haldorsen, T., & Lindemann, R. (2005). Consistency of reported alcohol use by pregnant women: Anonymous versus confidential questionnaires with item nonresponse differences. Alcoholism: Clinical and Experimental Research, 29, 14441449.
  • Bongers, I. M. B., & Van Oers, J. A. M. (1998). Mode effects on self-reported alcohol use and problem drinking: Mail questionnaires and personal interviewing compared. Journal of Studies on Alcohol, 59, 280285.
  • Bosnjak, M., & Tuten, T. L. (2001). Classifying response behaviors in web-based surveys. Journal of Computer-Mediated Communication, 6, 114.
  • Brener, N. D., Kann, L., & McManus, T. (2003). A comparison of two survey questions on race and ethnicity among high school students. Public Opinion Quarterly, 67, 227236.
  • Chard, L. A., & Wolfe, E. W. (2007). Assessing teachers’ collective responsibility for student learning. Unpublished Research Report. ETS.
  • Chard, L. B. (2005). Using multidimensional item response theory to examine measurement equivalence: A Monte Carlo investigation. Unpublished doctoral dissertation, Michigan State University, East Lansing, MI.
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.
  • Colsher, P. L., & Wallace, R. B. (1989). Data quality and age: Health and psychobehavioral correlates of item nonresponse and inconsistent responses. Journals of Gerontology, 44, 4552.
  • Cook, C., Heath, F., & Thompson, R. L. (2000). A meta-analysis of response rates in web- or internet-based surveys. Educational and Psychological Measurement, 60, 821836.
  • Craig, C. S., & McCann, J. M. (1978). Item nonresponse in mail surveys: Extent and correlates. Journal of Marketing Research, 15, 285289.
  • Dillman, D. A. (2007). Mail and internet surveys: The tailored design method (2nd ed.). Toronto: John Wiley & Sons, Inc.
  • Fricker, R. D., & Schonlau, M. (2002). Advantages and disadvantages of Internet research surveys: Evidence from the literature. Field Methods, 14, 347367.
  • Fricker, S., Galesic, M., Tourangeau, R., & Yan, T. (2005). An experimental comparison of web and telephone surveys. Public Opinion Quarterly, 69, 370392.
  • Gruskin, E. P., Geiger, A. M., Gordon, N., & Ackerson, L. (2001). Characteristics of nonrespondents to questions on sexual orientation and income in a HMO survey. Journal of the Gay & Lesbian Medical Assn, 5, 2124.
  • Guadagnoli, E., & Cleary, P. D. (1992). Age-related item nonresponse in surveys of recently discharged patients. Journals of Gerontology, 47, 206212.
  • Healey, B. (2007). Drop downs and scroll mice: The effect of response option format and input mechanism employed on data quality in web surveys. Social Science Computer Review, 25, 111128.
  • Johanson, G. A., Gips, C. J., & Rich, C. E. (1993). “If you can’t say something nice”: A variation on the social desirability response set. Evaluation Review, 17, 116122.
  • Kadushin, C., Reber, E., Saxe, L., & Livert, D. (1998). The substanse use system: Social and neighborhood environments associated with substance use and misuse. Substance Use and Misuse, 33, 16811710.
  • Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences. Pacific Grove, CA: Brooks/Cole Publishing.
  • Kupek, E. (1998). Determinants of item nonresponse in a large national sex survey. Archives of Sexual Behavior, 27, 581594.
  • Kupek, E. (1999). Estimation of the number of sexual partners for the nonrespondents to a large national survey. Archives of Sexual Behavior, 28, 233242.
  • Leigh, J. H., & Martin, C. R. (1987). “Don’t know” item nonresponse in a telephone survey: Effects of question form and respondent characteristics. Journal of Marketing Research, 24, 418424.
  • Mehta, R., & Sivadas, E. (1995). Comparing responses rates and response content in mail versus electronic mail surveys. Journal of the Market Research Society, 37, 429439.
  • Messmer, D. J., & Seymour, D. T. (1982). The effects of branching on item nonresponse. Public Opinion Quarterly, 46, 270277.
  • Pettit, F. A. (2002). A comparison of world-wide web and paper-and-pencil personality questionnaires. Behavior Research Methods, Instruments & Computers, 34, 5054.
  • Sheehan, K. B., & McMillan, S. (1999). Response variation in e-mail surveys: An exploration. Journal of Advertising Research, 39, 4554.
  • Smyth, J. D., Dillman, D. A., Christian, L. M., & Stern, M. J. (2006). Comparing check-all and forced-choice question formats in web surveys. Public Opinion Quarterly, 70, 6677.
  • Stanton, J. M. (1998). An empirical assessment of data collection using the internet. Personnel Psychology, 51, 709725.
  • Tse, A. C. B. (1998). Comparing the response rate, response speed and response quality of two methods of sending questionnaires: e-mail vs. mail. Journal of the Market Research Society, 40, 353361.
  • Wolfe, E. W. (2003). Using logistic regression to detect item-level non-responses bias in surveys. Journal of Applied Measurement, 4, 234248.
  • Yun, G. W., & Trumbo, C. W. (2000). Comparative response to a survey executed by post, e-mail, and web form [Electronic Version]. Journal of Computer-Mediated Communication, 6. Retrieved March 3, 2007, from
About the Authors
  1. Edward W. Wolfe is an Associate Professor in the Educational Research and Evaluation Program at Virginia Tech. His research focuses on measurement applications, particularly in the areas of computer-based measurement and rater effects. Address: 313 East Eggleston Hall, Virginia Tech, Blacksburg, VA 24061.

  2. Patrick D. Converse is an Assistant Professor in the School of Psychology at the Florida Institute of Technology. His research areas include self-regulation, personnel selection, and survey methodology. Address: School of Psychology, Florida Institute of Technology, 150 W. University Blvd., Melbourne, FL 32901-6975.

  3. Frederick L. Oswald is an Associate Professor in the Department of Psychology at Rice University. His research focuses on measurement, modeling and prediction of job performance, development and refinement of personality and other non-cognitive measures, and meta-analysis. Address: Department of Psychology, Rice University, 6100 Main St – MS205, Houston TX 77007


  1. Top of page
  2. RésuméAbstractResumenZhaiYaoYo yak
  3. Method
  4. Results
  5. Discussion
  6. References
  7. Appendices

Appendix A

Paper Questionnaire

inline image

inline image

inline image

inline image

inline image

inline image

inline image

inline image

Appendix B

Web Questionnaire

inline image

inline image

inline image

inline image

inline image

inline image

inline image

inline image