The Internet and Social Participation: Contrasting Cross-Sectional and Longitudinal Analyses
A doctoral student in the Human Computer Interaction Institute at Carnegie Mellon University. Her research interests include residential mobility, everyday use of information communication technologies and their impact on sociability and psychological well-being, and issues of survey methodology as it relates to Internet research. She recently spent a summer working with the Intel People and Practices research group studying long distance movers and their patterns of technology use. Her research is supported by an NSF graduate fellowship. More informaiton is available at http://www.cs.cmu.edu/~irinas
Herbert A. Simon Professor of Human-Computer Interaction at Carnegie Mellon University. He received his Ph.D. in Social Psychology from Yale University in 1973, and has previously taught at the University of Pennsylvania and Cornell University. He was a research scientist at AT&T Bell Laboratories and Bell Communications Research for twelve years. Dr. Kraut has broad interests in the design and social impact of computing and conducts research on everyday use of the Internet, technology and conversation, collaboration in small work groups, computing in organizations and contributions to online communities. More information is available at http://www.cs.cmu.edu/~kraut.
the founding Director of the Pew Internet & American Life Project. Since December 1999, the Washington D.C. research center has examined how people's Internet use affects their families, communities, health care, education, civic and political life, and workplaces. It has issued more than 80 reports based on surveys and other research on these social issues and important public policy questions such as trust and privacy online, e-government, intellectual property, broadband adoption, and the digital divides. All the Project's reports and data are available at: http://www.pewinternet.org. Prior to receiving the Pew grant, Rainie was managing editor of U.S. News & World Report. He is a graduate of Harvard College and has a master's degree in political science from Long Island University.
Address: Pew Internet & American Life Project, 1100 Connecticut Ave., Suite 710, Washington, DC 20036. Tel: (202)557-3463
The Internet opens new options for communication and may change the extent to which people use older communication media. Changes in the way people communicate are important, because communication is the mechanism people use to develop and maintain social relationships, so valuable for their physical and mental health. This paper uses data from a national panel survey conducted in 2000 and 2001 to examine the influence of Internet use on communication and on social involvement. In doing so, it contrasts the conclusions one can draw from cross-sectional and longitudinal data on these issues. Longitudinal analyses provide stronger evidence of the causal effects of using the Internet than do the cross-sectional ones. The longitudinal data show that heavy use of the Internet is associated with reductions in the likelihood of visiting family or friends on a randomly selected day. Cross-sectional analyses show high correlations between the frequency with which respondents communicate with specific family members by visits, phone calls and email, suggesting that communication in one medium stimulates the others. In contrast, longitudinal analyses suggest that the links between communication media are asymmetric: visits drive more email communication and phone calls drive more visits, but email drives neither phone calls nor visits.
If communication dominates Internet use for a majority of its users, there is good reason to expect that the Internet will have a positive social impact, both in terms of its users' social integration in a network of family, friends and community and the benefits that flow from this integration. However, there is controversy in the research literature about whether use of the Internet increases or decreases users' social participation and the psychological and health benefits people generally receive from this participation. Some optimistic reports claim that using the Internet leads to the emergence of a new social circle (e.g., Kraut et al., 2002; Turkle, 1997) and the development of deep and long-lasting social relationships on-line (McKenna, Green & Gleason, 2002) and that it augments involvement in existing communities by providing new social spaces for communication (e.g., Katz & Aspden, 1997; Wellman et al., 2001). In contrast, other analyses suggest that frequent Internet use has negative social outcomes. In these studies, frequency of Internet use was associated with increases in depression and social isolation (Kraut et al., 1998) and declines in spending time with family and friends and in attending social events (Nie et al., 2002).
Most of the claims, both positive and negative, about the way Internet changes social participation are based on evidence from cross-sectional surveys, comparing individuals who have Internet access to those who do not have it, comparing heavier users of the Internet with lighter users, or comparing earlier adopters to later ones. One should not draw conclusions about causal relationships from this research. As Singer and Willet (2003) argue, “[t]o model change, you need longitudinal data that describe how each person in the sample changes over time. We begin with this apparent tautology because too many empirical researchers seem willing to leap from cross-sectional data that describe differences among individuals … to making generalizations about change over time (p. 9).” Applying this logic when reviewing the literature on the social impact of the Internet, Nie argues, “Internet users do not become more sociable; rather, they already display a higher degree of social connectivity and engagement, due to the fact that they are better educated, better off financially, and less likely to be among the elderly” (Nie, 2001).
This paper has two goals. The first is to bring new data to bear on the question of whether use of the Internet changes people's level of social participation and some of the benefits that flow from social participation. The second is to illustrate the different conclusions that come from cross-sectional versus longitudinal data on this topic.
Consequences of Using the Internet
One goal of this paper is to bring new data to bear on the question of how use of the Internet is changing social participation. Current research has shown conflicting evidence about the effects of the Internet on communication and social relationships. Several arguments have been advanced concerning the impact of Internet use on social involvement. One, dating back to early concerns of anti-social effects of Internet use, is that Internet use takes time away from positive social interactions thus having a negative effect on social relationships and psychological well-being (Kraut et al., 1998; Nie et al., 2002). This argument, dubbed the hydraulic or crowding-out effect of Internet use, had been supported by results by Nie and colleagues (Nie et al., 2002; Nie & Hillygus, 2002), showing that Internet users spent significantly less time interacting face-to-face with family and friends than non-users. Many writers have worried that the ease of Internet communication might encourage people to spend more time alone, talking online with strangers or forming superficial “drive by” relationships, at the expense of deeper discussion and companionship with existing friends and family (e.g., Putnam, 2000, p. 179). Given the constraints of a 24-hour day, the inefficiency of online communication may cause heavy users of the Internet to maintain under-developed social relationships with their online communication partners (e.g., Parks & Roberts, 1998) or to maintain a smaller stock of relationships. A further concern is that even when conversing with close friends and family, lower quality, online conversations might displace higher quality, face-to-face and telephone conversations (Cummings, Butler & Kraut, 2002). For example, the impoverished nature of online communication can cause people to omit the social niceties that promote or maintain social relationships (e.g., Brennan, 1991; Cummings et al., 2002; Sproull & Kiesler, 1991).
In contrast, other researchers have argued that Internet use has important positive social effects on individuals (McKenna et al., 2002; McKenna & Bargh, 2000), groups and organizations (Sproull & Kiesler, 1991), communities (e.g., Hampton & Wellman, 2001), and society at large (e.g., Hiltz & Turoff, 1978). Much of the research literature favors the proposal that Internet-based modes of communication augment existing modes of communication, providing more facets for social interaction and expanding our ability to communicate and keep in touch (Cole & Robinson, 2002; Katz & Rice, 2002; Kestnbaum, Robinson, Neustadtl & Alvarez, 2002). Robinson and his colleagues (Robinson, Kestnbaum, Neustadtl & Alvarez, 2000), for example, argue that Internet users actually spend more time socializing with family and friends when compared to non-users. They reported that compared to non-users, Internet users spent more time communicating face-to-face and over the phone and less time watching TV and sleeping. Because the Internet permits social contact across time, distance, and personal circumstances, it allows people to connect with distant as well as local family and friends, co-workers, business contacts, and strangers who share similar interests. Broad social access could increase people's social involvement, as the telephone did in an earlier time (e.g., Fischer, 1992).
Whether Internet use influences social participation and relationships is an important policy question. A long empirical research literature in psychology shows that social participation, including contact with neighbors, friends, and family, and participation in social groups, generally improves people's social support, their probability of having fulfilling personal relationships, their sense of meaning in life, their self-esteem, their commitment to social norms and to their communities, and their psychological and physical well-being (e.g., Cohen & Wills, 1985; Diener, Suh, Lucas & Smith, 1999; Thoits, 1983; Williams, Ware & Donald, 1981). Some researchers have suggested that Internet use might have different effects on social involvement depending on personality or base-line levels of social participation. For example, McKenna & Green (2002) proposed that introverts gain more socially from Internet use than do extroverts. They argued that anonymity and privacy in text-based communication provides shy and introverted individuals with a way to overcome difficulties in communication, thus compensating for difficulties fostered by personality (McKenna, 1999; McKenna & Bargh, 2000). Sproull and Kielser (1991) have suggested that distant or marginal members gain most by electronic participation in groups and organizations.
In an effort to look more closely at whether personality moderates the effects of Internet use on social involvement, Kraut et. al. (2002) examined extroversion, a stable personality characteristic associated itself with many forms of social interaction. They found that extroverts benefited more socially from higher levels of Internet use than did introverts. Few other studies, however, have collected personality measures along with measures of Internet use and social involvement, and so this result has not yet been replicated. One of the goals of the present research is to examine whether effects of Internet use on social involvement vary with extroversion.
To examine these issues, we describe the results of a national panel survey carried out in 2000 and 2001 by the Pew Internet & American Life Project (http://www.pewinternet.org/). This survey, developed jointly with the HomeNet project at Carnegie Mellon University (http://homenet.hcii.cs.cmu.edu/), collected data about Internet use and social participation from a national sample at two time periods. These data allow us to examine whether the use of the Internet changes such basic elements of social participation as the methods that people use to communicate with their social partners and the off-line social activities they engage in together. We also examine the impact of Internet use on respondents' perceptions of the social support they receive. Because introverts and extroverts may use the Internet differently and using the Internet may have different consequences for them, we will also examine whether this personality variable moderates any impact that Internet use has on social participation.
Although theory predicting the impact of the Internet on social participation has stressed its role as a communication tool, most studies have measured Internet use as whole, not distinguishing communication from other uses (Cole & Robinson, 2002;. Katz & Rice, 2002; Kraut et al., 2002; Kraut et al., 1998; Nie & Hillygus, 2002). At the base of both the “hydraulic” and the “augmentation” hypotheses is a supposition that communication over the internet will influence the amount or quality of communication using other modalities. The hydraulic hypothesis is that Internet communication substitutes for communication by telephone or in person, while the augmentation hypothesis is that Internet communication stimulates these other modes of communication. In the first part of this paper, we will examine the impact of Internet use as a whole on social participation. In the second part, we examine the extent to which the main Internet communication service — electronic mail — inhibits or stimulates communication by telephone and in person with particular social partners, and reciprocally, how these other communication modalities influence email use.
A Comparison of Survey Research Methods
The methodological goal of this paper is to illustrate how research design can influence the conclusions one can draw about the effects of Internet use. To do so, we contrast cross-sectional analyses and hierarchical linear growth models for each dependent variable collected at two time periods.
As discussed earlier, most claims about the social impact of Internet use are based on evidence from cross-sectional analyses. As Nie amd Hilligus (2002) note, pre-existing differences among heavy and light users may account for differences among them in social participation. It is well known, for example, that Internet users and non-users differ in demographics, attitudes, values, and other factors (see United States Department of Commerce, 2002 for evidence on demographic differences). Although some recent research suggests that the disparity between users and non-users is becoming smaller on some dimensions (United States Department of Commerce, 2002; Cummings & Kraut, 2002), important demographic differences still exist. For example, the online population is still younger, better educated and richer (Howard, Rainie & Jones, 2001). These are demographic characteristics that are also associated with social participation.
Researchers use multivariate regression techniques to attempt to control statistically for pre-existing differences among Internet users and non-users in cross-sectional samples. However, these statistical controls are invariably inadequate. Because of errors in measurement, statistical methods never completely remove the effects of demographic and other control variables included in the analyses. In addition, many relevant variables, such as extroversion and other relatively stable predispositions toward social behavior, are rarely measured and hence are unavailable for statistical control. As a result, associations between Internet use and social outcomes found in cross-sectional studies can be spurious, attributable to pre-existing differences in third variables that are associated with both Internet use and the social outcome. For example, Carroll and his colleagues (Carroll, Rosson, Kavanaugh, Dunlap, Schafer & Snook, in press) have shown that extroversion is associated with both Internet use and a variety of types of social behavior. As noted earlier, most cross-sectional research does not control for extroversion.
Even if it were feasible to demonstrate convincingly a non-spurious link between Internet use and social participation in cross-sectional data, it is impossible to establish the causal direction of such a link from data collected at a single point in time. Some researchers have used structural equation modeling to claim causal relationships based on cross-sectional data (e.g., LaRose, Eastin & Gregg, 2001). However, this technique does not test whether the modeler's assumptions about causal direction are correct.
Longitudinal analyses of panel data offer more solid ground for making causal claims. Panels consist of data collected multiple times from the same individuals. Very few studies have used panel data to examine the impact of Internet use, although they are becoming more frequent (e.g., Gershuny, 2002; Kraut et al., 2002; Kraut et al., 1998). Appropriate analysis of panel data examines the way a putative causal variable is associated with changes in a putative outcome. The hypothetical data plotted in Figure 1, showing the probability of heavy and light Internet users engaging in some social behavior at two times, illustrates the fundamental premise, that differences among groups at one time tell little about changes in the groups over time. In Figure 1A, heavy Internet users initially participate more than light users, and their participation grows faster than that of light users. In Figure 1B, heavier users participate more than light users in the first period, but they increase more slowly, so that by the second time period light users have surpassed them in social participation. In Figure 1C, the groups do not differ initially, but over time heavy Internet users increase their social participation while light users decrease theirs.
The ability to evaluate the same people over time significantly reduces at least one threat to causal inference—that pre-existing differences among individuals account for differences in the outcome variable. Because the same people are examined more than once, they bring the same demographic, personality and other cross-sectional differences to each data collection episode, effectively controlling for variation among individuals in both measured and unmeasured variables.
We acknowledge at the outset that longitudinal designs are not panaceas. They are still subject to validity threats. Other events co-varying with time may drive both changes in Internet use and changes in outcomes. These covariates can be internal to the individuals, such as learning or maturation, or external, such as the business cycle or change in popular culture. In addition, pre-existing differences among individuals may lead some of them to be more susceptible to change. Also, because of errors of measurement, pre-existing differences among participants are never fully statistically controlled in longitudinal designs. Only experimental research, in which participants are randomly forced to use the Internet or are prevented from using it, can lead to pure inferences about causation. However, true experiments are difficult to perform when one is seeking to examine broad social effects in the population or when examining the impact of technology on phenomena, such as the development of friendship, that are likely to emerge only after long periods of time. In addition, as Kraut et al. (2002) demonstrated, random assignment of participants to use the Internet may no longer be possible, at least in the United States among people who own a personal computer. In their study, over 80% of individuals randomly assigned to a non-Internet control group subscribed to Internet service on their own.
In the rest of this paper paper, we will first consider the question of the influence of overall Internet use on sociability in general, comparing results obtained from cross-sectional and longitudinal analyses. We will then focus on a particular relationship, investigating whether communicating in one modality with a selected individual changes communication with that partner in other modalities.
Data and Methods
In this paper we re-examine panel data originally collected by the Pew Internet and American Life Project — a daily tracking survey on Americans' use of the Internet conducted in March 2000 and a callback survey conducted among the same group in March 2001. Three thousand five hundred and thirty three adults, 18 or older, completed the 2000 interview, representing 35% of those who were in the original random-digit dial sample. Telephone interviewers attempted follow-ups a year later with all respondents who completed the original interview. Successful follow-up interviews were conducted with 1,501 (42%) of the original respondents. This is a panel dataset — a one-year study of America's Internet use and social habits from a large and diverse sample of adults in the United States1.
Data were collected at both times via telephone interviews conducted in English. Among other topics, the interviews asked about respondents' amount and style of Internet use, their evaluation of Internet use, the extent to which they engaged in social activities and the activities they participated in during the day prior to the interview. The interview also asked specific questions about the quality of relationships and amount of social support derived from one family member and one friend with whom participants communicate most heavily by electronic mail. For the purposes of this paper, we are most interested in questions that assess how extensively respondents used the Internet, the extent to which they engaged in social activities, the level of social support they had available, and the extent to which they communicated with a family member by email, telephone and in person.
Most questions were asked of all respondents at both time periods, although a subset of questions was added in 2001. In addition, the telephone interviews employed a skip pattern, in which the occurrence of some questions were based on prior answers. In particular, only respondents who used electronic mail were asked to think of a family member they “email most often,” because these questions were relevant only to them. Respondents then reported levels of social support they received from their email partner, how close they felt to this person and how frequently they communicated with this person employing different modes of communication (visiting, phone, email). Because they were asked about the same family member in the two surveys, but not the same friend, we restrict our analyses of communication modes to the family member.
The Impact of Internet Use on Social Participation
In this section, we examine the impact of Internet use on social participation, contrasting conclusions from cross-sectional and longitudinal analyses.
In both 2000 and 2001, respondents indicated the number of people they could turn to for support when they needed help (response options: many people, just a few people, hardly any people). This is a measure of perceived availability of social support. In general, measures that assess perceived social support have been found to be reasonable predictors of the individual's ability to utilize their social support as a buffer against potential adverse effects of stressful events (Cohen, Mermelstein, Kamarck & Hoberman, 1984). Although this was a single item measure and, hence, likely to be unreliable, it was asked at both time periods and thus allowed for an assessment of changes in social support.
In both 2000 and 2001, respondents were asked about their social activities on the previous day, including whether they visited a friend or a family member yesterday or called them on the phone just to talk. These measures were binary, assessing only the presence or absence of the behavior. In 2001, respondents were also asked about the frequency of various aspects of social participation, such as calling friends or relatives just to talk, visiting with family or friends, doing volunteer work or attending religious service (from never to every day, on a 5-point Likert scale).
The major independent variable for this research is the extent to which respondents use the Internet. Some empirical studies have simply compared users and non-users of the Internet (e.g., Katz & Aspden, 1997), while others have used a measure of time spent online, among active Internet users (e.g., Goget, Yamauchi & Suman, 2002). Because the first of these approaches uses a dichotomous measure, it is insensitive. The second approach uses a measure that can only be calculated for those who actively use the Internet, truncating the distribution of Internet use and limiting the statistical power available for analysis.
We constructed three measures of Internet use— a breadth measure, a frequency measure, and a history measure—which combine these two approaches. The Pew interview first asked respondents if they “ever go online to access the Internet or World Wide Web or to send and receive email?” Across the two time periods, 52.8% of respondents reported using the Internet at least occasionally.
Frequency of Internet use.
We calculated a frequency of Internet use measure, based on respondents' estimates of the time and frequency with which they went online, used email, or used chat services (see the left side of Table 1). To create the index, we assigned zero values to respondents who never use the Internet, standardized responses to convert them to a common scale, and took their mean. The scale is internally consistent (Cronbach's alpha of .79; 68 for Internet users only) and relatively stable over a 12-month period, with test-retest reliability of .66.
Breadth of Internet use.
The breadth of Internet use measure examined the range of applications for which respondents used the Internet. Among those who used the Internet, the interview asked about the whether respondents performed each of 27 Internet activities (listed on the right side of Table 1). These included use of the Internet for communication (e.g., send or read email, send “instant messages” to someone who is online at the same time, take part in “chat rooms” or online discussions with other people), for information (e.g., research online for your job, looking for information about a hobby or interest, get financial information, check sports scores), for entertainment (e.g., listen to or download music, play a lottery or gamble online, play a game online), and for consumer purchases (e.g., buy a product online, buy or make a reservation for a travel service, do banking online). The Internet breadth index is the sum of the activities they participated in. This index was highly reliable (Cronbach's alpha of .95; .79 for Internet users only) and stable over time (test-retest correlation = .72).
History of Internet use.
Because Rainie, Howard, and Jones (2002) have shown that people who recently started to use the Internet differ systematically from those who have used the Internet for a long time, we included a measure of Internet history as an independent variable. This is the answer to a single survey item, “When did you first start going online: was it within the last six months, a year ago, two or three years ago, or more than three years ago?” The variable was coded zero for respondents who had never gone online and one for those who had.
These measures of Internet use overlapped conceptually, but were sufficiently distinct to warrant conducting separate analyses using them. Table 2 shows the intercorrelations among the measures. Because respondents who had never gone online were given zero values for all three variables, inflating the correlations among these measures, we show the intercorrelations separately for all respondents and only among Internet users.
To examine whether extroversion moderated any association between Internet use and social participation, we included John, Donahue and Kentle's (1991) measure of extroversion. This was the mean of six 4-point Likert items, in which respondents indicated how much statements such as the following described them: “I am talkative”, “I have an assertive personality”, and “I am reserved” (reversed). A zero indicated the statement did not describe them at all and a 3 indicated it described them very much. The extroversion scale was only asked in 2001 (Cronbach's Alpha=.78).
Respondents indicated their gender, age, level of education, and race. Because these variables are associated with both Internet use and social behavior, we include them as control variables in all analyses that follow. We also included a measure of “Time” that indicated data collection period (time 1 or time 2)
Data analysis methods
The goal of this paper is partially substantive—to understand the impact of Internet use on social support and social activities—and partially methodological—to compare conclusions one would draw from cross-sectional and longitudinal data. While we have argued earlier that panel data are essential for testing causal claims about the impact of Internet use, we conducted both cross-sectional and longitudinal analyses in support of the methodological goal. For the cross-sectional analyses, we constructed a dataset consisting of a single record per respondent, where each variable was the value recorded in 2001, unless it was asked only in 2000 (income or race, for example). We selected 2001 as the base year, because extroversion and the frequency of performing several of the social participation questions were only asked in 2001 (see Table 3). The cross-sectional analyses use ordinary least squares regression to examine whether the three measures of Internet use predict social behavior, feelings of closeness, communication with family and friends, and perceived social support. These analyses indicate whether people who use the Internet more broadly, more frequently, or for a longer time period differ in their social behavior and exchanges with others, controlling for demographic variables and extroversion.
For social participation variables of interest measured at two time periods, we conducted longitudinal analyses to assess whether more or less extensive use of the Internet was associated with changes in social involvement or social support. We conducted a separate analysis for each measure of Internet use—breadth, frequency, and history.
We use hierarchical linear growth models to estimate how respondents' use of the Internet influenced their social participation (Bryk & Raudenbush, 1987, 1992; Singer & Willett, 2003). The logic of this analysis follows. An outcome of interest (e.g., whether a respondent visited friends or family yesterday) is measured in at least two time periods. There are two records per respondent, one from data collected in 2000 and one from 2001. The predictors include static characteristics of the respondent (e.g., gender, race, or extroversion), time (whether the data was collected in 2000 or 2001), and characteristics of the respondent that vary with time (e.g., the amount the respondent used the Internet during either 2000 or 2001). OLS regression assumes that errors are independent, normally distributed, and have constant variance. In contrast, hierarchical linear modeling recognizes that responses from the same respondent are not independent of each other. This analysis separates the error variance associated with the respondent from the error variance associated with the questionnaire administration nested within respondent. It calculates the correct degrees of freedom associated with each level of the analysis (respondent or questionnaire) and provides more appropriate estimates of the standard errors than does OLS regression.
The Internet use variables indicate whether respondents who used the Internet more broadly, more frequently, or for a longer time period differ in the initial time period on the dependent variable from those who use the Internet less. The Time variable in the model provides an estimate of change in the dependent variable over time (e.g., whether respondents are more likely to visit in 2000 or 2001). The statistical interaction of Time with other variables indicates whether these variables moderate the effects of time. In this research we are especially interested in the interaction of Internet use and Time. For example, a negative coefficient for the Internet use × Time interaction would suggest that heavier Internet users decreased visiting more than did lighter users (or increased it less).
In early 2000, the Pew Internet & American Life Project estimated that roughly 55 million Americans went online on a typical day. Table 3 shows the means, Ns, and standard deviations of the variables used in current analyses for the respondents who participated in the interviews in both 2000 and 2001. The sample was 43% male, 87% white, with an average age of 50 years, an average education of “some college,” and an average income of approximately $45,000. Forty-eight percent had reported using the Internet at least once when asked in 2000, increasing to 57% in 2001. On average, those who used the Internet in 2000 reported having used it at least once for 8.2 distinct purposes; this breadth of use increased to 10.9 distinct purposes in 2001. Over 90% of all Internet users reported using email, while the next most popular Internet activity, looking for information about a hobby or interest, was reported by 78% of the respondents.
Social activities Cross-sectional analyses.
Table 4 presents results of ordinary least squares regressions of the relationship between frequency of Internet use and frequency of social activities, controlling for a respondent's demographic characteristics and extroversion. Because the results were qualitatively the same when measuring Internet use in terms of breadth of use and history of use, these results are not reported separately. Frequency of Internet use was positively associated with the frequency with which respondents went out to dinner with friends and family, but no other variables. People who used the Internet more broadly reported a higher frequency of going out to dinner with friends or family. From the cross-sectional data, however, we cannot tell whether this positive association comes about because use of the Internet facilitates dining out (by helping to maintain a relationship or aiding logistics) or because people who use the Internet most broadly have other characteristics that predispose them to dine out. Gershuny (2002), using longitudinal methods in a large British sample, found little evidence that Internet use changed the likelihood of dining out overall, although he did report a small gender interaction where women who were high internet users increased dining out. Frequency of Internet use was not associated with the frequency with which respondents reported calling friends or relatives just to talk, visiting with family or friends, doing volunteer work, attending religious services, or with their likelihood of visiting a friend or family member or calling one just to talk on the day preceding the interview.
In contrast to the poverty of associations with Internet use, other variables, especially gender, age, and extroversion, predicted participation in social activities. Women in general participated in more social activities than men (calling, visiting, and calling yesterday), and older adults visited less than younger ones. In general, extroverts were more likely to engage in a variety of social activities (calling just to talk, visiting, going out to dinner, doing volunteer work, and calling yesterday). The negative interaction of Internet use with extroversion for the dining out dependent variable shows that the association of extroversion with going out to dinner was weaker for respondents who used the Internet more frequently. A similar pattern occurs with the calling friends and relatives just to talk dependent variable. The association of extroversion with phone-calling was weaker for respondents who used the Internet more frequently
The Pew survey asked about only three social activities in both 2000 and 2001—visiting a friend or relative yesterday, calling a friend or relative just to talk yesterday, and perceived social support. Table 5 shows repeated measures analysis predicting changes in these three measures from breadth of Internet use. Because the call and visit measures are binary, they were analyzed through repeated measures logit analysis, using the xtlogit procedure in Stata (StataCorp, 2003). The support measure is continuous, and was analyzed using the xtreg procedure in Stata. Although Table 5 reports results of logistic regressions for calling a friend or relative just to talk yesterday and visiting a friend or relative yesteryday as logits, we discuss results in terms of probability of occurance of each event as a more understandable metric.
In each analysis, the coefficient of interest is the Time by Internet use interaction. For the visit dependent measure, the intercept shows that white women respondents of average age, education, and extroversion visited with a friend or family member on about 70% of days. The main effect of Internet breadth shows cross-sectional results. Like the OLS results in Table 3, the main effect of Internet breadth is non-significant, indicating that those who used the Internet for a wider variety of purposes were no more likely to visit friends or relatives than those who did not use the Internet at all. The negative main effect for time shows that respondents were 4% less likely to report visiting in 2000 than in 2001. The negative interaction of Internet use and time shows that the decline in visiting was steepest for those using the Internet most broadly, i.e., using it for the largest number of applications. Figure 2 shows the fitted results from the model graphically. There was no drop in likelihood of visiting among the 47% of the sample who did not use the Internet at all. There was a 4% drop among in those in the 60th percentile of Internet use, a 6% drop among those in the 70th percentile of Internet use, an 11% drop among those in the 90th percentile of Internet use, and a 21% drop among the heaviest Internet users. That is, the heaviest users dropped their probability of visiting on the day in question from about 70% to 49%. Results using other measures of Internet use were similar, although significance levels varied (for the frequency measure: beta=−.027, stderr=.015, p=.07 and for the history of use measure: beta=−.15, stderr=.06, p=.02). The significant, negative time by Internet use by extroversion interaction indicates that the decline in visiting associated with Internet use was largest for the most extroverted respondents.
For the call yesterday dependent variable, all of the time coefficients were negative, but none was significantly different from zero, indicating that respondents reported calling as frequently in 2001 as they did in 2000. The interaction between time and the Internet use measures were all negative, but none was significantly different from zero (for time by Internet frequency beta=−.07, stderr=.17, p=.62; time by Internet breadth beta= -.005, stderr=.015, p=.91; time by Internet history beta=−.08, stderr=.06, p=.23).
Similarly, for the perceived social support dependent variable, none of the time by Internet use interactions approached statistical significance (Internet frequency beta=.003, stderr=.04, p=.95; Internet breadth beta=−.007, stderr=.019, p=.75; Internet history beta=−.001, stderr=.01, p=.93).
Relationships Among Modes of Communication
In this section, we examine potential reciprocal causation among different ways of communicating with a family member. We are interested in whether communication in one modality changed communication in the others. The results give insight into whether different modes of communication substitute for or stimulate each other.
Selecting the family member
In 2000, Internet users were asked to identify a family member with whom they communicated most through electronic mail. These family members (N=432) consisted primarily of siblings (36%), cousins (26%), aunts and uncles (15%), parents and children (8%), and in-laws (6%). In 2001 they were again asked to describe their communication with the same family member.
In both 2000 and 2001 they arespondents how frequently they communicated with the same family member through telephone calls and visits.2
•visits (“How frequently do you get together with this person?”)
•phone (“How frequently do you speak to this person by phone?”)
•email (“How often do you email this person?”)
All responses were on a 5-point Likert scale, with response options of every day, about once a week, about once a month, several times per year, and less often.
Because the relationships among the communication variables were potentially recursive and mutually interdependent, we used two stage least squares analysis with instrumental variables to test causal relationships among them (Greene, 1997; Kennedy, 2001). Two-stage least squares regression analysis uses a system of simultaneous equations to deal with the recursive nature of the relationship among predictor variables. It employs instrumental variables to deal with the non-independence of independent variables and the error in a dependent variable. Instrumental variables are ones that correlate with the independent variable for which they are an instrument, but are contemporaneously uncorrelated with the error on the dependent variable.
We modeled each communication modality as an endogenous variable, so that it was influenced by and could potentially influence the other endogenous variables in the model. Thus, we modeled the extent to which a respondent communicated with a family member via one modality in 2001 as a function of their communication via the other two modalities in 2001. For each communication modality, we included instrumental variables, which would plausibly predict the extent to which respondents used that modality, but not necessarily the use of the other communication modalities.
In predicting visits with the family member in 2001, we included lagged visits (i.e., frequency of visits with the family member in 2000) as an instrumental variable. Because face-to-face visits are highly dependent upon geographic distance (Allan, 1979; Allen, 1966), we included distance between the respondent and family member as another instrumental variable for the visit variable. We also included respondents' propensity to visit with others, including their estimate of their annual frequency of visits with family or friends, collected in 2001.
In predicting 2001 phone calls with the family member, we included lagged calls (i.e., frequency of phone calls with the family member in 2000) as an instrumental variable. We also included as other instrumental variables respondents' propensity to phone others, including their estimate of their annual frequency of phoning with family or friends, collected in 2001, and their estimates of the usefulness of phoning others (“How useful to you is the telephone for communicating with … (a) members of your family and (b) your friends?”), averaged over the two years.
In predicting 2001 email with the family member, we included lagged email (i.e., the frequency of emailing the family member in 2000) as an instrumental variable. We also included as other instrumental variables respondents' propensity to use email for communication, including an estimate of the number of friends and relatives they report emailing and their estimate of their annual frequency of emailing particular friends, averaged over their 2000 and 2001 answers.
In addition to the instrumental variables, we included as controls in each equation, demographic variables (age, age squared, race, and education), extroversion, and respondents' reported psychological closeness to their family member (“How close do you feel towards…?”).
Respondents who communicate with a family member tend to do so using all the modalities available to them. The correlations among frequency of visits, phone and email communication with the family member were all moderate. The more respondents visited a family member the more they phoned (r (418) = .54, p<.001) and emailed that person (r (418) = .21, p<.001). Similarly, the more they phoned, the more they emailed (r (418) = .36, p<.001).
Of the 493 individuals who reported emailing a family member in 2000, 83% reported having email contact in 2001, while 16% (70 individuals) reported no longer emailing this person. The reasons respondents gave for dropping email contact suggest that many of them believe that communications media substitute for each other. Thirty-two percent claimed to have dropped email communication because in 2001 they talked on the phone more frequently or lived closer to each other and hence communicated more in other modalities. Others claimed to drop email contact because one member of the pair lost a computer or Internet access (27%), because they had lost touch (14%) or had no time (5%).
Table 6 shows results from two-stage least squares models, predicting the frequency of respondent's communication with a designated family member in one modality from communication in other modalities. Many of the control and instrumental variables have their expected effects. Respondents communicated more in all modalities with those with whom they felt psychologically closer, although, consistent with research by Cummings et al. (2002), the association of communication frequency and psychological closeness was stronger for visits and phone calls than for email. As expected, respondents were more likely to visit in 2001 those they visited in the prior year and were less likely to visit family members who lived far away. However, the other instrumental variable for visits — overall frequency of visiting family and friends — did not predict visits with the target family member. For phone calls, those who were younger, further from middle age, and more educated were more likely to call a family member. Of the instrumental variables, phoning in 2000 and overall frequency of phoning friends and family predicted phoning the target family member in 2001, but the perceived usefulness of telephone calls did not. For email, those who were younger and those who were further from middle age were more likely to send email to the selected family member. All the instrumental variables predicted email frequency.
The main purpose of this analysis was to identify whether communicating with a designated family member by one modality changed the frequency of communication via the other modalities. Results of the 2-stage least squares analysis suggest that relationships among the media are asymmetric. In predicting visits, the significant coefficient for phoning suggests that phoning the family member stimulated personal visits with that person. In predicting email, the significant coefficient for visiting suggests that visiting the family member stimulated email communication with that person. However, email with the family member stimulated neither visits nor phone calls.
This research examined the role of the Internet in changing users' social behavior and their communication with a particular communication partner.
Overall social participation
Substantial controversy exists as to whether sustained use of the Internet improves people's social participation, harms it, or is neutral. Substantively, our results provide modest evidence that use of the Internet depresses some social interaction. In particular, using the Internet was associated with a substantial decline in the probability of visiting a friend or family member. Because of limitations in the data, we were able to conduct longitudinal analyses on only three social variables—visiting friends or relations yesterday, calling friends or relations yesterday just to talk, and perceived availability of social support. The sample as a whole visited friends and family less in 2001 than in 2000. Although there were virtually no changes in the likelihood of visits among non-Internet users, among respondents who used the Internet most the probability of visiting dropped from 70% to 49%. This decline was largest for extroverts. There was no evidence from this research, however, that Internet use was associated with changes in the probability of phoning friends and family just to talk, or with changes in the social support respondents perceived they had.
Although the conclusion that the impact of using the Internet on social participation is narrow, strongly supported only for the case of in-person visits, it was robust across the alternate measures of Internet use—respondents' frequency of using the Internet, the breadth of services they used online, and the number of years they used the Internet. Because the visit measure asked respondents about a specific event yesterday, it is less subject to recall biases than global self-reports that ask respondents to estimate a frequency of visiting, for example. The quasi-behavioral nature of the outcome measure makes the result especially convincing.
Communication with a family member
An additional goal of this research was to determine whether adding email to people's repertoire of communication techniques changes their likelihood of using richer communication modalities. A hydraulic model of communication holds that the sum of communication with a particular partner stays roughly constant, and adding communication via one medium causes declines in using others. Some respondents clearly believed this model, explaining that they stopped emailing a family member because they were phoning or visiting more. The alternative, stimulation model, however, is more plausible. The stimulation model holds that communication in one modality stimulates communication via others (e.g., Kraut & Attewell, 1997). People might phone or email to arrange a visit or to continue a conversation started when visiting. Similarly, an email conversation with a partner may keep a relationship alive and lead to subsequent telephone calls and visits with that partner (McKenna et al., 2002).
The longitudinal, two-stage least squares analysis showed support for the stimulation model, but the effects were asymmetrical. Visiting a family member appeared to increase the frequency of emailing that person, and calling them on the phone increased visits to them. In contrast, emailing a family member seemed to have little influence on the frequency of communicating with that person via other media, neither increasing nor decreasing the frequency of communicating by phone or in person. These findings are consistent with other research suggesting that email is a weaker communication modality than either face-to-face visits or phone calls (Cummings et al., 2002), and is less likely to maintain the social relationships among those who communicate using it than is either comminicating in person or by telephone.
Most research purporting to analyze the social impact of the Internet uses cross-sectional data (e.g., Katz & Rice, 2002; LaRose et al., 2001). The cross-sectional analyses in this paper suggest that Internet use is associated with more social involvement. Cross-sectional analyses show that people who used the Internet more extensively were also more likely to go out to dinner with others. They also show that communication with a family member by electronic mail was moderately correlated with visiting and phoning that person. Katz and Rice (2002) have also analyzed cross-sectional data from the 2000 Pew sample to reach the conclusion that Internet use enhances social involvement.
We reached different conclusions, however, from longitudinal analyses. Analysis of change, using hierarchical linear modeling, is consistent with the hypothesis that greater use of the Internet caused respondents to decrease their likelihood of visiting with a friend or family member on a randomly selected day. More accurately, respondents who used the Internet more frequently, for a broader range of purposes or for a longer time period, had larger declines in visiting than those who did not use the Internet at all, used it more narrowly, or for a briefer period. In addition, the two-stage least squares analysis is consistent with the hypothesis that the effects of communication in one modality on communication in others are asymmetric. Visiting a family member seems to drive emailing them, and phoning that person stimulates visiting them, but not vice versa. Importantly, for the purposes of the present paper, emailing that partner stimulates neither visits nor phone calls.
As we discuss below, one cannot automatically grant causal claims based on the types of longitudinal statistical analyses we used here. Inferring causation depends upon accepting several strong assumptions. However, we believe these longitudinal analyses provide better evidence of causation than do cross-sectional analyses using the same variables. To infer causation from a cross-sectional correlation between two variables, one must believe that no third variables can account for the observed association. As we've discussed in the introduction, cross-sectional research on the social impact of the Internet controls only for demographic variables that themselves are only weak causes of social behavior (e.g., Robinson et al., 2000). The research reported in this paper also controls for extroversion, which correlates with other measures of social participation (see Table 4) and has low but significant correlations with all three measures of Internet use. In predicting communication with a particular family member, we also control for the respondent's psychological closeness to that family member, which has moderate correlations with all three measures of communication.
Inferring causation from cross-sectional data is also based on an assumption that if a causal relationship exists between two variables, it runs in the direction specified by the investigator. The assumption of one-way causation in most cross-sectional research on the social impact of the Internet, however, is generally untenable. The correlations between Internet use and communication found in many cross-sectional studies do not necessarily imply that Internet use drives communication. Many people get Internet access in order to communicate with others. Their prior involvement with friends, family, or community organizations drives Internet use as well as potentially facilitating these social activities.
The longitudinal analyses used in the current research partially solve these problems, by controlling for both prior Internet use and prior social involvement. The hierarchical linear growth models show that those who used the Internet broadly dropped their likelihood of visiting between 2000 and 2001 more than those who did not use the Internet at all or used it less broadly. This repeated-measures analysis controls for the co-variation of errors between respondents' initial likelihood of visiting and change in likelihood of visiting. It provides separate estimates of the cross-sectional co-variation between Internet use and visits and for the co-variation between Internet use and changes in visits. The two-stage, least squares analysis, examining the causal relationships among modes of communicating with a family member, controls for prior communication by phone, visits, and email and other factors that could uniquely influence these outcomes.
Longitudinal analyses, of course, depend upon assumptions that can be challenged. Hierarchical linear growth models take into account individual differences in the social outcomes and Internet use at the initial time period and the co-variation between these individual differences and change. However, the use of hierarchical growth models to assess causation rests upon an assumption that all the relevant variables have been measured. It is still possible that some unmeasured variable that co-varies with changes in Internet use may explain why Internet users had larger declines in their likelihood of visiting and phoning than did than non-users.
The validity of two-stage, least squares analyses for assessing the causal influence among modes of communication depends upon the selection of appropriate instrumental variables. Instrumental variables for a predictor should correlate with the predictor but not cause the variables to be predicted nor be correlated with their error. For example, to determine whether email communication with a family member causes phone communication with that partner, we want instrumental variables for email that correlate with email use, but not cause or be correlated with the error in phone use. We attempted to use instrumental variables for each of the communication modalities that theory, prior research, or common sense suggest should be associated strongly with one of the communication modalities and not the others. As is common in the econometrics literature (Kennedy, 2001, p. 142), we included the lagged value of the predictor variable in the preceding time period as an instrument for each predictor. We also included as instruments for a particular communication modality respondents' propensity to use that modality with other communication partners besides the family member (e.g., the frequency of visits or phone calls to friends, and the proportion of all relatives emailed). Some of these instruments failed (e.g., although frequency of phoning family and friends in general predicted the likelihood of phoning a particular family member, the frequency of visiting family and friends in general did not predict the likelihood of visiting the focal family member). The consequence of using instrumental variables that are insufficiently correlated with the dependent variable of interest is that the error of measurement for the instrumental estimator will be high, making it difficult to discern the association of the predictor with the dependent variable.
Besides the assumptions upon which the causal analyses rest, there are limitations in the dataset itself. The Pew dataset had many good attributes. It is based on a large national sample. Because it was originally designed to provide a broad description of how Americans use the Internet, it was possible to construct several highly reliable, multi-item measures of Internet use as independent variables. However, the Pew survey was not originally designed to measure change. As a result, the surveys included data at multiple time periods on only three measures of social involvement (visiting yesterday, calling yesterday, and perceived social support) and three measures of engagement with a particular family member (i.e., visiting, calling, and emailing). Many of the most interesting variables measuring a respondent's social activities (e.g., dining out, volunteering, attendance at religious services) or their social capital (e.g., their psychological closeness and the amount of social support they received from a particular friend or family member) were asked at only a single time period and could not be used for causal analyses. In addition, hierarchical linear models are more powerful for assessing changes with more rounds of data collection. The Pew study measured the relevant outcomes only twice. Although the “yesterday questions” are likely to be valid reports on behavior, they describe respondents' behavior on only a single, randomly selected day in their lives. Because time use varies widely day-to-day, these “yesterday” measures may be unreliable. The result is that real impacts of using the Internet will be more difficult to uncover because of the instability in the outcome measures. Finally, the questions about communication with a particular family member have flaws. In particular, the focal family member was selected as the one with whom the respondent communicated most by email. As a result, there is limited variability on communication frequency with this family member (e.g., respondents could not report a failure to communicate with this person by email in 2000, but could fail to communicate with him or her by phone or visits).
Although prior research has tended to measure Internet use as an undifferentiatable whole (Katz & Rice, 2002; Kraut et al., 2002; Kraut et al., 1998; Nie et al.,2002), the wide range of services available on the Internet suggests that different ways of using it may have a different impact on social participation. The Pew data set collected information on the general breadth of Internet use, but those questions merely assessed whether respondents had ever used the Internet for a particular purpose. This type of data did not allow assessing particular current patterns of use.
The Internet has become common in American homes. It is used for a wide range of purposes — communication, information, entertainment, and commerce. By all estimates communication with friends and family is among the most common and important. There is controversy about the impact that the widespread diffusion of the Internet is having on the social lives of its users.
Longitudinal analyses from a large national panel of Americans suggest that using the Internet may lead to declines in visiting with friends and family.This effect is largest for those who initially had most social contact, i.e., the extroverts. In addition, the data suggest that while visiting a family member stimulates exchanging email with that person and phoning him or her stimulates visiting, emailing doesn't affect the likelihood of either visiting or phoning.
Because of the assumptions necessary when conducting statistical analyses and because of other limitations in the data, however, we treat these conclusions as tentative. We need more data and analyses of the type described here before accepting the conclusion that Internet use is degrading off-line social involvement. Despite these limitations, the analyses presented here represent some of the strongest evidence to date about the consequences of using the Internet on social involvement. They suggest that cross-sectional methods for evaluating the impact of using the Internet should be treated with caution and augmented or even replaced with longitudinal designs.
This research was supported by National Science Foundation grant IRI-9900449 and a grant from the Pew Internet & American Life Project. We thank Sara Kiesler for helpful substantive and editorial advice.
Like many national surveys, the Pew sample started with random digit dialing to select a sample representative of the United Sates. However, attrition at multiple points in the survey process means that the sample from which data were actually collected systematically differs from a random sample. The Pew sample was also limited to English speakers, because English was the only language in which interviews were conducted.
The survey also asked about communication with a key friend, but did not ask about the same friend at the two time periods. Therefore, we do not analyze the friend data further.