Studying Online Social Networks


  • Laura Garton,

    Corresponding author
    1. Candidate in the Department of Sociology at the University of Toronto. Her dissertation is based on a whole network study of computer-mediated communication within an organizational context. Applying a social network perspective, she examines how the introduction of a multimedia space technology creates new opportunities and at the same time new constraints on the relations and interaction patterns among geographically separated work groups.
    Search for more papers by this author
  • Caroline Haythornthwaite,

    Corresponding author
    1. Caroline Haythornthwaite is Assistant Professor at the Graduate School of Library and Information Science at the University of Illinois, Urbana-Champaign. Her research and teaching centers on information, information systems, and organizations. Her current work uses social network analysis to examine the way in which computer-mediated communication supports information exchange.
    Search for more papers by this author
  • Barry Wellman

    Corresponding author
    1. Professor of Sociology, University of Toronto, founded the International Network for Social Network Analysis in 1976, headed it until 1988, and continues to serve as its International Coordinator. His research, based at the center for Urban and Community Studies, has studied communities and workgroups as social networks, both on- and off-line. Among his recent publications are Social Structures: A Network Approach (coedited with S.D. Berkowitz, 1997, JAI Press), “Net Surfers Don't Ride Alone” (with Milena Gulia), “An Electronic Group is Virtually a Social Network,” and “Computer Networks as Social Networks” (with Garton, Haythornthwaite, Gulia, Janet Salaff and Dimitrina Dimitrova).
    Search for more papers by this author

Laura Garton, Centre for Urban and Community Studies, University of Toronto, 455 Spadina Avenue, 4th Floor, Toronto Canada M5S 1A1.

Prof. Caroline Haythornthwaite, Graduate School of Library and Information Science, 501 East Daniel Street, University of Illinois at Urbana-Champaign, Urbana, IL 61820.

Prof. Barry Wellman, Centre for Urban and Community Studies, University of Toronto, 455 Spadina Avenue, 4th Floor, Toronto, Canada M5S 1A1.


When a computer network connects people or organizations, it is a social network. Yet the study of such computer-supported social networks has not received as much attention as studies of human-computer interaction, online person-to-person interaction, and computer-supported communication within small groups. We argue the usefulness of a social network approach for the study of computer-mediated communication. We review some basic concepts of social network analysis, describe how to collect and analyze social network data, and demonstrate where social network data can be, and have been, used to study computer-mediated communication. Throughout, we show the utility of the social network approach for studying computer-mediated communication, be it in computer-supported cooperative work, in virtual community, or in more diffuse interactions over less bounded systems such as the Internet.

What is Social Network Analysis?

The Social Network Approach

When a computer network connects people or organizations, it is a social network. Just as a computer network is a set of machines connected by a set of cables, a social network is a set of people (or organizations or other social entities) connected by a set of social relationships, such as friendship, co-working or information exchange. Much research into how people use computer-mediated communication (CMC) has concentrated on how individual users interface with their computers, how two persons interact online, or how small groups function online. As widespread communication via computer networks develops, analysts need to go beyond studying single users, two-person ties, and small groups to examining the computer-supported social networks (CSSNs) that flourish in areas as diverse as the workplace (e.g. [Fulk & Steinfield, 1990]; [Wellman, Salaff, Dimitrova, Garton, Gulia & Haythornthwaite, 1996]) and virtual communities, e.g., [Wellman & Gulia, 1997]. This paper describes the use of the social network approach for understanding the interplay between computer networks, CMC, and social processes.

Social network analysis focuses on patterns of relations among people, organizations, states, etc. ([Berkowitz, 1982]; [Wellman, 1988b]; [Wasserman & Faust, 1994]). This research approach has rapidly developed in the past twenty years, principally in sociology and communication science. The International Network for Social Network Analysis (INSNA) is a multidisciplinary scholarly organization, which publishes a refereed journal, Social Networks, and an informal journal, Connections.

Social network analysts seek to describe networks of relations as fully as possible, tease out the prominent patterns in such networks, trace the flow of information (and other resources) through them, and discover what effects these relations and networks have on people and organizations. They treat the description of relational patterns as interesting in its own right – e.g., is there a core and periphery?– and examine how involvement in such social networks helps to explain the behavior and attitudes of network members– e.g., do peripheral people send more email and do they feel more involved? They use a variety of techniques to discover a network's densely-knit clusters and to look for similar role relations. When social network analysts study two-person ties, they interpret their functioning in the light of the two persons' relations with other network members. This is a quite different approach than the standard CMC assumption that relations can be studied as totally separate units of analysis. “To discover how A, who is in touch with B and C, is affected by the relation between B and C … demands the use of the [social] network concept” [Barnes, 1972, p. 3].

There are times when the social network itself is the focus of attention. If we term network members egos and alters, then each tie not only gives egos direct access to their alters but also indirect access to all those network members to whom their alters are connected. Indirect ties link in compound relations (e.g., friend of a friend) that fit network members into larger social systems The social network approach facilitates the study of how information flows through direct and indirect network ties, how people acquire resources, and how coalitions and cleavages operate.

Although a good deal of CMC research has investigated group interaction online, a group is only one kind of social network, one that is tightly-bound and densely-knit. Not all relations fit neatly into tightly-bounded solidarities. Indeed, limiting descriptions to groups and hierarchies oversimplifies the complex social networks that computer networks support. If Novell had not trademarked it already, we would more properly speak of “netware” and not “groupware” to describe the software, hardware, and peopleware combination that supports computer-mediated communication.

Comparisons with Other Approaches to the Study of CMC: Much CMC research concentrates on how the technical attributes of different communication media might affect what can be conveyed via each medium. These characteristics include the richness of cues a medium conveys (for example, whether a medium conveys text, or whether it includes visual and auditory cues), the visibility or anonymity of the participants (e.g., video-mail versus voice mail; whether communications identify the sender by name, gender, title), and the timing of exchanges (e.g., synchronous or asynchronous communication). A reduction in cues has been cited as responsible for uninhibited exchanges (e.g., flaming), more egalitarian participation across gender and status, increased participation of peripheral workers, decreased status effects and lengthier decision processes ([Eveland & Bikson, 1988]; [Finholt & Sproull, 1990]; [Garton & Wellman, 1995]; [Huff, Sproull, & Kiesler, 1989]; [Eveland, 1993]; [Rice, 1994]; [Sproull & Kiesler, 1991]).

Studies of group communication are somewhat closer to the social network approach because they recognize that the use of CMC is subject to group and organizational influences ([Contractor & Eisenberg, 1990]; [Poole & DeSanctis, 1990]). The group communication approach includes CMC theories such as social influence [Fulk, Schmitz & Steinfield, 1990], social information processing [Fulk, Schmitz, Steinfield & Power, 1987], symbolic interactionism [Trevino, Daft & Lengel, 1990], critical mass [Markus, 1990], and adaptive structuration [Poole & DeSanctis, 1990]. These theoretical approaches recognize that group norms contribute to the development of a critical mass and influence the particular form of local usage ([Connolly & Thorn, 1990]; [Markus, 1990], [1994a], [1994b]; [Markus, Bikson, El-Shinnawy & Soe, 1992]). Yet this focus on the group leads analysts away from some of the most powerful social implications of CMC in computer networks: its potential to support interaction in unbounded, sparsely-knit social networks (see also discussions in [Rice, Grant, Schmitz, & Torobin, 1990]; [Haythornthwaite, 1996b]).

Units of Analysis

Social network analysis reflects a shift from the individualism common in the social sciences towards a structural analysis. This method suggests a redefinition of the fundamental units of analysis and the development of new analytic methods. The unit is [now] the relation, e.g., kinship relations among persons, communication links among officers of an organization, friendship structure within a small group. The interesting feature of a relation is its pattern: it has neither age, sex, religion, income, nor attitudes; although these may be attributes of the individuals among whom the relation exists…. “A structuralist may ask whether and to what degree friendship is transitive. He [sic] may examine the logical consistency of a set of kin rules, the circularity of hierarchy, or the cliquishness of friendship” [Levine & Mullins, 1978, p. 17].

Social network analysts look beyond the specific attributes of individuals to consider relations and exchanges among social actors. Analysts ask about exchanges that create and sustain work and social relationships. The types of resources can be many and varied; they can be tangibles such as goods and services, or intangibles, such as influence or social support [Wellman, 1992b]. In a CMC context, the resources are those that can be communicated to others via textual, graphical, animated, audio, or video-based media, for example sharing information (news or data), discussing work, giving emotional support, or providing companionship [Haythornthwaite, Wellman & Mantei, 1995].


Relations (sometimes called strands) are characterized by content, direction and strength. The content of a relation refers to the resource that is exchanged. In a CMC context, pairs exchange different kinds of information, such as communication about administrative, personal, work-related or social matters. CMC relations include sending a data file or a computer program as well as providing emotional support or arranging a meeting. With the rise of electronic commerce (e.g., Web-based order-entry systems, electronic banking), information exchanged via CMCs may also correspond to exchanges of money, goods or services in the “real” world.

A relation can be directed or undirected. For example, one person may give social support to a second person. There are two relations here: giving support and receiving support. Alternately, actors may share an undirected friendship relationship, i.e., they both maintain the relationship and there is no specific direction to it. However, while they both share friendship, the relationship may be unbalanced: one actor may claim a close friendship and the other a weaker friendship, or communication may be initiated more frequently by one actor than the other. Thus, while the relationship is shared, its expression may be asymmetrical.

Relations also differ in strength. Such strength can be operationalized in a number of ways ([Marsden & Campbell, 1984]; [Wellman & Wortley, 1990]). With respect to communication, pairs may communicate throughout the work day, once a day, weekly or yearly. They may exchange large or small amounts of social capital: money, goods, or services. They may supply important or trivial information. Such aspects of relationships measure different types of relational strength. The types of relations important in CMC research have included the exchange of complex or difficult information [Fish, Kraut, Root & Rice, 1992]; emotional support ([Fish, Kraut, Root & Rice, 1992]; [Haythornthwaite, Wellman & Mantei, 1995]; [Rice & Love, 1987]); uncertain or equivocal communication ([Daft & Lengel, 1986]; [Van de Ven, Delbecq & Koenig, 1979]); and communication to generate ideas, create consensus ([Kiesler & Sproull, 1992]; [McGrath, 1984], [1990], [1991]), support work, foster sociable relations ([Haythornthwaite, 1996a]; [Haythornthwaite & Wellman, 1996]; [Garton & Wellman, 1995]), or support virtual community [Wellman & Gulia, 1997].


A tie connects a pair of actors by one or more relations. Pairs may maintain a tie based on one relation only, e.g., as members of the same organization, or they may maintain a multiplex tie, based on many relations, such as sharing information, giving financial support and attending conferences together. Thus ties also vary in content, direction and strength. Ties are often referred to as weak or strong, although the definition of what is weak or strong may vary in particular contexts [Marsden & Campbell, 1984]. Ties that are weak are generally infrequently maintained, non-intimate connections, for example, between co-workers who share no joint tasks or friendship relations. Strong ties include combinations of intimacy, self-disclosure, provision of reciprocal services, frequent contact, and kinship, as between close friends or colleagues.

Both strong and weak ties play roles in resource exchange networks. Pairs who maintain strong ties are more likely to share what resources they have ([Festinger, Schacter & Back, 1950]; [Wellman & Wortley, 1990]; [Lin & Westcott, 1991]). However, what they have to share can be limited by the resources entering the networks to which they belong ([Burt, 1992]; [McPherson & Smith-Lovin, 1986], [1987]; [Liebow, 1967]; [Stack, 1974]; [Espinoza, 1997]). Weakly-tied persons, while less likely to share resources, provide access to more diverse types of resources because each person operates in different social networks and has access to different resources. The cross-cutting “strength of weak ties” also integrates local clusters into larger social systems ([Granovetter, 1974], [1982]).

The strength of weak ties has been explored in research suggesting that CMC reduces the social overhead associated with contacting people who are not well known to message senders, i.e., people to whom they are weakly electronically tied ([Constant, Sproull & Kiesler, 1996]; [Feldman, 1987]; [Pickering & King, 1995]). Thus, an electronic tie combined with an organizational tie is sufficient to allow the flow of information between people who may never have met face-to-face. Connectivity among previously unacquainted people is a well established finding in the CMC research literature [Garton & Wellman, 1995]. Examples of this form of connectivity are documented in studies of large international organizations ([Constant, Kiesler & Sproull, 1994]; [Constant, Sproull & Kiesler, 1996]) as well as in dispersed occupational communities such as oceanographers [Hesse, Sproull, Kiesler & Walsh, 1993], “invisible colleges” of academics in the same field ([Hiltz & Turoff, 1993]; [Plishkin & Romm, 1994]), and members of the computer underground [Meyer, 1989].


The more relations (or strands) in a tie, the more multiplex (or multistranded) is the tie. Social network analysts have found that multiplex ties are more intimate, voluntary, supportive and durable ([Wellman & Wortley, 1990]; [Wellman, 1992b]). Yet some analysts have feared that email, the Internet, and other reduced-cues CMCs are unable to sustain broadly-based, multiplex relations (see the review in [Wellman et al., 1996]; [Garton & Wellman, 1995]). These fears are extended by the boutique approach to online offerings which fosters a specialization of ties within any one of thousands of topic-oriented news groups ([Kling, 1995]; [Kollock & Smith, 1996]). However, this tendency toward specialization is counter-balanced by the ease of forwarding online communication to multiple others. Through personal distribution lists Internet participants can sustain broad, multiplex, supportive relationships ([Wellman & Gulia, 1997]; [Wellman, 1997]). As yet, there has been little research into the extent to which specialized, online, single relations grow into multiplex ties over time.


The composition of a relation or a tie is derived from the social attributes of both participants: for example, is the tie between different or same sex dyads, between a supervisor and an underling or between two peers. CMC tends to underplay the social cues of participants by focusing on the content of messages rather than on the attributes of senders and receivers. By reducing the impact of social cues, CMC supports a wider range of participants and participation. Hence, CMC in organizations may help to transcend hierarchical or other forms of status barriers ([Sproull & Kiesler, 1991]; [Eveland & Bikson, 1988]) and to increase involvement of spatially and organizationally peripheral persons in social networks ([Constant, Kiesler & Sproull, 1994]; [Huff, Sproull & Kiesler, 1989]).

Beyond the Tie – Social Networks

Two Views: Ego-centered and Whole Networks

A set of relations or ties reveals a social network. By examining patterns of relations or ties, analysts are able to describe social networks. Typically analysts approach social networks in two ways. One approach considers the relations reported by a focal individual. These ego-centered (or “personal”) networks provide an Ptolemaic views of their networks from the perspective of the persons (egos) at the centers of their network. Members of the network are defined by their specific relations with ego. Analysts can build a picture of the network by counting the number of relations, the diversity of relations, and the links between alters named in the network. This ego-centered approach is particularly useful when the population is large, or the boundaries of the population are hard to define ([Laumann, Marsden & Prensky, 1983]; [Wellman, 1982]). For example, Wellman and associates ([Wellman, 1988a]; [Wellman & Wortley, 1990]) used ego-centered network analysis to explore how a sense of community is maintained through ties, rather than through geographical proximity, among Toronto residents. They built a picture of the typical person as having about a dozen active ties outside of their household and workplace, including “at least 4 ties with socially close intimates, enough to fill the dinner table and at least 3 ties with persons routinely contacted three times a week or more” [Wellman, Carrington & Hall, 1988, p.140]. This approach was also used by [Granovetter (1973)] to explore what types of actors in people's network provided information important for finding new jobs and by [Lee (1969)] to explore how individuals found information about access to abortions. It is well suited to the study of how people use CMC to maintain wide-ranging relations on the Internet.

The second, more Copernican, approach considers a whole network based on some specific criterion of population boundaries such as a formal organization, department, club or kinship group. This approach considers both the occurrence and non-occurrence of relations among all members of a population. A whole network describes the ties that all members of a population maintain with all others in that group. Ideally, this approach requires responses from all members on their relations with all others in the same environment, such as the extent of email and video communication in a workgroup [Haythornthwaite, Wellman & Mantei, 1995]. Although methods are available for handling incomplete data sets (see [Stork & Richards, 1992]), this requirement places limits on the size of networks that can be examined. The number of possible ties is equal to the size of the population (n) multiplied by (n-1) and divided by 2 if the tie is undirected. For a population of size 20, there are 380 links for each specific relation.

In CMC research, ego-centered and whole network views provide two ways of examining the communication links among people. Ego-centered network analysis can show the range and breadth of connectivity for individuals and identify those who have access to diverse pools of information and resources. Whole network analysis can identify those members of the network who are less connected by CMC as well as those who emerge as central figures or who act as bridges between different groups. These roles and positions emerge through analysis of the network data rather than through prior categorization.

Network Characteristics

Range: Social networks can vary in their range: i.e., in their size and heterogeneity. Larger social networks have more heterogeneity in the social characteristics of network members and more complexity in the structure of these networks [Wellman & Potter, 1997]. Small, homogeneous networks are characteristic of traditional work groups and village communities; they are good for conserving existing resources. These networks are often the norm against which pundits unfavorably compare computer-supported cooperative work networks and virtual communities ([e.g., Stoll, 1995]; [Slouka, 1995]), or praise computer-supported social networks for unlocking social relations from traditional molds (e.g., [Rheingold, 1993]; [Barlow, Birkets, Kelly & Slouka, 1995]; see also the review in [Wellman & Gulia, 1997]). Yet large, heterogeneous networks (such as those often found online) are good for obtaining new resources.

Centrality: In the CMC context, it may be important to examine who is central or isolated in networks maintained by different media. Thus, the manager who does not adopt email becomes an isolate in the email network while retaining a central role in the organizational network. Information exchanged via email will not reach this manager while information exchanged in face-to-face executive meetings will not reach lower-level workers. In a situation such as this, another person may play a broker role, bridging between the email network and the face-to-face executive network and conveying information from one network to the other. Social network analysis has developed measures of centrality which can be used to identify network members who have the most connections to others (high degree) or those whose departure would cause the network to fall apart (cut-points; see [Freeman, 1979]; [Bonacich, 1987]; [Wasserman & Faust 1994]).

Roles: Similarities in network members' behavior suggest the presence of a network role. Teachers fill the same network role with respect to students: giving instruction, giving advice, giving work, receiving completed work, and assigning grades. Regularities in the patterns of relations (known as structural equivalence) across networks or across behaviors within a network allow the empirical identification of network roles. For example, the “technological gatekeeper” [Allen, 1977] is a role that may be filled by any member of a network according to what resources they bring in to the network. At the same time, the role is not identified by a title and cannot be found on organization charts.

Partitioning Networks


In social network analysis, a group is an empirically discovered structure. By examining the pattern of relationships among members of a population, groups emerge as highly interconnected sets of actors known as cliques and clusters. In network analytic language, they are densely-knit (most possible ties exist) and tightly-bounded, i.e., most relevant ties stay within the defined network (see [Scott, 1991]; [Wasserman & Faust, 1994]; [Wellman, 1997]). Social network analysts want to know who belongs to a group, as well as the types and patterns of relations that define and sustain such a group.

Network density is one of the most widely used measures of social network structure: i.e., the number of actually-occurring relations or ties as a proportion of the number of theoretically-possible relations or ties. Densely-knit networks (i.e., groups) have considerable direct communication among all members: this is the classic case of a small village or workgroup. Much traditional groupware has been designed for such workgroups. By contrast, few members of sparsely-knit networks communicate directly and frequently with each other. As in the Internet, sparsely-knit networks provide people with considerable room to act autonomously and to switch between relationships. However, the resulting lack of mutual communication means that a person must work harder to maintain each relation separately; the group that would keep things going is not present.

By examining relations to identify network groups, CMC researchers can track the beginnings of what may become more formal groups or identify coalitions and alliances that influence others and affect social outcomes. They can link research findings on commonly held beliefs to the regular patterns of interactions among people using CMC. By identifying the group prior to its formalization, social network analysis can be used to follow the growth of CMC network phenomena. For already-defined email groups, the social network approach can be used to examine what specific kinds of exchanges define the groups. For example, online groups may be formed initially based on socioeconomic characteristics and the vague notion of access to information, such as SeniorNet for senior citizens [Furlong, 1989] or Systers for female computer scientists [Sproull & Faraj, 1995]. Analysts can examine these email or bulletin board networks for the kinds of information exchange that sustain the network.

The social network approach can also be used to see where relations and ties cross media lines. Which kinds of groups maintain ties via multiple media, and which communicate only by means of a single medium? For example, a luncheon group might coordinate meeting times through email, coordinate food delivery by phone, with final consumption face-to-face. Other network groups, such as remotely-located technicians, might exchange information about only one topic and use only one medium, such as email.

Positional Analysis

As well as partitioning social network members by groups, analysts also partition members by similarities in the set of relations they maintain. Such members occupy similar positions within an organization, community or other type of social network ([Burt, 1992]; [Wasserman & Faust, 1994]). Those who share empirically-identified positions are likely to share similar access to informational resources. Some central positions have greater access to diverse sources of information, while other positions may have a limited pool of new ideas or information on which to draw. For example, why assume that managers always give orders and subordinates always take them when an analysis of email traffic may show otherwise? Thus our study (of university computer scientists) found that faculty did not always give orders and students did not always receive orders. The actual practice was more a function of specific work collaborations among network members ([Haythornthwaite, 1996a]; [Haythornthwaite & Wellman, 1996]).

One social network method, blockmodeling, inductively uncovers such underlying role structures by juxtaposing multiple indicators of relationships in analytic matrices ([White, Boorman & Breiger, 1974]; [Boorman & White, 1976]; [Wasserman & Faust, 1994]). It might place in one block all those in the structurally-equivalent position of giving (and not receiving) orders, even if these order-givers have no ties with each other. A second block might consist of those who only receive orders, while a third block might consist of those who both give and receive orders. This is but a simplified example of blockmodeling: blockmodeling can partition social network members while simultaneously taking into account role relationships such as giving or receiving orders, socializing, collaborating, and giving or receiving information.

Networks of Networks

The “web [network] of group affiliations” [Simmel, 1922] identifies the range of opportunities as well as the constraints within which people operate. Hence the study of relationships does not end with the identification of groups (or blocks). The concept of networks is scalable on a whole network level to a “network of networks” [Craven & Wellman, 1973]: network groups connected to other network groups by actors sharing membership in these groups. This operates in a number of ways. People are usually members of a number of different social networks, each based on different types of relationships and, perhaps, different communication media. For example, a scholar may belong to one network of CMC researchers and also belong to a network of friends. This person's membership in these two networks links the two networks: there is now a path between CMC researchers and the scholar's friends.

Not only do people link groups, but groups link people; there is a “duality of persons and groups” [Breiger 1974, p.181]. The group of CMC researchers brings together people who are themselves members of different groups. Their interpersonal relations are also intergroup relations (Figure 1). For example, the ties of this paper's coauthors links the University of Illinois with the University of Toronto and the disciplines of information science and sociology. Such cross-cutting ties structure flows of information, coordination and other resources and help to integrate social systems.

Figure 1.

A network of networks. (a) Ties between individuals; (b) ties between network clusters

Recognizing the nature of the Internet as a network of networks opens up interesting questions for CMC research:

  • 1There are questions about the multiplexity of CSSNs. For example, what types of interest groups maintain their single-stranded make-up, and which change to maintain more multiplex relationships?
  • 2There are questions about overlap of membership in specialized CSSNs, such as the extent of similarities in newsgroup memberships. ([Schwartz & Wood, 1993]; [Smith, 1997]).
  • 3There are questions about how co-membership affects the resources flowing into and out of specialized CSSNs. For example, how does the composition of a newsgroup affect the types of information flowing to this group? From what other groups are messages forwarded, and how different are they in content from the newsgroup's self-definition?
  • 4There are questions about how computer-supported social networks link organizations. Just as trade and airline traffic flow differentially among countries, Internet traffic flows differentially among universities and other organizations [Schwartz, 1992]. To what extent are such flows correlated with the existing power and size of organizations, or does the Internet diminish differences between the core and the social (or spatial) periphery?

Placing CMC in Context

The preceding discussion has largely focused on a computer network as the only arena of activity. Yet this is a trap, deliberately walked into for heuristic purposes. Computer networks are only one method of maintaining ties, and social networks are not restricted to one medium. Ties may be maintained by face-to-face contact, meetings, telephone, email, writing, and other means of communication. When examining CMCs, it is often useful to distinguish between the types of resource exchange occurring via a particular medium and the resource exchange occurring between actors in a social network who happen to be using these media. We suggest adding to the term “Computer-Supported Social Networks”, the notion of “Computer-Assisted Social Networks” (CASNs) to acknowledge that social networks often use both computer and non-computerized media to sustain relations and ties.

Collecting Network Data for CMC Studies

Selecting a Sample

Ego-centered Networks: Social network analysts gather relational data at different levels of analysis, such as individuals, ties, clusters, or whole networks [Wasserman and Faust, 1994]. In an ego-centered network study, a set of people (selected on the basis of some sampling criteria) are asked questions in order to generate a list of people (alters) who are the members of their personal social network. For example, a person may be asked to report on the people they go to for advice about work matters and the people they go to for advice about personal matters. When the naming of alters in not restricted to a specific group, then ego-centered approaches can help identify the different social pools on which people draw for different resources (e.g., [Wellman, 1982]; [Wellman and Wortley, 1990]).

Ego-centered social network studies have almost never collected information about all the relations that people have with all the 1,500 or so members [Kochen, 1989] of their social network. Such an effort would be prohibitively expensive; one heroic researcher took a year to identify all the interactions in the networks of only two persons [Boissevain, 1974]. Thus studies purporting to be of “the social network” are engaging in literary reduction at best. What they really are doing are observing people's specified relations with a sample of their network members, e.g., socially-close network members who provide social support.

Software logging may make it technically feasible for scholars to collect data about all those with whom a person is in contact online, although substantial coding and privacy-invasion questions remain for dealing with the content of these communications. With more resources, the U.S. Federal Bureau of Investigation routinely uses who-to-whom mail covers, wiretaps and email logs to discover all those with whom a person is communicating and to identify organized crime clusters [Davis, 1981], while espionage agencies routinely do the same thing with traffic analyses of who sends messages to whom.

Whole Networks: In a whole network study, people are often given a roster of all the people in a specific group, and asked to identify a connection of some specific content. Every person in the group is surveyed about every other person which gives an overall snapshot of the structure of relations, revealing disconnections as well as connections. This approach is particularly useful to identify the relative positioning of members in a network as well as the partitioning of subgroups ([Haythornthwaite, Wellman & Mantei, 1995]; [Haythornthwaite, 1996a]; [Haythornthwaite & Wellman, 1996]). It is also feasible to automate the collection of who-to-whom online contact data within a group.

Before collecting data about either ego-centered or whole networks, researchers must consider where they are going to draw the boundaries or limitations of the sample. Since indirect as well as direct relations can become data, the boundary expands exponentially. For example, people can also be asked to report on relations among the alters named in their network ([Wellman, 1979]; [Wellman, Carrington & Hall, 1988]). Or the alters can be asked for their own list of network members to reveal indirect relations between different networks [Shulman, 1972]. Such data about alter-alter relations can provide information about the interconnectivity of the network, indicating how quickly information might flow among network members, how well the network might coordinate its activity, and how much social control it might exert. One “small world” study investigated the number of steps or ties it took for a person sending a note to an unknown person in an entirely different geographic and social location. The links quickly extended well beyond the original network into the friends of friends and then to their friends. This study suggested that it took no more than six links for information to flow through the United States ([Milgram, 1967]; see also [White, 1970]; [Rapoport, 1979]).

Collecting Data

Information about social networks is gathered by questionnaires, interviews, diaries, observations and more recently through computer monitoring. In both whole and ego-centered network studies of CMC, people are often asked to identify the frequency of communication with others as well as the medium of interaction. Questions may refer to a specific relational content such as “socialize with” or “give advice to” within a given time frame. In our studies of communication patterns, respondents were asked to think about each member of their team and to identify the means of communication for each type of relation. For example, they were asked to give an account of their work communication with each person in unscheduled face-to-face meetings, scheduled face-to-face meetings, by telephone, fax, email, paper letters or memos, audioconferencing, and videoconferencing (see Figure 2 for an example of the questionnaire format; ([Haythornthwaite, Wellman and Mantei, 1995]; [Haythornthwaite, 1996a]).

Figure 2.


Respondents are often asked to recall behavior that took place over a broad time frame in order to capture as much information as possible. If the time frame is too long, or the amount of information too detailed, reliability and accuracy are jeopardized. This can be a problem for some network communication studies where respondents are expected to recall not only the content of the interaction but also the frequency and the media of communication. There is some concern among social network analysts that data based on recall, although widely used, may be less reliable than data gathered by observation ([Bernard, Killworth and Sailor, 1981];[Bernard, Killworth, Kronenfield and Sailor, 1984]). Although people are able to rank the relative frequency of communication with others [Romney and Faust, 1982] results may be biased because not all interactions are equally memorable [Christensen and King, 1983]. However, accuracy is not the only concern. Data gathered by self-reporting may be tapping into a different meaning of a communication episode than data gathered by observation. Thus, recall may be better for perceptions of media use, while observation or electronic data gathering may be better for measuring actual use.

Most network researchers agree that the best approach is to use a combination of methods including questionnaires, interviews, observation, and artifacts [Rogers, 1987]. In addition to survey questionnaires, our own research has made use of qualitative data gathered through in-depth interviews and observations. Software applications such as are useful to organize ethnographic data and to investigate patterns among persons, activities and attitudes towards new media. This process provides a way for integrating the analysis of social networks of persons and offices with cognitive networks of meaning.

Social network questionnaires need not be restricted to asking about relations between people since researchers can also examine intersections between people and their group memberships. In some cases the research question is to discover the crosscutting pattern of memberships in electronic news groups or distribution lists ([Breiger, 1974]; [Finholt & Sproull, 1990]; [Kiesler & Sproull, 1988]). Or investigators may want to find out how CMC has changed the overall structure of membership in face-to-face as well as electronic committees [Eveland & Bikson, 1988]. Online groups attract those with similar interests, and friends may be drawn from these types of focused affiliations [Feld, 1981]. CMC's potential for bringing diversity into group membership may be countered by its efficiency as a tool for finding and maintaining relations with others who share similar narrow sets of attitudes and behaviors. Network data can reveal the structure of these person-group relations and the implications for social behaviors.

People linked to people and groups are not the only sources of network data. Network analysts also look at other types of structural arrangements. Electronic text, including CMC, can be analysed for patterns of relations between words or phrases ([Carley, 1996]; [Danowski, 1982]; [Rice & Danowski, 1993]). This type of data reveals cognitive maps and identifies people who hold similar conceptual orientations. It has been used to help identify emerging scientific fields and the diffusion of new ideas and innovations ([Carley & Wendt, 1991]; [Valente, 1995]).

Gathering data electronically replaces issues of accuracy and reliability with issues of data management, interpretation, and privacy. Electronic monitoring can routinely collect information on whole networks or selected subsamples. Time frames are flexible, and any form of computerized communication is potential data. Constraints are the amount of server storage space and the ingenuity of researchers and programmers in their study design. All commands entered into a system are available for monitoring, making it possible to gather information on the form of media used, the frequency of use, the timing and direction of messaging, the subject of the message, and even the content of the message itself.

The amount of information that can be gathered through automated means can be so overwhelming as to pose challenges for interpretation and analysis. Moreover, it is difficult to assess the relative importance of electronic interactions captured in a log, causing researchers to look for other ways to separate trivial communication from significant interactions. In some cases the ‘Subject Header’ is captured along with the who-to-whom data. However, headers may be misleading because they often remain in place long after a topic has been abandoned in the to-and-fro of messaging. Full texts of a message offer more possibilities for sorting out issues of significance and interpretation, but even within a message there may be a sentence or phrase that carries specific meanings known only to the sender and receiver.

Since electronic data can be collected unobtrusively, it is more difficult for people to maintain control over what information is gathered and how it will be used in the future. Sensitive topics may be avoided when people know their mail is being monitored. Capturing electronic communication can reveal alliances and information that may jeopardize employment or work relations. To alleviate these concerns researchers can randomly assign codes so that individuals cannot be identified [Rice, 1994]. However, privacy protections are often less prevalent and less comprehensive in private organizations than public or government institutions [Rice & Rogers, 1984]. This issue is important for studies of institutional intranets, but even more important for researchers who want to study a larger public on the Internet. How will people know when they are the subjects of a study in online public fora when the researchers do not identify themselves? Must researchers identify themselves if they are only participating in the electronic equivalent of hanging-out on street corners or doughnut shops where they would never think of wearing large signs identifying themselves as “Researchers”?

How are Network Data Analysed?

Ego-centered Analysis

Ego-centered data are often analysed using standard computer packages for statistical analysis (e.g., SAS, SPSS). If the aim is tie-level analysis, then all ties from all networks are analysed as if they were from one grand sample of ties. For example, our research group recently found that work role and friendship level each independently predicted the multiplexity of computer scientists' ties, online as well as offline ([Haythornthwaite 1996a]; [Haythornthwaite & Wellman, 1996]). If the aim is network-level analysis, summary measures of each network's composition can be calculated using these packages, e.g., the percent who give social support; the percent who are women; mean frequency of contact; median multiplexity [Wellman, 1992a]. In such ego-centered network analysis, information about network members, such as their age or gender, are most conveniently stored in the same dataset as information about the tie between that network member and the ego at the center of a network. Such operations can provide information that, for example, networks with more contact (or higher percentages of women) tend to be more multiplex. Merge procedures can link tie or network data with information about the ego at the center of a network, facilitating the analysis of such questions as do supervisors (or women) have more multiplex (or supportive) social networks?

The whole network analytic procedures described below can be used to analyze the structure of each ego-centered network. However, this is labourious to do for large samples of egos, because existing software requires that an analytic run must be performed separately for each ego. For manageable samples, the resultant structural data can be merged (via SAS or SPSS) with the datasets describing each ego's attributes (e.g., gender) and the composition of each ego-centered network (e.g., median multiplexity).

Whole Network Analysis

Whole network studies examine the structure of social networks (including groups or blocks), as well as the networks' composition, functioning, and links to external environments. For example, our research group is interested in assessing the role of email and desktop videoconferencing within the context of overall communication. This has meant examining such questions as:

  • 1Who talks to whom? [the composition of ties]
  • 2About what? [the content of ties and relations; the composition of ties].
  • 3Which media do they use to talk (a) to whom and (b) about what?
  • 4How do ties and relations maintained by CMC change over time?
  • 5How do interpersonal relations such as friendship, work role and organizational position affect CMC?
  • 6How do computer-mediated communications differ from face-to-face communications in terms of (a) who uses them and (b) what people communicate about?
  • 7Do computer-mediated communications describe different social networks than face-to-face communications?

Several microcomputer programs have been especially designed to analyze social network structure: UCINET, Multinet, Negopy, Krackplot, and Gradap, with the combination of UCINET and Krackplot being the mostly widely used. To use these applications, data often must be transformed into a matrix with rows and columns representing the units of analysis. These units can be people, events, groups or other entities that are related to one another. In a person-by-person whole network study, the columns and rows represent the respondents. In a directed matrix, rows represent the initiators and columns the receivers of specific relations. For example, Person B gives advice to Person D but Person D may not reciprocate.

Each relation is represented by one matrix. For example, in our whole network study of members of a distributed work group, we constructed one matrix for frequency of “overall work interaction” by totaling their communication by each medium (see Figure 3). Individual matrices are constructed for separate media. In a longitudinal study there are matrices for each time period as well as for each relation. Managing data in matrix format can be a challenge if there are many different relations or, as in the case of communication media studies, several types of media for each relation.

Figure 3.

Data Matrix of Overall Work Interaction at Time 1.

The matrix in Figure 3 is formatted for UCINET. The first line includes a specification of the number of nodes: in this case the number of people included in the study (n=9). The labels represent the abbreviated codes assigned to each person. The cells of the matrix represent the number of times a respondent communicated with another about work related matters over a two week period. (The zero diagonals are not usually used in analyses.) Networks can be described mathematically in a variety of ways. Since a review of graph theory and algebraic notation is not within the scope of this paper we provide only a brief overview of some of the results that can be generated. (For further reading on the mathematical theory underlying network data analysis see [Wasserman & Faust, 1994].

Analyses of interaction frequency identifies the connections between people which can be used to build network models of resource flows or influence. They can also provide information on the overall density of interactions within a whole network or frequency of exchange among specific ties. Subgroups such as cliques are identified through partitioning the network into clusters of relative interaction density. Communication positions such as “isolate”, “bridge” or “star” emerge from an analysis of matrix data. Visual representations of relational matrices are generated by establishing coordinates through multidimensional scaling and importing these into a drawing program such as Krackplot. They can also be generated from within Krackplot by importing raw matrix data from UCINET or another compatible program. Visual representations of a network help identify the overall structure of positions and changes over time.

For example, in our study of media use within an organizational context we were interested in whether the introduction of a desktop video-conferencing system would produce new patterns of communication and increase collaboration between geographically distributed work groups. We collected data on work relations and media use both before, and at six month intervals after the implementation of the system, referred to here as CMS. Seven members of the organization, including the President, worked from headquarters. The Vice President and his office coordinator operated from a satellite office 100 km away. The following sociograms were generated by Krackplot and depict the organizational communication structure at different times over the twenty month study. In these visual representations people are displayed as points and arranged in relation to the relative frequency of their interaction. People who communicated more with each other are placed closer together. The connecting lines indicate communication direction and frequency level (above average).

The sociograms of work interaction (by all media) before and after the introduction of CMS (Figures 4 and 5) indicate few changes took place in the overall structure of communication patterns. The Headquarters Office Coordinator remained a central communication star despite changes in staff and job descriptions. Furthermore, the Satellite Office Coordinator remained a relative isolate, connected to the others primarily through a link with the Vice President. This interpretation of the sociograms is reinforced by statements made in interviews with individual organziational members. They reported a continued preference to organize their work activities with others who were physically proximate despite the addition of CMS. The exception was the Vice President who reported increased connectivity with all members of the organization and in particular with the President.

Figure 4.

Work Interaction by all media prior to the introduction of CMS

Figure 5.

Work Interaction by all media eighteen months after the introduction of CMS

Figures 6 and 7 are sociograms showing the work related communication networks that operated via CMS. The data is drawn from an electronic log of all interactions on CMS by all members of the organization over eighteen months.

Figure 6.

Work Interaction by CMS six months after Introduction.

Figure 7.

Work Interaction by CMS eighteen months after Introduction

In Figure 6 the Vice President and President are directly connected via CMS with an above average use in the direction from the President to the Vice President. In interviews both parties reported that CMS was helpful in supporting collaboration and decision-making. The system saved the Vice President travel time between sites. More importantly, it allowed him to take part in unscheduled meetings as well as spontaneous consultations. The Satellite Office Coordinator was also pleased with the system since there was no pressure from headquarters to take on more work responsibilities, and yet CMS provided an added visibility within the work group.

Eighteen months after the initial implementation of CMS there were changes in the relative positions and linkages between organizational members who used the system to communicate about work (see Figure 7). For example, the Vice President and President were no longer directly connected nor even close to each other in their use of the system. CMS had blurred boundaries between the two office sites which made it more difficult for the Vice President to unobtrusively control frequency and timing of interruptions by others. Since work between sites could be handled without visual synchronicity, CMS had become less useful as a tool for collaboration. However, CMS was used among members of the support staff. This group did not distribute work to each other and consequently did not report CMS interaction as disruptive or problematic. Media use in this case study was dependent upon the nature of the relation between the users as well as the features of the technology and the distribution of tasks.

Sociograms provide snapshots of organizational interaction structures which can indicate how static or dynamic these structures are over time. From these types of diagrams we can visually identify emergent positrons and clusters of interaction. The nature and content of relations may be part of the initial construction of the sociogram or determined later through surveys and interviews with members of the network. Visual depictions of whole networks can highlight both linkages and non-linkages, revealing ‘structural holes’ [Burt, 1992]. By examining these patterns of mediated and unmediated interaction we gain an added perspective on communication structures that underpin explicit work processes as well as those that support affective, less instrumental behaviors.


Because computer networks often are social networks, the social network approach gives important leverage for understanding what goes on in computer-mediated communication: how CMC affects the structure and functioning of social systems (be they organizations, workgroups or friendship circles) and how social structures affect the way computer-mediated communication is used.

Initial studies of computer mediated-communication developed from studies of human-computer interactions. Such studies focused on how individuals interfaced with various forms of “groupware”: software and hardware adapted for computer-mediated communication [Johnson-Lenz and Johnson-Lenz (1994)]. The obvious analytic expansion beyond the individual has been to the tie, i.e., how two persons interact through CMC. Not only is this a natural expansion, it is analytically tractable, and it has fit the expertise of those social scientists who have pioneered CMC research: psychologists and psychologically-inclined communication scientists and information scientists.

A need for new ways of analyzing CMC has developed with the spread of computer networks and the realization that social interactions online are not simply scaled-up individuals and ties. Analysts want to know how third parties affect communications, how relations offline affect relations online, and how CMC intersects with the structure and functioning of social systems. For example, have organizations flattened their hierarchy, are virtual communities rebuilding social trust online, and have personal attributes become less relevant on the Internet where “nobody knows you are a dog” (to quote a legendary New Yorker cartoon)? Given the network nature of computer-mediated communication, the social network approach is a useful way to address such questions.


Research for this paper has been supported by the Social Science and Humanities Research Council of Canada and the Information Technology Research center, and Industry Canada. We appreciate the advice of Joanne Marshall and Marilyn Mantei and the assistance of our Web guru, Keith Hampton.