Representation, Interaction, and Intersubjectivity

Authors

  • Richard Alterman

    Corresponding author
    1. Computer Science Department, Brandeis University
      Computer Science Department, Center for Complex Systems, Brandeis University, P.O. Box 549110, Waltham, MA, 02454-9110. E-mail: alterman@cs.brandeis.edu
    Search for more papers by this author

Computer Science Department, Center for Complex Systems, Brandeis University, P.O. Box 549110, Waltham, MA, 02454-9110. E-mail: alterman@cs.brandeis.edu

Abstract

What the participants share, their common “sense” of the world, creates a foundation, a framing, an orientation that enables human actors to see and act in coordination with one another. For recurrent activities, the methods the participants use to understand each other as they act change, making the intersubjective space in which actors operate richer and easier to produce. This article works through some of the issues that emerge from a close examination of intersubjectivity as it is managed through representation and interaction. The data that are presented document, in detail, a sequence of related interactions, within and across episodes of cooperation, where continuity and change can be observed. The emergence of conversational structure and coordinating representations are significant milestones in the long-term development of a representational practice that support the runtime co-construction of intersubjective space. Conversational structures emerge interactively to mediate recurrent points of coordination in the domain activity, and only secondarily the conversation itself. Coordinating representations change the representational practice of the participants by making it easier to manage their “shared view” of the collective work, enabling the participants to make progress, expand the field of the common activity, while exhibiting more control of if and when explicit grounding occurs.

1. Introduction

There is an unspoken commonness to any situation that provides the background, the framing that enables humans to understand each other and their shared field of activity, and to work together effectively. The participants' common “sense” of the situation creates a foundation, a framing, an orientation that enables human actors to see and act in coordination with one another. Without intersubjectivity there is no human communication, no accumulation of knowledge within a community across generations, and no emergence of complex patterns of social interaction. Intersubjectivity—literally “between subjects”—is arguably the organic structure of human cognition (Clark, 1996; Cole & Engeström, 1993; Hutchins, 1995a; Lenot'ev, 1972; Schegloff, 1992; Schutz, 1967; Tomasello, Kruger, & Ratner, 1993; Vygotsky, 1978).

A representational account of intersubjectivity features an analysis of either the functioning of a set of representations (Hutchins, 1995a) or the content of, and correlation among, the participants' prior and emergent understandings of the situation as they are internally represented (Clark, 1996; Lewis, 1969). An interactionist account locates intersubjectivity in the production and use of representations, not the content itself (Clark, 1996; Garfinkel, 1967; Schegloff, 1992). A cultural historic account of intersubjectivity closely examines the historical character of the participants' shared activities and the role of mediation (Cole & Engeström, 1993; Hutchins, 1995a; Lenot'ev, 1972; Vygotsky, 1978).

The study reported in this article examines how a “common” sense of the situation emerges among the participants while engaged in a cooperative activity in a work-like context. The participants are working in a rich representational environment. The participants are multitasking, sometimes working in parallel, and other times closely with a joint focus. The dynamic nature of the participants' task to share the world as they act makes each occasion of cooperation different (Lave, 1991; Suchman, 1986). Thus, at various points of the cooperation, the participants must make adjustments that enable them to align their private views of the situation.

The main task of this article is to work through some of the issues that emerge from a close examination of intersubjectivity as it is managed through representation and interaction. By embedding “conversation” in the context of the overall activity, some important features of how interaction and representation produce a runtime commonality to the situations will be revealed. The interaction among the actors, and their common representational activities, enable the participants to multitask, work in parallel, divide their tasks by role and expertise, and vary their level and type of participation both within the activity and across teams. The data will show that in work-like environments, the co-construction of intersubjective space is as much about how the participants develop their representational practice to avoid jointly focused, sequential grounding interactions, as it is about the grounding itself.

The data document, in detail, a sequence of related interactions, within and across episodes of cooperation, where continuity and change can be observed. The analysis of the data shows that methods used by participants to “stay on the same page” in a conversation are not the same as those that are used in more work-like domains where multitasking and forms of interaction other than face-to-face are common. Because of dynamics, whatever common procedures emerge, there are points of coordination, moments of interaction between the participants, where the participants mark, confirm, or negotiate the progress of their private understandings of the shared endeavor (Alterman & Garland, 2001). Conversational structures emerge interactively to mediate recurrent points of coordination in the domain activity, and only secondarily the conversation itself. Coordinating representations change the representational practice of the participants by making it easier to manage their “shared view” of the collective work, enabling the participants to make progress, expand the field of the common activity, while exhibiting more control of if and when explicit grounding occurs. Both conversational structure and coordinating representations are significant elements of the common representational practice that emerges within the community in support of managing the development of common understanding at runtime, making the intersubjective space in which the actors operate richer and easier to produce and reducing the number of occasions when explicit sequential grounding interactions must occur.

2. Representation and interaction

Suppose there are multiple actors approaching a stop sign at an intersection in the road. The stop sign is a physical object at the scene of the activity and both parties attend to it. The stop sign has meaning. The meaning assigned to it has bearing on the coordination of the cooperative activity. As sense is made of the situation, the stop sign mediates the interaction.

In a situation like this, each individual brings to bear a tremendous amount of knowledge. Although each participant, on a standardized test, could identify the traffic laws, the internal representation of each participant is not likely to be a rote memorization of the law. In addition to basic information about the traffic laws on stopping, each participant is also familiar with conventions for acting, under various conditions, when a stop sign is in force at an intersection in the road. Other kinds of relevant knowledge for which each actor has a mental representation concern the types of participants in a traffic situation (other drivers, cyclists, and pedestrians) and expectations about typical behaviors, rolling stops versus legal stops, rush hour traffic, and so forth.

Only a selection of general knowledge is directly relevant to the sense made of a particular encounter. Road constraints, the heaviness of traffic, and time constraints are all nuances of the situation-at-hand that influence how and what sense is individually made of the situation by each of the actors.

For a traditional cognitive scientist explaining what is common in the sense that is made by each of the participants depends on an accounting of mental content. Individual beliefs about the collaborative field of action compose the intersubjective space in which the actors operate. The predispositions and expectations of the participants, their quality, number, and correlation characterize the richness of the intersubjective space prior to the activity. To coordinate behavior, each actor's beliefs about the structure of the behavior must be aligned. The commonness of the sense that is made depends on the degree to which each individual's understanding of the situation—as it is internally represented—correlate; or alternately, the commonness depends on the amount of work it takes to align the private understandings of the individuals to accomplish some cooperative task.

A simple version of the representational viewpoint might argue that the commonness of each participant's assessment of their shared domain of activity is an intersection between their individual internal representations of the situation: what is in common is either located in the intersection of each actor's general knowledge, or in the intersection of the set of beliefs of each of the participants.

Interactionist accounts begin with the idea that a simple representational account of intersubjectivity cannot work (Garfinkel, 1967). Intersecting sets of internal representations cannot account for the commonality of the situation. Rather, the focus should be on the organization and flow of social interaction that produces a common understanding. “The appropriate image of a common understanding is therefore an operation rather than a common intersection of overlapping set” (p. 30).

From this perspective, an explanation of intersubjectivity should focus on the procedures by which “doers of action” produce shared knowledge (Schegloff, 1992, p. 1299):

Instead, what seemed programmatically promising was a procedural sense of “common” or “shared,” a set of practices by which actions and stances could be composed in a fashion which displayed grounding in, and orientation to, “knowledge held in common”—knowledge that might thereby be reconfirmed, modified, expanded and so on.

(p. 1298)

The participants can never directly compare their mental representations of their individual sense of the situation. Intersubjectivity is located in the procedure the participants use to display their orientation toward the collaboration. The organization of the interaction provides the participants with opportunities to display, repair, and orient themselves as they proceed with their activity.

The organization of ordinary conversation provides opportunities for the interactants to display their understanding of the situation-at-hand and also recognize and repair breakdowns of intersubjectivity (Schegloff, 1992). Conversation is sequential; the interactants take turns. In the first position, a speaker presents a contribution to the conversation. In the second position, other participants have an opportunity to display a response. In the third position, the initial speaker can amend her presentation if it did not invoke a preferred response. In this manner, it is the organization of the conversation, the organization of repair in conversation—the interaction, not representation—that forms the basis, the framework of analysis, for the intersubjective.

In a similar fashion, at the stop sign, the participants never really know exactly what the other actors believe about the situation, but their actions can display an orientation to, a stance toward, what is their presupposed common knowledge of situation.

A representational account is concerned with characteristics of the representations that are produced. An interactionist account is concerned with the how the representation is produced. The intersubjective space in which actors operate is located in both the production and product of their work to share an understanding of the situation of engagement.

2.1. Common ground

Common ground provides a basis for collaborators to coordinate their joint activities (Clark, 1996). It is composed of three parts. The initial common ground is the set of background facts, assumptions, and beliefs that are shared by the participants. The current state of the activity is a second part of common ground. The third is the public events during the current activity that the participants have witnessed.

Grounding is the method by which participants add new content to common ground (Clark, 1996; Clark & Brennan, 1991). Grounding occurs when “The contributor and his or her partners mutually believe that the partners have understood what the contributor meant to a criterion sufficient for current purposes” (Clark & Brennan, 1991, p. 129). Grounding includes both presentation and acceptance phases and may require interaction to achieve acceptance. One actor, A, presents an utterance u, with the expectation that during the acceptance phase B will provide evidence that B understood u (Clark & Brennan, 1991). There is a range of evidence that B can contribute that varies in the strength of evidence the contribution provides (Clark & Schaefer, 1989). In general, the participants try to minimize the total effort spent on their contribution in both the presentation and acceptance phases of the interaction (principle of least collaborative effort).

Common ground is defined in terms of a belief about some proposition p: p is a part of common ground for a set of actors if they all believe p and they believe that the other actors also believe p and that those other actors believe that they believe p and so on. A more formal definition of the grounding criteria introduces the notion that there is a basis b for the participants to believe that p is a part of common ground (Clark & Marshall, 1981; Lewis, 1969).

In the situation at the stop sign, there is any number of propositions p that could be grounded and there are points in the interaction where the participants take turns sequentially. However, the participants also work in parallel and are multitasking. At some point in the interaction the proposition p that “Joe is going first” is believed by each of the participants. The grounding criteria are met when Joe and Sally mutually believe that “Joe is going first.” The mutual belief of this proposition can come before either actor enters the intersection or it could come after Joe has already entered the intersection.

Within the larger context of a work-like environment, there are various reasons why grounding may need to be avoided, delayed, or both. The driving issue here is the cost of grounding. In the context of a work-like environment, actors are working in parallel and dividing the labor by kind. Both of these improve the performance of the group as a whole. As a consequence of these divisions each of the actors ends up multitasking. From an efficiency point of view, the problem with explicit grounding is that it halts the parallel efforts of the group. Enabling participants to delay grounding improves performance by reducing the number of costly interruptions in the activity. Even when sequential grounding interactions occur they are mediated by conversational structures that are engineered by the participants to mediate “the conversation” at expected points of recurrent coordination. These conversational structures primarily organize the coordination of the domain activity, and only secondarily the “conversation” itself.

2.2. Distributed cognition

Within the work context, the set of participants, the internal and external representations that mediates their activity, form a historically conditioned and functional system. It emerges from the modifications, changes, and improvements that have developed within a community over time. Distributed cognition frames cognition in terms of this larger unit of analysis: the entire collection of representational devices, content, and methods employed by the participants at the scene of the activity (Hutchins, 1995a).

A representational system in which a collaborative activity develops has three parts:

  • 1A set of representational media available to the participants.
  • 2A set of internal or external, private or shared, representations including those provided in the design of the task environment and ones created at runtime.
  • 3A set of procedures for communicating, recording, modifying, transcribing, and aligning multiple, partial representations of the shared context.

On a given occasion of activity, each event triggers a propagation of representational changes (Hutchins, 1995b; Hutchins & Klausen, 1992). An aircraft is ultimately controlled by the movement of information across the various representational media available in the cockpit. On board a Navy ship, navigation into the harbor is guided by a propagation of information through various representational media (Hutchins, 1993; Hutchins, 1995a).

Some of the representational activity of the participants is directly relevant to the maintenance of an intersubjective space, but not all. If one of the actors uses a calculator to compute a sum, the activity is part of how the representational system functions, but not directly relevant to the co-construction of intersubjective space.

In the classroom, during a lecture, the slides from the teacher's presentation, what was written on the chalkboard, and the student's notes are all part of the representational system in which the participants cooperate. Throughout the semester representations are being propagated within the system. Decisions about whether to handout printed versions of a lecture before the lecture have significant impact on the distribution, amount, and kinds of representational work that are done. Learning can be modeled by the transformation of information from one representation (the notes on the chalkboard) to other ones (the student's notes, the internal memory of the student). During the semester some of the representational activities primarily serve the function of keeping the participants “on the same page,” but not all.

The stop sign is part of the representational system that the participants use to negotiate the flow of traffic at a busy intersection in the road. With increasing amounts of traffic, an intersection that once could be traveled without incident may require an addition of a stop sign—a change to the representational system—to enable the drivers and pedestrian to more efficiently and effectively construct at runtime an intersubjective space in which to operate.

The participants in a work-like environment continue to reengineer their representational activity so as to better support the co-construction of the intersubjective space. A focus of this article will be on the coordinating representations, why they are introduced, and how they function. The basic story is with coordinating representations the actors are better able to manage when and how the intersubjective space emerges, enabling the participants to work in parallel, delay and reduce the number and size of costly sequential interactions and interruptions, while continuing to “stay on the same page.” With these innovations, individual understandings may diverge for a period of time but common representational activities evolve that support the recalibration of sense among the actors at a propitious time.

3. Case study

Collecting data that depends on recording a runtime interaction of the participants is not an easy task. Detailed note taking is incomplete, labor intensive to collect, and by its very nature interpretive. Technology has been used to collect interactional data that is more complete and less dependent on the subjective interpretation of the author. In conversational analysis, transcripts of recorded telephone conversations are used as data for analysis (Sacks, Schegloff, & Jefferson, 1974). Video technology has also been used to collect detailed interactional data (Suchman & Trigg, 1991). Both of these kinds of technology achieve greater fidelity in the recording of the interaction.

There are problems, however, with using either of these technologies to collect data of the sort that was needed to study the reciprocal and dynamic relationship between representation and interaction in the runtime construction of an intersubjective space. Both kinds of technology have very high transcription costs. Recorded telephone conversations would not be sufficient for a study that analyzes how the design of a task environment mediates cooperation. No matter how many videotapes are collected, there may still be relevant activity that is occurring outside the purview of the camera. Collecting multiple videotapes alleviates some of this problem but it also introduces a new one: The correlation of multiple tapes is technically complicated and time consuming. Both of these technologies work best by capturing a single episode of interaction. Neither of these technologies can be easily used to conduct a study that strings together several snapshots of cooperative behavior in order to capture the flow, growth, and development of intersubjective space for a set of recurrent activities within a community of actors.

Over an extended period, my group at Brandeis has been experimenting with a same time–different place groupware system (VesselWorld) as a platform for analyzing real–time, computer mediated collaborations. All events that occur during a VesselWorld problem-solving session are recorded in a log file by the system. Every mouse click, every event, and every shared bit of information was recorded without bias within the transcript for a session. Each transcript automatically included markings for different types of events—for example, a “planning event” or a “chat event.” A VCR-like program (called SAGE) was built to review the decision making of each group and examine how the participants in a VesselWorld session coordinate their activities and the exchange of information (Landsman & Alterman, 2003). Because the data saved has an inherent structure, the analyst can search through the data using any number of criteria; for example, he can move forward to the next communication, round, plan action, or other such action within the system, allowing for an easier review of the bulky data logs. VesselWorld was demonstrated at CSCW 2000 (Landsman, Alterman, Feinman, & Introne, 2001).

Using a transcript from an online collaboration is more complete than a study based on videotaping because it automatically captures everything the participants shared with one another. Because events are automatically marked on the transcript as being of a certain type, it is significantly less time consuming to analyze the data. See Landsman (2006) for a further discussion of this technology.

4. The base version of VesselWorld

We have built several versions of VesselWorld and collected over 100 hr of data. Several formal studies comparing teams of participants using different versions of the VesselWorld platform have been done. This article compares only two versions of the representational system of VesselWorld: the basic system (VesselWorld) and VesselWorld+.

4.1. Task

In VesselWorld, three users, situated at three physically separate locations, engage in a set of cooperative tasks that require the coordination of behavior in a simulated environment. In the simulated world, each participant is the captain of a ship. Their joint task is to find and remove barrels of toxic waste from a harbor and load them onto a large barge. Two of the users operate cranes that can be used to lift toxic waste from the floor of the harbor. The third user is the captain of a tugboat that can be used to drag small barges from one place to another.

Segments of activity are divided into rounds; it takes at least six rounds of activity to move from one end of the harbor to the other. During a round of activity, participants plan out their future actions explicitly and then submit them to the system. They also chat with one another and can access and store various kinds of information. Once a participant has submitted his or her next action, he or she can no longer change it. When all three participants have submitted actions, the round ends, the system updates the state of the world, and the next round begins.

There are many complications in clearing the harbor. The participants have limited (and non-identical) areas of perception, and the harbor must be searched to discover the toxic waste. Some barrels are large and require the two cranes to join together and lift them simultaneously. What equipment is needed to retrieve a particular barrel can only be determined by the Tug operator, and only when he is next to the barrel.

4.2. The representational system

A portion of the interface for the base system of VesselWorld is shown in Fig. 1. The WorldView (the large window in Fig. 1 graphically represents several kinds of information about the location and status of objects, from the perspective of an individual participant. It depicts the harbor from the participant's point of view; only a limited region of the whole harbor is visible at any one time—the shaded region in the figure. When two or more vessels have overlapping radiuses of perception, the participants can “see” each other to the extent that they know the other vessel(s) are nearby, but there is not sufficient detail to determine in what sort of activity the other ship is engaged. The participants can “mark” their map with labeled markers, but they cannot see each other's markers.

Figure 1.

The interface for the basic system.

A second window of information is used for editing and displaying the user's current plan. A third window allows the user to access more detailed information about visible objects. Textual chat is used as the primary method for participants to communicate with one another during their cooperative activity.

Each event in VesselWorld triggers a propagation of representational changes (Hutchins & Klausen, 1992). When a waste is first discovered, it is represented in the actor's WorldView. The size of the waste and the coordinates of its location are re-represented in the chat window and the marker lists of each participant. When the tug reports if special equipment is needed to remove the waste there is another propagation of representation.

Fig. 2 shows a sampling of the rules each of the groups used to propagate representational state in support of the handling of barrels of toxic waste and the representational work it entails. Suppose Crane1 discovers a small waste (Waste19) at a particular location, 265/318. When Crane1 reports her discovery via the chat channel to the other participants, knowledge of the newly discovered domain object is shared. The location and size of the waste is information represented in the WorldView. The discovery of the waste is reported in the chat window; this requires a transcription from the WorldView to the chat window (Rule 1). It is the responsibility of the other two actors to record this information by marking their private maps (Rule 3).

Figure 2.

Propagating representational state about wastes.

Fig. 3 gives some examples of the kind of chatting in which the participants engage. Most of the participant dialogue is centered on the barrels of waste and how effort can be coordinated in removing the barrels from the harbor and transporting them to the large barge. The participants must also keep track of what areas of the harbor have (or have not) been searched. The participants must discover and then keep track of the location of wastes. Initially this is the location of a waste in the harbor; later this includes whether a waste has been moved—and if so, where—and if it is on a small barge, in what order it was stacked. References to the wastes must be shared; these references can change depending on the circumstances.

Figure 3.

Examples of chatting.

5. What is grounded? How is it grounded?

Suppose the two cranes are about to remove an extra large waste (Waste25) from the harbor. There are several coordination problems involved in the removal of the waste:

  • •. Both cranes must be in close proximity of the waste.
  • •. One of the cranes may need to deploy equipment before the lifting begins.
  • •. During the same round of activity, the cranes must join together.
  • •. During the same round of activity, the cranes must jointly lift the waste.

Each crane believes, in general, that an understanding of these requirements exists as a part of their common knowledge. The common knowledge of the participants, general knowledge about the structure of recurrent behaviors and expectations about certain actors or kinds of actors, frames the intersubjective space in which the actors operate. The degrees to which the expectations and assumptions align are predictive of how smoothly the cooperation will run. A specific individual sense of how that structure plays out in the current situation is constructed during the course of the activity.

For activity theory there are three levels to the analysis: activity level (the motive), the action level (a set of goals to be achieved), and the operation level (how the actions are achieved; Engeström, 1992; Kuutti, 1996). In VesselWorld, the activity level is to clear the harbor, the actions are finding and removing the individual wastes, and the operations are the component actions and how they are carried out using the interface.

When the cranes jointly lift a large waste from the harbor, what parts of the sense that the participants individually make of the situation are grounded? Their motives are likely grounded: this applies throughout the entire activity. Their specific goal to jointly remove a particular waste at this particular time, is grounded in chat. However, at the operational level, much of what the actors are doing is not specifically nor necessarily grounded. Even when grounding occurs during a closely coupled action like a joint lift, it does not necessarily occur independently, prior to the action; rather, it is achieved by the completion of the action itself. In other words, the grounding occurs implicitly; it does not require an explicit grounding activity vis-à-vis a conversation. This arrangement is economical as that explicitly grounding vis-à-vis chat is labor intensive, requiring the actors stop working in parallel, jointly focus, and interact “sequentially” to achieve common ground for a particular p.

On a specific occasion of lifting an extra large waste (Waste25), each of the cranes formulates a plan for lifting that waste that draws on each crane's individual knowledge of such a situation. Each actor's plan for the situation frames the sense that they make of the situation. These plans are partially represented in the planning window. Whatever sense the individual actors make of that situation will be framed by their intent as represented by the plan. There are also points of coordination where the cranes intent must match else the execution of their individual plans will result in a breakdown. However, it is not necessary that the plans in their entirety be identical. During a cooperative lift, if one crane operator needs to deploy equipment and the other does not, the plans will not be identical. During the course of action, only the points of coordination are necessarily grounded. Others parts of each actors sense of the situation are functionally equivalent, but not necessarily, not likely to be, identical.

Much of the joint lift may be accomplished without the participants explicitly chatting to align their plans. Each of the cranes continues to act as the cooperative lift-and-carry is achieved. Only if one or the other crane thinks they are “no longer on the same page” will the actors jointly focus, engaging in a collaborative replanning vis-à-vis the chat. Because of the representational system that mediates VesselWorld activity, there may never be a specific point in time where the actors are jointly focused on a cooperative lift. There is a period of time where they are both reasoning about the cooperative lift, but not necessarily a specific moment when they are both tuned to the operation. In a conversation, the situation is just the opposite: The participants are jointly focused on each contribution to the interaction, moment by moment, in sequence.

During the period of time that an extra large waste is removed any number of other tasks overlap and are interleaved with the execution of plans for other individual and cooperative activities like adding a marker to a map. During the course of action, the participants are monitoring all of their “open” plans to make sure things are proceeding as expected. Because the actors are multitasking whatever monitoring they do of a particular cooperation is interspersed with other activities, and consequently, there is no “official” second position or next turn in which to confirm or initiate repair. Each actor must continuously monitor the situation looking for evidence as to whether the current situation is running as expected or can be explained given each of the individual's current plans for the situation. Therefore, if Crane2 stops along the way to pick up a small waste, Crane1 can interpret that as consistent with her plan to jointly lift an extra large waste with Crane2, or she may choose to initiate a repair. Because of the multitasking nature of the VesselWorld domain, there is no official point in the sequence of activity, a second position, in which she may choose to initiate a repair.

In general, Crane1 believes that she can use Plan1 as a basis to continue to act if Plan1 accounts for (explains) the actions of Crane2 up to the current point of the interaction with Crane2. If Crane2′s actions do not fit into the plan that mediates Crane1′s behavior, either a new plan is independently created by Crane1 to internally mediate her behavior or a communicative interaction is invoked to align private representations of the shared activity. In either case, the newly constructed plan, as Crane1 conceives it, must both achieve Crane1′s goal and account for Crane2′s actions.

6. Progress without grounding

For conversation it is the sequential nature of turn taking that dictates procedural infrastructure of how the participants maintain the social order (Schegloff, 1991). During a conversation the interactants are explicitly engaged in a joint sense-making task. The pace at which sense is jointly made is one turn at a time, one after another. In a conversation, each turn, every action, passes through an interactive process that cooperatively confirms some sense for the contribution p (Clark, 1996). Where in a face-to-face conversation the sequential nature of turn taking provides a basis for knitting together a common view of the shared field of activity, for a multitasking mixed sequential and parallel cooperation like VesselWorld the conventional representational activities of the participants is also part of the procedural infrastructure for maintaining the social order.

Suppose Crane1 discovers a new waste (Waste19), adds a marker to her map, and reports it on the chat channel.

When the discovery of Waste19 is reported, the other participants may or may not acknowledge they received the information. It sometimes happens that one of the actors fails to record the information in a marker. Other times, all the participants mark their maps but their marks are different. The data shows that in the normal case the marker lists of the participants at best approximate one another. Many times those approximations are sufficient for the actors to continue as if their markers were identical. There are also numerous occasions, however, where interaction is required to clarify and align the private marker representations of the actors.

When Crane 1 reports the discovery of Waste19 on the chat channel, she could believe:

  • •. The other two participants add markers to their maps.
  • •. The markers are the same.
  • •. Because of multitasking, these criteria for making progress are too stringent. An alternate set of requirements for continuing the action and making progress can be expressed in terms of the representational system that mediates the ongoing collaboration:
  • •. By adding a marker to her own map, the waste she discovered is adequately represented in the representational system.
  • •. To become a part of the intersubjective space, it is sufficient that the waste is “adequately represented” in the representational system.

The addition of information about a newly discovered waste to the representational system is a mark of progress but not necessarily an indication that the common ground of the participants has accumulated. These requirements are sufficient for Crane1 to continue with her other work with the knowledge that even in the problematic scenario the situation although not preferable is workable.

The reason that Crane1 believes that Waste19 is part of the intersubjective space depends on her understanding of the community's convention for distributing information about a newly discovered waste within the representational system. For newly discovered wastes, the distributed representational activity for maintaining an intersubjective space in which to operate is:

  • 1The person who discovered the waste adds a marker to her or his map.
  • 2The waste is reported on the chat line by the person who discovered it.
  • 3Each of the other actors adds a marker for the newly discovered waste to his or her marker list.

Crane1′s expectation is that if she does her part, the other participants will do theirs, and if the others do not do their part the situation will be retrievable at a later point in the situation. However, Crane1 also believes that Condition 1 is sufficient to make progress.

When Crane1 discovers Waste19, adds a marker to her map, and reports Waste19 on the chat channel, she may move on to other activities without waiting for confirmation from the other participants. Waste19 becomes a part of the intersubjective space of the VesselWorld actors but does not necessarily become a part of common ground.

7. Inventing conversational structure

Initially, in response to a breakdown, a conversational interaction occurs that realigns the private understandings of the participants. In future situations, where one or another actor anticipates the problem may recur, the actors will create a conversational structure to organize the flow of the activity. Over time the actors expect that structure as an organization of their activity at that point of the interaction. These conversational structures emerge at recurrent points of coordination. Their primary function is to organize a domain activity. Only secondarily do they organize the communication task.

Our everyday recurrent behaviors include conversational structure that is produced to mediate routine conversational situations (Schegloff, 1986). For example, there is a core opening sequence during the initial stages of a telephone conversation. Each participant has knowledge of the core opening sequence in a telephone conversation. The expected structure of the core opening sequence mediates the interaction at an anticipated point of interaction in the opening of a telephone conversation, making it easier to more effectively initiate the conversation. Each utterance serves a dual function: it communicates content and it helps the actors to synchronize their activity as the step through the opening sequence. Thus, when a secretary in the office picks up the phone and answers, “This is the computer science department,” the content of his or her utterance identifies the receiver of the call, and it also marks the progress of the participants through the opening core sequence. With the VesselWorld data, we see the emergence of these kinds of conversational structures as a part of the procedural infrastructure and representational practice that develop to handle complex interactions that develop for recurrent domain activities.

In VesselWorld, the participants developed a procedural structure for the domain task of jointly lifting an extra large waste. As a part of this common procedural structure there emerged a conversational structure to align private views of the situation at difficult points of coordination during the joint lift.

To successfully lift, carry, and load on the barge and extra large waste, the participants need to mutually point to several aspects of the situation:

  • 1The cranes must both intend to cooperatively lift the same waste.
  • 2The correct crane must deploy the equipment necessary to lift the waste.
  • 3During the same round of activity, the cranes must join together.
  • 4During the same round of activity, the cranes must jointly lift the waste.
  • 5During the same round(s) of activity, the cranes must jointly carry the large waste to the barge, if necessary.
  • 6During the same round of activity, the cranes must jointly load the large waste onto the small barge.

Errors in coordination result in failure and the spillage of toxic waste.

As they prepared to do a joint lift, the participants using the base system could “see” each other, but their perceptual capabilities were not sufficient to see what the other actors were doing in any detail. Each participant had a plan, but the participants could not see each other's plans. The problems inherent in jointly lifting or moving a large or extra large toxic waste made for a recurring source of difficulty.

Because managing the removal of extra large wastes was a recurrent source of difficulty, the cranes invented a conversational structure to organize operations on large and extra large wastes at each point of coordination. A set of adjacency pairs (Schegloff & Sacks, 1973) were used by the participants to mediate the private understandings of these tightly coupled actions.

The first part of the adjacency pair was for one actor to propose to take a given joint action on the next round. The second part of the adjacency pair was for the other actor to confirm that he would take the corresponding action. Therefore, if Crane1 proposes to do a joint load, Crane2 can confirm. For joint actions requiring multiple steps, each of the steps is proposed and confirmed using the adjacency pair structure. In the formal study discussed later, all the teams of participants developed this kind of conversational structure.

Fig. 4 shows a sample of dialogue where the participants used adjacency pairs to coordinate the handling of a large barrel of toxic waste. At 1 and 2, after jointly lifting a large barrel, Crane1 and Crane2 agree to do a joint carry followed by a joint load onto a barge. It will take three moves to reach their destination. In lines 3, 4, and 5, they tell each other they submitted their first move. At 8, the tug suggests a convention to simplify coordination. At 9 and 10, Crane1 and Crane2 tell each other they are ready to do the second part of the move. At 14, Crane1 states she is doing the third move. At 15 through 18 they plan and then they submit actions to do the joint load. At 19 and 20, they celebrate. Because the conversation of the users is mediated through textual chat, adjacency pairs do not strictly speaking occur one after the other; their positioning sometimes depends on the typing speed of the users. Other kinds of comments may end up interposed along the way.

Figure 4.

A conversational structure.

After this conversational structure became a part of the group's common knowledge, only some of the progress of the interaction it produced was specifically marked in the chat window.

The adjacency pair structures the participants use to help achieve joint lifts are part of the representational practice that emerges. This conversational structure reduces errors by making it easier to time when to initiate each phase of the action (Clark, 1996, pp. 83–86). They improve the performance of the actors by providing a mediating structure to guide the participants during selected points in the interaction. During these points of coordination the actors are explicitly engaged in a grounding activity, more closely paying attention to one another, but as a consequence reducing the parallelism and multitasking dimensions of the collaboration as they converse. Alternate methods exist for mending the representational practice that enable the participants to more freely work in parallel and multitask despite the coupling constraints of tightly coupled actions like the joint lift: coordinating representations.

8. Coordinating representations

For a recurrent activity the intersubjective space in which the participants operate has a historical character. The emergent structure of the activity, and the context in which it occurs, are conditioned by the prior history of the activity within the community and for the individual (Cole & Engeström, 1993; Hutchins, 1995a; Tomasello et al, 1993; Vygotsky, 1978). Mediating artifacts play a central role in organizing, structuring, and making sense of the activity as it develops; these artifacts are an outgrowth of prior efforts to adjust and improve the performance of a behavior. Work contexts are specifically designed to support highly predictable activities (Nardi, 1996; Suchman & Trigg, 1991).

The set of representations used within the work context is a significant part of how the work environment can be predesigned to support expected behaviors. If the representational system is a poor match for the domain tasks of the users, it becomes necessary to redesign and reorganize it and thereby embed alternate preferences for how the users should structure, organize, explain, and frame their coordinated and collaborative field of action.

Hutchins documents two examples of the progress of representational function within a system of activity: the airline cockpit (Hutchins, 1995b; Hutchins & Klausen, 1992) and the navigational bridge of a Navy vessel (Hutchins, 1995a). The use of speed bugs on the airspeed indicator in the cockpit and the Mercator projection chart on the navigational bridge are significant factors in the “cognition” and performance of the participants. These representational artifacts emerge from a history of reengineering prior representational systems.

The addition of some artifacts reduces errors or makes the participants more efficient and effective in their performance. Typewriters and then word processors are examples of the kind of progress that achieves these sorts of effects. Other artifacts are primarily introduced to mediate communication at an expected recurrent point of coordination, thereby serving the same function as a conversational structure; the stop sign is an example of this sort. Artifacts that are primarily introduced to mediate at an expected recurrent point of coordination will be referred to as a coordinating representation (cf. Suchman & Trigg, 1991). Like the conversational structures that are created by the participants, the coordination representation mends the representational practice so as to mediate the efforts of actors to align their differing views of the situation. Unlike conversational structure, it achieves this effect without making the participants explicitly engage in grounding activity. The coordinating representation enables the actors to make progress, delay or avoid the face time required for explicit grounding, and thereby enable more loosely coupled, in-parallel, multitasked, forms of interaction.

In the airline operations room, the day is divided up into complexes. Each complex is a period of time, roughly an hour, when, for a given airline, incoming plans arrive, transfers are made, and outgoing planes leave. All the information needed to coordinate work during each of these periods is represented in a matrix that is referred to as a complex sheet, which is a coordinating representation. The complex sheet is a “transparent artifact that stands in for situations out on the ramp and provides a shared object for communication between people during the course of the complex.” (Suchman & Trigg, 1991, p. 208). From the perspective of the analysis in this article, the complex sheet enables the participant to make progress, “stay on the same page,” while they work in parallel and multitask without explicitly having to ground a specific sense of their shared activity.

A clock in the classroom is a coordinating representation that mediates a point of coordination at the beginning and end of class. An appointment slip helps a patient to return to the dentist's office on the right day at the right time. A mail-order catalogue helps the customer and the sales office reach agreement on purchase items, sizes, and prices. Tax forms help to coordinate citizens and IRS personnel in their efforts to exchange information. At the airport a passenger's printed itinerary, the departure monitor, signs identifying the JetBlue™ ticket counter, and baggage claim tickets are also examples of coordinating representations that have been designed into the environment.

All artifacts can be used to mediate the co-construction of a shared understanding, but not all artifacts are designed to do that. A chair could mediate a point of understanding, but the chair was not designed with that purpose in mind.

Artifacts have both a tool and sign function (Vygotsky, 1978). The tool function makes it easier to accomplish some task. The sign function effects how we think about the task. For a coordinating representation, the sign and tool function coincide: The tool function of a coordinating representation is that it is a sign designed to mediate an interaction at a recurrent point of coordination.

Not all external representations are intended to mediate a point of coordination between collaborating actors; therefore, not all external representations are coordinating representations. A photograph is not a coordinating representation. The earlier drafts of this article helped me to work out what I want to say, but they were not coordinating representations. A scratch piece of paper that is used to do multiplication problems is not a coordinating representation. A personal diary is not a coordinating representation, even if somebody other than its author reads it.

At many locales, media is available, like the whiteboard, which the participants can use to construct external representations that coordinate the activity of a group. These representations become coordinating representations, in the sense that is meant here, only if their usage continues beyond a single episode of cooperation.

Coordinating representations enable the participants to more effectively multitask and work asynchronously in both collocated and non-collocated environments. The addition of coordinating representations enables the participants to make progress without always directly attending to one another.

Coordinating representations increase the pace and effectiveness at which an intersubjective space emerges for a recurrent activity. In itself the coordinating representation does not add to the intersubjective space in which the actors operate, but it expedites the interaction at certain points of coordination, while entirely removing others. They give the participants more control of if and when they explicitly ground. Embedding into the design of the representational system some preferences for organizing conventional behaviors is potentially more effective than the use of conversation structure at runtime. This “pre-computes” some of the runtime work of actors (Norman, 1991). It also enables the distribution of work across people (i.e., engineers and designers vs. the runtime performance of the participants). The reformulation of mediating structure from one whose external representation interactively emerges (the conversational structure) to one that is predesigned into the representational system (the coordinating representation) is a significant mark of progress that simultaneously expands the intersubjective space in which actors operate and transforms the vocabulary they use to make sense of the situation.

Thus, the representational system for that cooperative task is designed and redesigned. Each cycle converts some of the runtime work into more externally structured kinds of interaction that are specifically designed into the system to match the emerging practice of the participants. The overall effect is a reduction in runtime representational work to maintain a common sense of a recurrent cooperative activity, enabling the participants to work in a more loosely coupled fashion.

9. Adding coordinative representations to VesselWorld

The analysis of transcripts from usage of the base version of VesselWorld was used to develop a second version of the system VesselWorld+ that includes three coordinating representations (CRs). Each of these coordinating representations is designed to address some issue that had emerged at a point of coordination during a recurrent activity. The participants using VesselWorld+ will sometimes be referred to as the coordinating representation groups.

One issue that was identified in the transcripts from the base system concerned the exchange of information about the name, location size, and properties of wastes. There were numerous occasions of repair work instigated because the users' private representations of the state of the shared field of activity had diverged. The object list CR (see Fig. 5) was designed to mediate these kinds of points of coordination among the participants.

Figure 5.

The object list.

The object list is a coordinating representation that mediates the efforts of participants to construct intersubjective space for shared domain objects, and it potentially mediates the interaction at any number of points of coordination. A list of objects (with relevant properties) allows users to more systematically keep track of objects in the domain. This information is visible to all users and can be edited by any user. When a user discovers a waste, he or she can note it in the object list using a point-and-click operation. Entries in the object list can be displayed on the WorldView as markers. All of the teams that had access to the object list used it to mediate their interactions.

The users of the base system also had difficulties in coordinating tightly coupled actions involving the manipulation of large and extra large wastes. If one crane started to lift before the other crane, the waste spilled and leaked toxic materials into the environment. A shared planning CR was designed for the VesselWorld+ groups to handle these kinds of situations, converting it into a more efficient form of representational interaction (see Fig. 6). The shared planning CR allows a user to compare his projected actions to those of the other participants. The next few planned steps for each actor are displayed in a labeled column for each participant. The actions are listed in order from top to bottom. (So, the next few planned steps of Crane1 are to deploy equipment and then lift some waste.)

Figure 6.

Timing of joint actions.

The analysis of the base system transcripts also revealed that keeping track of multiple open plans was a recurrent activity among the participants. Sometimes repair work was triggered because there was not an adequate representation of the multiple tasks and ordering of tasks; for example, one participant would be waiting for another, not realizing that the other actor was doing something else first.

A third coordinating representation was designed to allow the users to manage multiple plans. The idea was to create a structured space where the participants could rapidly sketch a high-level plan that would help them to manage multiple open tasks.

9.1. Changes in the runtime construction of intersubjective space

A study to assess the difference in performance between participants that used the initial version of VesselWorld (the base groups) and participants that had access to the three coordinating representations introduced in VesselWorld+ (the CR groups) was conducted (Alterman, Feinman, Introne, & Landsman, 2001). These two representational systems have a chronological order: VesselWorld+ includes coordinating representations that were specifically designed to mediate a recurrent point of coordination that existed for users of the base representational system VesselWorld. This set-up enables us to consider, in detail, a sequence of related interactions within and across episodes of cooperation such that continuity and change can be observed.

The participants for the study were a mix of students and local-area professionals, with varying degrees of computer proficiency. Participants were organized into teams of three. Each team worked with one of two representational systems; three teams used VesselWorld, and three teams used VesselWorld+. Each team was trained together for 2 hr in use of the system, and then solved randomly chosen VesselWorld problems for approximately 10 hr. To alleviate fatigue concerns, the experiment was split into four 3-hr sessions. Participants were also asked to fill out entrance surveys to obtain population data and exit surveys where they could give feedback about their experience with the system and the coordination issues arising in their team.

A set of random problems was produced, and participants were given a succession of problems drawn from this set. Groups did not necessarily see the same problems or in the same order; because of differences in performance, groups did not complete the same number of problems over their 10 hr of problem solving. To account for this, a general measure of the complexity of a particular problem was devised, taking into account the quantity and type of the wastes in the harbor, their distance from the large barge, and the number of small barges available to the respondents. This metric was used to normalize results.

The primary interest of collecting this data was to use the replay device to do a detailed analysis of the transcripts of participant behavior. Some quantitative analysis was also performed.

9.2. Data analysis

All teams, regardless of which platform they used, improved their average performance. Fig. 7 compares the first 5 hr of problem solving (after training) to their second 5 hr of problem solving. All teams saw significant decreases, over time, in the number of chat lines they produced and the elapsed clock time it took to achieve their goals: The participants talked less and took less time to accomplish their tasks.

Figure 7.

Within-group improvement over time (first 5 hr vs. last 5 hr).

All of the groups developed conversational structure to mediate certain recurrent points of coordination:

  • • Each of the base teams independently invented an adjacency pair structure (Schegloff & Sacks, 1973) to organize the aligning of their private representations of these kinds of situations. None of the CR groups developed the adjacency pair structure; they all used the shared planning CR to coordinate closely coupled actions.
  • • Each of the groups, regardless of the platform they used, developed shorthand notational conventions for describing various features of the wastes. These conventions of naming allowed the participants to rapidly describe, in a few keystrokes, the relevant information about a particular waste. Thereafter, the participants had a useful handle that reminded them of many of the relevant properties of the waste.
  • • In anticipation of the breakdowns that resulted from keeping separate representations of the wastes and their locations, one of the base groups invented a conversational structure, a “marker check,” which they used to periodically compare private representations. During a marker check one of the participants would list all the wastes that she had marked on her map, one quadrant of the map at a time. The other participants would compare her marks to their own, making repairs as they went along.

The coordinating representations both reduced the number of points of coordination and expedited the interaction at other points of coordination:

  • • The object list CR reduced the numbers of points of coordination. For the base group, because the participants kept separate representations, each time a waste was discovered, or any other kind of information exchange about a waste occurred, there was a point of coordination. With VesselWorld+ some of these points of coordination were removed. For example, with the representational system of VesselWorld, if the tug reported he found a barrel at a specific location, the other actors needed to add a marker to their map for them to keep track of the fact it exists. With the introduction of the object list that point of coordination no longer exists.
  • • The shared planning CR was used to mediate closely coupled actions. To submit an action to the system the users needed to add it to their plan anyway. So, from the point of view of the users who had access to the shared planning window, having to talk about their immediate plan was just extra representational work. The CR groups also used this shared external representation to bypass communicating this information via chat. Exchanging representations of timing information for closely coupled actions via the shared planning CR was more efficient and precise, and it was less error prone.
  • • On more than one occasion it was observed that one of the cranes would use the shared planning CR to adjust his plan to match the plan of the other crane within the same turn, without any discussion.

The introduction of a CR changes how the participants produce at runtime intersubjective space and, consequently, the content of their common sense of the situation.

Fig. 8 shows the opening dialogue in a VesselWorld problem-solving session where users had access to coordinating representations. This dialogue ensues before all of the participants have submitted their actions to the system for the first round of action. At line 1, Crane1 ecstatically declares that he can see an extra large waste. At line 2, the Tug expresses his “envy.” At line 3, Crane2 expresses his excitement that he can see both an extra large and a large waste. The rest of the opening dialogue is mostly concerned with planning.

Figure 8.

Opening dialogue.

Fig. 5 shows the object list that is constructed by the time all the participants have submitted their first action. Only three of the entries into the object list were explicitly mentioned in the opening dialogue, and none of these were explicitly named. Without the object list CR, a team works closely together to manage the discovery of a waste, jointly focused, explicitly grounding each of the seven newly discovered wastes. For the CR groups the discovery of a large set of new wastes happens en masse versus one at a time for the non-CR groups. There is also a change to the content of the participants common sense of the situation: The CR groups spend relatively more time planning, whereas the non-CR groups spend relatively more time cooperatively managing information about wastes. For the non-CR groups with the discovery of each waste the participants try to engage in a grounding activity. For the CR groups, the object list CR enables team members to add to the intersubjective space in a productive manner without grounding.

The CR groups used the shared planning CR to mediate closely coupled actions. To submit an action to the system the users needed to add it to their plan anyway. Therefore, from the point of view of the participants who had access to the shared planning window, having to talk about their immediate plan was just extra representational work. The removal of a need for redundant descriptions of the users' plans reduced clock time, interface work, and confusion among the users about the details of each other's plans. On more than one occasion it was observed that one crane would adjust his plan to match the plan of the other crane within the same turn, without any discussion, thus indicating a richer intersubjective space than the one produced by the non-CR groups. Thus, although the Cranes talked less about manipulating extra large wastes, the CR enabled them to work more closely together with a better understanding of each other's intent without explicitly engaging in a tightly coupled sequential interaction.

The high-level planning coordinating representation was not used by any of the CR groups. The participants did not use the high-level planning window because the extra representational work needed to construct a representation of a high-level plan was not warranted. An analysis of the discourse showed that the plans the participants created had a relatively short period of average relevance (Feinman & Alterman, 2003). Thus information about the plan was readily assessable from the short history of prior chat that was already available in the chatting window.

The results presented in Fig. 9 show the improvement in performance of CR groups over the base groups for the final 5 hr of play for each team; after 5 hr the performance of the teams had stabilized. The most significant effect is the 57% reduction in communication generated. This reduction reflects the decrease in “face time” for the participants to maintain a common sense of the shared activity, working more in parallel, making it easier to multitask. A 49% reduction in clock time was another highly significant result. There was also a reduction in system events (mouse clicks, etc.), down 38%. Overall domain errors (errors in performing domain actions that led to a toxic spill) were reduced by 61%. The variance of this measure was quite high due to the low frequency of errors; this reduced its confidence below statistical significance (p < .20). All of these measures are a reflection of differences in intersubjective space in which the participants operate.

Figure 9.

Improvement of CR groups over base groups; final 5 hr of play.

There was also a reduction in the rounds of activity, but it was not as significant (p < .35): In terms of domain action, both communities achieved similar levels of performance. Almost all of the extra rounds are accounted for by a reduction in the number of errors between base and CR groups. Because the two groups achieved similar levels of performance, the reductions that were seen in clock time, system events, and chat are attributable to a reduction in the amount of representational work to maintain a common sense of the shared activity. Overall, these differences in performance reflect the accumulation of cultural practice over time.

10. Concluding remarks

What the participants share, their common “sense” of the world, creates a foundation, a framing, an orientation that enables human actors to see and act in coordination with one another. For recurrent activities, the methods the participants use to understand each other as they act change, making the intersubjective space in which they operate richer and easier to produce. Changing the representational practice of the participants enables them to work in a more loosely coupled fashion, working in parallel, multitasking, and yet continuing to maintain a common sense of the shared activity. Theses changes to how intersubjective space is produced at runtime changes its content, the speed at which it is produced, and its effectiveness.

Individual and shared beliefs about the collaborative field of action that are generated at runtime compose the intersubjective space. The predispositions and expectations of the participants, their quality, number, and correlation and the effectiveness and the appropriateness of the representational system available at the scene characterize the potential richness of the intersubjective space. A specific individual sense of how the expected structure of the activity plays out in the current situation is constructed during the course of the activity. Conventional representational activities enable the participants to add to the intersubjective space in which they operate, continuing the action without specifically grounding. Changing how the intersubjective space is produced changes what is produced.

The study reported in this article examined how participants come to understand each other sufficient to their task in a work-like context. A key assumption was that the participants are multitasking, working sometimes in parallel, other times closely with a joint focus. The main task was to work through some of the issues that emerge from a close examination of intersubjectivity as it is managed through representation and interaction. The focus was on adjustments that are made for recurrent cooperative activities that emerge in these kinds of work contexts. The data that were presented document, in detail, a sequence of related interactions, within and across episodes of cooperation, where continuity and change can be observed.

Despite the emergence of the “common procedures” for recurrent cooperative situations, there are always points of coordination: moments in the interaction where the participants must mark, confirm, or navigate their progress through their private expectations of how the collaboration will unfold.

Initially, in response to a breakdown, a conversational interaction occurs that realigns the private understandings of the participants. In future situations, where one or another actor anticipates the problem may reoccur, the actors will create a conversational structure to organize the flow of the activity. This mediating structure becomes a part of the procedural infrastructure for the interaction.

A coordinating representation serves the same function as the creation of conversational structure. Both provide mediating structure for a recurrent point of coordination. The shift from activities that are organized by mediating structures interactively (conversational structure) to recurring activities that have a predesigned organizational structure as part of the representational system for the task (coordinating representations) simultaneously expands the intersubjective space in which actors operate and transforms the vocabulary they use to make sense of the situation.

Coordinating representations enable the participants to more effectively multitask and work in parallel. The addition of coordinating representations enable the participants to make progress without always directly attending to one another. In itself the coordinating representation does not add to the intersubjective space in which the actors operate, but it expedites the interaction at certain points of coordination, while entirely removing others.

Aknowledgment

This research was supported by the Office of Naval Research under grants No. N00014-96-1-0440 and N66001-00-1-8965. Additional support came from NSF grant EIA-0082393.

Ancillary