Undoing decontextualization or how scientists come to understand their own data/graphs


Correspondence to: Wolff-Michael Roth; e-mail: wolffmichael.roth@gmail.com


The sciences have been so successful in the course of recent human history because the (mathematical) representations they use articulate laws and relations independent of contextual particulars and contingencies of concrete situations. This allows verification anywhere and at any time, and, therefore, the objectivity of scientific phenomena. Decontextualization, however, may make interpretation difficult even for scientists. This ethnographic study of a scientific lab investigating the absorption of light in the eyes of salmonid fish was designed to investigate the role of context in the understanding of data and graphs in science. Drawing on data from a 5-year ethnographic study of laboratory science, I exhibit the effort scientists mobilize to learn by reconstructing the context from which their data have been abstracted. Without recontextualization, scientists struggle making sense of the study results that emerge from their work. Scientists require familiarity with the settings from which the data derive and with the entire transformation process that produce graphical representations to be able to interpret the data. This has considerable implications for teaching graphs and graphing and for using graph interpretation tasks. Rather than being a decontextualized basic process skill, graphing competency is a function of familiarity with both scientific object and the research process as a whole.


[A] concrete conception of commodity . . . coincides with the theoretical understanding of the entire totality of the interacting forms of economic life. (Il'enkov, 1982, p. 105, emphasis added)

Graphs and graphing, which emerged during the Renaissance period, are quintessential to the nature of science: Without these, the sciences as we know them today would not exist (Edgerton, 1985). In the production of scientific knowledge, graphs play an important role because they depict, at a sufficiently abstract level, general tendencies in the relation of two or more variables (Latour, 1987). It comes as little surprise, then, to find graphs and graphing among the fundamentals to be taught in science (e.g., National Research Council, 1996). From prekindergarten to high school, curricula are to enable students to formulate questions, collect and organize data, and display them such as to provide answers to the questions posed. Students are to learn how to develop and evaluate inferences and make predictions based on their data. Curricula should enable students to “create and use representations to organize, record, and communicate mathematical ideas; select, apply, and translate among mathematical representations to solve problems; use representations to model and interpret physical, social, and mathematical phenomena” (National Council for Teaching of Mathematics, 2000, p. 66).

Educational research and practice tend to assume that scientists are experts at graphs and graphing. But even scientists are frequently at a loss even when asked to interpret introductory-level graphs in their own field (Roth & Bowen, 1999a, 2003). The same scientists however tend to be knowledgeable when it comes to graphs from their own work. This raises questions about the nature of the expertise that tends to be ascribed to scientists when science (and mathematics) educators list graphs and graphing among the core “scientific process skills.” This study was designed to better understand the process by means scientists come to understand their data and associated graphs for the purpose of informing science educators about implications for research and practice. The purpose of this paper is to show how scientists, prior to their ultimate discoveries, indeed struggle to understand their own data and the graphs that they give rise to when these are apparently independent of context. I present an exemplifying analysis of scientists before they have arrived at the final understanding of their data and graphs that they ultimately publish. At this early point in their work, scientists still articulate many things that do not hold up in the end.


Previous research on scientists’ interpretation of graphs shows that they have trouble even when the graphs are from introductory university courses in their own domain (Roth & Bowen, 2003); the frequency of trouble increases when they are confronted with graphs from another discipline (Roth, 2009). In contrast, when scientists talk about their own graphs, they do not just know graphs but the graphs deriving from their work also appear to bear part–whole (i.e., metonymical) relations to their work (Roth & Bowen, 2001). Thus, “as a metonymy, the graph, as one small part of a rather complex situation (workplace, research), has come to stand for the situation as a whole” (Roth, 2004, p. 87). The sense of the graph—and, therefore, its signification—has the possibility to be articulated in and structured by the explanation by virtue of the practical understanding and situational familiarity that precedes, accompanies, and concludes the explicatory process.

One important problem with existing studies of graphing is that these do not tease apart familiarity and processes that function independent of it. For example, one major study of graph interpretation asked H. A. Simon, a Nobel Prize winning economist and a cognitive scientist, to read/explain a supply/demand graph that economics students encounter in one of their first classes (Tabachneck-Schijff, Leonardo, & Simon, 1997). In this graph (Figure 1a), price is plotted against quantity of the product for both supply (price falls linearly with increasing quantity available) and demand (price rises linearly with increasing quantity available). Simon, however, after nearly half a century in the field, is very familiar with this economics graph. Without hesitating, he highlights the intersection of the supply and demand graphs and suggests that the price would settle at this point (stable equilibrium). His response to the task is very similar to the ones that those biologists provide who teach undergraduate courses and who are asked to explain birthrate/death rate graphs (Figure 1b) typical of undergraduate courses (Roth & Bowen, 2003). These graphs featured two intersections one of which should be interpreted as giving rise to a stable equilibrium the other to an unstable equilibrium. But nonuniversity research scientists were less successful at a statistically significant level than their university-based peers, who are familiar with the kind of graphs used in the study.

Figure 1.

Two structurally equivalent graphs used in graphing studies with scientists: (a) an economics graph and (b) population dynamics graph.

A related investigation in which a total of 33 scientists (16 biologists, 17 physicists) are asked to interpret graphs from introductory courses of biology shows that even with tremendous levels of training, only 9 (27%) of the scientists gave correct answers on a graph that bears structural similarity to the oxygen–shrimp frequency graph (Roth, 2009). This graph features the distribution of three types of plants—distinguished by their photosynthetic mechanism (C3, C4, and CAM [Crassulacean acid metabolism])—along an elevation gradient. These mechanisms differentially adapt them to the reigning climate, CAM plants being able to regulate their water losses by opening stomata at night when evaporation losses are minimized. When those 8 biologists who are teaching at the undergraduate levels are not considered then only 2 of 25 (8%) scientists correctly answered the question. As in the preceding comparison, some scientists successfully interpret the graph but others are less successful.

Graphing difficulties are often attributed to the “abstract” nature of graphs (Leinhardt, Zaslavsky, & Stein, 1990). One possible way of understanding the relation between abstract and concrete representations (inscriptions) has arisen from two studies of biologists at work (Latour, 1993; Roth & Bowen, 1999b). These show how in scientific work an increasing level of contextual detail is deleted until the scientists arrive at their stated claims. The result is a chain of representations, each of which is separated by an ontological gap from all other representations; these gaps are bridged only in and through scientific practice. Thus, there is no natural and necessary relationship between soil samples placed in stacked drawers and line graphs representing the horizontal and vertical distribution of soil types: Practices are responsible for the equivalence of the two. Scientific research thereby moves from the concrete world, where specimens are taken (abstracted), and conducts measurements that are represented by increasingly abstract sign systems (graphs, equations), and these are used to substantiate verbally articulated claims. In interpretation, the scientists move in the opposite direction (Latour, 1993), which is made difficult because each gap between two representations in the chain constitutes a black box that has to be reopened before the gap can be bridged (Latour, 1987). For those familiar with the entire process of abstraction—scientists, technicians, and students alike—it appears to be easy to go all the way back from the representation to the natural world from which the abstract representations were derived.

In educational research, students’ difficulties with graphs tend to be attributed to cognitive factors. Given that the scientists in the studies reviewed were highly competent individuals, most with Ph.D. and most with long research and funding records, it is evident that a cognitive deficit does not explain the results. In this study, I followed the recommendation to study graphs and graphing anthropologically: “Before attributing any special quality to the mind or to the method of people, let us examine first the many ways through which inscriptions are gathered, combined tied together and sent back” (Latour, 1987, p. 258). Researchers are advised to start speaking of (invisible) cognitive factors “only if there is something unexplained once the networks have been studied” (Latour, 1987, p. 258). An opportunity for studying the appearance of graph-related competencies exists when scientists first come to understand phenomena and their mathematical representation, that is, during a process of mathematization and before they individuals can rationalize the process of their learning.


Ethnographic Context

This study of how scientists come to understand their data derives from a larger study designed to understand how new knowledge is produced in the discovery sciences. For this purpose, I simultaneously conducted two 5-year ethnographies. One ethnographic study focused on a scientific laboratory interested in understanding the life history of coho salmon; and the other one took place in the context where the coho salmon were raised. In both instances, I used apprenticeship as my field method (e.g., Coy, 1989). Thus, in one of the fish hatcheries I studied, I worked alongside the resident fish culturists to learn all aspects of their work, beginning with the culling of eggs and milt in the fall through the raising of the brood (fry and parr stages) to the ultimate release of the fish into the river at the time the coho salmon had reached the smolt stage and were physiologically ready for the migration to the saltwater environment. In the scientific laboratory, I became an integral member of the team, contributed to the design of experiments, mathematically modeled the data production process, collected measurements, interpreted results, and contributed to two scientific publications. My graduate degree in physics and doctoral minor in physical chemistry had prepared me to deal with many aspects of the data collection, including the production of absorption spectra, fast Fourier transformation (FFT) and its inverse, polynomial curve fitting, mathematical modeling of absorption and reflection processes, and mathematical processes used in “cleaning up” data and in identifying maxima and half-maximum bandwidth of approximately Gaussian absorption curves that were lodged on nonhorizontal background noise. The other aspects I learned on the job included dissecting fish and extracting retina from the excised eyes, preparing microscopic slides, mounting slides into high-powered microscopes, and collecting absorption spectra.

Scientific Context

During the preparation of the proposal for the project, the head of the laboratory (Gregg) and I had talked about testing a method for identifying the readiness of hatchery-raised coho salmon smolt for their migration to the ocean. (Pseudonyms are used.) In the 1940s, a subsequent winner of the Nobel Prize (George Wald) had established the canon with respect to the physiological changes that those animals undergo that live, as part of their life cycle, in both marine and freshwater environments. Accordingly, it was said that a shift occurred in the photoreceptors of the retina from a freshwater vitamin A2–based pigment porphyropsin to the vitamin A1–based pigment rhodopsin dominant in the saltwater of marine environments. For the coho salmon of interest in our scientific study, the latter pigment was said, based on the results of previous research, to maximally absorb at a wavelength λmax = 503 nm whereas the former was said to maximally absorb at a wavelength of λmax = 527 nm. The gathered absorption spectra are used to determine the amount of porphyropsin (A2) in a particular photoreceptor rod from the fish retina.

Each year, the different Pacific salmon species returning from their migration in the ocean are awaited with great anticipation. Large returns mean substantial incomes for the commercial fishers, food for the First Nations bands on the West Coast, income for the tourism industry that caters to sports fisheries, and benefits to the local economy more broadly. From year to year, there are large variations in the return rates of the salmon. Not only are there counts that the scientists from the national Department for Fisheries and Oceans determine, but also the individual hatcheries keep records about how many smolts they release and how many adult fish from that year return into their compounds. At present, scientists do not understand these variations. The scientific study was designed to better understand the relationship between time of release and return rates. This required a marker of stage of development, which, in this study, was investigated in the form of the changeover between the two pigment forms.

Although there were some conflicting results in the scientific literature concerning the amount of change between the two pigments in the course of the different stages in the life cycle, the team of scientists took as its orientation the canon confirmed by an investigation on coho salmon that scientists from another nearby university had published (Alexander, Sweeting, & McKeown, 1994). As the graph that the doctoral student working on the project (Shelley) used in his presentation shows, the pigment rhodopsin dominates during the months of January through early May whereas the young coho salmon is in the freshwater (between 60 and 80 molar% rhodopsin) (Figure 2). The porphyropsin levels then change dramatically just before the coho are ready for migration in early June. When the coho are retained in freshwater, as the researchers of the earlier study had done, then the porphyropsin levels appear to increase again. The scientific canon suggested that the maxima and minima could be as much as 95% and 5% porphyropsin, respectively. The biology research was based on the idea that if this pattern panned out, the best release date could be optimized based on the relationship between porphyropsin levels at release date and return rates. Because the head scientist had developed new instrumentation, which made it feasible to increase data by several orders of magnitude over previous studies, the study was designed to investigate the porphyropsin levels at different life stages of coho salmon.

Figure 2.

This graph, which the doctoral student (Shelley) has produced based on published data and which he projects in the different talks he gives over a 3-year period, shows two variables (porphyropsin, plasma, sodium) as a function of time of year.

The data collection constitutes a process of decontextualization where the object of study, individual photocells from the retina, come to be prepared. It begins with keeping the fish in the dark over night; scientists then sacrifice the specimens in a dark lab where there is only near infrared light (which requires that we spend an hour to adapt to see anything at all). After extracting the eyeball, it is cut in half (“hemisected”) and the retina is removed, most of which is kept in minimal essential medium (“MEM”) so that further tissue can be taken when needed in the course of the daylong data collection. The blob of tissue is macerated and mounted on a microscopic slide. Once this is mounted properly, cells can be seen on the computer monitor (Figure 3a) via a CCD-based camera (similar recording mechanism as that used in handycams, digital video, and photographic cameras). The light beam falls onto the slide where the crosshair is marked on the screen. One measurement (“ref[erence]”) is taken with the sampling beam next to a photoreceptor, the other one with the sampling beam going through the photoreceptor. The difference in light intensity between the two measurements is the absorption spectrum (Figure 3b). In the ideal case (i.e., a “beauty”), the spectrum looks more or less like a Gaussian even though the background might not be a horizontal line (Figure 3c). These spectra subsequently are processed mathematically to make it possible to extract the location of their peaks along the wavelength spectrum (λmax) by fitting it with polynomials (see Figure 4c; Munz & Beatty, 1965) or from the bandwidth of the curve found by cleaning up the curve using FFT and iFFT procedures. Porphyropsin levels are found using least squares fit procedures relative to two published templates relating λmax and porphyropsin levels. In the laboratory meeting from which the fragments below derive, the distributions and mean porphyropsin levels for fish caught at the same time in the same location (creek or hatchery) are analyzed.

Figure 3.

Stages in the production of data. (a) Visual image of a photoreceptor. (b) An absorption spectrum in raw form. (c) An absorption spectrum after “detrending” together with a reference curve that will be fitted.

Figure 4.

Five members of the research group sit around a table and use both computer projections (right) and chalkboard as part of their meeting. From the right: Gregg, Shelley, Elmo, Tiêu, and I (head in foreground). (Faces were smudged using Photoshop.)

Context of the Meeting

To exhibit the troubles that decontextualization during the data collection produces, I selected for analysis the first of a series of meetings over a 2-year period, with about 3,000 usable measures from seven sampling sites already available, the team attempts to make sense of the trends that might be observed in the data. By selecting this meeting rather than all the others that subsequently occurred (e.g., when the study was almost completed), my analysis preempts any possibility that the scientists’ interpretations would “rewrite” the history of their work in the face of the results they did obtain. Historians of the natural sciences have described this tendency to rewrite their own discovery process (Kuhn, 1970), and this tendency was observed here when the doctoral student later began to explain to me that he knew all along there were problems with the canon that framed the team's analyses. However, during this first meeting, there is no evidence that the scientific canon would eventually be overthrown.

At this point in the research project, the team is almost convinced to be reproducing the results that Alexander et al. (1994) had published 8 years before. It turns out that the originally published data and the graph that the doctoral student Shelley reproduced for the team, as well as the related knowledge, are important for understanding the unfolding talk that the scientists generate in the course of their meetings. Even my personal knowledge—gathered during the ethnographies of the two participating hatcheries that supplied the wild and hatchery-raised coho—made available to other team members during these meetings became an important resource for understanding the conditions under which the young salmon were raised and, therefore, some of the possible reasons for similarities and differences that would explain the variation (or lack thereof) in the data. In fact, the information I (author) provided constituted an affordance in the sense that without having this background, the team would have been at a loss in understanding the significance of their data. At the same time, knowing about the Alexander et al. (2004) study turned out to be a constraint, for it became a lens through which the team saw its data; and this, ultimately, led them up a garden path. It took the team a long time to realize that the data they collected actually undid the canon established by the Nobel Prize winning research done some 60 years earlier.

Members of the Research Team in the Meeting

At the time represented in the data, the research team working on the problem included the lead scientist (Gregg) with 30 years of research experience on salmonid fish vision, a postdoctoral fellow who had done his Ph.D. thesis on salmonid fish (Elmo), a research associate (Tiêu) who had a background in physics, data analysis, and software development, a doctoral student (Shelley) doing his thesis work on the very topic, and me. The team members sat in a seminar room around a set of tables at one of the ends of which the contents of Shelley's laptop computer were projected (Figure 4). Both Tiêu and Gregg previously produced drawings on the chalkboard behind Shelley or walked up to the projection to talk about and point to specific trends and data points. Shelley used the cursor as a means to point to specific parts of a projected image or to move it along, in iconic fashion, some graphical feature.

Data Collection and Analysis

As part of the two ethnographic studies, I collected about 100 hours of videotape in addition to my ethnographic notes. These data allowed me to select the first rather than any other meeting because it exhibited the kinds of trouble that scientists subsequently disavowed to have had. I digitized all videotapes in QuickTime format. Initially, I produced a table of topics with the time of the event; then I produced the rough transcripts of the video. Finally, produced high fidelity transcriptions suitable for conversation analysis (e.g., Sacks, Schegloff, & Jefferson, 1974), which retained speech features such as emphases, phrasal intonations, unfinished sound words, overlapping speech, and so on. (Transcription conventions are provided in the Appendix.) In a first step of the analysis, I annotated the transcriptions, inserted commentaries into the transcriptions, and highlighted relevant places in the transcription. In a second step of the analysis, my analysis moved second by second, frame by frame, through each tape. My aim was to recover from the records a perspective on the conversations through participants’ perspective at the time of the meeting. I produced relevant offprints for inclusion in the transcription, or produce drawings that feature relevant gestures. During this phase, I also produced the high fidelity versions of transcripts of relevant fragments.

During subsequent passes, I sought to disconfirm preceding hypotheses, commentaries, and annotations, which were taken as hypotheses to be tested in the entire database. The goal of my analysis was to arrive at the inner dynamic of the conversation. This dynamic is recoverable from the conversation itself. For the analysis, all I have available is precisely the same talk that the participants have available. As long as analysts have the same social competencies, allowing them to understand the conversation in the way other participants do, they can reproduce its sense (Garfinkel, 1967). Just as participants do not act upon any hidden thoughts other participants might have, the analyst does not seek recourse to unobserved ideas, forms of thought, or mental frameworks to explain the inner dynamic of the conversation (Latour, 1987). From this perspective, it is legitimate to say that a person is happy or that a person thinks something if the speaker actually formulates this otherwise inaccessible state of affairs. For example, Gregg said during the meeting, “I 'm happy with what I see there,” which makes it legitimate to say: “he states to be happy.” The verb to formulate is a technical term from conversation analysis used to describe that participants do something to make salient what they do or intend at the moment (Garfinkel & Sacks, 1986). “See, what I am saying is . . .,” “I think . . .,” or “I mean . . .” all are formulations by means of which the current activity itself is explicated. By means of formulating, participants make available to each other the structures of the practical action that they are currently engaged in with others.

Whereas analysts may not make assumptions about what goes on in the mind of individual participants, they do have to be familiar, as I was, with the knowledge that the participants to a setting know themselves to be sharing. This knowledge “goes without saying,” and therefore is not articulated. To use a somewhat hyperbolic example, a participant will not say “I am reading off the screen” because to others present it is evident when something is read off the screen. Pertinent to the meeting, everybody knows that this research project is done to test whether the findings of a previously published paper (i.e., Alexander et al., 1994)—changes in the retina as predictors of smoltification—can be used to determine a fish population's current point in its life cycle and its readiness for migration and for the challenges of their future saltwater environments.


This study was designed to understand the learning process by means of which scientists come to interpret their data and graph at a point in time when their ultimate results are not yet known. As the analyses that follow show, scientists are out on a limb with their understandings, their conjectures; and it is precisely because their struggle occurs in and as part of sequentially ordered interactions that their interpretation processes can be studied. What is relevant information to understanding emerges unpredictably from the (societal) relations that are characteristic of laboratory meetings. This is especially important because these ultimate findings were very different from what they initially had thought to find.

In this first of the two results section, I present scientists ' initial work at understanding their data. In the following four subsections, I exhibit scientists ' (a) initial, positive impressions and acceptance of data, (b) need to see the data in context with other data, (c) expressions of uncertainty, and (d) postponement of more definitive interpretation of their data. I summarize the scientists ' efforts to understand their data by discussing how context, which has been eliminated in the research process, creeps back into it.

First Impressions and Acceptance: “This Is the Means of Each Batch so You Can Get a Sense”

In this first data analysis meeting, Shelley uses his laptop and a projector to share the results from the first five batches of fish that Tiêu and he have processed in the laboratory. Together they have obtained the absorption spectra, which Tiêu has subsequently cleaned up and transformed to extract λmax and half-max bandwidth, the information needed to determine the relative amount of porphyropsin (i.e., “perc[ent] A2”) present in the photoreceptors (Figure 5). Tiêu returned the results to Shelley, who analyzed them using the then current version of the SPSS software package. The data collected from the coho salmon sourced from the Kispiox Hatchery and the nearby creeks and river repeatedly became the topic in this 2-hour meeting.

Figure 5.

Results of the first five “batches” that the doctoral student had processed. Percent A2 has been obtained from the original spectra by means of curve fitting, determination of λmax, and published relation between λmax and A1/A2 ratio.

As Shelley presents the distribution of porphyropsin from each batch of fish, he provides an overall assessment: “these ones here seem to be slowly shifting toward the right,” and then he self-corrects, “to the left rather” (turn 001). That is, he expresses seeing the distribution maxima to be shifting to lower values of A2, even though we might hear him not to be all too confident (“I'm, you know, not . . .”) about the data in the first panel when he says that it represents their “first day.” He describes the data (“this”) as “looking pretty nice, almost” (turn 001). Gregg not only agrees that the data are nice but also provides an emotionally laden assessment: He formulates being happy about what he sees (turn 002). This is the kind of data that they often denote—in the laboratory and during meetings—by the noun “beauty.”


Shelley then continues to provide descriptive statistics about the means and standard deviations, on the basis of which he notes that the distributions are wide and that “there is no question here.” In any other context, this might have been an innocuous statement, but in the present context, where the scientists work on establishing a method for determining release dates based on the A1/A2 ratio, having wide distribution is less than ideal. This is an important question as the within-fish range is from 75% to 6% A2. The scientific canon at the time states that the A1/A2 ratio is a function of the physiological changes in the fish during the process of getting ready for migration into the saltwater environment. If the within-fish variations are so large, this might be used to question the entire model. But the scientists do not question the model at this point. They accept the data—even though in another part of the meeting, Gregg requests the high peak in the first bin (left most frequency) and the data in the right-most bin be removed because they do not “meet the criteria.”

In the Context of Other Data: “Do You Have Comparisons?”

Biologically of interest are the changes in percent A2 as the season progresses. At the time, the scientists are collecting data in 2-week intervals to capture what they believe, based on the scientific canon, to be the physiologically driven change from one to another form of photosensitive material: rhodopsin (vitamin A1 analogue) and porphyropsin (vitamin A2 analogue). A few minutes after having considered the data from their main participant hatchery (Robertson Creek; Figure 6), Gregg requests to look at the second hatchery. The Robertson Creek data show, on the one hand, a decreasing porphyropsin level in the fish that migrated to the sea (Figure 6a) but a renewed increase in those fish that the laboratory retained for research in fresh water (Figure 6b). Shelley pushes a sheet of paper toward Gregg, saying that he “got both of [them] there in small histograms.” But, as he does not have the means as numbers available, he requests the batch numbers and then produces, while apparently talking to himself, the requested information so that everyone can see it on the projection screen. Shelley says that he has plotted the distributions in the same manner as the preceding figures he presented and now refers to a table that presents the means and standard deviations for two measurement episodes (batches) and the processing summary (offprint in turn 12). He reads the relative amount of A2 in each of the two batches from the table he has generated just now: 72% and 67%. He immediately provides a commentary: “They are pretty high in porphyropsin” (turn 014). He elaborates: “these are wild fish that are on their way to the wild, these are migrating to the sea” (turn 014).

Figure 6.

(a) The expected decrease in porphyropsin levels at the time that coho (smolt) migrate toward the ocean. (b) When the coho are held back in freshwater, then their porphyropsin levels are expected to increase again as a consequence of their freshwater environment. (Video offprint, slightly enhanced using Photoshop.)

To understand this conversation in the way that these scientists do, the statements need to be considered with respect to the established canon, confirmed by the Alexander et al. (1994) study in reference to which the present research is conducted. Elmo makes reference to that study, where maximum porphyropsin levels were around 80% (Figure 2). Shelley responds using the adversative conjunction “but” followed by the statement that the fish “are high up the river,” which he then elaborates: “they do not seem to be shifting that much in advance” (turn 018). He formulates having found this a surprising result given “previous work,” which proposed that the fish “seemed to shift a lot in advance” (turn 022). He then modalizes the comment by adding that there could be another cause, which would make the instant of changes shift from year to year. Gregg subsequently suggests looking at the data from the other hatchery in the study (i.e., those presented earlier in the meeting and reproduced in Figure 6), the wild fish caught near the hatchery, and those that have already migrated to the estuary (about 30 km away). He articulates his sense that these data (hatchery and estuary) will not make a difference.

The scientists indicate surprise about their specimens ' high porphyropsin amount. This is in contrast to the coho from the Kispiox Hatchery itself, where porphyropsin levels are at 30%. Based on the data from the second hatchery and based on the published data (Figure 2), this value would mean that the fish are ready to go out to the ocean. But the data have been collected when the fish still are in the hatchery. On the other hand, the wild fish from the same geographical location have very high levels of porphyropsin, much higher than the fish from the Robertson Creek Hatchery and have started, as the latter, to migrate. Shelley articulates this as a surprise, which Gregg confirms (“yea”).

In this situation, the scientists do not just interpret and make sense of the data. They already begin to mobilize their familiarity with the geographical contexts where the fish were caught to interpret the numbers. Their familiarity is a lens through which they see the data—a familiarity that the students asked to interpret the oxygen–shrimp frequency graph in the Preece and Janvier (1992) study do not have. This lens is made from what scientists currently know about other hatcheries, published data, and the scientific canon. Through this lens, the data do not make sense. It is precisely because of their background understanding and familiarity with the life cycle of the coho salmon and with the literature that the data do not make sense. The porphyropsin (A2) levels are high given that fish on the way to the ocean should have much lower porphyropsin levels. One possibility is, as Shelley suggests (turn 018), that the shifting does not occur much in advance of the migration, especially not “a lot in advance” in the way this has been suggested in the scientific literature.

Uncertainty: “Okay, This Is Reversion Data?!”

One may note that the scientists, though initially looking at the large variations in the retina from the individual fish—which, according to Shelley, fall between 6% and 78% A2—now are looking at the mean values only, no longer considering possible explanations related to the large variations or differences in variations. They already have seen several plots, including the one that they refer to in the opening of Fragment 3 (Figure 6). The dogma states that when the coho salmon get ready to head for the ocean, their porphyropsin levels drop (Figure 6a), whereas these levels are thought to increase again if the coho were held back in freshwater beyond the normal time of migration (Figure 6b). The “reversion data” Gregg refers to (turn 001) are from the Kispiox fish that the team keeps in freshwater tanks at the university for subsequent testing.


While looking at the data, Gregg does not articulate some visible trend to offer his interpretation subsequently. Rather, he sees the mean values of A2 decrease and then increase again as “reversion data” (turn 001). This expresses considerable familiarity—based on his work in the field and reading of the literature—that reversion data look like this. No special “interpretation” is necessary just as we do not need to engage in interpretation to know that it is a sunny day today. Shelley confirms, articulating why “reversion” is a reasonable descriptor: The fish have been held in freshwater past the date when these would normally be released to begin their migration. That is, he no longer looks at the data on their own but contributes to the interpretation through the lens of what has been done to the fish; this therefore can become part of a shared explanation for the actual values of the measurements.

Elmo, who had done his Ph.D. work on olfactory imprinting of a salmon species, suggests that the hatchery fish may not have the stimulus for beginning the physiological changes and that they need something “to get off in the hatchery environment” (turn 005). Shelley adds that the fish are in freshwater and that the mean level of A2 has dropped to 27%, which is similar to the values they have measured in the fish from the other hatchery (turn 006). He suggests that this level “is not bad” given the environment in which the fish are raised in the Kispiox Hatchery, which, as all those present know from visits to the place, is in an enclosed hall with low light conditions and with a water temperature that is constant throughout the year rather than varying as it would in nature and in the comparison hatchery. Moreover, Shelley adds, this value is good given that other salmon have a higher value at the point when they are “going into sea”; the Kispiox hatchery fish are “already at the bottom,” that is, at a point that corresponds to the lower parts of the published graph. Shelley continues by elaborating the reference for his assessment of the Kispiox data: the Robertson Creek Hatchery releases its fish at a point so that when it arrives in the ocean, its A2 values are 44% or 47%. Gregg and Shelley produce a confirmation that indeed the release value lies around a mean porphyropsin level of A2 = 44%. Elmo comments that this is “pretty high,” and adds—using the oppositive conjunction “but”—that “they [fish, hatchery] are close to the ocean.” That is, he can be heard saying that the value of A2 is high compared to some implicit, nonarticulated norm but this makes sense given that the hatchery is close to the ocean.

Postponement: “Any Kind of Ability for Us to Predict Comes From Watching [That] Curve”

Shelley, in a long turn at talk without interruption, then elaborates for the benefit of the remaining team members present the sense he is making of the data in the face of the existing theoretical canon and published research. The fish do not appear to “shift entirely [from A2 to A1] before they go,” that is, leave the breeding grounds and hatchery to head downriver and toward the ocean. The team's ability to predict the proper release time—i.e., the objective of this research—comes from “watching the curve” and making a decision to release the fish from a given hatchery when the curve is at a yet-to-be-determined point. The problem that the scientists currently face is that they “don't have the initial stuff,” that is, the measurements that constitute the early part of the graph. This would have meant to start “back in March,” when they might have been able to observe “that really nice eighty [or] seventy percent line,” that is, the early part in the Alexander et al. graph (Figure 2) where measured A2 values are high and nearly constant.

Someone “in the know” can hear that Shelley is taking the lens of the Alexander et al. graph. This is so because, while talking, he enacts features of the graph by means of gestures (Figure 7) and utterance forms that bring into this meeting the currently absent graph from the Alexander et al. paper that everyone else in the room is familiar with. In a first iteration, Shelley describes what they might have observed had they begun taking measurements in March. He produces the baseline, which is actually high in A2. His right hand, which heretofore participated in gesturing the level of the baseline, moves backward and then takes a pointer configuration. The left hand iconically shows the level in reference to which the subsequent measurements exhibit change. That he refers to measurements can be seen from the fact that he points to imaginary locations on the graph, then moves and makes another forward pointing gesture as if he were plotting data points. Each time his hand moves forward pointing to some location on the virtual graph, he utters “dip,” as if he was making a dot to represent a data point. But at the same time, each point corresponds to a particular state of the real living fish. His iconic gestures thereby bridges between a representational world and the real world, much like this has been described for physicists (Ochs, Jacoby, & Gonzales, 1996). This means that the scientists have to be familiar in and with both worlds so that their body can combine these in the way a mathematical function relates domain and range.

Figure 7.

Shelley produces gestures that bear an iconic relation to the graph that is the reference for this entire research project. He marks each hypothetical measurement by a slightly forward-oriented indexical gesture.

Following the first presentation, Shelley reproduces what is expected, though now in terms of a continuous graph, whereby the right hand moves along a continuous trajectory from the reference point toward the bottom. In these two presentations, we thereby first see the individual measurements that would be expected and then something like a best-fitting curve drawn through all the measurements—in the way it would be done in the resulting publication. When Shelley enacts the gestures for a third time, he verbally coarticulates the first several points by uttering repeated “t's” and then marks the three points associated with large downward distances saying “dip, dip, dip.” The right hand then disappears completely below the table, becoming invisible just at the instant when he voices the words “bang down.” Finally, he directly names what he has been talking about: “the curve.” He suggests that they have to start by looking at the beginning of the curve, which leaves about 3 to 4 weeks for predicting when the downturn will commence.

Summary: From Decontextualization to Recontextualization

Science is a successful endeavor because its concepts are applicable to a wider range of phenomena the less they are tied to the specifics of any context, and representations become more inclusive and abstract the less contextual particulars they include. But decontextualization and the abstraction it implies, while increasing the power of thought, simultaneously comes with a loss: “Every abstraction is nothing other than a sublation of certain clear ideas/representations [Vorstellungen], which is generally done to more clearly imagine/represent [vorstellen] that which remains” (Kant, 1956, p. 803). Learning by abstraction, however, does not mean that we should do away with context but implies “negative attention, that is, a real doing and acting that is opposed to that action by means of which an idea/representation [Vorstellung] becomes clear” (p. 803, original emphasis). Out of this negative attention, “the Zero, or the lack of a clear idea/representation [Vorstellung] is brought about” (p. 803). Otherwise—e.g., “simply a negation or lack”—there would be no difference between the intention to know and the intention not to know.

In the meeting fragments analyzed in this section, it is evident that the scientists do not see the data points abstracted from everything else. Rather than looking at the data independent of context, simply articulating relationships and mathematical patterns, they “look through” the measurement points seeing, on the one hand, the graphical representation, and, on the other hand, the real events in the hatchery, where they get their fish, and in the river and estuary, where they capture additional specimens for sampling purposes. They see their currently available measurements in terms of (a) the theoretical canon, which relates the A1/A2 ratio and other physiological aspects (e.g., response to “salt water challenge”) to the freshwater and saltwater habitat and (b) the graph that has the Alexander et al. published data on A2 ratios during a 12-week period covering the migration period plus some weeks before and after. In a way and not unlike what can be observed among science students, the scientists ' preconceptions determine what they see and how they interpret the graphs. Assessments and evaluative commentaries are provided in terms of the graph they anticipate to obtain as their outcome. An explanation is sought for the deviations by constructing possible reasons that are based on their familiarity with the context from which the fish derive. Thus, the Kispiox measurements do not make sense in themselves; rather, these data are seen through the lens of the scientists ' familiarity with the conditions in which these fish are raised and the differences between these fish and those in the nearby wild fish living under very different conditions. One possible reason for the deviation is that the timing may not be quite as reported or that timing might differ from year to year. What is left untouched here is the nature of the graph itself: It constitutes the paradigm within which this group of scientists operates and looks at the data.

At this early stage of the research, then, the scientists do not “let the data speak for themselves.” Moreover, it is not just the reigning paradigm (theory, empirical work) that determines what there is to be seen. There already is a lot of embodied experience with the particulars of where the specimens were caught and under which conditions they have grown up that scientists bring to their effort of understanding the representations that they are looking at. That is, what can be learned from and what is the signification of the data does not derive from the abstract properties and relations between the data points but depends on contextual particulars of the data sources. If this is the case, then one might hypothesize those individuals to have difficulties reading data who are not intimately familiar with the data sources and their acquisition methods—a situation typical of education research and practice on graphing and addressed in research on the role of data generation on interpretation (Cobb & Tzou, 2009).


Or, perhaps, while sleeping, I returned without effort to an age of my primitive life, forever gone, found again such as my childhood terrors . . . I had forgotten this event in my sleep; I remembered it again as soon as I had managed to wake up. (Proust, 1913, p. 13)

In his celebrated seven-volume novel In Search of Lost Time, Marcel Proust discovers near the end that only involuntary memory is capable of resuscitating what time has made it lose. It is in the seventh volume that time is found again. In analogy to the novel, scientists lose much of the context as they produce their data in the process of research. I show in this section that in their interpretive process, they reconstitute precisely this context that they have earlier left behind as part of abstraction. That is, returning to the earlier quotation by Kant, the scientists have first applied negative attention to the contingent aspects of their fish and then spend a lot of effort to bring back into focus again as part of the meeting. They undo the decontextualization that their scientific method has produced. In the following three subsections, I exemplify the reconstruction of context in scientists ' discussion of (a) an ensemble of environmental factors, (b) the age of hatchery-released fish, and (c) the weight–porphyropsin link they had previously constituted.

An Ensemble of Environmental Factors

The scientists seemingly struggle with the fact that in one setting (Robertson Creek), the wild and hatchery-raised fish exhibit similar mean levels of porphyropsin; in the other setting (Kispiox Hatchery), there are vast differences between the wild fish and those in captivity. In this fragment from the data analysis session, those who collected the measurements (Shelley, Tiêu) mention size differences. The lead scientist links these differences to differences in “life history strategies” (turn 007), that is, he draws on a particular concept in biology to explain why the data might differ. It is not that the data tell him about differences in life history strategies, it is the strategies that tell him about differences—much in the way data coders did not arrive at hospital practices from the hospital data, as they were tasked to do, but used their knowledge of hospital practices to code the data (Garfinkel, 1967). Gregg then offers another concept: The fish that they are dealing with represent two different age classes—lower modal (i.e., early) second-year coho versus upper modal (i.e., late) first year—which may be due to differences in life history strategies (turn 007).


Shelley introduces another “quick point” by asking Elmo whether he knew where the fish originally were sourced. Elmo responds that they are from two rivers. This appears to surprise Shelley (“Oh it is?” [turn 010]). Rather than pursuing the idea of the two rivers, Shelley then suggests that it would make a big difference if the brood stock used in the hatchery were from a river closer to the ocean. That is, at that moment they do not appear to know where the Kispiox coho had been caught as the source of the roe (eggs) and milt (sperm) that produced the current brood. If these coho were from a river closer to the ocean then these would be more similar to the fish from Robertson Creek, which is but 30 km from the estuary, whereas Kispiox Hatchery is nearly 300 km from the ocean. Here, Shelley draws on geographical factors that might distinguish the fish and their physiology. This is consistent with the observations in a study of fish biologists and hatchery workers, who had trouble and ultimately could not categorize a specimen but, in trying to come up with criteria, drew on geographical factors that would differently affect the sheen on two species (Roth, 2005).

Gregg asks Shelley to write down “the question,” and then restates that they might be dealing with two age classes of fish based on the fact that the hatchery coho salmon were of different size. Elmo suggests that they would have to look at the [fish] scales, which, because there are annual growth rings on them, are a means for biologists to read the age of a fish. He suggests that this would be important given that they received on that very day a batch of fish from the Kispiox Hatchery with very differently sized specimens. Gregg picks up on the size differences, for which he offers two hypotheses: The fish either are exposed to an ensemble of environmental factors or are being held over (i.e., older fish still in the river).

Elmo then says where the fish had come from, which provides possible answers to both Shelley's and Gregg's contentions (turn 023). The fish are from different creeks, and these creeks constitute very different ecological niche conditions. It turns out that Elmo knows not only that the fish came from different river systems but also about the nature of these systems (turn 023). In one instance, it is a quick flowing creek, whereas the other creek flows through many ponds and even a lake. It is a slow-flowing creek. At this point, he also coarticulates a typical disease that comes from living in such a system, which is the disease that research team members had detected in some of the specimens (turn 023). Insiders can hear his response to Gregg's question concerning the origin of the big fish as pertaining to the Skatsnat River, where the productivity is higher and therefore leads to bigger fish. This fish is also diseased.


In this exchange with Gregg about the mortality, Shelley articulates yet another concrete observation that is relevant to the age problem: “one of them had a fish in its stomach” and he adds an assessment, “that was unusual” (turn 029). Thus, Shelley provides an explanation why he did not keep the dead fish; he and Elmo then look at each other, each responding with a subdued “no,” as if they were children caught doing something inappropriate. Shelley says that he found a fish in the stomach of one of his specimen, which he formulates as having found unusual. Gregg comments even more strongly: “Holy shit” (turn 030). While Shelley repeats the unusual nature of this observation, Gregg states that this “does sound like a lower modal year two” (turn 032). Elmo confirms: “they are definitely year-two, different release class” and therefore different that those that were “supplied from the hatchery” (turn 032).

Later, Elmo also suggests that the specimens are from different age classes and adds that “they had an auxiliary clip at that time,” which Shelley confirms. “The clip” refers to the fish hatchery practice of removing the adipose fin—a fin on the back of a fish, which is believed not to be necessary for efficient swimming. Fish found without adipose fin definitively are hatchery-raised coho. Gregg then notes that the other fish, the smaller ones, to be coho released during the present year from the hatchery. Elmo responds negatively, suggesting that the small fish are wild based on the fact that they do not lack their adipose fin (“they have no clips, no ad[ipose fin]”). Gregg then formulates what he is in the process of doing: he is trying to figure out whether the smaller fish are 1 year and the larger fish 2 years old. Shelley offers two possible explanations: (a) The stream where these small, 9-gram coho have been caught has a low productivity (which does not allow the fish to put on weight as they normally would) or (b) these fish migrate toward the ocean during their first year.

The biologists in this study grapple with identifying the age class of their specimens. The coho salmon they analyzed were of different size, which could be because of low productivity environment or because of different age class. The wild fish have come from different streams and are of different size. An additional piece of information is the gut content, which assists them in narrowing down the possible age class. In the course of grappling with their problem, the scientists bring up and discuss various possibilities. These possibilities derive from their concrete understanding of the river and creek system in the area, their familiarity with fish according to which only year-two coho would eat smaller fish, diseases that exist only in slower rather than faster flowing waters, and so on. Following the conversation, one can see these possibilities arise as something new. This contingent articulation of new information gives the learning that occurs an emergent, unpremeditated quality, as certain knowledge of the specifics of the origin of the fish or their state is articulated in some discursive context only to become significant to another aspect.

How Old Are Hatchery-Released Fish?

The age of the fish has been introduced as a possible confounding factor. Whether it is a confounding factor is neither certain nor confirmed at this stage. Knowing the biology of the species involved is an integral aspect of interpreting the data—an important aspect given that the students in the above-mentioned Preece and Janvier (1992) study were not likely having such knowledge about the shrimp in their graph interpretation task. The scientists draw on their knowledge of the source of the data to lay out possible interpretation scenarios. They do not just interpret the numbers to make some claim by inferential reasoning. In this meeting, the scientists suggest that the data should not be presented in the way they currently are, as there are differences that make the data from the different rivers or age classes incompatible. In effect, the scientists have found reasons for not interpreting the data as is, but have introduced information that allows them to defer interpretation.

In the preceding excerpts, an undetected confusion about the fish may be noticeable, a confusion that would become evident in the subsequent discussion. The hatchery-supplied coho are less than 20 months old—because fertilization occurs in October, the fish are about 8 months old—whereas the coho migrating toward the sea are older than 20 months (i.e., the release age). The size of the fish is a function of the living conditions and temperature; in wintertime the fish hardly feed and therefore hardly gain weight (and, as my ethnographic work in the hatchery showed, they even may lose weight), whereas the fish increase substantially in weight during the summer months (hatchery and wild). This is why hatchery workers, who model the growth of the fish, feed less in the winter than they do in the summer months.

The following segment shows that some of the scientists articulate presuppositions, whereas the others know from the time they have spent in the hatcheries that the truth is different—hatcheries release fish when they are “one plus” (i.e., fish that is more than 1 year old) rather than “zero plus” (i.e., fish in their first year). In fact, the lead scientist is unfamiliar with the precise release date and with the age of the wild coho salmon when these appear to be leaving the river. The hatchery ethnography shows that there are differences between the different species of salmonids that are raised in the hatcheries. For example, three species are raised at Robertson Creek: coho (Oncorhynchus kisutch), steelhead (O. mykiss), and chinook salmon (O. tshawytscha). The hatchery releases these at different stages in their life cycle and at their respective optimum weight: at 20–25 grams (about 20 months following fertilization), 60 grams (∼18 months), and 6 grams (∼6 months). In the following fragment, two individuals turn out to know the timing of the release because of their extensive time in fish hatcheries: Elmo and me (author). Gregg states that the coho salmon are released during their first year and that the wild fish are of the same age (turns 074, 076). Elmo, however, contradicts him: The fish are “one plus” (1 year + [unspecified months]”) that is, they are released during their second year of life. Shelley concludes that the “little guys” “are the abnormal ones” that might go out early, and he offers a causal connection (“that's why”) between the similarities of “everything” and “them” and for “getting more of them” (turn 084, 086). He states to have “two comparisons of fish that are quite similar,” that is, the wild and hatchery-raised coho from Robertson Creek and the wild and hatchery-raised fish from Kispiox Hatchery (turn 086).


Gregg then suggests that they needed to document “that scenario” and adds that they also needed to document the Robertson Creek scenario, which he formulates as assuming to be a release of zero-plus. Elmo disagrees and proposes that it is “the same thing.” There are several brief exchanges at the end of which I contribute the fact that the coho eggs are fertilized in October and the fish are released in May (19 months later). Elmo later (incorrectly) suggests “[a year and] nine months,” which I confirm in a constative utterance. It is only at this point that Gregg realizes that the coho are released in their second year in both hatcheries, and he explicitly formulates what he just has learned: “Okay, now then, that's consistent.”

Shelley then has a long turn during which he summarizes the results presented and raises the central question: Why are the wild coho near the Kispiox Hatchery so different from the ones raised in the facility? He offers up the possibility that the ones with the lower values are actually released too late “so that they travel for another 50 km with the wrong pigment,” that is, with the rhodopsin pigment typical of the marine environment. In this case, then, the wild coho from Kispiox would be the normal ones—because they have the porphyropsin pigment typical of freshwater—and all the other fish are abnormal. He ends by saying that this “is possibly but speculation probably.”

Elmo changes topic, which, while not allowing us to assess the relevance of what Shelley has articulated, introduces a new piece of information relevant to the question of age. He formulates having had a thought: Two years prior to the meeting, there was a “terrible run” and only 80 returning coho were counted in that river in that year (turn 100). As a result, very few offspring—one-plus coho smolt—would be in the river and that most of the specimens that they received from that system would therefore be “zero-plus.” He adds that only the scale analysis would tell them the age of the fish and that this kind of analysis is not easy to do (turn 104).


In this instance, Elmo knows about the biology and that there could not have been many young fish in the age class where there were only 80 adult returning. As a consultant to fish hatcheries in that geographical area—and based on what he has learned about salmon during his dissertation—he is very familiar with the hatchery practices and with the life cycle of Pacific salmon (Oncorhynchus) species. Because of the ethnographic study of fish hatcheries conducted simultaneously, I, too, am very familiar with the practices in these institutions and the parts of the life cycle that the salmon spent in these institutions (beginning and end of their life cycle). What Elmo and I know from our extensive experience in the fish hatcheries does not come to be mobilized up front, as a starting point for the process of making sense of the data. Rather, it is in the course of the meeting and as relations with others that relevant aspects of our familiarity with the concrete settings of the hatcheries—their practices, the natural environment, climatic and geographical conditions—emerge as possibly important pieces of information that bear on the situation at hand. The learning process is the result of the dialogical relation from which the new knowledge emerges.

What has emerged, then, is the fact that small fish other than those attributed to Clifford Creek also have a likely age of zero plus, as the run that would have given rise to one-plus coho smolts was nearly wiped out. Gregg comes to realize some consistency, and Shelby articulates his sense that there are two good comparisons (wild vs. hatchery-raised, and geographical location) that their data afford. The mystery, then, as Shelby articulates, arises from the difference between the fish sourced at Robertson Creek Hatchery and those sourced at Kispiox Hatchery. In the former situation wild and hatchery-raised fish exhibit similar porphyropsin levels, but in the latter the porphyropsin levels are very different. Something other than weight has to give rise to the difference. Not articulated here—but which would become salient only years later—is the fact that there is little difference between the zero-plus and the one-plus wild coho. This could have led this team to pursue alternatives to the pattern exhibited in the Alexander et al. graph much earlier (Figure 2). But already in this meeting, Shelby presents evidence for the possible independence of porphyropsin levels from weight or age class.

Undoing the Weight–Porphyropsin Link?

The scientists have spent—in search of a possible explanation for the differences in porphyropsin (A2) levels between the hatchery-raised and wild fish from the area—a considerable amount of time on the question of the age class and weight of the coho fish that they have received from the Kispiox Hatchery. Some 15 minutes later in this meeting, Shelley actually contributes an analysis that suggests an independence of porphyropsin levels and weight. Following a discussion of the conditions under which the fish are kept in the university aquarium facility, Shelley changes topic when suggesting that “there is a quick analysis on the size” that he has conducted; it shows that “there seems to be absolutely zero trend.” He projects a graph (Figure 8)—to which Elmo immediately responds by shaking his head in what we might see as an expression of puzzlement and then utters, with rising intonation and using an interrogative: “What is this?”

Figure 8.

Shelley projects this “quick analysis” as part of the meeting plots porphyropsin levels versus fish weight.

Although one might have assumed that this additional analysis should have settled the issue, the scientists do not discuss it in this manner. Nobody present picks up on the potential relevance of this analysis. In fact, Elmo appears to be a little incredulous, as he laughs after stating that all data from all locations have been plotted. Shelley elaborates that he wanted “to get a sense of some trend,” and this is why he plotted “all three thousand [data points]” using “different colors [for the different locations where the coho had been captured].” He concludes that the data fill “the whole blank,” “the whole box,” which means that they “are getting size-unrelated data to porphyropsin” and that is something he “actually expected.” He expected this result, and that is why he brought the plot into the discussion. There is a long pause, which ends when Gregg begins to summarize what they have learned during this meeting.

Here, Shelley introduces a graph. It apparently shows that the porphyropsin levels are independent of fish size. Although this information appears to contradict what they have been discussing earlier, which is a relationship between weight/age class of the fish—at least at Kispiox Hatchery—and the porphyropsin levels, these two aspects of the meeting are not brought together and the implications are not discussed. The similarity in the data from zero-plus and one-plus coho provided a possibility for recognizing that it is not smoltification and the associated physiological changes that drive the difference in the visual system but some other factor. The publication of the paper in which the team ultimately reports its results suggests high correlations between porphyropsin levels and temperature or length of day. The sense that the relation between these variables exists and may be periodic emerges over the course of the following 2 years, with the accumulation of relevant data. At this stage of the analysis, however, what is later recognized as evidence—weight, age-class independence—is not yet seen as such. What Shelley just has presented stands on its own and, because Gregg continues with a summary of what he sees as emerging, this fact has no implications for the present discussion of the data. For months to come, the team continues to pursue the hypothesis of the correlation between the saltwater- and migration-related changes in porphyropsin levels rather than the seasonal changes.

Summary: The Past Recaptured

In analogy with Proust's last sequel (one translation of which is entitled The Past Recaptured), where the protagonist recaptures the past in finding time again, the foregoing subsections show how in the interpretive effort, the lost context reemerges for scientists as a matter of course in the process of establishing the signification of the data. But there is nothing in their meeting that would suggest that these scientists consciously, deliberately, or in a premeditated fashion seek to recontextualize the decontextualized data. It is in and through their effort to understand that they “wake [themselves] up” to remember what they have left behind. No individual member of the research group has all the information required or knows what it takes to understand or even realizes what the relevant information might be. All of this, the nature of what constitutes information and what is required to understand the phenomenon emerges from the context. Emergence here means that the end result could not have been predicted on the basis of adding up all preexisting knowledge available across the group but (parts of) what emerges exceeds and is in surplus of the sum total of what existed before. In the process, the ultimate graph is constituted as a synecdoche (part) of the research process (whole), as scientists learn to connect the data with the original settings (contexts) from which they extracted these. Confronted with the data that have been decontextualized in the process of research, the scientists now can be observed in the process of rebuilding the context that they had stripped.


This study was designed to better understand the relationship between abstract representations and the concrete contexts from which the former emerge in the course of scientific discovery work. It focuses on how scientists come to understand data and graphs not only because it shows how they think and learn at work but also because it allows us to better “judge the authenticity of classroom activities and their potential to prepare students for out-of-school problem solving” (Gainsburg, 2006, p. 3). The analyses show that in the data analysis meeting—the present one being documentary evidence of what happened in the early meetings generally—the scientists struggle understanding the data that are the result of many steps in the research process that has taken them downstream and away from the source of the data. In this process, context literally is lost as pieces of material are extracted and then subjected to measurements, which themselves undergo extraction and abstraction. These steps are part of a chain of translations that take a natural object and transform it into scientific knowledge claims, where any two chain links are separated by an ontological gap that is bridged by means of concrete practice (Latour, 1993). That is, the scientists do not just “construct” data-related knowledge. Their understanding of data and graphs is a process of undoing the decontextualization that they previously enacted. Understanding the graphs means moving upstream toward the original context of the data. What they can do with a graph is related to their intimate knowledge (a) with the situation where the data have been collected and (b) with the transformations that the things at their hands undergo in the laboratory and subsequent computer models. Without this (often tacit) intimate knowledge (ground), the data (figure) mean nothing. In the following, I discuss the results under four aspects: (a) the process of interpreting data and graphs, (b) understanding what scientists do when they have little to go by, (c) the role of background understanding in interpretation, and (d) the light thrown on existing research on the role of context in students ' data and graph interpretations.

Understanding the Process of Interpreting Data: Induction Versus Abduction

Science is concerned with producing concepts and theories that are valid independent of particular locations and situations. This inherently requires disattending to particulars and, as Kant states in the introductory quotations, calls for negative attention. As a result, scientists end up with data points, graphs, and means. The foregoing analyses show that scientists do not move from their specific observations (measurement) to the coho population in general and to a universal law—e.g., in terms of inherently mathematical properties. This constructive process would be referred to as induction (Eco, 1984). In the present study, the relationship between explanans, “sentences describing the phenomenon to be explained,” and explanandum, the “class of those sentences which are adduced to account for the phenomenon” (Hempel & Oppenheim, 1948, p. 137) comes to be reversed. Scientists use what is to be explained (the description of the difference between the porphyropsin in coho) to look for differences that would explain them (their description of the differences between situations and the coho that derive from their). It is only in the ultimate write-up that the logic of induction comes to be observed, whereby statements of antecedent conditions and general laws derived from the observations are used to explain the phenomenon.

This study shows how an existing graph serves as a lens that frames the new data in terms of their contribution to values and relations already exhibited in the existing graph. In the course of their efforts, the team members mobilize their familiarity and concrete understanding of the situations from which their specimens have been taken. That is, in their effort to explain some natural phenomenon, the scientists mobilize their understanding of the natural environment. In this study, coming to understand the data takes the form of abduction, defined as the “tentative and hazardous tracing of a system of signification rules which will allow the sign to acquire its meaning” (Eco, 1984, p. 40). The very structure of abduction, where results are explained through the hypothesizing of rules applied to specific cases leading to tentative results that are compared with actual results, which has the structure


makes it reasonable to hypothesize that knowledge of specific cases and their context supports rather than hinders abduction. It is a type of reasoning that also has been observed at work in the form of statistical process control (Bakker, Kent, Derry, Noss, & Hoyles, 2008). In this study, given the decontextualized nature of the data they are looking at, scientists attempt to arrive at an unknown rule such that they arrive at their concrete results when the rule is applied to their specific context. That is, students not only use informal inferential reasoning but also highly competent and successful scientists draw on their familiarity and personal knowledge of cases to learn (about) general laws (rules).

The diagram (1) suggests that coming to know graphs involves a double ascension: from abstract to concrete and concrete to abstract. This double movement is depicted in (1) from rule to result via the case and from result to rule. The two movements are manifestations of one and the same process, abduction, much as thinking and speaking are processes that manifest the process of signification (Vygotskij, 2005). Learning to read a graph means coming to understand the corresponding context and know a context means coming to know the graph.

It may appear strange that a team of highly competent scientists did not inductively “construct” knowledge beginning with and based on their data. But similar observations have been made some time ago in a sociology research project where team members were tasked with coding outpatient clinic records for the purpose of producing knowledge about treatment criteria (Garfinkel, 1967). It was noted that in the same way as in the present study “coders were assuming knowledge of the very organized ways of the clinic that their coding procedures were intended to produce descriptions of” (p. 20). Moreover, familiarity with the clinic, as with the natural and social settings (hatcheries) in this study, “seemed necessary and was most deliberately consulted whenever, for whatever reasons, the coders needed to be satisfied that they had coded ‘what had really happened’” (p. 20). This is consistent with a philosophical analysis of the relation of the concrete and abstract in scientific understanding: “If we single out the phenomenon . . . and consider it in the abstract, that is, leaving aside all the circumstances that do not flow from its immanent laws, we shall understand nothing in its motion” (Il'enkov, 1982, p. 104, emphasis added). “Concrete understanding,” on the other hand, “coincides with taking into account all those influences exerted upon [the phenomenon] by all the developed and increasingly complicated forms” (p. 105).

What Scientists Do When They Have Little to Go By

This study of graph interpretation among scientists draws on fragments from a team meeting as illustrative material to exhibit the processes by means of which scientists come to grips with relations in their data that they do not immediately understand. The scientists had engaged in this study to test whether changes in the visual system of salmon can be used as a predictor of smoltification—transformation into the stage at which the coho go to sea. To see what scientists do with data and graphs when they do not yet know the outcome of their experiment, the presented materials focus on an analysis session in the early stages of scientific discovery work rather than at the later stages, when they might be tempted to say that they have had a hunch (or knowledge) all along. The fragments exhibit that what will become evidence concerning the age- and weight-independent nature of porphyropsin levels is not considered at the time. What might have been reason to pursue other routes to understanding the data is not considered during this meeting and therefore is nothing at all. The team did not attend to the similarity as similarity between the porphyropsin levels of different age classes. This finding is interesting in the light of a study among college students in ecology identifies continued difficulties when there are interactions among multiple variables (Picone, Rhode, Hyatt, & Parshall, 2007). As the present analyses show, even experienced scientists may not recognize a variable as a variable until, for one or another reason, it emerges as something salient.

In the course of producing their data, the scientists initially have little to go by to interpret what they have in hand. It is only because of their intimate knowledge of the transformation that the retina undergoes and the trajectory of the data that follows as well as understanding the natural environment from which the fish have been collected and the particulars of the setting that they can gain some foothold for articulating what the data mean. In the course of the analysis the scientists are shown to express trouble understanding why the data between the Kispiox Hatchery and the Kispiox wild coho are so different, whereas in the simultaneously conducted study at the Robertson Creek Hatchery, the data between wild and hatchery fish are very similar. Scientists are confronted with data that in the process of the research have been decontextualized, and the whole meeting brings about the emergence of relevant context that assists scientists to explain what they have lost or given up earlier.

The Role of Background Understanding

In the course of their interpretive work, scientists exhibit to each other a great deal of biological content knowledge next to the mathematics involved in understanding their data and the graphs that they give rise to. This is indicative of the background understanding scientists have to bring to the task of interpretation to make sense of their data and which they, when asked to explain a graph from their own research, voluntarily articulate prior to the actual explanation of a graph when they talk graphs resulting from their work (Roth & Bowen, 2003). It is not only indicative but also symptomatic. When scientists do not have this background information, they struggle making sense to the point of suggesting that the graphical representations are poor. The present study shows how the scientists draw on their (a) intimate knowledge with the natural phenomenon and (b) detailed knowledge of the transformation processes that produce the representations. It is their familiarity with the entire research context that comes to be denoted by the graph in synecdochical fashion. When asked, scientists unfold how ever much necessary or required to provide explanations of what a graph means or what particular features can be attributed to. The question of how “authentic practices” become supportive of mathematical reasoning (Dierdorp, Bakker, Eijkelhof, & van Maanen, 2011) can be answered in this way: “Authentic practices” allow students to become very familiar with a particular context so that they can draw on this familiarity to produce cases that support their abductive reasoning (see also Cobb & Tzou, 2009). “Authentic” does not denote some special practice—e.g., what scientists or mathematicians do—but means familiarity with specific settings that supports “positive attention” to context and, thereby, supports understanding of general, non–context-specific mathematical and scientific concepts as well.

In this study, we see that the existing canon “biases” what the scientists see in their data and how they see them—which would not be surprising given that confirmation bias has been observed in the sciences (Hanson, 1958; Nickerson, 1998). Formulated hyperbolically, scientists do not give their data a chance to speak for themselves. In the early parts of this study, each data point was seen in terms of its possible location on an already published graph. It would take the scientists several years until they found that the porphyropsin levels correlated highly with temperature levels and day length. Sinusoidal functions with temperature or day length as independent variables best fitted the data collected over the period of several years. That is, just as Shelley suggested in the meeting, there is no correlation between body weight and porphyropsin levels; and just as Tiêu hypothesized during the meeting, temperature would become a significant factor. It therefore does not come as a surprise that Shelley, in a move that reconstructed the history of the discovery, would say in an interview after the paper had been published that he has had a sense all along that the data did not fit the dogma.

It has been suggested that the study of a science (including technology, engineering, mathematics) in action requires us to follow practitioners “either before the facts and machines are black-boxed or we follow the controversies that reopen them” (Latour, 1987, p. 258). But studying such a practice during the controversy is not so easy because the existing paradigm (canon) has such a stronghold on the scientists that they have a difficult time seeing their data in a way that it would reopen a controversy. In fact, those individuals on the research team raised doubt who did not have histories in the field as long as the lead scientist: the doctoral student and the research associate with a background in physics. A second aspect is the canon, which directs them to see the data in a particular lens. The scientists anticipated getting something like the Alexander et al. data, and, therefore, see what they get in terms of this study. Thus, the scientists' intimate understanding with theory and previous research was both an affordance and a constraint and, therefore, their learning process has characteristics not unlike those found among eighth-grade students attempting to mathematically model a winch (Izsâk, 2000).

It also has been suggested that without practical familiarity, there would be no interpretation possible (Ricœur, 1986). This is so because, as the analysis of everyday cognition shows, the development of practical understanding is interpretation. It is in interpretation that practical understanding comes to understand itself in an understanding way: “In interpretation, practical understanding does not become something different but becomes itself” (Heidegger, 1927/1977, p. 148). In interpretation, the possibilities only implied in existing practical understanding come to be developed. Interpretation articulates significations, which are already prefigured as possibilities in practical understanding. This study shows that the scientists present to and for each other what they have learned and are familiar with: two different types of creek systems, the windows in the closed rearing facility, the constant water temperature in the Kispiox Hatchery that is much higher than the near freezing temperatures of the creeks, and the fact that the fish culturists release coho 20 months after egg fertilization. This is why the possibilities embodied in practical understanding make sense. It does make sense that the coho in the Kispiox Hatchery are different from those in the wild, and it does make sense that the wild and hatchery-raised coho are similar at Robertson Creek because they are reared in ponds that directly receive their water from the neighboring creek. In and through their effort, the scientists learn about an aspect of the concrete world as it manifests itself in their practical understanding and in an abstracted way in their data. In this approach, new ideas arise from material praxis so that any higher psychological function, including understanding, exists first as a societal and material relation before it exists as conscious thought for the individual (Vygotskij, 2005).

New Light on Existing Studies of Students Interpreting Data and Graphs

The findings in the present study shed new light on and relativize findings of an earlier study on learning graphs in science lessons. It had provided pairs of eighth-grade students with maps divided into plots, each of which contained two measures: one for the amount of light falling onto the plot and the other for the amount of brambles growing on it (Roth, 1996). Students were asked to indicate what patterns they were seeing (if any), what claims they would make based on the data, and to provide evidence that supported their claims. The study provides evidence for the “exasperating” requests for additional information about the contexts from which the data were extracted—even though the story problem had used the students' own curriculum as the frame. The study concludes: “[B]y abstracting the problem from the environment, we had limited the range of options students had available during their field work. A meaningful setting was changed into a puzzle with few options” (p. 518). Students in another study articulated similar exasperating requests, and the students refused to engage in data interpretation until they were provided with more background information (Cobb & Tzou, 2009).

The present study shows that scientists, in the attempt to understand the data from their own research and the graphs that these give rise to, reconstruct the natural environment from which their fish had been taken. It is not as if they were merely looking at mathematical relations. As biologists, these scientists are interested in the natural world, attempting to provide models that explain why nature presents itself to observation in the way it does. Thus, it is precisely the story that is of relevance, some coho salmon living in a creek that includes many ponds and a lake, which gives rise to disease of the kind that the scientists have observed in their fish. In her study of statistics in classrooms, Pfannkuch (2011) may be heard as complaining: “The historical context of prior knowledge about interpreting plots seemed to remain a strong influence when [students] were learning [information inferential reasoning]. The teacher attempted to sway them from making up stories about the data” (p. 39). It turns out that the scientists in this study have to make up stories, and any attempt to sway them from it might have appeared as irrational, for they discussed environmental conditions and specifics, temperature, age class, weight, time of release, and geographical location as possible causes for the differences between (a) wild fish and fish raised in the hatchery and (b) the differences between two hatchery/river systems. One may concede that because of the difference in object/motives of mathematical/statistical and other activities, including natural science and fish hatching, there should be different forms of practical understanding that have to be brought to the interpretation.


The present study exhibits the difficulties scientists have interpreting decontextualized data. If scientists experience trouble reading their own graphs because of prior decontextualization, this has considerable consequences for science education, where students need more than cursory introductions to graphing to make them display the competencies that science students are expected to display, for example, on international tests such as PISA (Programme for International Student Assessment). This study has at least three types of implications for science education research and practice: (a) the assessment of graphing competencies, (b) the explanation of difficulties students exhibit in interpreting and understanding graphs, and (c) the design of environments where students learn how to interpret data and graphs.

First, science education research tends to expose students to unfamiliar graphs and story contexts. Thus, in research or international achievement studies, students (including middle and high school and university) tend to be asked to interpret data presented in the form of some representation, including histograms or line graphs. A typical task might include a graph featuring two curves, one for the amount of oxygen in a river and another for the number of shrimp found in it (Preece & Janvier, 1992) or featuring the number of deaths per 100 deliveries from puerperal fever that occurred in “Semmelweis ' Diary” (Organization for Economic Cooperation and Development, 2006, p. 16 [Item S195]). The present study would suggest that without knowing the context, students could be anticipated to fail. Authentic assessment of graph-related competencies may require practical settings where students actually produce the data, which they interpret after having plotted them. Experimental studies could be designed that investigate graph interpretations when students generated data and graphs on their own versus when the graphs are presented in a story context.

The present study implies that science educators should be thinking about how to present graphing tasks. Graphs tend to be presented in a story context. We may call it con-text, for it tends to be additional text—e.g., a story—that provides a certain naturalistic (societal, material) setting of the data. If the present research findings are generalizable, then the con-text for graph-related activity has to be understood in the understandings: It is not a story problem that will provide this additional (con-) text. Without considerable familiarity in the domain modeled by a graph, students will not likely exhibit the desired competencies. How might science education researchers assess graph-related competencies (e.g., interpretation)? The present study appears to suggest that only when individuals are highly familiar with the source of the data will they produce appropriate interpretations.

Second, this study also implies that science educators may have to rethink how to understand the problems students might have interpreting graphs. In research, there is a tendency to focus on what students do not know, attributing the problems they have in interpreting graphs to misconceptions and other (cognitive) deficits (e.g., Asher, Nasser, Ganaim, & Tabak, 2010). As the present study shows, cognitive deficits and misconceptions may not be the best explanations, because even successful and experienced scientists are not quite as competent as often presupposed. To understand their own graphs, scientists had to recontextualize them in a variety of ways. The reasons for reading graphs in some accepted way derive from sources other than cognitive deficits. For example, the quality of graph interpretations may depend on the degree of familiarity that a person has with both the representation and the phenomenon inscribed (e.g., Triantafillou & Potari, 2010). Further research among students needs to be conducted to understand the extent to which lack of familiarity rather than cognitive deficit explains difficulties in interpreting graphs.

Third, in the present study, scientists came to understand their data only when they recontextualized these in the context from which they had abstracted them. If learning how to interpret requires successful interpretation experiences and if successful interpretation requires familiarity with the source of the data, then students need to have opportunities to generate the data that they graph. That is, rather than teaching graphing by providing students with data sets or finished graphs, science educators may have to go all the way: teaching graphs and graphing in the course of inquiry. As there is some evidence that preparing graphs for the purpose of convincing peers about facts and relations observed (e.g., Cobb & Tzou, 2009; Roth, 1996), providing students with opportunities to argue about their graphs may further enhance opportunities for learning how to interpret graphs.


The transcriptions were produced consistent with the way established in conversation analysis (Selting et al., 1998) with the exception of timing, which, for the purpose of economy, has not been reproduced here. The entire transcription uses small letters and the specific notations include the following:

G: eh n[ice]Brackets indicate extent of overlap
S: [yea] 
i = mEqual sign indicates latching, that is, sounds flowing into each other or two speakers speak without pause or overlap
((Figure 4))Transcriber's comments in italics and within double parentheses
((00:21:50))Running time on the digitized video
,?;.Punctuation is used to indicate intonation of the utterance as a whole: slightly upward, strongly upward, slightly falling, strongly falling
(2) ()Pause in seconds; pause of unmeasured length
e:hEach colon indicates the lengthening of phoneme by about 1/10th of a second
<<acc>jst so you>Accelerando (acc) denotes increased speech rate
<<p>and so>Piano (p) and pianissimo (pp) denote lower and much lower than normal volume by this speaker
<<pp>and so> 
<<f>okay>Forte (f) denotes louder than normal speech of this speaker
KISpioxCapital letters denote special emphasis compared to normal speech of the speaker
*Asterisk marks coincidence between speech and offprint shown
(:E)Denotes participant directly addressed by the speaker
hhClearly marked exhalation
.hhClearly audible inhalation


This study was supported by a joint grant from the Social Sciences and Humanities and Natural Sciences and Engineering Research Councils of Canada. All opinions are my own.