The Challenges of Qualitatively Coding Ancient Texts


should be sent to Edward Slingerland, Department of Asian Studies, University of British Columbia, Asian Centre, 403-1871 West Mall, Vancouver, BC V6T 1Z2, Canada. E-mail:


We respond to several important and valid concerns about our study (“The Prevalence of Folk Dualism in Early China,”Cognitive Science 35: 997–1007) by Klein and Klein, defending our interpretation of our data. We also argue that, despite the undeniable challenges involved in qualitatively coding texts from ancient cultures, the standard tools used throughout the cognitive sciences—large quantities of data, coders as blind to the hypothesis as possible, intercoder reliability measures, and statistical analysis—allow the noise of randomly distributed interpretative differences to be distinguished from the signal of genuine historical patterns.

Klein and Klein raise several important and valid concerns about our study.

Challenge 1: Genre. As Klein and Klein note, we do consider the possibility that the trend that we observed is driven by genre. Considering their interesting demonstration that our effect disappears if the Shi Jing data are removed, this hypothesis should, and will, be explored by looking at later periods where a more balanced corpus exists. One reason to provisionally reject the genre hypothesis is that the late-Warring States conception of xin seems to persist throughout later Chinese literature, even once lyric poetry again becomes an important genre. Also, the Shi Jing is not entirely composed of lyric poetry: It contains substantial sections of state hymns, rhymed accounts of formal ritual performances, and other material where the emotion-bias would not be expected, so simply eliminating all of these data does not isolate the lyric-poetry effect. Finally, in a field where data are scarce, we worry that removing this substantial quantity of data produces a far less representative picture of the past; after all, the Shi Jing materials were most likely composed and compiled in the Pre-Warring States period. We believe that the most rigorous approach was for us to quantify these sources and draw the most general conclusion from the complete data set. We made our data publicly available so that other scholars could make empirically informed arguments about alternate conclusions by considering special subsets of the data (and we are very pleased that Klein and Klein have done just that). Thus, we consider this point less a criticism of our initial conclusions and more the sort of healthy, empirically quantified debate at the intersection of cognitive science and the humanities that we hoped our contribution would spur.

Challenge 2: Inference from contrasts. Klein and Klein argue that:

An analysis done on this journal would probably show that the term “brain” is most often associated with cognition. That hardly shows that its contributors are closet dualists . . . Contrasts between the cognitive functions of xin and those of other body parts could well serve only to emphasize these differences in function, rather than establish a difference in kind.

First, we agree that early Chinese writers were likely describing differences in function rather than explicitly insisting on differences in kind—even in “the West” only rare philosophers insist on differences in kind. We merely think that people whose intuitions incline them to think about minds and bodies separately would consequently be particularly inclined to notice and record just these mind-body functional contrasts and not others, just as the ancient Chinese did. Second, drawing empirically grounded historical inferences is an inherently inductive endeavor. We believe that the combination of several lines of evidence—xin’s being exclusively contrasted with the body, its becoming more associated with higher cognitive abilities as written language spreads, contemporary experimental evidence of dualist intuitions in diverse contemporary adult and child populations (Chudek et al., unpublished data, Cohen, Burdett, Knight, & Barrett, 2011)—does paint a compelling picture. If contributors to Cognitive Science consistently associated “brain” with higher cognitive functions and consistently contrasted “brain” (and only “brain” among all the organs) with the body and behaved like dualists in experiments, this would be reasonable evidence that they are closet (or not so closet) dualists.

Challenge 3. The Qualitative Coding problem. The authors are absolutely correct that there are many challenges involved in both comprehending and coding classical Chinese texts, that assumptions will shape how one reads a given passage, and that informed experts might very well disagree on the coding of a particular passage. We originally discussed this issue with some examples (later cut because of space limitations) of where our coding decisions differ from interpretations offered in the expert literature: particularly tricky are the “meta-codes” involving implicit versus explicit contrasts or identifications of xin and the other organs or the body, where insight into the rhetorical background could result in entirely opposite codes being given to a passage. This is an issue that the first author intends to address specifically in a longer, follow-up article presenting this study to his colleagues in Asian and Religious Studies—a group steeped in the challenges of the interpretative process (Slingerland, unpublished data). Here, we will merely note that our goal in pioneering such large-scale quantitative coding of historical data is precisely to gain some empirical traction on this challenge. Large-scale coding and statistical analysis allow the noise of randomly distributed interpretative differences to be distinguished from the signal of genuine historical patterns by exploiting large samples and statistical inference. These methods also quantify qualitative disagreements, providing measures of intercoder reliability that specify just how much difference in interpretation exists. They provide a path out of endless cycles of disagreement by specifying precisely documented techniques for resolving disagreements, which can be replicated, systematically altered, and statistically analyzed. While Klein and Klein seem more wary than us about the challenges, we are more enthusiastic about the potential of quantitative empiricism to meet them.

The need for interpretation is an inevitable issue in any sort of qualitative coding. As recent well-publicized controversies attest, coding in various branches of the cognitive sciences (say, primate behavior) is anything but unproblematic; there is no principled reason for thinking that classical Chinese texts (or written texts more generally) are unique in this regard. Any sort of qualitative coding involves interpretation, influence of background assumptions, and subjective judgments (hence the adjective “qualitative”). We attempt to respond to these potential problems with the standard tools used throughout the cognitive sciences: coders as blind to the hypothesis as possible, intercoder reliability measures, and statistical analysis. We are currently running a follow-up study that employs an entirely new team of coders with very different intellectual and cultural backgrounds, and preliminary results support the trends we report in our study. But of course, another check would be ideal: independent replication by separate laboratories. We hope that our study will inspire other researchers to attempt to replicate our results, as well as to extend our methods to new data.

Challenge 4. Dating of texts. Again, the authors are correct that the dating—even rough—of texts from the pre-Qin period is controversial, not least of all because, like most pre-printing-press texts, they are rather permeable, taking in material from different time periods and subject to scribal and editorial whims. It is difficult to respond to this challenge in this venue because this is a matter for expert dispute (which is why we did not bring it up in a piece aimed at cognitive scientists). On this topic, there are various factions within the field of early Chinese studies, ranging from scholars who still defend a very clear and “traditional” chronology of early texts to what we would characterize as a “radical fringe” that has argued for extreme textual indeterminancy in all pre-Han texts (e.g., Brooks & Brooks, 1998). We place ourselves somewhere in the middle and would stand by the claim that the three-part periodization that we employ is defensible on both philological and philosophical grounds (see Goldin, 2011; Slingerland, 2000). In any case, very few would deny that the contrast between the pre-Warring States texts (at least the Shi Jing and Shu Jing, and much of the Zuo Zhuan) on the one side and the early- and late-Warring States texts on the other is uncontroversial, on both linguistic and historical grounds.

We are pleased that our preliminary study has already inspired a quantitative response from humanists. Like Klein and Klein, we hope that it inspires a host of follow-up studies—not only on our own data and the absolutely massive corpus from later periods of Chinese history but also on the entire panoply of “data from dead minds” from past cultures around the world.