I appreciate de Witt's and Ploeg's efforts to find both criteria and language more suitable for evaluating interpretive phenomenological research. This is an especially worthy goal as phenomenological inquiry is probably the most difficult to communicate of the qualitative research traditions. Yet, I object to the way they have used my work to accomplish this task.

To frame their discussion of the inadequacies of appraisal criteria for phenomenological inquiry and to advance their own framework, de Witt and Ploeg chose my paper on rigor in qualitative research (Sandelowski 1986). They argue that my 1986 discussion of rigor– a word that I later regretted as conveying methodological rigidity rather than quality in craftsmanship (Sandelowski 1993) – is inadequate and inappropriate for interpretive phenomenology, which requires language that better expresses quality criteria unique to interpretive phenomenology.

The problem here is that what I proposed 20 years ago was not intended to be used to evaluate any one study in any one qualitative research tradition, but rather to draw general lines between qualitative and quantitative research at a time when qualitative research was less understood but increasingly being used in nursing. By drawing these lines, I had hoped to offer qualitative researchers a way to communicate differences when the comparison with quantitative research seemed the best vehicle to communicate them. The influence of postmodern ideas (especially concerning difference), the increasing emphasis on the narrative quality inherent in all inquiry (Elliott 2005), and the turn away from qualitative vs. quantitative to qualitative and quantitative discourses, have made this objective seem naïve and ill-advised in 2006. I too now lament the continuing use of quantitative research as a comparative reference point for qualitative research and the lack of attention to diversity within qualitative research. And I have long wished that researchers would stop using my 1986 paper in their grant proposals and reports of qualitative research. Perusal of my publications since then will show that I no longer subscribe to many of the ideas presented in my 1986 paper, and a vast literature now exists offering more sophisticated treatments of quality in qualitative research.

2006 is now and 1986 is then. Albeit with the best of intentions, and despite the disclaimer that they selected my 1986 paper because many of the ideas presented in it are still debated today, de Witt and Ploeg commit the cardinal error of removing a work from its historical context and, from a present-oriented vantage point, criticizing it for not meeting the needs of the present. I have seen this error committed repeatedly with my own and other scholars’ work whereby they are charged with failing to do something they never intended; failing to know then what any enlightened person knows now; failing to know then what only the critic knows now; or failing to adhere then to an academic discourse that is in fashion now.

Hindsight is truly the best sight of all. Yet 20 years ago, with my now admittedly old-fashioned goal of drawing lines between qualitative and quantitative research, I still stated (Sandelowski 1986, p. 28) that given the diversity in qualitative methods, one generic framework was inadequate for evaluating rigor in all qualitative research. I even mentioned phenomenology among a host of methods for which the application of any single set of criteria would not be suitable. Yet, despite this clear statement of the very argument that de Witt and Ploeg advance in their ‘critical appraisal’ paper, they chose to use the 1986 paper as a foil to argue against a practice that I was already against then and I am still against now: namely, the indiscriminate use of one set of criteria for evaluating all qualitative research.

The central focus of de Witt's and Ploeg's argument is on the inadequacies of language to express phenomenological imperatives. Their premise is that the language in my 1986 framework masks and obscures the unique characteristics of interpretive phenomenological inquiry. The words credibility and confirmability are especially poor because, as they see it, these words convey positivist notions of a single truth, whereas phenomenology is about multiple truths. Yet, in my 1986 paper, I specifically (albeit briefly) addressed the various understandings of ‘truth’ in qualitative research. And I addressed it again in my 1993 paper (which de Witt and Ploeg cited) and in other papers I have written since then (e.g. Sandelowski 1996, 2006; Sandelowski & Barroso 2003). In addition, who today would argue against the idea – no matter what method is addressed – that there are many truths and many meanings of truth? Yet the authors do nothing with explicating phenomenological truth and how it diverges from other truths, except to say it is concealed in the language of the 1986 framework. I also specifically discussed the subjectivity involved in ‘qualitative’ notions of confirmability.

The words I used in my 1986 publication were drawn from the Guba and Lincoln (1981) work and, again, intended to appeal to audiences of qualitative researchers trying to communicate with quantitative researchers for whom these words would have meaning, and to audiences of quantitative researchers trying to understand what to them was perceived as a new and even disturbing turn of events in nursing research. Yes, of course, words are important; words create worlds, and words obscure and mask. The entire academic enterprise rests on debating the uses and meanings of words and discourse analyses are now in vogue. But the integrity of the academic enterprise depends on getting scholars’ words right (i.e. accurately representing them) before they are charged with getting the ideas conveyed by them wrong (i.e. erring in their views). Without this foundation in descriptive validity (Maxwell 1992), scholarly criticism has no ‘credible’ foundation. Words are certainly open to different interpretations and their meanings are by no means always so transparent as to leap off the pages of a work. But there is no excuse for the kind of mistake whereby an author explicitly wrote in plain words that ‘the sky is blue’, but a writer quotes that author as saying that ‘the sky is green’ and then proceeds to argue for the position that the sky is indeed blue. Contemporary discourses legitimately trouble reading and writing practices and illuminate the complicated relationship between readers and writers of texts. But these discourses cannot be used to defend flagrant errors in reading.

Yet, incredibly, having found the language of my 1986 framework too concealing, de Witt and Ploeg themselves advance a series of new words that are, arguably, even more concealing, three of which fail to differentiate phenomenology from other qualitative and even quantitative inquiry. These words include balanced integration (their word for ensuring congruence between philosophy and method, and for juxtaposing participant and researcher voices, concerns in all qualitative research); openness (their word for making study procedures transparent, a concern in any inquiry, and for being open to the target phenomenon, a concern in all qualitative inquiry); and concreteness (their word for descriptive vividness and detail, and for utility, concerns in other qualitative projects). Only in their discussion of resonance (a version of vicarious experience) and actualization (an interesting idea they read off my work, but the temporal significance of which I did not recognize at the time, that experiential validation of findings is sometimes achieved long after the finding is read), do the authors approach meeting their objective to address quality issues distinctive to interpretive phenomenology. Moreover, the examples offered from de Witt's own research do not serve to show the distinctiveness of the new words they propose. Especially troubling is the comment preceding the example of resonance that ‘additional explanation of resonance is deliberately withheld…to facilitate its experiential effect upon the reader’ (p. 226). Substituting the hope of having an ‘experiential effect’ on the reader for a clear explanation of method will hardly clarify phenomenological inquiry to those interested in conducting it. Nor will it advance the phenomenological research enterprise to the ‘publishers and funding agencies’ (p. 216) to whom de Witt and Ploeg (2006) hope their framework will appeal.

de Witt and Ploeg charge me with muddling methods because, as they see it, I referred to a kind of sampling used in grounded theory (i.e. theoretical) that is not appropriate for every kind of research and especially phenomenological research. The sentence in my 1986 paper reads (p. 31): ‘…sampling is often (emphasis added) theoretical rather than statistical’. The use of the word often was to signal my understanding that not all sampling in qualitative research was theoretical. The line drawn here was between informationally representative (purposeful) sampling as opposed to statistically representative (probability) sampling, not between varieties of purposeful sampling. The word theoretical here was intended to convey something generic about sampling in qualitative research. The sentence containing this word is embedded in two columns of text about sampling, intended to further the comparative lines I wanted to draw between qualitative and quantitative research, not between phenomenology and any other qualitative method. Yet de Witt and Ploeg chose to ignore the context for my use of the word, a remarkable error given their devotion to the importance of context in their own paper.

Of greater concern than my so-called muddling of methods is de Witt's and Ploeg's claims that issues of typicality and variation do not matter in phenomenological research and that the only sampling criteria are willingness and ability to articulate an experience. Not only is this claim uncomfortably close to defending convenience sampling as the strategy of choice in phenomenological research, but it denies that the selection of participants always implies the use of criteria (even if unrecognized) beyond willingness to participate. Do de Witt and Ploeg really want to argue that all the different kinds of variations attendant to the Alzheimers’ patients whose words they feature in quotations in their paper do not matter in phenomenological inquiry?

All efforts to name, to categorize, and to frame both clarify and obscure. Depending on the purposes of the writer, differences may be minimized or maximized, and generic (in this case, English-language) meanings of words may be intended over specific technical meanings. Words like validity, bias and objectivity are not owned by any one research community: indeed, they have meaning outside the research enterprise as a whole. Although the greater onus is on writers to clarify their meanings, the onus is also on the reader to understand the diverse ways in which the same words can be used to signal different meanings, and the way that the same meaning can be signalled by different words. After all, de Witt's and Ploeg's entire paper is about substituting one set of words for another, and, in three of five of their cases of substitution, to signify the same meanings for words that they found inadequate.

de Witt and Ploeg also seem to conflate the words I used in the 1986 paper with how others have used or might use them. For example, they acknowledged that I had explained the nuances of credibility well in the text of my paper, but then claimed that the future impact of credibility was concealed by the word. Apart from supporting the point I have already made here about the importance of the context of words, what do the authors mean by this: that the word is not good because others may not use it as a writer defined it, that readers may not accept the definition a writer gave but just use the word to mean something else? How responsible are writers for how others might use their words?

Had de Witt and Ploeg simply stated that appraisal frameworks, such as my 1986 framework, continue to be indiscriminately applied to all qualitative research and that such a use is no longer appropriate, I would have applauded them. The 1986 paper belongs to its time and it should be treated that way: even better, it should be retired.


  1. Top of page
  2. References
  • De Witt L. & Ploeg J. (2006) Critical appraisal of rigor in interpretive phenomenological nursing research. Journal of Advanced Nursing 55(2), 215229.
  • Elliott J. (2005) Using Narrative in Social Research: Qualitative and Quantitative Approaches. Sage, London.
  • Guba E. & Lincoln Y.S. (1981) Effective Evaluation. Jossey-Bass, San Fransisco, CA.
  • Maxwell J.A. (1992) Understanding and validity in qualitative research. Harvard Educational Review 62, 279300.
  • Sandelowski M. (1986) The problem of rigor in qualitative research. Advances in Nursing Science 8(3), 227237.
  • Sandelowski M. (1993) Rigor, or rigor mortis: the problem of rigor in qualitative research revisited. Advances in Nursing Science 16(2), 18.
  • Sandelowski M. (1996) Truth/storytelling in nursing inquiry. In Truth in Nursing Inquiry (KikuchiJ.F. SimmonsH. & RomynD.M., eds), Sage, Thousand Oaks, CA, pp. 111124.
  • Sandelowski M. (2006) ‘‘Meta-jeopardy’’: the crisis of representation in qualitative metasynthesis. Nursing Outlook 54, 1016.
  • Sandelowski M., & Barroso J. (2003) Classifying the findings in qualitative studies. Qualitative Health Research 13, 905923.