Data collection and sampling in qualitative research: does size matter?


Correspondence to M. Cleary:


In this editorial we discuss issues relating to gaining quality data from individual and focus group interviewees. The primary strength of qualitative research is its potential to explore a topic in depth (Carlsen & Glenton 2011). Qualitative research is epistemologically grounded in social constructivist, symbolic interactionist, or other interpretive conceptual frameworks, or perspectives. It can traverse a range of information gathering and analysis methods, which have an impact on how participants are selected, and when data collection should stop.

Participant selection should have a clear rationale and fulfil a specific purpose related to the research question, which is why qualitative methods are commonly described as ‘purposive’ (Collingridge & Gantt 2008). Nevertheless, it is important to determine the extensiveness of data collection processes – breadth, depth and scope – when planning a qualitative project (Sobal 2001). Who and how many participants will depend on ‘what you want to know, the purpose of the inquiry, what's at stake, what will be useful, what will have credibility’ (Patton 1990, p. 184, cited in Sobal 2001). Broadly, informants are selected because of their personal experience or knowledge of the topic under study.

Key participant selection principles are as follows:

  • Small numbers are studied intensively.
  • Participants are chosen purposefully.
  • Selection is conceptually driven by the theoretical framework (or in the case of grounded theory – evolving concepts that arise from interview material).
  • It is commonly sequential rather than pre-determined.
  • A rationale for selection is necessary (Curtis et al. 2000, Tuckett 2004, Walsh & Downe 2006).

First, to evaluate qualitative research approaches, participant selection must be congruent with the conceptual framework. Participants should be likely to generate rich, dense, focused information on the research question to allow the researcher to provide a convincing account of the phenomenon (Curtis et al. 2000, Walsh & Downe 2006). Second, ethical conduct feeds into locating informants, the ability to justify the inclusion and exclusion of potential interviewees, and the nature of the relationship and interactions between interviewers and interviewees (Curtis et al. 2000). Given that the researcher and the interviewee are the interactive source of information in interview-based qualitative studies, verbal fluency, clarity, and explicatory and analytical abilities are central to the possibility of gathering in-depth information (Patton 1990, Sandelowski 1995, Sobal 2001).

The adequacy of participant numbers involves thoughtful decision-making; too few may risk adequate depth and breadth, but too many may produce superficial or unwieldy volumes of data (Sandelowski 1995, p. 179). For example, an experienced interviewer, with a clearly defined research topic, and a small number of well-selected homogeneous interviewees (with adequate exposure to or experience of the phenomenon) can produce highly relevant information for analysis. An inexperienced interviewer with a variable and very large sample could result in superficial data, providing a false sense of security and/or generating large amounts of information non-conducive to in-depth analysis. It is, therefore, important that qualitative researchers justify their sample size on the grounds of quality data – something that should be clearly reflected in the presentation of the study's findings. It is also important that, at least, the majority of participants are represented in the presentation of data. It is not uncommon to see papers that report a sample of 20, but on close inspection data are actually presented from less than half of the sample – did the others not say anything?

If sample size is not central to qualitative research as such, what criteria are used to cease interviewing? The key difference between quantitative (where the term sample size originates) and qualitative research is that the latter is an interactive process via which information – some of which will become ‘data’ – is likely to emerge (e.g. individual and focus group interviews).

Stopping information gathering is dependent on ‘redundancy’ of information or ‘saturation’. Redundancy is ‘the process of sequentially conducting interviews until all concepts are repeated multiple times without new concepts or themes emerging’ (Trotter 2012, p. 399). That is, analysis is carried out after each interview and when the researcher finds the conceptual wellspring has dried up and interviewees reiterate each others’ ideas, one way or another, redundancy has been achieved. Saturation is reached when ‘all questions have been thoroughly explored in detail [and] no new concepts or themes emerge in subsequent interviews’ (Trotter 2012, p. 399). The term saturation initially arose from grounded theory, which is a complex and demanding qualitative research methodology; however, in recent years a significant proportion of qualitative researchers have taken up the term without adhering to its stringent methods of concurrent data collection and analysis for theory development (Carlsen & Glenton 2011).

When to stop interviewing more candidates or groups is guided by the principles of adequacy and appropriateness as well as analytical redundancy whereby, given previous interview material, one or many more will not make further contributions or provide additional insights (Sobal 2001). Curtis and colleagues (2000, p. 1012) summarize it this way: ‘what is most difficult in designing accountable sampling strategies is finding the right balance between conflicting criteria for subject selection’. Because qualitative research is a process seeking to generate a ‘product’, there are inevitable challenges during various phases in all studies that include – but are not confined to – issues around how many people to interview and how one can be confident in this case that no more important ideas will emerge.

Another sample size issue in qualitative research relates to the growing use of focus group research. Presumably such methods of collecting information are called focus groups because the discussion is focused, not meandering or tangential (Twohig & Putnam 2002). Convention has it that more than one focus group is necessary unless the methods are mixed, in which case the qualitative group-derived data invariably become the inferior branch of the project (Carlsen & Glenton 2011). Why must there be more than one? The idea of numbers and plentiful data must not distract qualitative researchers and lead them to be uncritical when considering – what is the best way to gain the in-depth information required to address the research question?

Another question to be asked during the planning phase is: what is the advantage of a focus group? From a commonsense qualitative perspective, focus groups are not just a method to add numerical weight to the project: they take advantage of interactions between participants that allow reciprocation, exploration and elaboration of ideas that may not have occurred outside the group (Bender & Ewbank 1994, Kitzinger 1995). This potential for synergistic ‘sparking-off’ between group members cannot occur in one-on-one interviews. If the main determinant of the quality of qualitative research is the skills and characteristics of the interviewer and the interviewee, then this is even more crucial for focus groups. Another important factor in both individual work and focus group is the interview guide which should be piloted (Halcomb et al. 2007) and, as Twohig and Putnam (2002) point out, a ‘lame duck’ question in one focus group may precipitate spirited disputation in another. In addition, qualitative researchers should address the choice of focus group over interview – or vice versa. Inappropriate use of focus groups could detract from quality data – especially in topic areas that are emotive and personally sensitive – yet not using focus groups to explore an issue of clinical or professional debate is hard to justify.

If numbers are relevant to qualitative research using focus groups, it may well be the number of people in the group that is more important. According to Carlsen and Glenton's (2011) review of 220 focus group research publications, almost half failed to mention the minimum and maximum number of participants. How could a focus group with 2–4 people generate sufficiently extensive relevant information; benefit from group interactivity; prevent socially correct responses; avoid non-conducive activities such as the researcher holding the floor; or stop participants feeling shy about contradicting a fellow interviewee? At the other end of the scale, who amongst us has the skills to manage in-depth focused information gathering from more than twelve participants (even with a digital recorder/note taker) who are unknown to each other, within a timeframe of 90 minutes?

Fundamentally, for focus group methods of data gathering the phenomenon being investigated should determine how many participants in each group to optimize interactivity; and the research question itself guides researchers to decide how many focus groups are needed and why. The use of 3–5 focus groups is well-documented (Twohig & Putnam 2002), but this may turn out to be another oft repeated rule-of-thumb that avoids rigorous questions about what, who and why. To facilitate adequate data collection and the emergence of themes across and between groups, fewer than 2 groups raises the question of true representation (Halcomb et al. 2007), if providing extensiveness and representativeness (borrowed from quantitative research principles) are pertinent aims.

Accurate reporting of qualitative research methods during all stages is important for reader comprehension and transparency. When discussing focus group interviews, Twohig and Putnam (2002) consider that recruitment details, the range of the groups and a description of the participants, along with inclusion and exclusion criteria should all be clearly documented. Readers and reviewers should be able to work out what was done, how it was done and why, from the published manuscript. This includes reporting unexpected practical adaptations. In their review, Carlsen and Glenton (2011) found inconsistent method and analysis documentation in published qualitative research, as well as a lack of evidence of rigour. Consequently they suggest that journal editorial teams could benefit from using guidelines for reporting qualitative methods, such as RATS (see

In this editorial we have explored a range of issues around qualitative research planning and participant selection for individual and focus group interviews. Interviewer skill, reasons for including and excluding participants, and the connection between data analysis and the number of interviewees, and transparent reporting of decision-making have also been addressed.