Corresponding author: Mahmood F. Bhutta, Nuffield Department of Surgical Sciences, University of Oxford, Level 6, John Radcliffe Hospital, Headley Way, Headington, Oxford OX3 9DU, UK. Tel: +44 (0)1865 220532; Fax: +44 (0)1865 768876; E-mail: email@example.com
Collection of saliva for DNA extraction has created new opportunities to recruit participants from the community for genetic association studies. However, sample return rates are variable. No prior study has specifically addressed how study design impacts sample return. Using data from three large-scale genetic association studies we compared recruitment strategy and sample return rates. We found highly significant differences in sample return rates between the studies. In studies that recruited retrospectively, overall returns were much lower from families with a self-limiting condition who provided samples at a research centre or home visit, than adult elderly individuals with a chronic disease who provided samples by post (59% vs. 84%). Prospective recruitment was associated with high agreement to participate (72%), but subsequent low return of actual saliva samples (42%). A telephone call had marginal effect on recruitment in a retrospective family study, but significantly improved returns in a prospective family study. We found no effect upon DNA yield comparing observed versus unobserved sample collection, or between male and female adult participants. Overall, study design significantly impacts upon response rates for genetic association studies recruiting from the community. Our findings will help researchers in constructing and costing a recruitment protocol.
The ability to collect DNA from saliva has created new opportunities for recruitment to large-scale genetic association studies. Saliva collection is noninvasive, can be self-administered, and returns high yields of good quality DNA (Rylander-Rudqvist et al., 2006; Rogers et al., 2007; Bahlo et al., 2010). A key requirement for any genetic association study is to maximise sample size and recruitment efficiency. In studies recruiting from the community, saliva sample returns have been shown to be higher than those of blood (Hansen et al., 2007), but with reported return rates varying between 52% and 80% (Etter et al., 1998; Etter et al., 2005; Nishita et al., 2009).
An understanding of the factors that underlie this variable return rate can inform the design, costing, and analysis of genetic association studies. In particular, this knowledge may influence the choice of prospective versus retrospective identification of participants, the demographic of the populations studied, and suggest alterations to the protocol to target populations less likely to return samples, or to consider this as a source of ascertainment bias. No prior study has specifically addressed the question of how study design might impact sample returns.
We had access to recruitment and return data from three large-scale genetic association studies (Table 1). Each of these studies recruited from the community, but used different recruitment strategies. The first study concerns Dupuytren's Disease, a disease that affects the middle aged and elderly and leads to palmar fibrosis, often causing long-term disability (Dolmans et al., 2011). The second study (http://www.ichr.uwa.edu.au/om; Rye et al., 2011) looks at recurrent acute otitis media (rAOM), a disease that is prevalent in infancy and can lead to significant morbidity, but which has usually disappeared by late childhood. The final study (http://www.har.mrc.ac.uk/geneticsomstudy) also looks at childhood otitis media, but focuses on chronic otitis media with effusion (COME) a disease that can cause persistent childhood hearing loss, but which again has usually resolved by late childhood. All three studies collected saliva using the OG-250 pot (Oragene, DNA Genotek Inc, Ontario, Canada), but recruitment protocol and participant demographics differed. We demonstrate a significant variation in returns between these studies, and suggest factors that may underlie this.
Table 1. Study characteristics for each of the three studies
Mean age of
Recurrent Acute Otitis Media
Chronic Otitis Media with Effusion
Study A was approved by Oxfordshire Research Ethics Committee. Study B was approved by the Human Research Ethics Committee at the Princess Margaret Hospital for Children, Subiaco, Western Australia. Study C was approved by Oxfordshire Research Ethics Committee. Written consent was obtained from either the participant or their legal guardian in all studies.
Study A was a UK multicentre study looking at the genetics of Dupuytren's Disease. Adult probands were defined as those who underwent surgery for Dupuytren's disease between 2003 and 2009 at a participating centre, and were identified by a retrospective review of operative logbooks. All living patients were sent a letter of invitation with an information sheet, a reply slip, and a prepaid return envelope. Those who returned the reply slip were sent a recruitment pack, which included a saliva collection kit and a prepaid envelope for return of samples. A single reminder letter was sent to those who did not return their recruitment pack after a minimum of two months. This study was a genome-wide association study, and so samples were only requested from the proband. With regard to ethnicity, 99.8% of probands described themselves as Caucasian (Fig. 1).
Study B (Rye et al., 2011, http://www.ichr.uwa.edu.au/om) is a Western Australian study looking at the genetics of rAOM. Again, probands were identified by a retrospective review of patient records from 2003 to 2008 identifying as eligible those who had had three or more physician-diagnosed episodes of otitis media prior to three years of age, with recommendation for ventilation tube (grommet) insertion. This was a family-based study and so parents and affected siblings were also recruited. Eligible families were invited to participate by letter signed by the lead researcher and the family's surgeon. Families were also provided with an information leaflet designed in consultation with a Community Reference Group, and could respond by prepaid reply slip, telephone, or email. Telephone contact was attempted at least once for those who did not initially respond after a minimum of two weeks; the use of a phone call and time-frame was outlined in the invitation letter sent to all participants. All interested families were invited to the research centre or offered a home visit if they were unable to attend the centre, where they were given a full description of the study, completed a questionnaire, and underwent collection of saliva samples by research staff. Where a family member was not present at the collection visit, a saliva collection kit was provided with written instructions for use and a prepaid reply envelope for self-collection. Postal collection was only used for entire families who lived >100 km from the research centre or for those in rural/remote regions of Western Australia. Saliva collection sponges (DNA Genotek Inc, Ontario, Canada) were provided for those too young to spit into the pot. The families recruited ranged in size from three (i.e. trios) to seven (i.e. parents plus five siblings) members; 92% of families described themselves as Caucasian.
Study C (http://www.har.mrc.ac.uk/geneticsomstudy) was a UK multicentre study looking at the genetics of otitis media, both rAOM and COME. This study was prospective, and between 2009 and 2011 recruited children up to the age of 10 who were undergoing ventilation tube insertion for these disorders at one of 20 centres from across England and Scotland. This was also a family-based study, and aimed to recruit the proband, the parents, and all full siblings. Participants were given written information about the study, either at the time the decision was made for surgical intervention, or by post some time prior to the operation. Participants were invited to join the study in person on the day of surgery. Those who agreed were provided with saliva self-collection kits with verbal and written instructions on their use. Families were asked to complete the samples at home and return them in a prepaid reply envelope. Saliva collection sponges were provided for those too young to spit. To try to improve sample returns, a phone call reminder was instituted seven months into the study (November 2009). This call occurred at —two to seven days after enrolment. The families recruited ranged in size from two (i.e. parent plus child) to six (i.e. parents plus four siblings) members; 94.2% of families described themselves as Caucasian.
Outcomes and Statistical Analysis
We compared the studies on recruitment (agreement to participate), and on saliva sample return rates. We compared study A to study B to ascertain the effect of participant demographics or study protocol on recruitment, and we compared study B to study C to ascertain the effect of retrospective versus prospective recruitment. We compared the effect of the phone call reminder on outcome within study B and within study C. We used Fisher's exact test to determine statistical differences between groups.
We compared available demographic variables between those who did and did not agree to participate in study B and those who did and did not return saliva samples in study C (the same demographic details were not available in all studies). Specifically we tested the hypotheses that families living a greater distance from the research centre may be less likely to participate and that families with young children or with a large family size may be less likely to return saliva samples. We compared DNA yields between men and women in all studies, and between adults and children in studies B and C. We compared saliva volume and DNA yields between observed and unobserved sample collection in study B, separately for adult males, adult females, and children (studies A and C only used unobserved sample collection). For these latter statistical comparisons we used the Mann–Whitney U-test.
Recruitment and Sample Return
The total number of probands invited to participate, the number who agreed to participate, and the number who subsequently returned saliva samples in each study are shown in Table 2. We found highly significant differences between the studies. The highest overall saliva sample returns was seen in study A where 41% of all those invited to participate provided a sample. When comparing the two retrospective studies, those invited to participate in study A (UK study of adult individuals with a chronic condition, samples returned by post) were more likely to participate (49%) than those invited in study B (Australian study of children with a self-limiting condition, samples collected at research centre or on home visit; 21%). Those who agreed to participate in study A were also more likely to return their sample (84%) compared to those in study B (59%). When comparing the two family-based association studies addressing otitis media, those in study C (recruited prospectively) were more likely than those in study B (recruited retrospectively) to agree to participate (72% vs. 21%; P < 0.0001), but fewer families of those who agreed to participate subsequently returned their saliva samples (42% vs. 59%; P < 0.0001). Nevertheless, a greater percentage of invitations resulted in a sample being returned in study C than study B (30% vs. 12%; P < 0.0001).
Table 2. Comparison of recruitment and sample return rates in each of the three studies
1Numbers relate to numbers of probands in study A, and number of families in studies B and C.
2Fisher's exact test.
Jan 2003–Dec 2009
Jan 2003–Dec 2008
Apr 2009–Jan 2011
Effect of Telephone Call
In study B, a large number of families did not respond to the initial written invitation to participate (2311/2782; 83%). In view of this at least one telephone call to encourage participation from those who did not respond to the initial invitation was used. For those families that could be contacted by phone (322/2311; 14%) a modest level of additional recruitment was achieved; although the sample returns from this group were much lower (69/224, 31%) than those who had sought to participate after first contact (271/357, 76%; P < 0.0001). It is of note that the vast majority of those families that did not respond to the initial written invitation in study B could also not be contacted by telephone (86%). It is likely that an indeterminable number of these families had moved away from the only contact address available to the researchers.
In study C, the telephone call served a different purpose; to encourage sample return from those who had already agreed to participate and who had recently provided up to date contact details. Here sample returns were 51/195 (26%) before the telephone call reminder was instigated, compared to 263/554 (47%) afterwards (P < 0.0001).
Effect of Demographics on Sample Return
Study B was carried out in Western Australia, a state that covers 2.6 million km2 and the second largest administrative division in the world. Whilst most families invited to participate lived in the Perth Metropolitan Region (i.e. less than 100 km from the research centre) a small percentage (9%) of invited families lived in rural/remote regions (greatest distance from research centre = 2180 km). Comparing the average distance of all families invited to participate with those that agreed to participate shows that the distance a family lived from the research centre had no influence on participation rates in the Metropolitan population (average distance all invited = 21.6 km; average distance all agreed = 21.2 km) but had a slight, albeit nonsignificant, influence on participation rates in the rural/remote population (average distance all invited = 493 km; average distance all agreed = 345 km; P > 0.05).
Using data available from study C we also looked at family demographics that might influence sample return rates. We found no evidence that age of the proband or having a younger family (defined by mean age of all children) had an effect on sample return (Table 3; P > 0.05). The mean number of children in families returning samples was higher than those not returning samples (1.97 vs. 1.88; P < 0.03), but after Bonferroni adjustment this could not be regarded as significant.
Table 3. Demographic variables of families that did or did not return saliva samples in study C
DNA Yield and Effect of Observed Collection of Specimen
All three studies collected large numbers of saliva samples for purposes of DNA extraction over the included study periods (study A – 2677 samples; study B – 1143 samples; study C – 1289 samples). According to the manufacturer the saliva collection kit utilised by all three studies should provide a median DNA yield of 100 μg (Birnboim, 2004; Rylander-Rudqvist et al., 2006; Rogers et al., 2007; Nishita et al., 2009). Average DNA yield in study A was 124.5 μg (124.7 μg in men and 123.3 μg in women, P = 0.45). In study B it was 76.7 μg for adults (83.3 μg for men and 62.8 μg for women, P < 0.005) and 43.3 μg for children, but in study B only half the saliva volume collected was extracted with the remainder archived suggesting that total DNA yield values are likely to be double. In study C DNA yield was 51.5 μg for adults (47.8 μg for men and 55.9 μg for women, P = 0.56) and 18.1 μg for children.
Whilst the lower DNA yield seen in study C may be due to differences in extraction protocol rather than study protocol, the recruitment strategy employed in study B afforded the opportunity to directly compare DNA yield between recruiter-collected (i.e. observed) and self-collected (i.e. unobserved) saliva samples. We found no statistical difference in DNA yield between observed and unobserved sample collection in adults (data not shown) but there was a marginally significant difference in children (P < 0.05), with unobserved collections giving higher yields. However, the mean age of children in unobserved samples was higher than that in observed samples (13.8 vs. 6.4 years). Given that there is a positive correlation between age and DNA yield in children (linear regression r2 = 0.138), this is likely to account for the result.
Sample size is a major determinant of the power and ability of a genetic association study to dissect susceptibility to common diseases. Our results confirm those of others (Birnboim, 2004; Rylander-Rudqvist et al., 2006; Rogers et al., 2007; Nishita et al., 2009), that DNA yields from saliva are adequate for downstream applications. One of the main advantages of saliva over blood is that it enables noninvasive sample collection from and in the community, and so offers the potential to increase participation and sample size. However, both our study and others (Etter et al., 1998; Etter et al., 2005; Nishita et al., 2009) report significant variation in sample return rates. Previous studies show that most people say they would donate biological samples to research if asked (Wendler, 2006), but here we have tried to dissect factors that may explain variation in participation and sample return.
Our main aim was to compare three large-scale genetic association studies to understand how study protocol may influence sample return. Both study A and study B identified potential participants from a retrospective review of patients who previously attended secondary care for treatment, and made initial contact by mail. That those approached by study A were much more likely to agree to participate and to return their saliva sample than those approached in study B is likely due to a combination of the population being studied and the recruitment protocol. Study A used self-collection of saliva from an elderly population with a chronic condition, whereas in study B saliva collection was from families whose child had a self-resolving condition and used a researcher administered protocol.
Studies B and C were matched in that they both recruited families in whom the proband was suffering from a similar, self-limiting condition. However, study B recruited retrospectively with researcher-administered sample collection, usually at the research centre, whilst study C recruited prospectively with self-administered sample collection at home. We found that in study C many more of those prospectively invited agreed to take part. This is reflected in the overall sample return rate from all those invited, which is significantly higher in study C than in study B. However, a potential confounding factor with a retrospective recruitment protocol (such as in study B) may be that those families from whom no response to the initial invitation was received may not actually have received the invitation if they have moved away from the contact address provided at the time of their contact with secondary care. In this instance, an undetermined percentage of families have not actually been “contacted”. Data from the Australian Bureau of Statistics for the 2007–2008 period show that up to 59% of families with young children had moved within the last five years whilst <20% of those aged 70 and over had moved within the same time-frame (http://www.abs.gov.au/socialtrends). As such, this issue could have a greater impact on recruitment of younger families, especially if the proband's last contact with secondary care was several years ago. It is also possible that other, currently undescribed, factors contribute to the nonresponse rate seen in retrospective study B. However, as the nonresponding families have not consented to participate in any study it is not possible to delineate what these factors might be.
We also found that whilst more families agreed to participate in study C they were subsequently less likely to return their saliva sample than those who agreed to participate in study B. It is perhaps not surprising that those who are required to actively respond to contact in order to participate (study B) are more likely to ultimately provide a DNA sample than those who somewhat passively agree to take part when asked face to face (study C).
A reminder telephone call appeared to give little additional benefit to the retrospective recruitment protocol of study B. This suggests that those who have proactively agreed to participate in a study are more likely to see it through and return their saliva samples than those who have to be asked again if they would like to take part. Ultimately, in study B the telephone call was deemed not to be a cost-effective addition to the retrospective recruitment protocol and was abandoned. Conversely, in the prospective recruitment protocol of study C, a reminder telephone call to those who had previously agreed to participate nearly doubled sample returns, and was deemed cost-effective.
We also tried to discover factors that may influence sample return within studies. This has been looked at previously in adults, where older participants have been shown to be more likely to return samples, but the effect of gender, ethnicity or years of schooling has been contradictory (Kozlowski et al., 2002; Nishita et al., 2009). Here we found that having a younger family has no effect on sample return in a family-based association study. We also found that the distance a family lived from the research centre had no influence on participation rates. Incentives of gifts or money have been reported to give a modest improvement in DNA sample returns (Etter et al., 1998, Bhatti et al., 2009), and such incentives are also known to improve response to questionnaires (Edwards et al., 2009). None of the studies included here offered such incentives, mainly due to budget constraints given these are all large-scale studies. Study B did offer reimbursement of travel costs for families who travelled to the research centre.
Owing to the fact that our study is observational in nature, it is susceptible to threats to validity, including confounding. Some of the observed differences in recruitment may be attributable to both the measured differences described above, and other unmeasured confounding variables. Nonetheless, we have identified several factors that could influence the choice of a retrospective or prospective protocol for recruitment of participants from the community to genetic association studies. For recruitment of participants who are older or have a chronic condition, a retrospective protocol may be appropriate. For recruitment of young families or those with a self-limiting condition a prospective strategy may prove more successful, especially if a follow-up call to encourage sample return is utilised. Our results also confirm that self-collection of saliva samples yields high quantities of DNA, however, researcher-administered collection can be beneficial in limiting costs associated with the nonreturn of collection kits. Future research avenues should include randomised controlled trials of different recruitment strategies nested within large genetic association studies. This will help to clarify the effects of the variables and interventions we have described.