Falling between the cracks: The effect of using different levels of suicide risk exclusion criteria on sample characteristics when recruiting for an online intervention for depression

It has generally become accepted that Internet interventions for depression and other mental disorders can be effective, especially when some form of human guidance is provided. A literature review of mobile apps and webbased therapies for depression concluded that comparable treatments appear to show little difference in efficacy when delivered via different formats (e.g., online vs. facetoface) and furthermore that completely selfguided online interventions show small yet significant positive effects (Cuijpers et al., 2017). While these findings are promising, much of the existing evidence relies on webbased intervention studies that have overlooked an important subgroup of depressed individuals, those who may be at risk for suicide. Received: 12 August 2020 | Accepted: 13 January 2021 DOI: 10.1111/sltb.12761


INTRODUCTION
It has generally become accepted that Internet interventions for depression and other mental disorders can be effective, especially when some form of human guidance is provided. A literature review of mobile apps and webbased therapies for depression concluded that comparable treatments appear to show little difference in efficacy when delivered via different formats (e.g., online vs. face-to-face) and furthermore that completely self-guided online interventions show small yet significant positive effects (Cuijpers et al., 2017). While these findings are promising, much of the existing evidence relies on webbased intervention studies that have overlooked an important subgroup of depressed individuals, those who may be at risk for suicide.

Funding information
This research is funded by the Canadian Institutes of Health Research grant number PJT 153 324; however, the findings and conclusions of this paper are solely of the authors and do not necessarily represent the views of the Canadian Institutes of Health Research. The funding body has no role or influence on the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Abstract
Background: Despite a strong link between suicide risk and depression, a recent literature review found that many effectiveness studies for online depression interventions exclude individuals at risk of suicide. This study scrutinizes how different suicide risk exclusion criteria impact recruitment rates and final sample characteristics. Materials and Methods: Two recruitment periods for an online depression intervention trial utilized different suicide risk cutoff exclusion criteria, a one-point difference on the last item of the Personal Health Questionnaire (i.e., more than 0 (Not at all) vs. more than 1 (Several Days)). Bivariate statistics were used to assess differences in recruitment rates and sample characteristics between these two recruitment periods, while all other eligibility criteria and recruitment strategies remained consistent. Results: The recruitment period using the least restrictive suicide risk exclusion criteria yielded twice as many participants; however, recruited sample characteristics did not significantly differ among demographic or clinical characteristics, despite observable trends. Discussion: Researchers should carefully select suicide risk exclusion criteria that balance recruitment rates, study budgets, and sample selection biases, while minimizing participant harm. Moreover, researchers are urged to report suicide risk exclusion rates and consider these exclusions when interpreting results. Limitations of the results are also discussed.
In a recent review of randomized controlled trials (RCTs) of online interventions for depression, Sander et al. (2020) found that the majority of studies, over 70%, excluded individuals at risk of suicide. Furthermore, an independent review of these studies by the authors revealed that less than 60% of researchers who exclude those at risk of suicide report the number of excluded individuals; and when reported, suicide risk represented a substantial amount of excluded screeners (between 5% and 51%) and up to one third of all eligible participants.
While researchers may have important reasons for excluding these individuals, such as a lack of resources for supporting them (e.g., time, expertise, expense) and resistance from ethics committees, it is unknown how their exclusion may confound study findings or alter sample characteristics. Taking into account that depressed individuals may be at a 25 times greater risk of suicide (American Association of Suicidology, 2014) and that suicidal ideation has been found to be a predictor of treatment seeking for major depression (Magaard et al., 2017), excluding this potentially large subgroup can both limit the external validity of the findings and artificially inflate or mute the interventions' efficacy (Andriessen et al., 2019;Lakeman & Fitzgerald, 2009;Sisti & Joffe, 2018;Zimmerman et al., 2005). Consequently, it is imperative that researchers better understand how the exclusion of individuals at risk of suicide impacts the sample characteristics and recruitment rate of individuals into online depression interventions, which is what this paper aims to address. In an ongoing study testing the effectiveness of an online intervention for depression with and without the provision of a brief alcohol treatment, the authors utilized the last item of the Personal Health Questionnaire (PHQ) to determine suicide risk. During the first four months of recruitment, participants scoring greater than zero (Not at all) on the last item were excluded; however, due to difficulties recruiting, this criterion was expanded to exclude only those scoring greater than one (Several days). As all other parts of the recruitment process and eligibility criteria were identical, it provided a unique opportunity to examine how a small change in suicide risk criteria impacts recruitment rates and subsequently sample characteristics. Due to the exploratory nature of this paper, no a priori hypotheses were made.

Study design
The sample analyzed in this paper is part of a larger ongoing research study for which a study protocol has been published elsewhere (Cunningham et al., 2018). Briefly, individuals currently feeling persistent low mood or depression were invited to participate in a study to "help improve an online intervention for depression" via online advertisements which commenced in April 2019. Eligibility criteria for the study consisted of being 18 years or older, scoring 10 or more on the PHQ (Kroenke et al., 2001), scoring 8 or more on the Alcohol Use Disorders Identification Test (AUDIT) (Saunders et al., 1993), and agreeing to be followed up for 6 months. Additionally, potential participants were screened for suicidal ideation using the last item of the PHQ-9 "Over the last 2 weeks, how often have you been bothered by thoughts that you would be better off dead or of hurting yourself in some way," with the response options Not at all (0), Several days (1), More than half the days (2), and Nearly every day (3). For those recruited between April and July 2019, exclusion criteria included scoring greater than zero on this item, while those recruited after August 2019 were only excluded if they scored more than one.
Those found eligible were also required to provide the study team with contact information (e.g., email, mailing address, telephone number) for verification purposes and complete an online consent form. Once consent was provided, eligible participants were then asked to complete a baseline questionnaire measuring both clinical (i.e., depression, anxiety, alexithymia, past treatment use, quality of life) and demographic characteristics and were randomized to receive an online intervention for depression (i.e., MoodGYM) with or without the additional provision of a brief intervention for alcohol use. Only those who accessed the online intervention(s) were officially enrolled into the study and compensated with a $10 Amazon.ca gift card. In addition, participants who completed the three-and six-month follow-up surveys were compensated with a $20 and $30 Amazon.ca gift card, respectively. It is important to note that individuals who were not found eligible to participate in the study were provided free access to the online depression intervention (i.e., MoodGYM) for the duration of the study. Ethics approval for the research methods used in this study was provided by the standing ethics review committee of the Centre for Addiction and Mental Health (CAMH).

Data analysis
As advertising campaigns, including monthly budgets and ad content, were comparable throughout the sampling, individuals who were screened for inclusion into the study during the first four months of recruitment (April to July 2019) and those screened during the subsequent four months (August to October 2019) were compared using rates of enrollment. A series of chi-square tests were conducted to determine whether the proportion of eligible screeners, completed consents, and intervention registrations differed between the two groups. In addition, those who were recruited into the study from each recruitment period were compared on demographic and clinical characteristics using independent t tests and chi-square tests. All analyses were performed using SPSS, version 25.0.

RESULTS
A total of 5341 screening surveys were completed between April 2019 and October 2019, of which 478 were found eligible to participate, and 316 went on to provide contact information. Overall, 220 potential participants provided consent, and 170 signed up for the intervention and were officially enrolled into the study. The proportion of individuals which were excluded at different stages of recruitment was compared between the first and subsequent four months in Table 1. Overall, a smaller proportion of individuals were found eligible during the first four months of recruitment as compared to the subsequent four months χ 2 (1, N = 5341) = 142.03, p < 0.001. In particular, individuals completing the screener during the first four months were more likely to be screened out for being less than 18 years of age (p < 0.001), experiencing suicidal ideation (p < 0.001), and having a score less than 8 on the AUDIT (p < 0.001), as compared to those screened within the subsequent four months. However, post-eligibility, differences observed between the two recruitment periods did not carry forward to the proportion of those who subsequently completed a consent form (p = 0.677), or accessed the online intervention and were officially enrolled into the study (p = 0.631). In addition, the recruited samples from the different recruitment periods did not display any significant differences in their demographic or clinical characteristics (Table 2).

DISCUSSION
The majority of research trials of online interventions for depression exclude individuals at risk of suicide (Sander et al., 2020); however, this paper sought to determine how different suicide risk exclusion criteria may impact recruitment rates and overall sample characteristics. In general, we found that less restrictive suicide risk exclusion criteria more than doubled the proportion of eligible participants; however, this did not appear to have any effect on the proportion of individuals who completed subsequent stages of recruitment such as providing informed consent or accessing the study intervention. Moreover, no significant differences were observed between the two recruited groups among other mental health symptoms (e.g., depression, anxiety), alcohol use, past treatment use, or demographic characteristics. Indeed, it was quite surprising that the two recruitment groups did not differ on depression symptoms as measured by the PHQ-8 despite less restrictions being placed on only one of the groups. It would have been expected that allowing participants to have some suicidal ideation would have increased that recruited group overall depression symptoms as arguably it should have been a more severely depressed sample. The expected trend, although not significant, was observed within the anxiety scores and quality of life between the groups, as loosening the PHQ criteria lead to recruiting a group that reported slightly greater anxiety and a lower quality of life. As this study was somewhat underpowered to significantly detect these differences, future research should address this in more detail and disentangle the relationship between loosening suicidal ideation on the PHQ and its impact on recruited sample characteristics.
While these findings alone cannot recommend what suicide risk level is optimal for use as exclusion criteria, or  whether individuals should be excluded at all, it provides a learning opportunity for researchers. Potentially excluding nearly half of all eligible participants as a result of a onepoint difference on a 4-point suicide risk item can have important implications on the validity of the data collected, as well as the results of the research, as it can introduce sample selection biases. Moreover, the exclusion of such a large proportion of individuals can also compromise study budgets and timelines as recruiting individuals for online depression interventions can be expensive, time-consuming, and challenging, particularly when subsamples of depressed individuals are being recruited as was the case in this study (i.e., those with hazardous drinking). Therefore, we recommend that researchers carefully balance the use of less restrictive suicide risk criteria, while ensuring that the necessary resources are in place for those who may be at risk of suicide. It is important that researchers balance recruitment goals and data validity, all while minimizing potential harm to participants. In addition, we strongly encourage researchers to be transparent about how many individuals are excluded from their research for being at risk of suicide, how this risk was measured, and consider possible sample selection biases when interpreting results. While this study provided a unique opportunity to examine how different suicide risk cutoff scores impact recruitment rates and sample characteristics, some important limitations should be noted. Firstly, some have questioned the use of the last item of the PHQ-9 for determining suicidal risk in recent literature (Na et al., 2018). However, while it continues to be one of the most popular methods of assessing suicidal risk, we recommend that this research also be replicated using other suicide risk measures that can be utilized online such as item 9 on the Beck Depression Inventory, or suicide questionnaires (see (Beck et al., 1961(Beck et al., , 1988Gega et al., 2005). Secondly, eligibility criteria for this study included hazardous drinking; therefore, results are not generalizable to other depression samples recruited online. Thirdly, despite efforts to maintain similar recruitment strategies throughout the project, it is unclear why significantly more individuals were screened out for being <18 during the first four months of recruitment and whether this impacted other observed outcomes.  T A B L E 2 Sample characteristic differences between participants recruited during first four months vs. subsequent four months.