Data from published trials, trials presented at scientific meetings, and unpublished clinical trials of SSRIs and SNRIs were reviewed. Table 2 reports on some of the efficacy outcomes for each of the trials. In each case, the results based on the pre-defined primary outcome variable are provided. In addition, most studies report response outcome based on the Clinical Global Improvement (CGI-I), which is defined as ‘much’ or ‘very much’ improved (a rating of a 1 or 2). The table also reports the change scores on the primary continuous measure for each trial. In some cases, the additional outcomes provided in the table (e.g., CGI-Improvement or change score) are the same as the primary outcome. Finally, the table provides remission rates (e.g., ‘well’ or symptom-free) for the studies.
Table 2. Efficacy of antidepressants trials
| Fluoxetine 1997 (Emslie et al., 1997)||96 (48 flx, 48 pb)||7–17||CGI: 56% vs. 33% (p = .02) CDRS-R: −20.1 vs. −10.5 (p = .001)||56% vs. 33% (p = .02)||CDRS-R: −20.1 vs. −10.5 (p = .001)||CDRS-R ≤28: 31% vs. 23%|
| Fluoxetine 2002 (Emslie et al., 2002)||219 (109 flx, 110 pb)||8–17||Response, as defined by 30% reduction in CDRS-R: 65% vs. 53% (p = .093)||52.3% vs. 36.8% (p = .028)||CDRS-R: −22.0 vs. −14.9 (p = .001)||CDRS-R ≤28: 41.3% vs. 19.8% (p = .01)|
| Fluoxetine TADS (March et al., 2004)||221 (109 flx, 112 pb)||12–17||CDRS-R (end score): 36.30 ± 8.18 vs. 41.77 ± 7.99 CGI-I: 61% vs. 35%||61% vs. 35%||CDRS-R: −22.64 vs. −19.41||Not yet available|
| Paroxetine 2001 (Keller et al., 2001)||275 (93 par, 95 imipramine, 87 pb)||12–18||≤8 and/or ≥50% reduction on HAM-D from baseline to endpoint via LOCF: 66.7% par vs. 58.5% imipramine vs. 55.2% pb (p = .11 par vs. pb) Change from baseline in HAM-D total score via LOCF: −10.7 par, −8.9 imp, −9.1 pb (p = .13 par vs. pb)||66% vs. 52% vs. 48% (p = .02 par vs. pb)||HAM-D: −10.7 par vs. −8.9 imp vs. −9.1 pb (p = .13 par vs. pb)||HAM-D ≤8: 63.3% vs. 50.0% vs. 46.0% (p = .02 par vs. pb)|
| Sertraline 2003 (Wagner et al., 2003)||376 (189 sert, 187 pb)||6–17||Response, as defined by 40% decrease on CDRS-R (adjusted CDRS-R score, 17 points subtracted) via LOCF: 69% vs. 59% (p = .05)||63% vs. 53% (p = .05)||CDRS-R: −22.84 vs. −20.19 (p = .007)||Not available|
| Citalopram 2004 (Wagner et al., 2004)||174 (89 cit, 85 pb)||7–17||Response, as defined by ≤28 on CDRS-R via LOCF: 36% vs. 24% (p = .05)||47% vs. 45% (NS)||CDRS-R: −21.7 vs. −16.5 (p = .038) (taken from the MHRA report, December 2003)||CDRS-R ≤28: 36% vs. 24% (p = .05)|
| Paroxetine Study#377 (Emslie et al., 2004b)||275 ITT (182 par, 93 pb)||13–18||Response, as defined by ≥50% reduction in MADRS via LOCF: 60.5% vs. 58.2% (p = .702) Mean change in K-SADS-L depression subscale score: −9.33 vs. −8.92 (p = .616)||69.2% vs. 57.3%||MADRS: −13.60 vs. −12.80 (p = .520)||Not available|
| Paroxetine Study#701 (GSK website)||203 ITT (101 par, 102 pb)||7–17||Change from baseline in CDRS-R total score via LOCF: −22.6 vs. −23.4 (p = .684)||49% vs. 46% (p = .563)||CDRS-R: −22.6 vs. −23.4 (p = .684)||Not available|
| Nefazodone (Emslie et al., 2002a)||195 (99 nef, 96 pb)||12–17||CDRS-R at Week 8 via LOCF: Significant (p = .03)||65% vs. 46% (p = .005)||CDRS-R: −26.5 vs. −22.5 (p = .055)||Not available|
| Venlafaxine Study#382 (Emslie et al., 2004)||161 ITT (78 ven, 83 pb)||8–17||CDRS-R at Week 8 via LOCF: −18.1 vs. −16.1 (p = .338)||NS||CDRS-R: −18.1 vs. −16.1 (p = .338)||Not available|
| Venlafaxine Study#394 (Emslie et al., 2004)||193 ITT (101 ven, 92 pb)||8–17||CDRS-R at Week 8 via LOCF: −24.3 vs. −22.6 (p = .386)||NS||CDRS-R: −24.3 vs. −22.6 (p = .386)||Not available|
| Citalopram (MHRA Report, Dec. 2003)||233 ITT||13–18||K-SADS-P total score over time: −12.4 vs. −12.7 (no significance reported by MHRA report, December 2003)||Not available||K-SADS-P: −12.4 vs. −12.7||Not available|
| Mirtazapine (Data on File, Organon)||126 ITT||7–17||CDRS-R total score: 35.1 vs. 37.2 (p = .421)||59.8% vs. 56.8% (p = .75)||HAM-D21: −12.22 mir vs. −11.07 pb (p = .419)||Not available|
| Mirtazapine (Data on File, Organon)||124 ITT||7–17||CDRS-R total score: 35.4 vs. 38.8 (p = .19)||53.7% vs. 41.5% (p = .20)||HAM-D21: −11.82 mir vs. −9.76 pb (p = .107)||Not available|
| Nefazodone (noted in Emslie et al., 2002b)||UNK||7–17||NS||NS||NS||Not available|
Fluoxetine. Three large double-blind placebo controlled trials have been conducted with fluoxetine (Emslie et al., 1997, 2002a; March et al., 2004). The first was a single-site trial of 96 outpatients, age 7–17 years, with Major Depressive Disorder (MDD). Following a 2-week evaluation and 1week single-blind placebo run-in, subjects were randomized to fluoxetine, 20 mg/day (n = 48) or placebo (n = 48) for 8 weeks. Primary outcome measures were global improvements on the Clinical Global Impression Scale (CGI-I) and change in severity based on the Childhood Depression Rating Scale Revised (CDRS-R, P02). Based on CGI-Improvement of 1 or 2 (very much or much improved), 56% of subjects receiving fluoxetine and 33% of subjects receiving placebo responded to treatment at exit from the study (p = .02). Weekly CDRS-R scores were also significantly different between the two groups by week 5 (p = .03) and continuing through week 8 (fluoxetine 38.4 ± 14.8 versus placebo 47.1 ± 17.0; p < .008). Change in CDRS-R score slope was also significantly different between groups, with the fluoxetine group improving 2.75 U per week, compared to only 1.27 U in the placebo group (p = .04). No significant differences were found between the two groups on a general psychiatric scale (Brief Psychiatric Rating Scale for Children), Clinical Global Assessment Scale (CGAS) or depression self-reports such as the Beck Depression Inventory (BDI) and the Weinberg Screening Affective Scale (WSAS).
In a multi-site study replicating Emslie et al. (1997), 219 children and adolescents with MDD were randomized to fluoxetine (n = 109) or placebo (n = 110). Like the original study, subjects underwent a 2-week evaluation phase, followed by a 1-week placebo run-in. Subjects randomized to fluoxetine received 10 mg per day for the 1st week, and then 20 mg per day for 8 weeks, for a total of 9 weeks of acute treatment. Unfortunately, the prospectively defined response criterion of 30% in CDRS-R was not significant (p = .093). However, multiple other outcome variables did significantly favor fluoxetine over placebo, including percentage decreases in CDRS-R symptoms of 20%, 40%, 50%, 60%, and 70%. In addition, the mean CDRS-R score at endpoint was significantly lower in the fluoxetine group (35.1 ± 13.5) than the placebo group (40.2 ± 13.5l p < .001). Rate of improvement based on the mean change in CDRS-R was also significant at week 1 and continued for the remainder of the study (p < .05). Based on CGI-Improvement of 1 or 2, 53.3% of fluoxetine treated subjects were considered responders compared with 36.8% of placebo treated subjects (p = .028). CGI severity and MADRS scores were also significantly better for fluoxetine treated patients. Like the previous fluoxetine study, there were no significant differences between global assessment of functioning scores (GAF) or the self-report measure (BDI or CDI).
A more recent multi-site trail, TADS (Treatment for Adolescent Depression Study) (March et al., 2004), compared outcomes in subjects treated with 12 weeks of fluoxetine (10 to 40 mg/day), cognitive behavioral therapy (CBT), combination treatment of fluoxetine and CBT, and placebo in 439 patients between the ages of 12 to 17 years. Placebo and fluoxetine alone were administered double-blind while CBT alone, and CBT with fluoxetine were administered single-blind (the independent evaluator rating the outcome measures was blind to treatment conditions). This study confirmed the efficacy of fluoxetine both alone and in combination with CBT for treating adolescent depression. Rates of response, based on CGI-I of 1 or 2, for fluoxetine with CBT was 71.0% (95% CI, 62–80%); fluoxetine alone, 60.6% (95% CI, 51–70%); CBT alone, 43.2% (95% CI, 34–52%); and placebo, 34.8% (95% CI, 26–44%).
Finally, an earlier study by Simeon et al. studied 40 patients aged 13 to 18 years in a placebo-controlled double-blind study of fluoxetine. Fluoxetine was not statistically superior to placebo on any of the outcome measures. However, the results of this study are difficult to explain due to the small sample size and incompletely described methodology (Simeon et al., 1990).
Thus, in these studies of fluoxetine most of the outcome measures were positive, though self-report measures did not differ significantly between groups. However, self-reports, even at baseline, showed wide variability, which may account for the difficult interpretation at endpoint. One important point is that these trials differ from an earlier controlled trial by Simeon et al. that showed no significant differences between fluoxetine and placebo in 40 subjects, where the fluoxetine group showed more improvement than the placebo group, but differences were not significant. While two-thirds of the sample (on both fluoxetine and placebo) showed marked or moderate clinical global improvement, this study was small and the methodology is not clearly explained, making it difficult to interpret these results.
Paroxetine. Three double-blind, placebo-controlled trials have been conducted with paroxetine. The only published report of these trials involved a study of 275 adolescents (age 12–18) with MDD at twelve centers across the US and Canada (Keller et al., 2001). Adolescents were randomized to paroxetine, imipramine, or placebo for 8 weeks. Dosing of paroxetine was initiated at 20 mg/day, with optional increase to 30–40 mg/day after at least 4 weeks of treatment. Imipramine was initiated at 50 mg/day and gradually increased to 250–300 mg/day after week 4. The two primary outcome measures were 1) HAM-D ≤8, or 50% decrease from baseline, and 2) change in HAM-D total score. On the first primary outcome (HAM-D ≤8 or 50% decrease), no statistical difference was found between paroxetine, imipramine, and placebo (66.7% vs. 58.5% vs. 55.2%. respectively). Change in the HAM-D total score was also not significant (−12.2 ± .88 vs. −10.6 ± .97 vs. −10.5 ± .88).
Secondary outcomes, however, suggested positive effects of paroxetine. With CGI-I of 1 or 2, response rates were 65.6% in the paroxetine group, compared to 52.1% for imipramine and 48.3% for placebo (p = .02). HAM-D and K-SADS-L depressed mood items were also significantly better in the paroxetine group (p = .001, p = .05, respectively). Finally, remission rates, defined as HAM-D ≤8 (part of the first primary outcome), were significantly higher in subjects treated with paroxetine (66%, p = .02) compared with imipramine (50%) and placebo (46%). Differences in mean CGI scores, K-SADS-L depression sub-scores, and HAM-D totals at endpoint were not statistically significant between groups, however.
Two other double-blind, placebo-controlled trials of paroxetine have been conducted, but data remains unpublished. However, GSK has posted the study reports for these trials on their website (http://www.gsk.com/media/paroxetine.htm).
In a study of adolescents only (study#377), conducted in 33 centers internationally, 275 (ITT) adolescents (age 13–18) were randomized to paroxetine, 20–40 mg/day (n = 182), or placebo (n = 93) for 12 weeks. No differences were found between paroxetine and placebo on the primary outcome variables (≥50% decrease from baseline MADRS and change from baseline in K-SADS-L depression subscale). At the Week 12 endpoint, 60.5% of paroxetine subjects and 58.2% of placebo patients had responded (based on 50% decrease in MADRS). Although not statistically significant, older adolescents (>16) tended to show greater improvements with paroxetine than placebo, while younger subjects (≤16) had high placebo response rates. No treatment differences were seen on other outcome variables (CGI Severity, CGI Improvement, BDI, and Mood and Feelings Questionnaire).
A final study of children and adolescents (study#701) was an 8-week trial conducted in 41 centers across the US and Canada. Subjects (N = 203) were children and adolescents (age 7–17) randomized to paroxetine 10–50 mg/day (n = 101) or placebo (n = 102). The primary outcome measure (change from baseline on CDRS-R total score) did not show paroxetine to be more efficacious than placebo. Secondary variables (CGI-Severity, CGI-Improvement, and GAF) were also negative. There was evidence of a treatment by age group interaction. Children (age 7–11) in the placebo group had greater improvements on the CDRS-R than children on paroxetine (p = .054). No differences were found between paroxetine and placebo in the adolescent group. Thus, only the published paroxetine study showed some positive outcomes, while the two unpublished studies did not.
Sertraline. Two identical studies were conducted in 53 centers in the US, India, Canada, Costa Rica, and Mexico to compare sertraline and placebo. Wagner and colleagues report that the studies were combined a priori; providing results on 376 children and adolescents (age 6–17) with MDD. Subjects were randomized to sertraline (n = 189; 50–200 mg/day) or placebo (n = 187) for 10 weeks. Based on the primary outcome measure (change from baseline in CDRS-R), sertraline subjects showed significantly greater improvement (−22.84) than placebo subjects (−20.19; p = .007) over the course of the study. Those completing all of the 10 weeks of treatment showed even greater differences in decrease of CDRS-R scores (−30.24 versus −25.83, respectively; p = .001). Mean change in CGI-Severity scores showed similar differences (−1.99 versus −1.58, respectively; p = .001), and mean CGI improvements also favored sertraline (2.02 versus 2.3, respectively; p = .009). Although the mean CGI scores are statistically different between sertraline and placebo, because the ranges are so small (1–7), it is difficult to interpret the clinical significance of these findings.
Responders were defined using 2 measures. First, based on a 40% decrease in CDRS-R at end of study using the last observation carried forward, 69% of sertraline treated subjects and 59% of placebo treated subjects were considered responders (p = .05). Second, based on CGI- Improvement of ‘much’ or ‘very much’ improvement, 63% of sertraline treated subjects and 53% of placebo treated subjects were considered responders (p = .05). Although only a 10% difference was found between active medicine and placebo, due to the large number of subjects in this trial, the difference is statistically significant. Other significant outcomes include several individual items on the CDRS-R, low self-esteem, excessive weeping, listless speech, and hypo-activity. Although numerically greater improvements were seen in the sertraline group on the Multidimensional Anxiety Scale for Children (MASC), Pediatric Quality of Life Enjoyment and Satisfaction Questionnaire (PQ-LES-Q), and Children's Global Assessment Scale (CGAS), these differences were not significant.
Of interest is that greater differences were noted in adolescents than in children. For example, the CDRS-R mean change noted between drug and placebo in adolescents was −21.55 (sertraline) versus −18.20 (placebo; p = .01), while in children the difference was not significant (−24.05 versus −22.20, respectively, p = .19). However, because the study was not powered to detect differences between age groups, these findings only serve to highlight areas of interest for future research.
One area of debate surrounding the sertraline data is that the 2 studies were pooled for the analyses. Individually, both studies were negative. That is, there was no difference between drug and placebo. The primary efficacy variable (change score in CDRS-R from baseline) was similar between drug and placebo for both individual studies (Study 1: −25.9 vs. −22.1, p = .084; Study 2: −28.8 vs. −25.6, p = .17). The response rates (based on 40% decrease in CDRS-R) in Study 1 were 62.4% vs. 56.8% for sertraline and placebo (p = .46); response rates for Study 2 were 75% vs. 60.4%, respectively (p = .033), which is significantly different (data on the individual studies were obtained from the MHRA report). The fact that the primary outcome variables on the individual trials were not positive could be in part due to the high placebo rate, and the studies individually were underpowered to detect a difference. The authors suggest several possible reasons for the high placebo rate, including large numbers of participating sites, few subjects within each site, and multinational sites. Although no site-by-site differences were found, these factors easily could have impacted the study. Nonetheless, the studies were combined a priori, and the fact remains that statistically significant improvements in sertraline over placebo were found on the majority of the outcome measures.
Citalopram. Two studies have been conducted comparing citalopram and placebo. However, only one has been published (Wagner et al., 2004). This study was an 8-week multi-site study of 178 children and adolescent outpatients (ages 7–17) with Major Depressive Disorder, or MDD. Following a one-week placebo run-in, subjects were randomized to citalopram (20–40 mg/day) or placebo. Four citalopram subjects were lost to follow-up and did not receive study medication. Thus, the analysis included 89 subjects on citalopram and 85 subjects on placebo. Primary outcome was change in CDRS-R score from baseline to week 8 or early termination. Citalopram showed significant improvement over placebo as early as week 1 (F = 6.58, p < .05), and persisted throughout the study. Although not reported in the paper, the change scores were −21.7 vs. −16.5 (p = .038; MHRA report). Additionally, more citalopram treated subjects met the defined response criteria of CDRS-R ≤28 than placebo treated subjects, (36% versus 24%, respectively, p < .05). Although these response rates appear quite low, the defined response criterion used was full remission (e.g., ‘well’), and reported rates of remission are similar to those found in other studies (Emslie et al., 1997, 2002a). No differences were found on CGI Improvements of 1 or 2 (much or very much improved) between citalopram and placebo (47% versus 45%, respectively) or mean CGI Severity (4.4 versus 4.3, respectively). Other outcomes were not reported, however. Thus, it would have been of interest to know other response rates (i.e., percent change on CDRS), because the CGI-Improvement response rates were negative and the pre-defined response rate of CDRS-R ≤28 is more often considered a remission rate. Of note, it appears unusual that the CGI-Improvement rates were not significantly different, while the continuous measure (CDRS-R) did show a difference on change scores and based on cut-offs. Thus, it is unclear why these outcomes are so different within a single trial.
The second citalopram study was a 12-week trial of citalopram (10–40 mg/day) or placebo (MHRA website). Two hundred and thirty-three adolescents (age 13–18), both inpatients and outpatients, who received at least 1 dose of medicine were included in the analysis. Significant differences were not found on the primary outcome measure (change from baseline on K-SADS-P total score) or any other outcome measures. Because this study has not been published and full data are not available for review, it is difficult to interpret the results. Nonetheless, two clear limitations stand out about the study. First, both inpatients and outpatients were included, which suggests increased illness severity in subjects; second, only 74 (60%) of subjects in the citalopram group and 79 (66%) in the placebo completed the study. These issues alone cause some hesitation in evaluating the results.
Venlafaxine. Two double-blind, controlled trials of venlafaxine have been conducted in children and adolescents (age 7–17) with MDD. The studies showed no differences between drug and placebo on any outcomes. On study 1 (#382), the mean decrease on the CDRS-R was −18.1 for venlafaxine and −16.1 for placebo (p = .338); on study 2 (#394), the mean decrease on the CDRS-R was −24.3 for venlafaxine and −22.6 for placebo (p = .386) (MHRA website; Emslie et al., 2004). Based on CGI-Improvements of 1 or 2, response rates for Study 1 were 50% for the venlafaxine group and 41% for the placebo group (p = .314). In Study 2, 68% of the venlafaxine subjects and 61% of the placebo subjects were considered responders based on CGI-Improvement (p = .295).
The data from these two studies were pooled and presented at the American Psychiatric Association in 2004 (Emslie et al., 2004). Three hundred and thirty-four subjects were randomized to venlafaxine 37.5–225 mg/day (n = 169, ITT) or placebo (n = 165, ITT) for 8 weeks in 52 centers across the US. No difference was found between venlafaxine and placebo on the primary outcome (CDRS-R endpoint). Analyses were also conducted based on age groups (7–11 and 12–17). In the child group, no differences were found; however, in the adolescent group, significant differences were found between drug and placebo on the CDRS-R change score (−24.4 vs. −19.9; p = .02), suggesting that adolescents may show more response to venlafaxine treatment than younger children. The individual studies were underpowered to evaluate age group differences, however.
As mentioned previously, one other underpowered study (n = 40) has compared venlafaxine to placebo, with no differences found between active treatment and placebo (Mandoki et al., 1997).
Nefazodone. One double-blind, placebo-controlled trial has been reported which compared nefazodone to placebo. This trial involved 195 adolescents (aged 12–17) with MDD at 15 different sites in the US (Emslie et al., 2002a). Following a 2–4-week baseline phase, subjects were randomized to receive either 8 weeks of nefazodone (n = 99), or placebo (n = 96). Those randomized to nefazodone began at 100 mg/day, with a dosage increase of 100 mg/week to reach to desired amount of 300–400 mg/day. The primary outcome measure was a comparison of mean CDRS-R scores from baseline to week 8 between those taking nefazodone or placebo. CDRS-R scores over the entire 8-week trial showed a statistically significant change in favor of the nefazodone (p = .03). At week 7, nefazodone demonstrated a significant difference in CDRS-R scores over placebo (−26.7 vs. −21.3, respectively, p = .006), as well as a 4.0-point improvement in scores by week 8, though just failing to reach significance (−26.5 vs. −22.5, respectively, p = .055).
Likewise, secondary outcomes suggested possible effectiveness of nefazodone over placebo. At week 8 there was a greater increase in CGI response rate (65% vs. 46%, respectively, p = .005), as well as CGI Improvement (2.3 vs. 2.8, respectively, p = .012), and CGI Severity (−1.7 vs. −1.3, respectively, p = .022) with nefazodone use. Change in HAM-D scores were also significantly improved in nefazodone group (−10.0 vs. −8.2, respectively, p = .023), as were scores on CGAS (17.2 vs. 13.0, respectively, p = .020). Although this study seemed to demonstrate that nefazodone is safe and effective in acute treatment of adolescents with MDD, an unpublished second depression trial in pediatric patients (mentioned in Emslie et al., 2002b) found no significant differences between nefazodone and placebo.
Mirtazapine. Two multi-center trials of children and adolescents (age 7–17) with MDD were conducted to compare mirtazapine and placebo. Participants were randomized to 8 weeks of mirtazapine 15–45 mg/day or placebo. In the first study, 126 youth (82 mirtazapine, 44 placebo) were randomized, and included in the efficacy (ITT group) and safety (AST group) analysis. No significant differences were found on any of the outcome variables. The primary outcome variable, CDRS-R total score at endpoint, was similar for the 2 groups: 35.1 ± 1.6 for mirtazapine and 37.2 ± 2.2 for placebo (p = .421). Other depression outcome measures were similar, including mean change in CGI-Severity (−1.71 ± .14 vs. −1.48 ± .19; p = .322), and change in HAMD-21 (−12.22 ± .84 vs. −11.07 ± 1.15; p = .419). Rates of response (defined as CGI-Improvement of 1 or 2) were also similar between active treatment and placebo (59.8% vs. 56.8%; p = .75).
In the second study, 133 youth (88 mirtazapine, 44 placebo) were randomized; however, only 132 were included in the safety analysis (AST group), and 124 were included in the efficacy analysis (ITT group). Similar to the initial study, none of the outcome measures were significantly different between the two groups. The CDRS-R total scores (primary outcome measure) were 35.4 ± 1.5 for mirtazapine, compared to 38.8 ± 2.1 for placebo (p = .19). Other depression outcomes were as follows: mean change in CGI Severity (−1.51 ± .14 vs. −1.15 ± .19; p = .127), and change in HAMD-21 (−11.82 ± .73 vs. −9.76 ± 1.04; p = .107). Of interest is that in this second study, 53.7% of those on mirtazapine and only 41.5% of those on placebo were considered responders based on CGI-Improvement (1 or 2). The numerical difference between the two groups was larger (over 12%) than the difference found in the sertraline trial (only 10%). However, due to the smaller sample size, the difference was not significant (p = .2).
Methodological issues regarding efficacy outcomes
In reviewing the efficacy data for antidepressants in adults and children, it does not appear the response in adolescents to SSRIs is different from adults. Regulatory agencies generally require 2 positive trials to declare a medication effective. With up to 50% of adult trials failing, it frequently takes up to 4–5 studies to achieve the required 2 positive trials. It is unlikely that more than 2 studies will ever be completed in the pediatric population with a single medication, as pharmaceutical sponsors are only required to conduct 2 pediatric trials. In fact, in disorders such as OCD only 1 pediatric trial was required for efficacy.
A second consideration is whether individual SSRIs differ from one another. Such a conclusion is difficult to reach, as no agents have been compared directly. Likewise, however, there is no evidence in adults that the medications are particularly different from each other. Relatedly, placebo response rates in the different clinical trials vary from 33% to 60%, yet we do not assume that the placebo pills have different levels of efficacy. Thus, it is more likely that the differences seen in outcomes across the pediatric studies are a result of methodological differences. Table 3 provides some of the methodological issues that likely contribute to the varying rates of response across trials. In terms of efficacy, these issues involve site selection, study population, study design and outcomes measures. More specific details about the methodology of these studies can be found in a review of the study designs used by the FDA Advisory Committee to evaluate the studies in September 2004 (http://www.fda.gov/ohrms/dockets/ac/04/briefing/2004-4065b1-08-TAB06-Dubitsky-Review.pdf).
Table 3. Factors influencing study outcome
| Number of sites|
| Number of subjects per site|
| Experience of sites and investigators|
| Inclusion/exclusion criteria: age, diagnosis, prior and current comorbid conditions, severity of illness, etc.|
| Recruitment strategies|
| Duration of assessment|
| Use of placebo run-in period|
| Duration of acute treatment|
| Adult scales vs. child-specific scales|
| Categorical vs. continuous|
| Cut-off scores vs. percent change|
Site selection. The first difference in trials is simply the number of sites selected for the study. Obviously, the more sites used, the more variability in the conduct of the study, which impacts outcome. Table 4 shows the difference between active medication and placebo in the 6 published trials of SSRIs (Emslie et al., 1997, 2002a; Keller et al., 2001; March et al., 2004; Wagner et al., 2003, 2004). The studies with the highest number of subjects per site had the largest drug–placebo differences, compared with the smaller differences in studies with more sites and fewer numbers of subjects per site. The single site fluoxetine trial had 96 subjects recruited within that site, while the average number of subjects per site in the sertraline trial was approximately 8 subjects (and in many cases, sites only recruited 1–2 subjects). Clearly there may be other factors that contributed, which will be reviewed next, but number of subjects per site may play an important role in outcome.
Another site difference is the experience of the site and investigators in 1) assessing and treating pediatric depression, and 2) utilizing rating scales for outcome measures. The authors of the sertraline study, for example, suggest that the high placebo response rate may have been due in part to the wide variability in sites, including experience of investigators and low numbers of subjects recruited at each site.
Study population. The 2nd area of trial variability is the subject population, including inclusion/exclusion criteria and recruitment strategies. In each of the studies listed here, all subjects were required to meet DSM criteria for MDD with at least moderate severity (DCM-IV; American Psychiatric Press, 1994). However, other factors, such as age, severity of illness, co-morbid disorders, treatment history, and ongoing psychotropic and psychotherapeutic interventions varied across studies. For example, most of these trials included only outpatients, but the unpublished citalopram study allowed inpatients as well (MHRA website). It is likely that the severity of depression was substantially higher in this study. Another example is that some trials excluded subjects with certain comorbid disorders, while others did not. An example is the exclusion of patients with eating disorders in the published paroxetine study (Keller et al., 2001). Likewise, some trials allowed ongoing supportive psychotherapy, strictly prohibited in others.
Another variable factor across trials is age. In some trials, only adolescents were included, while in others children were also included. Some of the negative studies have shown a difference between active medication and placebo in the adolescents, but not in the children (Emslie et al., 2004a; Wagner et al., 2003). However, it should be noted that some of the trials that have included children have been positive (Emslie et al., 1997, 2002a; Wagner et al., 2003, 2004).
Source of recruitment may also be a factor explaining differences in trial outcomes. Because no studies have explicitly examined this, the question of variability of severity across different types of sites remains unanswered. Thus, are subjects seen in psychiatric offices different (i.e., more severe) than those seen in general practitioner offices? If so, are less severe cases more likely to be placebo responders? In a study of CBT with depressed teens, Brent and colleagues reported higher response rates in subjects coming in from advertisements than clinical referrals (Brent et al., 1998), raising the question of whether clinical referrals may be more severe cases than those coming in through advertisements.
Study design. The 3rd area affecting outcomes of trials is the study design. Duration of assessment, use of placebo run-in, duration of acute treatment, and dosing all impact results. Rintelmann et al. reported that an extended evaluation period (i.e., 2 weeks) led to improvement in some subjects during the course of the evaluation (Rintelmann et al., 1996). Similarly, placebo run-ins are also useful in eliminating some of the placebo responders. The 2 fluoxetine studies (Emslie et al., 1997, 2002a), 1 paroxetine study (#377), 1 citalopram study (Wagner et al., 2004), and both venlafaxine studies used placebo run-in periods, while the others did not.
Duration of the acute trial must also be adequate. In most of the trials described here, the duration of acute treatment was 8 weeks (a few trials were longer). Of interest, however, is that most subjects do not reach clinical remission by 8 weeks of treatment (Emslie et al., 1997). Thus, if remission (e.g., CDRS-R ≤28) is the primary outcome, most subjects will not have received adequate treatment to achieve this level of remission.
Finally, dosing of medication for all trials have been conducted based on extrapolating from adults without benefit of dose-finding studies. This appears less of an issue for adolescents, but in children it is possible that lower doses could have been effective with fewer side effects. An example of this is the sertraline trial. The discontinuation rate due to adverse events was substantially higher in the children than adolescents. The mean dose of sertraline in the study was 131 mg/d. The report does not specify the mean dose for children versus adolescents, but the authors comment that there was a question of whether the dosing was too high in the children considering the high dropout rate in this subgroup (Wagner et al., 2003). Unfortunately, no dose-finding studies were conducted prior to these trials. It is possible that dosing may have impacted treatment outcomes, whether due to inadequate dosing or overdosing leading to premature discontinuation from side effects.
Outcome measures. Research in pediatric depression has advanced significantly over the past 2 decades. However, rating scales for depression severity specific to this age group were limited initially. As such, many trials utilized rating scales designed for adults. In some cases, these scales may not have yielded adequate clinical information. For example, in children and adolescents, irritability may be the primary mood symptom, even in the absence of depression itself (DSM-IV; American Psychiatric Press, 1994). Thus, scales that do not contain a rating for this symptom are missing an important aspect of pediatric depression. Currently, the Childhood Depression Rating Scale Revised (CDRS-R) (Poznanski et al., 1984) is considered the most appropriate and widely used severity measure for this population. However, even using an appropriate continuous measure, the analysis of that measure affects the outcome of the trial. Some trials have used cut-off scores (e.g., CDRS-R ≤28), while others use percent change from baseline. The clearest picture of the impact of outcome measure selection can be seen in the multi-site fluoxetine trial (Emslie et al., 2002a). As Table 5 (Emslie et al., 2002a) depicts, depending on the percent change selected, remarkably different response rates for placebo can be seen (21–69%). Even the 30% decrease in the CDRS-R was significantly different between drug and placebo, but the reported rate in the published manuscript was calculated incorrectly. The CDRS-R has a minimum score of 17, and the response rate calculation that corrects for the non-zero minimum should be used. Unfortunately, in this study, 30% change in CDRS-R score was the primary outcome variable and was reported as negative in the manuscript based on the incorrect calculation. As such, some scientists may consider this trial negative, despite that the corrected calculation and multiple other outcome measures were positive.
Table 5. Calculations of response based on CDRS-R varying percent change
| ∫20%||95 (87.2)||69 (68.3)||.001|
| ∫30%||86 (78.9)||62 (61.4)||.006|
| ∫40%||77 (70.6)||56 (55.4)||.031|
| ∫50%||63 (57.8)||41 (40.6)||.014|
| ∫60%||56 (51.4)||29 (28.7)||.001|
| ∫70%||40 (36.7)||21 (20.8)||.015|
Clearly, special attention must be paid to the selection of the primary outcome variable, as this is the focus of regulatory agencies. However, in reviewing the studies, a decision about whether a study is positive or negative cannot always be made strictly based on the primary outcome measure. All outcome information should be synthesized so that clinicians can make informed decisions about the integrity of the study, and thus the efficacy of the medication. Future studies should also consider having standard outcomes for each trial which would allow for easier comparison and interpretation. Of course, comparison would also be made easier if there is rapid disclosure and publication of trials data such as with a trials’ registry.
As with other pharmacological treatments, the indication for the use of antidepressants in children and adolescents is determined by weighing the risks and benefits of introducing an active treatment. Therefore, with the growing questions about the efficacy of antidepressants in children and adolescents for the treatment of depression, the safety profile of this class of medications has also become a topic of great interest for consumers, clinicians and researchers. Further fueling this interest is the possible link between the use of antidepressants and the increased risk for suicide-related events in children and adolescents with depression.
In this section, we review safety data from the same clinical trials as those reviewed in the efficacy section. However, given the extensive review of the safety data by the FDA, we considered the FDA data as the most reliable in this section. As with the efficacy data, there were only limited safety data from the Simeon et al. and Mandoki et al. studies. In the venlafaxine study conducted by Mandoki et al., nausea and increased appetite were more likely to occur in subjects treated with venlafaxine compared to placebo (Mandoki et al., 1997). There were no other reports of adverse events. Similarly, the Simeon study did not report differences in rates of adverse events between subjects on fluoxetine or placebo (Simeon et al., 1990). Therefore, these 2 studies are omitted from the majority of the sections below.
Serious adverse events (SAEs) or adverse events that lead to discontinuation of treatment make up only a small percentage of the total number of adverse events in both placebo and active treatment groups in the clinical trials. It is important to keep in mind that reports of adverse events leading to discontinuation and SAEs may or may not overlap. In other words, a subject who has an SAE may or may not discontinue from the study. The alternative is also true: subjects who discontinue due to an adverse event may not be a SAE. Similarly, suicidal events may or may not be classified as SAEs or reasons for discontinuation. For example, a subject who has suicidal ideation may be maintained in the study, so that event would not count as a discontinuation, and if it is not life threatening or does not lead to hospitalization, it is not an SAE. On the other hand, a serious attempt that requires hospitalization and removal from the study would count as both an SAE and a reason for discontinuation. In all of these cases, the determination of how to categorize an event is made at the site level by the investigators. Thus, the following sections provide details of the studies related to those categories (adverse events, adverse events leading to discontinuation, serious adverse events, and suicidality), and subjects may be listed in one or more areas. Finally, with the recent FDA warning, details regarding behavioral activation, hostility and switch to mania are addressed in a separate section.
Adverse events. Adverse events are common in antidepressant trials with children and adolescents. Methods for eliciting adverse events, however, vary widely across trials, as there are currently no systematic methods to ascertain adverse events. In most studies, adverse events are collected from spontaneous reports from patients and families. Only the multi-site fluoxetine trial used an additional method to obtain adverse events (standardized Side Effects Checklist; Emslie et al., 2002a). Furthermore, the spontaneous events collected may or may not be related to the medication, and this determination is generally made by the investigator.
The most common adverse effects are generally physical rather than the more controversial and serious adverse events such as hostility, mania or suicide-related events. In addition, although adverse events are common in pediatric trials, adverse events are seen frequently in subjects on both active medication and placebo. Few side effects occur statistically more frequently in the active treatment groups. The most common physical adverse events seen with SSRIs and SNRIs in children and adolescents are similar to those reported in adults in the Physicians’ Desk Reference (2004), but may occur slightly less frequently than in adults.
In addition to physical adverse event, some patients may experience behavioral symptoms, such as aggression, hostility, mania, behavioral activation, motor restlessness, etc. Although not as common as physical symptoms, these adverse events are serious problems when they occur. Patients receiving antidepressant treatment should be monitored for such symptoms, and health care providers should adjust treatment interventions as needed. In the antidepressant trials reviewed, most did not list spontaneously reported adverse events (physical or behavioral) unless they met a certain level of difference (e.g., ≥5% and at greater incidence than placebo). However, if the events led to discontinuation or were considered a serious adverse event, they were specifically listed in most cases. These will be reviewed in the Discontinuation and SAE sections of this review.
Adverse events leading to discontinuation. One measure of general safety is the rate of discontinuation due to adverse events. In all of the placebo-controlled trials of SSRIs and in some of the SNRIs, rates of discontinuation due to adverse events were reported. Table 6 shows the rates of discontinuation due to adverse events in each of the placebo-controlled trials for both SSRIs and SNRIs. The rates of discontinuation are consistent across all studies ranging from 2 to 12%. In general, the rates of discontinuation were higher in the active medication groups versus the placebo groups with all medications, although the differences, where reported, were not statistically significant.
Table 6. Rates of discontinuation due to adverse events and rates of SAEs
| Fluoxetine 1997||10.4% (5)||4.2% (2)||NR||2%(1)||2% (1)||Emslie et al., 1997|
| Fluoxetine 2002||4.6% (5)||8.2% (9)||.408||1% (1)||4% (4)||Emslie et al., 2002a|
| Fluoxetine 2004||UNK||UNK||UNK||UNK||UNK||March et al., 2004|
| Paroxetine 2001||9.7% (9)||6.9% (6)||NR||12% (11)||2% (2)||Keller et al., 2001|
| Sertraline 2003||9% (17)||2% (4)||NR||4% (7)||3% (6)||Wagner et al., 2003|
| Citalopram 2004||5.9% (5)||5.6% (5)||NS||0% (0)||NR||Wagner et al., 2004|
| Paroxetine (Study#701)||8.9% (9)||2% (2)||NR||5.9% (6)||1% (1)||Emslie et al., 2004b|
| Paroxetine (Study#377)||11.8% (22)||7% (7)||NR||11.8% (22)||6.5% (6)||GSK website|
| Nefazodone||3% (3)||3% (3)||NR||None||NR||Emslie et al., 2002b|
| Venlafaxine (combined)||10% (18)||3% (5)||NR||8% (14)||3% (5)||Emslie et al., 2004a|
| Citalopram||10.7% (13)||8% (9)||UNK||UNK||UNK||MHRA Report (Dec. 2003)|
| Nefazodone||UNK||UNK||UNK||UNK||UNK||Emslie et al., 2002b|
| Mirtazapine||5.3% (9?)||3.4% (3?)||UNK||1.2% (2)||1.1%(1)||Organon data on File|
In the single site fluoxetine trial (Emslie et al., 1997), 2 subjects (4%) on placebo and 5 on fluoxetine (10%) discontinued the study due to adverse events. One subject on placebo developed mania and one had a suicide-related event. Three of the fluoxetine subjects developed manic symptoms, one developed a rash, and one had a suicide-related event. In the multi-site fluoxetine trial (Emslie et al., 2002a), rates of discontinuation were not statistically different between the active treatment group (4.6%) and the placebo group (8.2%), but were slightly higher in the placebo group. In the fluoxetine group, 1 each discontinued for rash, agitation, constipation, hyperkinesias and manic reaction (n = 5). In the placebo group, 1 each discontinued for rash, abdominal pain, alopecia, anxiety, dizziness, headache, kidney infection, aggressive behavior, and self-mutilatory behavior (n = 9). Three of the placebo subjects who discontinued treatment were also considered SAEs because the event required hospitalization (kidney infection, aggressive behavior, and self-mutilatory behavior). Data are unavailable for the rates of discontinuation due to adverse events for the TADS study.
In the published paroxetine study, discontinuation due to adverse events occurred in 9.7% (n = 9) for paroxetine and 6.9% (6) for placebo (Keller et al., 2001).
The GSK website (http://www.gsk.com/media/paroxetine.htm) also provides details about adverse events for the 2 unpublished trials. In the international adolescent only trial (#377), discontinuation due to adverse events occurred in 11.8% (22/187) of subjects in the paroxetine group and 7% (7/99) of subjects on placebo. These rates are not statistically significant.
In the child and adolescent trial (#701), 9 subjects (8.9%) in the paroxetine group and 2 subjects (2%) in the placebo group discontinued due to adverse events. Four paroxetine subjects had worsened depression and 1 had a suicide attempt (‘emotional lability’). In addition, 1 discontinued due to agitation and irritability (‘nervousness’), 1 for hostility, and 2 for medical problems. In contrast, 2 subjects on placebo were discontinued: 1 for suicidality (‘emotional lability’) and 1 for mood swings (‘emotional lability’), insomnia, and restlessness (‘nervousness’). No statistically significant differences in discontinuation due to adverse events were found between the two groups.
Nine percent (17/189) of subjects in the sertraline group discontinued from the study due to adverse effects; 2% (4/187) of those on placebo discontinued due to adverse events (MHRA website). The majority (13/17) of subjects who discontinued from the sertraline arm due to adverse effects were children (Wagner et al., 2003). The commonest reason for discontinuation was psychiatric adverse events. These were suicidal thoughts (3 sertraline), suicide attempt (2 sertraline, 2 placebo), aggressive behavior (1 sertraline), agitation (3 sertraline), and hyperkinesias (2 sertraline) (Wagner et al., 2003; MHRA website).
In the published trial of citalopram, the rate of discontinuation due to adverse events was similar for citalopram and placebo (5.6% vs. 5.9%). Discontinuation due to agitation was reported in 2 subjects and discontinuation due to worsening depression was reported in 2 subjects. All 4 of these subjects were on citalopram (Wagner et al., 2004). In the unpublished trial reported in the MHRA report, there were 13/121 (10.7%) discontinuations in the treatment group and 9/112 (8.0%) in the placebo group. Five of the discontinuations in the treatment group were due to suicide-related events compared to 2 in the placebo group. As mentioned previously, this study included both inpatients and outpatients, so it is possible that the baseline severity of this population was greater. Until more data are available on the study design and other outcomes, it is difficult to interpret these results.
In the combined trials of venlafaxine, 18 (10%) on venlafaxine and 5 (3%) on placebo discontinued due to adverse events. Reasons for discontinuation from venlafaxine were physical symptoms (n = 8), psychiatric symptoms (n = 5: 2 hostility/aggression, 2 mania, 1 hallucination), and suicidal behavior (n = 5: 4 suicidal ideation, 1 attempt). Reasons for discontinuation from placebo were medical (n = 2) and psychiatric symptoms (n = 3: 2 nervousness/hostility, 1 mania). In addition, there were 5 subjects who discontinued during the placebo run-in period for adverse events: 1 with depression, 1 hostility, 1 suicidal ideation, 1 accidental injury, and 1 unexpected pregnancy) (Emslie, personal communication).
The discontinuation rate due to adverse events in the nefazodone study was 3% (n = 3) for both active medication and placebo. Specific details about the reasons for discontinuation are not reported in the abstract (Emslie et al., 2002b).
In the two placebo-controlled trials of mirtazapine, adverse events were reported as one combined sample size. The rate of discontinuation due to adverse events was similar for mirtazapine and placebo (5.3% vs. 3.4%, respectively).
Serious adverse events. The reporting of SAEs has been consistent in clinical trials with SSRIs and SNRIs. Adverse events are considered to be SAEs if any of the following occurs:
- 2Life threatening: The subject was at substantial risk of dying at the time of the adverse event.
- 3Hospitalization (initial or prolonged): Admission to the hospital or prolongation of a hospital stay results because of the adverse event.
- 4Disability: The adverse event resulted in a significant, persistent, or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities or quality of life.
- 5Congenital anomaly/birth defect: There are suspicions that exposure to a medical product prior to conception or during pregnancy resulted in an adverse outcome in the child.
- 6Event requires intervention to prevent permanent impairment or damage: A condition that requires medical or surgical intervention to preclude permanent impairment or damage to a subject.
- 7Other important medical events: Events that may not result in death, be life threatening, or require hospitalization may be considered a serious adverse drug experience if they may jeopardize the subject or may require medical or surgical intervention (e.g., failed suicide attempts).
Determining whether an event is an SAE is made at the site level by the investigator. Furthermore, the relationship of the event to the medication is determined by the investigator. Psychiatric adverse events, such as worsening of mood or suicidal behavior, are often difficult to categorize, as it is unclear if the event is related to the medication or to the illness. Suicide-related events are reported as SAEs if they cause any of the 7 SAE definition requirements; however, some suicidal behaviors would not qualify as an SAE. For example, in some cases of suicidal gestures (e.g., cutting) or suicidal ideation, the event may not be documented as an SAE. As mentioned, classification is further made problematic because many of the psychiatric adverse events or SAEs may be a result of the illness, rather than the medication. Placebo-controlled trials are needed to evaluate whether there is increased incidence in such events in active medication over placebo. Table 6 shows the rates of SAEs in the included trials.
In the single-site fluoxetine trial, no SAEs were reported in the manuscript. However, 1 subject on fluoxetine and 1 subject on placebo were hospitalized due to suicide related events, which would qualify as SAEs (Emslie, personal communication). In the multi-site trial, 5 SAEs were reported: 1 fluoxetine and 4 placebo subjects. The fluoxetine subject was hospitalized for swollen tonsils; the four placebo subjects were hospitalized for each of the following: kidney infection, aggressive behavior, abdominal pain/appendicitis, and self-mutilation. Data on the SAEs occurring in the TADS study are not yet available.
In the published report of paroxetine, 11 (12%) patients on paroxetine and 2 (2%) patients on placebo were reported to have SAEs (Keller et al., 2001). SAEs in the study were defined as serious if they resulted in hospitalization, were associated with suicidal gestures, or were determined by the treating physician to be serious. Of the 11 SAEs with paroxetine, 1 was a severe headache and the other 10 subjects had various psychiatric events including worsening of depression (2), ‘emotional lability’ (e.g., suicidal ideation/gestures; 5), hostility or conduct problems (2), and euphoria (1). Seven of these patients were hospitalized. Only the subject with headache was considered to have a drug-related SAE. In the placebo group, 1 subject developed ‘emotional lability’ and 1 had a worsening of depression.
In the international adolescent trial (#377), 11.8% (22) of paroxetine and 6.8% (6) of placebo subjects had SAEs. Three of the paroxetine subjects had agitation and 6 had ‘emotional lability’. In contrast, none of the placebo subjects had agitation, 3 had ‘emotional lability’, and 1 had nervousness.
In the child and adolescent trial (#701), SAEs occurred in 6/101 (5.9%) subjects in the paroxetine group and 1/102 (1%) patient in the placebo group. In the paroxetine group, 3 subjects had worsening of depression, 1 subject had suicidal ideation (termed ‘emotional lability’), and 2 had a suicide attempt (termed ‘emotional lability’) with subsequent medical conditions (1 hypertension, 1 arm laceration). Four of these subjects were discontinued following the SAEs. In the placebo group, 1 subject had suicidality (termed ‘emotional lability’), and was discontinued from the study.
Seven subjects (4%) in the sertraline group had SAEs: 2 with a suicide attempt, 3 with suicidal ideation, 1 with aggressive behavior, and 1 requiring hospital admission for medical problems. In contrast, 6 (3%) of subjects in the placebo group had SAEs: 2 with suicide attempt and 4 with hospitalization due to medical conditions.
In the published trial of citalopram, no SAEs occurred in patients treated with citalopram (Wagner et al., 2004). The report does not mention if there were any SAEs in the placebo group. In the unpublished trial of citalopram, specific information about SAEs was not provided in the MHRA report (MHRA website). However, 15 (12.4%) subjects in the treatment group versus 9 (8%) in the placebo group were hospitalized during the study for psychiatric disorders.
In the combined trials (Emslie et al., 2004), 14 (8%) on venlafaxine and 5 (3%) on placebo had SAEs. Medical SAEs occurred in 4 subjects on venlafaxine. Other SAEs in the venlafaxine group were due to psychiatric symptoms (1 agitation/hostility, 1 mania, 2 worsening of depression, 1 hallucinations) and suicidal behaviors (4 suicidal ideation and 1 suicide attempt). Types of SAEs in the placebo group were similar: 1 medical, 3 psychiatric (2 hostility, 1 mania), and 1 suicidal behavior (self-injurious behavior).
No SAEs occurred in the nefazodone study for patients receiving nefazodone. The report did not include information about SAEs in the placebo group.
Only 3 SAEs were reported following randomization (2 on mirtazapine and 1 on placebo). Only 1 of these involved worsening of depression and suicidal ideation (mirtazapine), and in the opinion of the investigator, this event was unlikely due to the study drug. The other 2 SAEs involved medical events: 1 overdose on Depakote after being ‘dared’ (mirtazapine), which required emergency room treatment and was considered by the investigator to be unlikely related to study drug, and 1 increased temperature with possible viral infection (placebo), which required hospitalization and was considered to be unrelated to study drug in the opinion of the investigator.
From published and unpublished trials of SSRIs and SNRIs in children and adolescents, there are reports of increased suicide-related events in patients treated with medication versus placebo (Table 7). One difficulty in interpreting suicidal behaviors in depression studies is that suicide and suicidal ideation are a symptom of depression. Suicide is the 3rd leading cause of death in adolescents. Approximately 19% of teenagers (age 15–19) in the general population think about suicide and nearly 9% of teenagers make an actual suicide attempt (MMWR, 2002). These rates are even higher in patients receiving some type of care for depression. Studies find that 35–50% of these youth have made, or will make, a suicide attempt. Thus, when an individual patient makes a suicide attempt during the course of treatment, it is simply not possible to know if the event was related to the medication or if it was a part of the illness itself. In many cases, patients have suicidal ideation prior to treatment, as part of the illness. Continuation or worsening of such symptoms could be related to medication, but could also simply be lack of improvement. Only when there are sufficient numbers of patients in placebo-controlled trials will we be able to determine if medication treatment is associated with increased suicidal behavior.
Table 7. Overall relative risk for suicide-related events in clinical trials of youth by drug
|Celexa||1.37 (.53, 3.50)||1.37 (.53, 3.50)|
|Luvox||No MDD trials||5.52 (.27, 112.55)|
|Paxil||2.15 (.71, 6.52)||2.65 (1.00, 7.02)|
|Prozac||1.53 (.74, 3.16)||1.52 (.75, 3.09)|
|Zoloft||2.16 (.48, 9.62)||1.48 (.42, 5.24)|
|Effexor XR||8.84 (1.12, 69.51)||4.97 (1.09, 22.72)|
|Remeron||1.58 (.06, 38.37)||1.58 (.06, 38.37)|
|Serzone||No events||No events|
|Wellbutrin||No MDD trials||No events|
|Total||1.66 (1.02, 2.68)||1.95 (1.28, 2.98)|
The other major difficulty in interpreting data on suicidal behavior in these studies is terminology. Suicidal behavior is not the same as suicidal ideation. Different studies use varying terminology. For example, in the paroxetine trial, ‘emotional lability’ included suicidal ideation and suicidal gestures. Furthermore, investigators within a single study may use different terminology. An event such as cutting may be labeled several different things: suicidal gesture, self-mutilatory behavior, suicide attempt, etc. Clearly, hospitalization due to an actual attempt is not equal to hospitalization due to increased ideation. Thus, analyzing the events is somewhat difficult. Furthermore, within events that are actual suicidal behaviors (i.e. actually doing something that could lead to death), there are wide degrees of difference. Self-harm with no ideation or intent is not the same as a suicidal gesture, which is not the same as an attempt. Therefore, the FDA recently completed a project to re-evaluate each of these types of events across the pediatric antidepressant RCTs, and re-categorize them consistently across all studies.
Table 7 presents the results of the FDA re-analyses of suicide-related events in youth from antidepressant trials for MDD and for all indications. The overall relative rate of suicide related events was 1.95 (95% CI = 1.28–2.98) events (including suicidal ideation, self-harm, or attempts) for the trials (all disorders). As seen in Table 7, with the exception of venlafaxine, subjects on individual antidepressants were not statistically more likely to experience suicide-related adverse events compared to placebo, possibly because of the low rate of such occurrences. However, the overall rate for suicide-related events was statistically significant.
One interesting trend emerged from the FDA re-analyses. There was a large numeric difference between the increased risks for suicide ideation compared with actual suicide attempts for subjects on venlafaxine. Most of the risk-ratio for venlafaxine was driven by suicidal ideation events, rather than by actual suicidal behavior events (all suicide-related events 8.84, 95% CI = 1.12–69.51; suicidal ideation 7.89, 95% CI = .99–62.59; suicidal behaviors 2.77, 95% CI = .11–67.10). This was also true for sertraline. However, in subjects treated with paroxetine, citalopram or fluoxetine, the risk was higher for suicidal behaviors compared to suicidal ideation, although none of these differences were significant between medication and placebo, either by event or for the total.
The overall risk difference between the drug and placebo groups was 2–3%. However, even after extensive sub-analyses, no predictive factors were identified which distinguished between subjects with treatment emergent suicide-related adverse events and those with suicide-related events related to their depressive disorder.
Behavioral activation, hostility or switch to mania
Although behavioral activation adverse events and manic episodes are generally captured as adverse events leading to discontinuation or as SAEs, with the recent FDA emphasis on this possible adverse event, the reports of these events are reviewed specifically in this section. Table 8 presents the results from the FDA report regarding the risk of developing these adverse events on medication compared to placebo. As for spontaneous reports of behavioral adverse events (excluding those that lead to discontinuation or were classified as SAEs), only a few reports provide this information. For example, there were no spontaneous reports of behavioral adverse events mentioned for fluoxetine (1997, 2002a), citalopram, or nefazodone. In contrast, in the TADS study (March et al., 2004), 6 (5%) subjects on fluoxetine (either alone or in combination with CBT) experienced agitation/hostility/irritability versus 4 (4%) subjects on placebo. In the published report of paroxetine, 7 (7.5%) of paroxetine subjects (1 was reported as an SAE) and no placebo subjects were listed as having hostility (report included any adverse events at ≥5% on any of the treatments). For one of the unpublished study report for paroxetine (#377), all adverse events were listed. In this study, 5 subjects on paroxetine had agitation or hostility. Three of the subjects with agitation were discontinued from the study. In study#701, only adverse events that occurred in ≥5% of subjects were reported. One subject had hostility and was withdrawn from the study. The sertraline report listed all spontaneous adverse events that occurred in ≥5% of the sertraline group and at least two times the incidence of placebo. Within the children, agitation was more frequently seen on sertraline than placebo (8.1% vs. 2.3%); no behavioral events were mentioned for adolescents. Finally, the presented data for venlafaxine reported 5 (3%) subjects on venlafaxine, compared with 2 (1%) placebo subjects who had hostility adverse events.
Table 8. Overall relative risk for treatment emergent agitation or hostility in MDD trials of youth by drug
|Celexa||1.87 (.34, 10.13)|
|Paxil||7.69 (1.80, 32.99)|
|Prozac||*1.01 (.40, 2.55)|
|Zoloft||2.92 (.31, 27.83)|
|Effexor XR||2.86 (.78, 10.44)|
|Remeron||.52 (.03, 8.27)|
|Serzone||1.09 (.53, 2.25)|
|All drugs||1.79 (1.16, 2.76)|
Specifically relating to mania, in the citalopram and nefazodone trials, none of the subjects developed mania (Wagner et al., 2004; MHRA website). In the sertraline trial, there was no mention of the risk of switching to hypomania/mania (Wagner et al., 2003). In the fluoxetine trials, 1 subject (1%) on fluoxetine in the 2002 study, and 3 subjects (6%) on fluoxetine and 1 subject (2%) on placebo in the 1997 study developed manic symptoms (Emslie et al., 1997; Emslie et al., 2002a; Emslie, personal communication). In the TADS study, mania occurred in 1% of both the fluoxetine alone and placebo groups. However, 3% of subjects on fluoxetine (either alone or in combination with CBT) developed hypomanic symptoms versus 1% of subjects on placebo. In the published study of paroxetine, 2 subjects on paroxetine developed euphoria (GSK website). In the unpublished paroxetine studies, there is only limited data on switching. One subject on placebo discontinued due to mood swings in study#701, and there were no reports of switching in study#377 (GSK website). Finally, 2 subjects (1%) of the subjects developed manic symptoms during the venlafaxine trial. No information is available on mania for the mirtazapine trials.
Methodology issues regarding safety outcomes
With the increased focus on adverse events associated with antidepressants, researchers have also focused on how these events are evaluated and reported in clinical trials. There are several issues that have been raised, including the method of evaluation of adverse events and the method of reporting adverse events in these trials.
As mentioned above, the process of evaluating adverse events in the placebo-controlled trials has been variable and inconsistent. The different methods used to evaluate adverse events in these trials may account for some of the differences in the rates of adverse events. The different methods used included 1) standardized side effect scales, 2) required serious adverse events reporting, and 3) general inquiry or the dependence on spontaneous reporting by patients. Although the majority of the studies simply depended on the spontaneous reporting of side effects by patients on general inquiry, one of the studies included the use of a standardized checklist to elicit adverse events (Emslie et al., 2002a). Clearly, the latter method would increase the number of adverse effects reported in these trials and may account for the lack of difference between the treatment and placebo groups in that trial (Rabkin & Markowitz, 1986). In fact, the lack of standardized and valid tools for assessing the safety of medications in youth is an issue that remains controversial (Greenhill et al., 2003).
Aside from the method of elicitation of adverse events, there are also issues with the classification and description of adverse events once they are elicited. This issue is of special concern in the reporting of suicide-related events. In clinical trials, it is unclear how suicide-related events are classified or defined. For example, one concern that has been previously mentioned is the use of other terms such as ‘emotional lability’ to describe adverse events that would be classified as suicide-related events in other studies. Furthermore, there is a need for better descriptions of the severity of adverse events, particularly suicide-related events. A frequent example is the intent or lethality of a suicide-related event. Better descriptions of the severity of adverse events will allow users of the literature to better understand the potential risk of adverse events with a particular treatment. Therefore, more clear and consistent guidelines are needed to help researchers to better classify and define adverse events.
There are several methodological issues that arise from the reporting of adverse events in these trials. First, adverse events have been reported in isolation from the subjects’ previous behaviors or history. Since prior behaviors are sometimes the best predictor of future behaviors, it is crucial for studies to report both the adverse event as well as the known history of similar behaviors in these subjects. This is particularly important in the reporting of suicide-related events which are generally highly associated with previous suicidal behaviors (Shaffer & Waslick, 2002). From our review, only 1 trial had collected and reported this important information (Emslie et al., 2002a).
Second, as discussed above, the selection criteria for the publication of adverse events are also quite variable. Some studies only reported adverse events in the treatment groups, while some studies reported adverse events only if there were significant differences between the active treatment and the placebo groups. Finally, some studies only reported adverse events if they occurred in more than 5% of the subjects and with an incidence in the treatment group of at least twice that compared to the placebo group (Wagner et al., 2003; Wagner et al., 2004). The inconsistency in reporting has made it difficult to compare adverse events across studies and for users of the literature to have a full appreciation of the range of adverse events that may occur in patients treated with either active medication or placebo in different studies.