Review of the efficacy and safety of antidepressants in youth depression


Amy H. Cheung, 33 Russell St., 3rd Floor Tower, Toronto, Ontario M5S 2S1, Canada; Tel: 1-416-535-8501; Email:


Background:  Depression in children and adolescents is a cause of substantial morbidity and mortality in this population. It is a common disorder that affects 2% of children and up to 6% of adolescents. Although antidepressants are used frequently for the treatment of this disorder, there has been recent controversy about the efficacy and safety of these medications in this population. This review examined the available evidence from clinical trials of antidepressants in adolescents and children with depression.

Methods:  Clinical trial data reviewed were obtained from published reports, including peer review journals and meeting abstracts, as well as unpublished data in the public domain. Clinical trials in this review included large RCTs of antidepressants in youth under the age of 19 with depression. Studies were identified in 2 stages: 1) all RCTs included in the 2004 FDA safety report were reviewed; and 2) to ensure that no additional studies not reported to the FDA were missed, MEDLINE and PSYCH Info were searched from inception until December 2004. A total of 8 published studies and 9 unpublished studies were identified and reviewed.

Results:  Efficacy and safety results from each study are reviewed in detail. There are significant differences in remission and response rates between different antidepressants but also between placebo groups across studies. Adverse events are common in clinical trials involving children and adolescents with depression. Due to lack of access to full data sets, effect sizes could not be calculated.

Conclusions:  With the variability in trial methodology and the variation in the drug/placebo response rates within a single trial, clinicians need to be judicious in their interpretation of research data on pediatric antidepressant trials. Significant methodological issues may also have affected the efficacy and safety results from these clinical trials.

Depression in children and adolescents is a cause of substantial morbidity and mortality in this population. Suicide remains the 3rd leading cause of death in adolescents, and is often associated with depression (National Center for Health Statistics, 2001). The prevalence of depression is around 2% in children (equal proportion in males and females), and increases to 4–6% in adolescents (where the ratio progresses to 2:1 female to male) (Emslie, Weinberg, Rush, Adams, & Rintelmann, 1990). There is substantial continuity with adult depression, with approximately 50% having recurrent depression in adulthood (Weissman et al., 1999).

Given the serious nature of the disorder, along with the fact that the peak onset is in adolescents and young adulthood, early recognition and treatment is important. However, only in the past 5–10 years has substantial data been available for both psychopharmacology and psychotherapy specific to children and adolescents. In the 1990s, it was recognized that there was limited data available on a variety of medications that were used in children and adolescents but had only been approved for adults. This was an issue for all of pediatrics, not just psychiatry, and the Food and Drug Administration (FDA) Modernization Act was enacted to address this imbalance by requiring pharmaceutical companies to provide pediatric data on new compounds, as well as providing an incentive (6-month extension on patent life) for existing medications. This resulted in substantial increase in available information in children and adolescents, particularly in psychiatry. Specifically with antidepressants, prior to 1995, data was only available for about 250 children and adolescents who had been in small double-blind randomized controlled trials (RCTs) (Jensen, Ryan, & Prien, 1992). Now the number is over 4,400. The increased available data has also resulted in difficulties in interpreting the data, leading to recent controversies. Prior to 1997, there had been no published reports of any antidepressant being better than placebo for treatment of depression in childhood. Therefore, this paper will review the efficacy and safety of antidepressants in youth with depression based on RCTs, examining published and unpublished data. Methodological differences between the different trials will be examined to evaluate its impact on the assessment of efficacy and safety.

The recent controversies about use of antidepressants hinges primarily around issues of efficacy and safety. In the absence of efficacy, any risk is of concern. However, it is important to recognize that failure to prove efficacy is not proof of lack of efficacy. The standard to determine efficacy for regularly approved treatments of depression in children and adolescents is generally 2 positive placebo-controlled trials (Jensen et al., 1999). While in adults 50% of trials conducted for registration fail to prove efficacy, it is unlikely that more than 2 studies will be done in the pediatric age group with any medication, because the FDA Modernization Act only required that 2 studies be conducted in the pediatric population (for medications that may be used in youth). Therefore, it is unlikely that most medications will be approved. However, clinicians must make their own judgment of available data in determining how to evaluate the benefits and risks in treating depression in children and adolescents.

The controversy about the safety and efficacy of antidepressants, particularly selective serotonin reuptake inhibitors (SSRIs) and newer antidepressants (selective norepinephrine reuptake inhibitors –SNRIs) began in June 2003, when Great Britain's Medicines and Healthcare products Regulatory Agency (MHRA) received all the pediatric data (both depression and anxiety trials) for paroxetine submitted by GlaxoSmithKline (GSK). Great Britain's Department of Health then issued a statement that paroxetine should not be used to treat depressed youth under 18. The statement reported that paroxetine was not efficacious in treating depressed adolescents, and that it may cause increased suicidal behavior. Shortly thereafter, the US FDA issued a similar warning, but stated that they had not yet completed a full evaluation. Then, in August 2003, Wyeth Pharmaceuticals, maker of venlafaxine and venlafaxine XR, distributed letters to physicians stating that venlafaxine was not recommended for treating depressed youth due to lack of efficacy and potential increased hostility and suicidal behavior. Great Britain's Department of Health followed suit, stating that venlafaxine should not be used in the pediatric population. In October 2003, the FDA issued a FDA ‘talk paper’ stating that it was conducting investigations of antidepressants, including citalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, sertraline, and venlafaxine, but that at present, the data were insufficient to suggest that these medications caused increased suicidal or aggressive behavior in youth. The statement clearly advised physicians to monitor subjects for side effects, particularly increased suicidal risk, but did not unequivocally state there were data to support this concern. In December 2003, the MHRA stated, ‘The majority of SSRI's … are not suitable to be used by under 18s … [and that] risks … outweigh benefits ….’ Only fluoxetine was considered to have an acceptable risk–benefit ratio. In March 2004, the FDA issued a statement requesting a warning label that recommends close observation of adult and pediatric patients treated with antidepressants for worsening of depression or the emergence of suicidality (fluoxetine, sertraline, paroxetine, fluvoxamine, citalopram, escitalopram, bupropion, venlafaxine, nefazodone, and mirtazapine). The label specifically warns patients, families, and healthcare providers to watch for signs of worsening of depression, increased suicidality, anxiety, agitation, panic attacks, insomnia, irritability, hostility, impulsivity, akathisia (severe restlessness), hypomania, and mania in all patients (children, adolescents, and adults). Finally, after a lengthy re-analyses of all suicide-related data from 26 RCTs (all disorders), in October of 2004 the FDA issued a black box warning describing an increased risk of worsening of depression and suicidality for all current and future antidepressants used in those under the age of 18. Of note, not all of the data from the 26 trials is available to the public, and no completed suicides have been reported in any of the RCTs of children and adolescents with these medications.

The aim of this paper is to review the efficacy and safety of antidepressants in youth with depression based on recent large RCTs, examining published, presented, and unpublished data. Differences in methodology between trials will be explored, to evaluate its impact on assessment of both efficacy and safety.


Clinical trial data reviewed in this paper were obtained from published reports, including peer reviewed journals and meeting abstracts, as well as unpublished data in the public domain (FDA website, MHRA website). Clinical trials in this review included large RCTs of antidepressants in youth under the age of 19 with depression. Studies included in this review were identified in 2 stages. First, we reviewed all depression RCTs included in the FDA safety report. Second, to ensure that other studies not reported to the FDA were not missed, we searched MEDLINE and PSYCH Info from inception until December 2004. Searches were based on keywords/terms ‘antidepressant, placebo, child, youth, adolescent, depression, trial, double blind, comparative study’. Among 5,333 potentially relevant published articles initially screened for the review, 8 met the criteria for inclusion (newer classes of antidepressants, i.e., SSRIs and SNRIs, randomized placebo controlled trials, youth under the age of 19, acute treatment). These included four trials with fluoxetine (Simeon et al., 1990; Emslie et al., 1997, 2002a; March et al., 2004), one with sertraline (Wagner et al., 2003), one with citalopram (Wagner et al., 2004), one with paroxetine (Keller et al., 2001), and one with venlafaxine (Mandoki, Tapia, Tapia, Sumner, & Parker, 1997). Although the Mandoki study was underpowered (N = 40) and was a combination trial (both treatment groups also received psychotherapy), it was included. However, both the Mandoki study and the study by Simeon et al. are not described in most sections of this review because of lack of information from the published manuscript (Mandoki et al., 1997; Simeon, Dinicola, Ferguson, & Copping, 1990). These 2 studies were also excluded from the reviews by the FDA and MHRA, and therefore, no additional efficacy and safety information was available on these studies. In addition, older antidepressants (i.e., MAOI’, TCAs) were not included in this review because 1) the current controversy and the recent FDA review only involved newer classes of antidepressants, and 2) the known lack of efficacy (i.e., TCAs) and the lack of clinical trials data (i.e., MAOIs) of other classes of older antidepressants (Hazell, O'Connell, Heathcote, & Henry, 2002).

Finally, due to the continuing controversy around the disclosure of unpublished clinical research data, unpublished studies included in the FDA review were also reviewed for this paper: two each with paroxetine, venlafaxine, nefazodone, and mirtazapine, and one with citalopram. In some cases where multiple sources were used for a single study, there were occasional discrepancies between the various reports. In these cases, for efficacy data the published source was considered the accurate presentation, followed by data from personal communication with the investigators, the MHRA report (MHRA website), and the FDA website. With paroxetine, the actual study reports published on the GSK website ( carried the greatest weight for the 2 unpublished trials (GSK website). Given the FDA reclassification and re-analyses of the safety data, the FDA report was considered the accurate presentation for safety data. Table 1 is a summary of the included trials and sources of data, and describes general methodological information about each of the studies. For further details regarding the FDA review, please refer to the FDA website (

Table 1.  Methodology for SSRI and NSRI trials
 # SitesNAgeGender% femaleDurationDiagnostic measuresOutcome measuresSource
  1. *GSK Website:

 Fluoxetine 19971967–1746%8 weeksDICA, K-SADS Affective, DSM-IIIRCDRS-R, CGI, CGAS, BDI/CDI, WSAS, BPRS-CEmslie et al., 1997
 Fluoxetine 2002152198–1849%9 weeksDICA, DSM-IVCDRS-R, CGI, GAF, BDI/CDI, MADRS, HAM-AEmslie et al., 2002a
 Fluoxetine 200415221 (flx/pb)12–1754%12 weeksK-SADS-PL, DSM-IVCDRS-R, CGI, CGAS, RADS, self-reportsMarch et al., 2004
 Paroxetine 20011227512–1862%8 weeksK-SADS-L, DSM-IVHAM-D, CGI, K-SADS-L depressionKeller et al., 2001
 Sertraline 2003533766–1751%10 weeksK-SADS-PL, DSM-IVCDRS-R, CGI, CGAS, MASC, PQ-LES-QWagner et al., 2003
 Citalopram 2004211747–1753%8 weeksK-SADS-PL, DSM-IVCDRS-R, CGIWagner et al., 2004
 Venlafaxine 19971408–1863%6 weeksHAM-D, CDRS, CBCL, CDIHAM-D, CDRS, CBCL, CDIMandoki et al., 1997
 Paroxetine (Study#701)41203 (ITT)7–1747%8 weeksK-SADS-PL, DSM-IVCDRS-R, CGI, CGAS, KADSEmslie et al., 2004b
 Paroxetine (Study#377)33275 (ITT)13–1867%12 weeksDSM-IV (Unknown diagnostic measure)MADRS, K-SADS-L depression, CGI, CGASGSK website*; MHRA Report (Dec. 2003)
 Nefazodone1519512–1759%8 weeksK-SADS, DSM-IVCDRS-R, HAM-D, CGI, CGASEmslie et al., 2002b
 Venlafaxine (Study#382)16 h(2 w/ 0)161 (ITT)7–1750% (n = 165)8 weeksK-SADS-PL, DSM-IVCDRS-R, HAM-D, MADRS, CGIEmslie et al., 2004
 Venlafaxine (Study#394)37193 (ITT)7–1742% (n = 196)8 weeksK-SADS-PL, DSM-IVCDRS-R, HAM-D, MADRS, CGIEmslie et al., 2004
 CitalopramUNK23313–18UNK12 weeksK-SADS-PK-SADS-PMHRA Report (Dec. 2003)
 Mirtazapine171267–1751%8 weeksK-SADS-PL, DSM-IVCDRS-R, CGI, CGAS, HAM-D21, SCARED, Conner's Global IndexOrganon data on File
 Mirtazapine171247–1753%8 weeksK-SADS-PL, DSM-IVCDRS-R, CGI, CGAS, HAM-D21, SCARED, Conner's Global IndexOrganon data on File
 NefazodoneUNKUNKUNKUNKUNKUNKUNKEmslie et al., 2002b

In each trial, where efficacy data was available, we examined the following: 1) study population (including recruitment and inclusion/exclusion criteria), 2) site selection, 3) drug and dosages, and 4) primary and secondary outcomes. Adverse events were reported in all of the trials. However, due to the variability in the published and unpublished reports, not all data on adverse events were available for review. We examined the following safety issues in each of the studies where information was available: 1) rates of physical side effects, 2) rates of discontinuation due to adverse events, 3) rates of suicide-related events, 4) rates of hostility and/or behavioral activation, 5) rates of switching to mania, 6) method of elicitation of side effects/adverse events, and 7) reporting of adverse events and serious adverse events (SAEs) in both treatment and placebo groups.



Data from published trials, trials presented at scientific meetings, and unpublished clinical trials of SSRIs and SNRIs were reviewed. Table 2 reports on some of the efficacy outcomes for each of the trials. In each case, the results based on the pre-defined primary outcome variable are provided. In addition, most studies report response outcome based on the Clinical Global Improvement (CGI-I), which is defined as ‘much’ or ‘very much’ improved (a rating of a 1 or 2). The table also reports the change scores on the primary continuous measure for each trial. In some cases, the additional outcomes provided in the table (e.g., CGI-Improvement or change score) are the same as the primary outcome. Finally, the table provides remission rates (e.g., ‘well’ or symptom-free) for the studies.

Table 2.  Efficacy of antidepressants trials
Individual studyNAgesPrimary outcome measure(s) (Drug vs. placebo)Clinical global improvement (CGI) (Drug vs. placebo)Change scores (Drug vs. placebo)Remission (Drug vs. placebo)
 Fluoxetine 1997 (Emslie et al., 1997)96 (48 flx, 48 pb)7–17CGI: 56% vs. 33% (p = .02) CDRS-R: −20.1 vs. −10.5 (p = .001)56% vs. 33% (p = .02)CDRS-R: −20.1 vs. −10.5 (p = .001)CDRS-R ≤28: 31% vs. 23%
 Fluoxetine 2002 (Emslie et al., 2002)219 (109 flx, 110 pb)8–17Response, as defined by 30% reduction in CDRS-R: 65% vs. 53% (p = .093)52.3% vs. 36.8% (p = .028)CDRS-R: −22.0 vs. −14.9 (p = .001)CDRS-R ≤28: 41.3% vs. 19.8% (p = .01)
 Fluoxetine TADS (March et al., 2004)221 (109 flx, 112 pb)12–17CDRS-R (end score): 36.30 ± 8.18 vs. 41.77 ± 7.99 CGI-I: 61% vs. 35%61% vs. 35%CDRS-R: −22.64 vs. −19.41Not yet available
 Paroxetine 2001 (Keller et al., 2001)275 (93 par, 95 imipramine, 87 pb)12–18≤8 and/or ≥50% reduction on HAM-D from baseline to endpoint via LOCF: 66.7% par vs. 58.5% imipramine vs. 55.2% pb (p = .11 par vs. pb) Change from baseline in HAM-D total score via LOCF: −10.7 par, −8.9 imp, −9.1 pb (p = .13 par vs. pb)66% vs. 52% vs. 48% (p = .02 par vs. pb)HAM-D: −10.7 par vs. −8.9 imp vs. −9.1 pb (p = .13 par vs. pb)HAM-D ≤8: 63.3% vs. 50.0% vs. 46.0% (p = .02 par vs. pb)
 Sertraline 2003 (Wagner et al., 2003)376 (189 sert, 187 pb)6–17Response, as defined by 40% decrease on CDRS-R (adjusted CDRS-R score, 17 points subtracted) via LOCF: 69% vs. 59% (p = .05)63% vs. 53% (p = .05)CDRS-R: −22.84 vs. −20.19 (p = .007)Not available
 Citalopram 2004 (Wagner et al., 2004)174 (89 cit, 85 pb)7–17Response, as defined by ≤28 on CDRS-R via LOCF: 36% vs. 24% (p = .05)47% vs. 45% (NS)CDRS-R: −21.7 vs. −16.5 (p = .038) (taken from the MHRA report, December 2003)CDRS-R ≤28: 36% vs. 24% (p = .05)
 Paroxetine Study#377 (Emslie et al., 2004b)275 ITT (182 par, 93 pb)13–18Response, as defined by ≥50% reduction in MADRS via LOCF: 60.5% vs. 58.2% (p = .702) Mean change in K-SADS-L depression subscale score: −9.33 vs. −8.92 (p = .616)69.2% vs. 57.3%MADRS: −13.60 vs. −12.80 (p = .520)Not available
 Paroxetine Study#701 (GSK website)203 ITT (101 par, 102 pb)7–17Change from baseline in CDRS-R total score via LOCF: −22.6 vs. −23.4 (p = .684)49% vs. 46% (p = .563)CDRS-R: −22.6 vs. −23.4 (p = .684)Not available
 Nefazodone (Emslie et al., 2002a)195 (99 nef, 96 pb)12–17CDRS-R at Week 8 via LOCF: Significant (p = .03)65% vs. 46% (p = .005)CDRS-R: −26.5 vs. −22.5 (p = .055)Not available
 Venlafaxine Study#382 (Emslie et al., 2004)161 ITT (78 ven, 83 pb)8–17CDRS-R at Week 8 via LOCF: −18.1 vs. −16.1 (p = .338)NSCDRS-R: −18.1 vs. −16.1 (p = .338)Not available
 Venlafaxine Study#394 (Emslie et al., 2004)193 ITT (101 ven, 92 pb)8–17CDRS-R at Week 8 via LOCF: −24.3 vs. −22.6 (p = .386)NSCDRS-R: −24.3 vs. −22.6 (p = .386)Not available
 Citalopram (MHRA Report, Dec. 2003)233 ITT13–18K-SADS-P total score over time: −12.4 vs. −12.7 (no significance reported by MHRA report, December 2003)Not availableK-SADS-P: −12.4 vs. −12.7Not available
 Mirtazapine (Data on File, Organon)126 ITT7–17CDRS-R total score: 35.1 vs. 37.2 (p = .421)59.8% vs. 56.8% (p = .75)HAM-D21: −12.22 mir vs. −11.07 pb (p = .419)Not available
 Mirtazapine (Data on File, Organon)124 ITT7–17CDRS-R total score: 35.4 vs. 38.8 (p = .19)53.7% vs. 41.5% (p = .20)HAM-D21: −11.82 mir vs. −9.76 pb (p = .107)Not available
 Nefazodone (noted in Emslie et al., 2002b)UNK7–17NSNSNSNot available

Fluoxetine.  Three large double-blind placebo controlled trials have been conducted with fluoxetine (Emslie et al., 1997, 2002a; March et al., 2004). The first was a single-site trial of 96 outpatients, age 7–17 years, with Major Depressive Disorder (MDD). Following a 2-week evaluation and 1week single-blind placebo run-in, subjects were randomized to fluoxetine, 20 mg/day (n = 48) or placebo (n = 48) for 8 weeks. Primary outcome measures were global improvements on the Clinical Global Impression Scale (CGI-I) and change in severity based on the Childhood Depression Rating Scale Revised (CDRS-R, P02). Based on CGI-Improvement of 1 or 2 (very much or much improved), 56% of subjects receiving fluoxetine and 33% of subjects receiving placebo responded to treatment at exit from the study (p = .02). Weekly CDRS-R scores were also significantly different between the two groups by week 5 (p = .03) and continuing through week 8 (fluoxetine 38.4 ± 14.8 versus placebo 47.1 ± 17.0; p < .008). Change in CDRS-R score slope was also significantly different between groups, with the fluoxetine group improving 2.75 U per week, compared to only 1.27 U in the placebo group (p = .04). No significant differences were found between the two groups on a general psychiatric scale (Brief Psychiatric Rating Scale for Children), Clinical Global Assessment Scale (CGAS) or depression self-reports such as the Beck Depression Inventory (BDI) and the Weinberg Screening Affective Scale (WSAS).

In a multi-site study replicating Emslie et al. (1997), 219 children and adolescents with MDD were randomized to fluoxetine (n = 109) or placebo (n = 110). Like the original study, subjects underwent a 2-week evaluation phase, followed by a 1-week placebo run-in. Subjects randomized to fluoxetine received 10 mg per day for the 1st week, and then 20 mg per day for 8 weeks, for a total of 9 weeks of acute treatment. Unfortunately, the prospectively defined response criterion of 30% in CDRS-R was not significant (p = .093). However, multiple other outcome variables did significantly favor fluoxetine over placebo, including percentage decreases in CDRS-R symptoms of 20%, 40%, 50%, 60%, and 70%. In addition, the mean CDRS-R score at endpoint was significantly lower in the fluoxetine group (35.1 ± 13.5) than the placebo group (40.2 ± 13.5l p < .001). Rate of improvement based on the mean change in CDRS-R was also significant at week 1 and continued for the remainder of the study (p < .05). Based on CGI-Improvement of 1 or 2, 53.3% of fluoxetine treated subjects were considered responders compared with 36.8% of placebo treated subjects (p = .028). CGI severity and MADRS scores were also significantly better for fluoxetine treated patients. Like the previous fluoxetine study, there were no significant differences between global assessment of functioning scores (GAF) or the self-report measure (BDI or CDI).

A more recent multi-site trail, TADS (Treatment for Adolescent Depression Study) (March et al., 2004), compared outcomes in subjects treated with 12 weeks of fluoxetine (10 to 40 mg/day), cognitive behavioral therapy (CBT), combination treatment of fluoxetine and CBT, and placebo in 439 patients between the ages of 12 to 17 years. Placebo and fluoxetine alone were administered double-blind while CBT alone, and CBT with fluoxetine were administered single-blind (the independent evaluator rating the outcome measures was blind to treatment conditions). This study confirmed the efficacy of fluoxetine both alone and in combination with CBT for treating adolescent depression. Rates of response, based on CGI-I of 1 or 2, for fluoxetine with CBT was 71.0% (95% CI, 62–80%); fluoxetine alone, 60.6% (95% CI, 51–70%); CBT alone, 43.2% (95% CI, 34–52%); and placebo, 34.8% (95% CI, 26–44%).

Finally, an earlier study by Simeon et al. studied 40 patients aged 13 to 18 years in a placebo-controlled double-blind study of fluoxetine. Fluoxetine was not statistically superior to placebo on any of the outcome measures. However, the results of this study are difficult to explain due to the small sample size and incompletely described methodology (Simeon et al., 1990).

Thus, in these studies of fluoxetine most of the outcome measures were positive, though self-report measures did not differ significantly between groups. However, self-reports, even at baseline, showed wide variability, which may account for the difficult interpretation at endpoint. One important point is that these trials differ from an earlier controlled trial by Simeon et al. that showed no significant differences between fluoxetine and placebo in 40 subjects, where the fluoxetine group showed more improvement than the placebo group, but differences were not significant. While two-thirds of the sample (on both fluoxetine and placebo) showed marked or moderate clinical global improvement, this study was small and the methodology is not clearly explained, making it difficult to interpret these results.

Paroxetine.  Three double-blind, placebo-controlled trials have been conducted with paroxetine. The only published report of these trials involved a study of 275 adolescents (age 12–18) with MDD at twelve centers across the US and Canada (Keller et al., 2001). Adolescents were randomized to paroxetine, imipramine, or placebo for 8 weeks. Dosing of paroxetine was initiated at 20 mg/day, with optional increase to 30–40 mg/day after at least 4 weeks of treatment. Imipramine was initiated at 50 mg/day and gradually increased to 250–300 mg/day after week 4. The two primary outcome measures were 1) HAM-D ≤8, or 50% decrease from baseline, and 2) change in HAM-D total score. On the first primary outcome (HAM-D ≤8 or 50% decrease), no statistical difference was found between paroxetine, imipramine, and placebo (66.7% vs. 58.5% vs. 55.2%. respectively). Change in the HAM-D total score was also not significant (−12.2 ± .88 vs. −10.6 ± .97 vs. −10.5 ± .88).

Secondary outcomes, however, suggested positive effects of paroxetine. With CGI-I of 1 or 2, response rates were 65.6% in the paroxetine group, compared to 52.1% for imipramine and 48.3% for placebo (p = .02). HAM-D and K-SADS-L depressed mood items were also significantly better in the paroxetine group (p = .001, p = .05, respectively). Finally, remission rates, defined as HAM-D ≤8 (part of the first primary outcome), were significantly higher in subjects treated with paroxetine (66%, p = .02) compared with imipramine (50%) and placebo (46%). Differences in mean CGI scores, K-SADS-L depression sub-scores, and HAM-D totals at endpoint were not statistically significant between groups, however.

Two other double-blind, placebo-controlled trials of paroxetine have been conducted, but data remains unpublished. However, GSK has posted the study reports for these trials on their website (

In a study of adolescents only (study#377), conducted in 33 centers internationally, 275 (ITT) adolescents (age 13–18) were randomized to paroxetine, 20–40 mg/day (n = 182), or placebo (n = 93) for 12 weeks. No differences were found between paroxetine and placebo on the primary outcome variables (≥50% decrease from baseline MADRS and change from baseline in K-SADS-L depression subscale). At the Week 12 endpoint, 60.5% of paroxetine subjects and 58.2% of placebo patients had responded (based on 50% decrease in MADRS). Although not statistically significant, older adolescents (>16) tended to show greater improvements with paroxetine than placebo, while younger subjects (≤16) had high placebo response rates. No treatment differences were seen on other outcome variables (CGI Severity, CGI Improvement, BDI, and Mood and Feelings Questionnaire).

A final study of children and adolescents (study#701) was an 8-week trial conducted in 41 centers across the US and Canada. Subjects (N = 203) were children and adolescents (age 7–17) randomized to paroxetine 10–50 mg/day (n = 101) or placebo (n = 102). The primary outcome measure (change from baseline on CDRS-R total score) did not show paroxetine to be more efficacious than placebo. Secondary variables (CGI-Severity, CGI-Improvement, and GAF) were also negative. There was evidence of a treatment by age group interaction. Children (age 7–11) in the placebo group had greater improvements on the CDRS-R than children on paroxetine (p = .054). No differences were found between paroxetine and placebo in the adolescent group. Thus, only the published paroxetine study showed some positive outcomes, while the two unpublished studies did not.

Sertraline.  Two identical studies were conducted in 53 centers in the US, India, Canada, Costa Rica, and Mexico to compare sertraline and placebo. Wagner and colleagues report that the studies were combined a priori; providing results on 376 children and adolescents (age 6–17) with MDD. Subjects were randomized to sertraline (n = 189; 50–200 mg/day) or placebo (n = 187) for 10 weeks. Based on the primary outcome measure (change from baseline in CDRS-R), sertraline subjects showed significantly greater improvement (−22.84) than placebo subjects (−20.19; p = .007) over the course of the study. Those completing all of the 10 weeks of treatment showed even greater differences in decrease of CDRS-R scores (−30.24 versus −25.83, respectively; p = .001). Mean change in CGI-Severity scores showed similar differences (−1.99 versus −1.58, respectively; p = .001), and mean CGI improvements also favored sertraline (2.02 versus 2.3, respectively; p = .009). Although the mean CGI scores are statistically different between sertraline and placebo, because the ranges are so small (1–7), it is difficult to interpret the clinical significance of these findings.

Responders were defined using 2 measures. First, based on a 40% decrease in CDRS-R at end of study using the last observation carried forward, 69% of sertraline treated subjects and 59% of placebo treated subjects were considered responders (p = .05). Second, based on CGI- Improvement of ‘much’ or ‘very much’ improvement, 63% of sertraline treated subjects and 53% of placebo treated subjects were considered responders (p = .05). Although only a 10% difference was found between active medicine and placebo, due to the large number of subjects in this trial, the difference is statistically significant. Other significant outcomes include several individual items on the CDRS-R, low self-esteem, excessive weeping, listless speech, and hypo-activity. Although numerically greater improvements were seen in the sertraline group on the Multidimensional Anxiety Scale for Children (MASC), Pediatric Quality of Life Enjoyment and Satisfaction Questionnaire (PQ-LES-Q), and Children's Global Assessment Scale (CGAS), these differences were not significant.

Of interest is that greater differences were noted in adolescents than in children. For example, the CDRS-R mean change noted between drug and placebo in adolescents was −21.55 (sertraline) versus −18.20 (placebo; p = .01), while in children the difference was not significant (−24.05 versus −22.20, respectively, p = .19). However, because the study was not powered to detect differences between age groups, these findings only serve to highlight areas of interest for future research.

One area of debate surrounding the sertraline data is that the 2 studies were pooled for the analyses. Individually, both studies were negative. That is, there was no difference between drug and placebo. The primary efficacy variable (change score in CDRS-R from baseline) was similar between drug and placebo for both individual studies (Study 1: −25.9 vs. −22.1, p = .084; Study 2: −28.8 vs. −25.6, p = .17). The response rates (based on 40% decrease in CDRS-R) in Study 1 were 62.4% vs. 56.8% for sertraline and placebo (p = .46); response rates for Study 2 were 75% vs. 60.4%, respectively (p = .033), which is significantly different (data on the individual studies were obtained from the MHRA report). The fact that the primary outcome variables on the individual trials were not positive could be in part due to the high placebo rate, and the studies individually were underpowered to detect a difference. The authors suggest several possible reasons for the high placebo rate, including large numbers of participating sites, few subjects within each site, and multinational sites. Although no site-by-site differences were found, these factors easily could have impacted the study. Nonetheless, the studies were combined a priori, and the fact remains that statistically significant improvements in sertraline over placebo were found on the majority of the outcome measures.

Citalopram.  Two studies have been conducted comparing citalopram and placebo. However, only one has been published (Wagner et al., 2004). This study was an 8-week multi-site study of 178 children and adolescent outpatients (ages 7–17) with Major Depressive Disorder, or MDD. Following a one-week placebo run-in, subjects were randomized to citalopram (20–40 mg/day) or placebo. Four citalopram subjects were lost to follow-up and did not receive study medication. Thus, the analysis included 89 subjects on citalopram and 85 subjects on placebo. Primary outcome was change in CDRS-R score from baseline to week 8 or early termination. Citalopram showed significant improvement over placebo as early as week 1 (F = 6.58, p < .05), and persisted throughout the study. Although not reported in the paper, the change scores were −21.7 vs. −16.5 (p = .038; MHRA report). Additionally, more citalopram treated subjects met the defined response criteria of CDRS-R ≤28 than placebo treated subjects, (36% versus 24%, respectively, p < .05). Although these response rates appear quite low, the defined response criterion used was full remission (e.g., ‘well’), and reported rates of remission are similar to those found in other studies (Emslie et al., 1997, 2002a). No differences were found on CGI Improvements of 1 or 2 (much or very much improved) between citalopram and placebo (47% versus 45%, respectively) or mean CGI Severity (4.4 versus 4.3, respectively). Other outcomes were not reported, however. Thus, it would have been of interest to know other response rates (i.e., percent change on CDRS), because the CGI-Improvement response rates were negative and the pre-defined response rate of CDRS-R ≤28 is more often considered a remission rate. Of note, it appears unusual that the CGI-Improvement rates were not significantly different, while the continuous measure (CDRS-R) did show a difference on change scores and based on cut-offs. Thus, it is unclear why these outcomes are so different within a single trial.

The second citalopram study was a 12-week trial of citalopram (10–40 mg/day) or placebo (MHRA website). Two hundred and thirty-three adolescents (age 13–18), both inpatients and outpatients, who received at least 1 dose of medicine were included in the analysis. Significant differences were not found on the primary outcome measure (change from baseline on K-SADS-P total score) or any other outcome measures. Because this study has not been published and full data are not available for review, it is difficult to interpret the results. Nonetheless, two clear limitations stand out about the study. First, both inpatients and outpatients were included, which suggests increased illness severity in subjects; second, only 74 (60%) of subjects in the citalopram group and 79 (66%) in the placebo completed the study. These issues alone cause some hesitation in evaluating the results.

Venlafaxine.  Two double-blind, controlled trials of venlafaxine have been conducted in children and adolescents (age 7–17) with MDD. The studies showed no differences between drug and placebo on any outcomes. On study 1 (#382), the mean decrease on the CDRS-R was −18.1 for venlafaxine and −16.1 for placebo (p = .338); on study 2 (#394), the mean decrease on the CDRS-R was −24.3 for venlafaxine and −22.6 for placebo (p = .386) (MHRA website; Emslie et al., 2004). Based on CGI-Improvements of 1 or 2, response rates for Study 1 were 50% for the venlafaxine group and 41% for the placebo group (p = .314). In Study 2, 68% of the venlafaxine subjects and 61% of the placebo subjects were considered responders based on CGI-Improvement (p = .295).

The data from these two studies were pooled and presented at the American Psychiatric Association in 2004 (Emslie et al., 2004). Three hundred and thirty-four subjects were randomized to venlafaxine 37.5–225 mg/day (n = 169, ITT) or placebo (n = 165, ITT) for 8 weeks in 52 centers across the US. No difference was found between venlafaxine and placebo on the primary outcome (CDRS-R endpoint). Analyses were also conducted based on age groups (7–11 and 12–17). In the child group, no differences were found; however, in the adolescent group, significant differences were found between drug and placebo on the CDRS-R change score (−24.4 vs. −19.9; p = .02), suggesting that adolescents may show more response to venlafaxine treatment than younger children. The individual studies were underpowered to evaluate age group differences, however.

As mentioned previously, one other underpowered study (n = 40) has compared venlafaxine to placebo, with no differences found between active treatment and placebo (Mandoki et al., 1997).

Nefazodone.  One double-blind, placebo-controlled trial has been reported which compared nefazodone to placebo. This trial involved 195 adolescents (aged 12–17) with MDD at 15 different sites in the US (Emslie et al., 2002a). Following a 2–4-week baseline phase, subjects were randomized to receive either 8 weeks of nefazodone (n = 99), or placebo (n = 96). Those randomized to nefazodone began at 100 mg/day, with a dosage increase of 100 mg/week to reach to desired amount of 300–400 mg/day. The primary outcome measure was a comparison of mean CDRS-R scores from baseline to week 8 between those taking nefazodone or placebo. CDRS-R scores over the entire 8-week trial showed a statistically significant change in favor of the nefazodone (p = .03). At week 7, nefazodone demonstrated a significant difference in CDRS-R scores over placebo (−26.7 vs. −21.3, respectively, p = .006), as well as a 4.0-point improvement in scores by week 8, though just failing to reach significance (−26.5 vs. −22.5, respectively, p = .055).

Likewise, secondary outcomes suggested possible effectiveness of nefazodone over placebo. At week 8 there was a greater increase in CGI response rate (65% vs. 46%, respectively, p = .005), as well as CGI Improvement (2.3 vs. 2.8, respectively, p = .012), and CGI Severity (−1.7 vs. −1.3, respectively, p = .022) with nefazodone use. Change in HAM-D scores were also significantly improved in nefazodone group (−10.0 vs. −8.2, respectively, p = .023), as were scores on CGAS (17.2 vs. 13.0, respectively, p = .020). Although this study seemed to demonstrate that nefazodone is safe and effective in acute treatment of adolescents with MDD, an unpublished second depression trial in pediatric patients (mentioned in Emslie et al., 2002b) found no significant differences between nefazodone and placebo.

Mirtazapine.  Two multi-center trials of children and adolescents (age 7–17) with MDD were conducted to compare mirtazapine and placebo. Participants were randomized to 8 weeks of mirtazapine 15–45 mg/day or placebo. In the first study, 126 youth (82 mirtazapine, 44 placebo) were randomized, and included in the efficacy (ITT group) and safety (AST group) analysis. No significant differences were found on any of the outcome variables. The primary outcome variable, CDRS-R total score at endpoint, was similar for the 2 groups: 35.1 ± 1.6 for mirtazapine and 37.2 ± 2.2 for placebo (p = .421). Other depression outcome measures were similar, including mean change in CGI-Severity (−1.71 ± .14 vs. −1.48 ± .19; p = .322), and change in HAMD-21 (−12.22 ± .84 vs. −11.07 ± 1.15; p = .419). Rates of response (defined as CGI-Improvement of 1 or 2) were also similar between active treatment and placebo (59.8% vs. 56.8%; p = .75).

In the second study, 133 youth (88 mirtazapine, 44 placebo) were randomized; however, only 132 were included in the safety analysis (AST group), and 124 were included in the efficacy analysis (ITT group). Similar to the initial study, none of the outcome measures were significantly different between the two groups. The CDRS-R total scores (primary outcome measure) were 35.4 ± 1.5 for mirtazapine, compared to 38.8 ± 2.1 for placebo (p = .19). Other depression outcomes were as follows: mean change in CGI Severity (−1.51 ± .14 vs. −1.15 ± .19; p = .127), and change in HAMD-21 (−11.82 ± .73 vs. −9.76 ± 1.04; p = .107). Of interest is that in this second study, 53.7% of those on mirtazapine and only 41.5% of those on placebo were considered responders based on CGI-Improvement (1 or 2). The numerical difference between the two groups was larger (over 12%) than the difference found in the sertraline trial (only 10%). However, due to the smaller sample size, the difference was not significant (p = .2).

Methodological issues regarding efficacy outcomes

In reviewing the efficacy data for antidepressants in adults and children, it does not appear the response in adolescents to SSRIs is different from adults. Regulatory agencies generally require 2 positive trials to declare a medication effective. With up to 50% of adult trials failing, it frequently takes up to 4–5 studies to achieve the required 2 positive trials. It is unlikely that more than 2 studies will ever be completed in the pediatric population with a single medication, as pharmaceutical sponsors are only required to conduct 2 pediatric trials. In fact, in disorders such as OCD only 1 pediatric trial was required for efficacy.

A second consideration is whether individual SSRIs differ from one another. Such a conclusion is difficult to reach, as no agents have been compared directly. Likewise, however, there is no evidence in adults that the medications are particularly different from each other. Relatedly, placebo response rates in the different clinical trials vary from 33% to 60%, yet we do not assume that the placebo pills have different levels of efficacy. Thus, it is more likely that the differences seen in outcomes across the pediatric studies are a result of methodological differences. Table 3 provides some of the methodological issues that likely contribute to the varying rates of response across trials. In terms of efficacy, these issues involve site selection, study population, study design and outcomes measures. More specific details about the methodology of these studies can be found in a review of the study designs used by the FDA Advisory Committee to evaluate the studies in September 2004 (

Table 3.  Factors influencing study outcome
Site selection
 Number of sites
 Number of subjects per site
 Experience of sites and investigators
Study population
 Inclusion/exclusion criteria: age, diagnosis, prior and current comorbid conditions, severity of illness, etc.
 Recruitment strategies
Study design
 Duration of assessment
 Use of placebo run-in period
 Duration of acute treatment
Outcome measures
 Adult scales vs. child-specific scales
 Categorical vs. continuous
 Cut-off scores vs. percent change

Site selection.  The first difference in trials is simply the number of sites selected for the study. Obviously, the more sites used, the more variability in the conduct of the study, which impacts outcome. Table 4 shows the difference between active medication and placebo in the 6 published trials of SSRIs (Emslie et al., 1997, 2002a; Keller et al., 2001; March et al., 2004; Wagner et al., 2003, 2004). The studies with the highest number of subjects per site had the largest drug–placebo differences, compared with the smaller differences in studies with more sites and fewer numbers of subjects per site. The single site fluoxetine trial had 96 subjects recruited within that site, while the average number of subjects per site in the sertraline trial was approximately 8 subjects (and in many cases, sites only recruited 1–2 subjects). Clearly there may be other factors that contributed, which will be reviewed next, but number of subjects per site may play an important role in outcome.

Table 4.  Percent differences between active medication and placebo
MedicationReference# of SitesAve.# subjects per site% Difference (active vs. placebo)*
  1. *Based on CGI-Improvement of 1 or 2.

FluoxetineEmslie et al., 199719623%  
FluoxetineMarch et al., 200413≅3426%  
ParoxetineKeller et al., 200110≅2817.3%
FluoxetineEmslie et al., 200215≅1516.5%
SertralineWagner et al., 200353≅710%  
CitalopramWagner et al., 200421≅82%   

Another site difference is the experience of the site and investigators in 1) assessing and treating pediatric depression, and 2) utilizing rating scales for outcome measures. The authors of the sertraline study, for example, suggest that the high placebo response rate may have been due in part to the wide variability in sites, including experience of investigators and low numbers of subjects recruited at each site.

Study population.  The 2nd area of trial variability is the subject population, including inclusion/exclusion criteria and recruitment strategies. In each of the studies listed here, all subjects were required to meet DSM criteria for MDD with at least moderate severity (DCM-IV; American Psychiatric Press, 1994). However, other factors, such as age, severity of illness, co-morbid disorders, treatment history, and ongoing psychotropic and psychotherapeutic interventions varied across studies. For example, most of these trials included only outpatients, but the unpublished citalopram study allowed inpatients as well (MHRA website). It is likely that the severity of depression was substantially higher in this study. Another example is that some trials excluded subjects with certain comorbid disorders, while others did not. An example is the exclusion of patients with eating disorders in the published paroxetine study (Keller et al., 2001). Likewise, some trials allowed ongoing supportive psychotherapy, strictly prohibited in others.

Another variable factor across trials is age. In some trials, only adolescents were included, while in others children were also included. Some of the negative studies have shown a difference between active medication and placebo in the adolescents, but not in the children (Emslie et al., 2004a; Wagner et al., 2003). However, it should be noted that some of the trials that have included children have been positive (Emslie et al., 1997, 2002a; Wagner et al., 2003, 2004).

Source of recruitment may also be a factor explaining differences in trial outcomes. Because no studies have explicitly examined this, the question of variability of severity across different types of sites remains unanswered. Thus, are subjects seen in psychiatric offices different (i.e., more severe) than those seen in general practitioner offices? If so, are less severe cases more likely to be placebo responders? In a study of CBT with depressed teens, Brent and colleagues reported higher response rates in subjects coming in from advertisements than clinical referrals (Brent et al., 1998), raising the question of whether clinical referrals may be more severe cases than those coming in through advertisements.

Study design.  The 3rd area affecting outcomes of trials is the study design. Duration of assessment, use of placebo run-in, duration of acute treatment, and dosing all impact results. Rintelmann et al. reported that an extended evaluation period (i.e., 2 weeks) led to improvement in some subjects during the course of the evaluation (Rintelmann et al., 1996). Similarly, placebo run-ins are also useful in eliminating some of the placebo responders. The 2 fluoxetine studies (Emslie et al., 1997, 2002a), 1 paroxetine study (#377), 1 citalopram study (Wagner et al., 2004), and both venlafaxine studies used placebo run-in periods, while the others did not.

Duration of the acute trial must also be adequate. In most of the trials described here, the duration of acute treatment was 8 weeks (a few trials were longer). Of interest, however, is that most subjects do not reach clinical remission by 8 weeks of treatment (Emslie et al., 1997). Thus, if remission (e.g., CDRS-R ≤28) is the primary outcome, most subjects will not have received adequate treatment to achieve this level of remission.

Finally, dosing of medication for all trials have been conducted based on extrapolating from adults without benefit of dose-finding studies. This appears less of an issue for adolescents, but in children it is possible that lower doses could have been effective with fewer side effects. An example of this is the sertraline trial. The discontinuation rate due to adverse events was substantially higher in the children than adolescents. The mean dose of sertraline in the study was 131 mg/d. The report does not specify the mean dose for children versus adolescents, but the authors comment that there was a question of whether the dosing was too high in the children considering the high dropout rate in this subgroup (Wagner et al., 2003). Unfortunately, no dose-finding studies were conducted prior to these trials. It is possible that dosing may have impacted treatment outcomes, whether due to inadequate dosing or overdosing leading to premature discontinuation from side effects.

Outcome measures.  Research in pediatric depression has advanced significantly over the past 2 decades. However, rating scales for depression severity specific to this age group were limited initially. As such, many trials utilized rating scales designed for adults. In some cases, these scales may not have yielded adequate clinical information. For example, in children and adolescents, irritability may be the primary mood symptom, even in the absence of depression itself (DSM-IV; American Psychiatric Press, 1994). Thus, scales that do not contain a rating for this symptom are missing an important aspect of pediatric depression. Currently, the Childhood Depression Rating Scale Revised (CDRS-R) (Poznanski et al., 1984) is considered the most appropriate and widely used severity measure for this population. However, even using an appropriate continuous measure, the analysis of that measure affects the outcome of the trial. Some trials have used cut-off scores (e.g., CDRS-R ≤28), while others use percent change from baseline. The clearest picture of the impact of outcome measure selection can be seen in the multi-site fluoxetine trial (Emslie et al., 2002a). As Table 5 (Emslie et al., 2002a) depicts, depending on the percent change selected, remarkably different response rates for placebo can be seen (21–69%). Even the 30% decrease in the CDRS-R was significantly different between drug and placebo, but the reported rate in the published manuscript was calculated incorrectly. The CDRS-R has a minimum score of 17, and the response rate calculation that corrects for the non-zero minimum should be used. Unfortunately, in this study, 30% change in CDRS-R score was the primary outcome variable and was reported as negative in the manuscript based on the incorrect calculation. As such, some scientists may consider this trial negative, despite that the corrected calculation and multiple other outcome measures were positive.

Table 5.  Calculations of response based on CDRS-R varying percent change
Calculation method Response criteriaFluoxetine (N = 109)Placebo (N = 101)p-value
  1. Source: Table derived from Emslie et al., 2002.

inline image
 ∫20%95 (87.2)69 (68.3).001
 ∫30%86 (78.9)62 (61.4).006
 ∫40%77 (70.6)56 (55.4).031
 ∫50%63 (57.8)41 (40.6).014
 ∫60%56 (51.4)29 (28.7).001
 ∫70%40 (36.7)21 (20.8).015

Clearly, special attention must be paid to the selection of the primary outcome variable, as this is the focus of regulatory agencies. However, in reviewing the studies, a decision about whether a study is positive or negative cannot always be made strictly based on the primary outcome measure. All outcome information should be synthesized so that clinicians can make informed decisions about the integrity of the study, and thus the efficacy of the medication. Future studies should also consider having standard outcomes for each trial which would allow for easier comparison and interpretation. Of course, comparison would also be made easier if there is rapid disclosure and publication of trials data such as with a trials’ registry.


As with other pharmacological treatments, the indication for the use of antidepressants in children and adolescents is determined by weighing the risks and benefits of introducing an active treatment. Therefore, with the growing questions about the efficacy of antidepressants in children and adolescents for the treatment of depression, the safety profile of this class of medications has also become a topic of great interest for consumers, clinicians and researchers. Further fueling this interest is the possible link between the use of antidepressants and the increased risk for suicide-related events in children and adolescents with depression.

In this section, we review safety data from the same clinical trials as those reviewed in the efficacy section. However, given the extensive review of the safety data by the FDA, we considered the FDA data as the most reliable in this section. As with the efficacy data, there were only limited safety data from the Simeon et al. and Mandoki et al. studies. In the venlafaxine study conducted by Mandoki et al., nausea and increased appetite were more likely to occur in subjects treated with venlafaxine compared to placebo (Mandoki et al., 1997). There were no other reports of adverse events. Similarly, the Simeon study did not report differences in rates of adverse events between subjects on fluoxetine or placebo (Simeon et al., 1990). Therefore, these 2 studies are omitted from the majority of the sections below.

Serious adverse events (SAEs) or adverse events that lead to discontinuation of treatment make up only a small percentage of the total number of adverse events in both placebo and active treatment groups in the clinical trials. It is important to keep in mind that reports of adverse events leading to discontinuation and SAEs may or may not overlap. In other words, a subject who has an SAE may or may not discontinue from the study. The alternative is also true: subjects who discontinue due to an adverse event may not be a SAE. Similarly, suicidal events may or may not be classified as SAEs or reasons for discontinuation. For example, a subject who has suicidal ideation may be maintained in the study, so that event would not count as a discontinuation, and if it is not life threatening or does not lead to hospitalization, it is not an SAE. On the other hand, a serious attempt that requires hospitalization and removal from the study would count as both an SAE and a reason for discontinuation. In all of these cases, the determination of how to categorize an event is made at the site level by the investigators. Thus, the following sections provide details of the studies related to those categories (adverse events, adverse events leading to discontinuation, serious adverse events, and suicidality), and subjects may be listed in one or more areas. Finally, with the recent FDA warning, details regarding behavioral activation, hostility and switch to mania are addressed in a separate section.

Adverse events.  Adverse events are common in antidepressant trials with children and adolescents. Methods for eliciting adverse events, however, vary widely across trials, as there are currently no systematic methods to ascertain adverse events. In most studies, adverse events are collected from spontaneous reports from patients and families. Only the multi-site fluoxetine trial used an additional method to obtain adverse events (standardized Side Effects Checklist; Emslie et al., 2002a). Furthermore, the spontaneous events collected may or may not be related to the medication, and this determination is generally made by the investigator.

The most common adverse effects are generally physical rather than the more controversial and serious adverse events such as hostility, mania or suicide-related events. In addition, although adverse events are common in pediatric trials, adverse events are seen frequently in subjects on both active medication and placebo. Few side effects occur statistically more frequently in the active treatment groups. The most common physical adverse events seen with SSRIs and SNRIs in children and adolescents are similar to those reported in adults in the Physicians’ Desk Reference (2004), but may occur slightly less frequently than in adults.

In addition to physical adverse event, some patients may experience behavioral symptoms, such as aggression, hostility, mania, behavioral activation, motor restlessness, etc. Although not as common as physical symptoms, these adverse events are serious problems when they occur. Patients receiving antidepressant treatment should be monitored for such symptoms, and health care providers should adjust treatment interventions as needed. In the antidepressant trials reviewed, most did not list spontaneously reported adverse events (physical or behavioral) unless they met a certain level of difference (e.g., ≥5% and at greater incidence than placebo). However, if the events led to discontinuation or were considered a serious adverse event, they were specifically listed in most cases. These will be reviewed in the Discontinuation and SAE sections of this review.

Adverse events leading to discontinuation.  One measure of general safety is the rate of discontinuation due to adverse events. In all of the placebo-controlled trials of SSRIs and in some of the SNRIs, rates of discontinuation due to adverse events were reported. Table 6 shows the rates of discontinuation due to adverse events in each of the placebo-controlled trials for both SSRIs and SNRIs. The rates of discontinuation are consistent across all studies ranging from 2 to 12%. In general, the rates of discontinuation were higher in the active medication groups versus the placebo groups with all medications, although the differences, where reported, were not statistically significant.

Table 6.  Rates of discontinuation due to adverse events and rates of SAEs
 Discontinuation due to AESAEs
Drug (n)Placebo (n)p-valueDrug (n)Placebo (n)Source
  1. NS = not significant; NR = not reported; UNK = unknown.

  2. Please note: Events may overlap (i.e., an AE may count both as an SAE and a discontinuation).

 Fluoxetine 199710.4% (5)4.2% (2)NR2%(1)2% (1)Emslie et al., 1997
 Fluoxetine 20024.6% (5)8.2% (9).4081% (1)4% (4)Emslie et al., 2002a
 Fluoxetine 2004UNKUNKUNKUNKUNKMarch et al., 2004
 Paroxetine 20019.7% (9)6.9% (6)NR12% (11)2% (2)Keller et al., 2001
 Sertraline 20039% (17)2% (4)NR4% (7)3% (6)Wagner et al., 2003
 Citalopram 20045.9% (5)5.6% (5)NS0% (0)NRWagner et al., 2004
 Paroxetine (Study#701)8.9% (9)2% (2)NR5.9% (6)1% (1)Emslie et al., 2004b
 Paroxetine (Study#377)11.8% (22)7% (7)NR11.8% (22)6.5% (6)GSK website
 Nefazodone3% (3)3% (3)NRNoneNREmslie et al., 2002b
 Venlafaxine (combined)10% (18)3% (5)NR8% (14)3% (5)Emslie et al., 2004a
 Citalopram10.7% (13)8% (9)UNKUNKUNKMHRA Report (Dec. 2003)
 NefazodoneUNKUNKUNKUNKUNKEmslie et al., 2002b
 Mirtazapine5.3% (9?)3.4% (3?)UNK1.2% (2)1.1%(1)Organon data on File


In the single site fluoxetine trial (Emslie et al., 1997), 2 subjects (4%) on placebo and 5 on fluoxetine (10%) discontinued the study due to adverse events. One subject on placebo developed mania and one had a suicide-related event. Three of the fluoxetine subjects developed manic symptoms, one developed a rash, and one had a suicide-related event. In the multi-site fluoxetine trial (Emslie et al., 2002a), rates of discontinuation were not statistically different between the active treatment group (4.6%) and the placebo group (8.2%), but were slightly higher in the placebo group. In the fluoxetine group, 1 each discontinued for rash, agitation, constipation, hyperkinesias and manic reaction (n = 5). In the placebo group, 1 each discontinued for rash, abdominal pain, alopecia, anxiety, dizziness, headache, kidney infection, aggressive behavior, and self-mutilatory behavior (n = 9). Three of the placebo subjects who discontinued treatment were also considered SAEs because the event required hospitalization (kidney infection, aggressive behavior, and self-mutilatory behavior). Data are unavailable for the rates of discontinuation due to adverse events for the TADS study.


In the published paroxetine study, discontinuation due to adverse events occurred in 9.7% (n = 9) for paroxetine and 6.9% (6) for placebo (Keller et al., 2001).

The GSK website ( also provides details about adverse events for the 2 unpublished trials. In the international adolescent only trial (#377), discontinuation due to adverse events occurred in 11.8% (22/187) of subjects in the paroxetine group and 7% (7/99) of subjects on placebo. These rates are not statistically significant.

In the child and adolescent trial (#701), 9 subjects (8.9%) in the paroxetine group and 2 subjects (2%) in the placebo group discontinued due to adverse events. Four paroxetine subjects had worsened depression and 1 had a suicide attempt (‘emotional lability’). In addition, 1 discontinued due to agitation and irritability (‘nervousness’), 1 for hostility, and 2 for medical problems. In contrast, 2 subjects on placebo were discontinued: 1 for suicidality (‘emotional lability’) and 1 for mood swings (‘emotional lability’), insomnia, and restlessness (‘nervousness’). No statistically significant differences in discontinuation due to adverse events were found between the two groups.


Nine percent (17/189) of subjects in the sertraline group discontinued from the study due to adverse effects; 2% (4/187) of those on placebo discontinued due to adverse events (MHRA website). The majority (13/17) of subjects who discontinued from the sertraline arm due to adverse effects were children (Wagner et al., 2003). The commonest reason for discontinuation was psychiatric adverse events. These were suicidal thoughts (3 sertraline), suicide attempt (2 sertraline, 2 placebo), aggressive behavior (1 sertraline), agitation (3 sertraline), and hyperkinesias (2 sertraline) (Wagner et al., 2003; MHRA website).


In the published trial of citalopram, the rate of discontinuation due to adverse events was similar for citalopram and placebo (5.6% vs. 5.9%). Discontinuation due to agitation was reported in 2 subjects and discontinuation due to worsening depression was reported in 2 subjects. All 4 of these subjects were on citalopram (Wagner et al., 2004). In the unpublished trial reported in the MHRA report, there were 13/121 (10.7%) discontinuations in the treatment group and 9/112 (8.0%) in the placebo group. Five of the discontinuations in the treatment group were due to suicide-related events compared to 2 in the placebo group. As mentioned previously, this study included both inpatients and outpatients, so it is possible that the baseline severity of this population was greater. Until more data are available on the study design and other outcomes, it is difficult to interpret these results.


In the combined trials of venlafaxine, 18 (10%) on venlafaxine and 5 (3%) on placebo discontinued due to adverse events. Reasons for discontinuation from venlafaxine were physical symptoms (n = 8), psychiatric symptoms (n = 5: 2 hostility/aggression, 2 mania, 1 hallucination), and suicidal behavior (n = 5: 4 suicidal ideation, 1 attempt). Reasons for discontinuation from placebo were medical (n = 2) and psychiatric symptoms (n = 3: 2 nervousness/hostility, 1 mania). In addition, there were 5 subjects who discontinued during the placebo run-in period for adverse events: 1 with depression, 1 hostility, 1 suicidal ideation, 1 accidental injury, and 1 unexpected pregnancy) (Emslie, personal communication).


The discontinuation rate due to adverse events in the nefazodone study was 3% (n = 3) for both active medication and placebo. Specific details about the reasons for discontinuation are not reported in the abstract (Emslie et al., 2002b).


In the two placebo-controlled trials of mirtazapine, adverse events were reported as one combined sample size. The rate of discontinuation due to adverse events was similar for mirtazapine and placebo (5.3% vs. 3.4%, respectively).

Serious adverse events.  The reporting of SAEs has been consistent in clinical trials with SSRIs and SNRIs. Adverse events are considered to be SAEs if any of the following occurs:

  • 1Death.
  • 2Life threatening: The subject was at substantial risk of dying at the time of the adverse event.
  • 3Hospitalization (initial or prolonged): Admission to the hospital or prolongation of a hospital stay results because of the adverse event.
  • 4Disability: The adverse event resulted in a significant, persistent, or permanent change, impairment, damage or disruption in the patient's body function/structure, physical activities or quality of life.
  • 5Congenital anomaly/birth defect: There are suspicions that exposure to a medical product prior to conception or during pregnancy resulted in an adverse outcome in the child.
  • 6Event requires intervention to prevent permanent impairment or damage: A condition that requires medical or surgical intervention to preclude permanent impairment or damage to a subject.
  • 7Other important medical events: Events that may not result in death, be life threatening, or require hospitalization may be considered a serious adverse drug experience if they may jeopardize the subject or may require medical or surgical intervention (e.g., failed suicide attempts).

Determining whether an event is an SAE is made at the site level by the investigator. Furthermore, the relationship of the event to the medication is determined by the investigator. Psychiatric adverse events, such as worsening of mood or suicidal behavior, are often difficult to categorize, as it is unclear if the event is related to the medication or to the illness. Suicide-related events are reported as SAEs if they cause any of the 7 SAE definition requirements; however, some suicidal behaviors would not qualify as an SAE. For example, in some cases of suicidal gestures (e.g., cutting) or suicidal ideation, the event may not be documented as an SAE. As mentioned, classification is further made problematic because many of the psychiatric adverse events or SAEs may be a result of the illness, rather than the medication. Placebo-controlled trials are needed to evaluate whether there is increased incidence in such events in active medication over placebo. Table 6 shows the rates of SAEs in the included trials.


In the single-site fluoxetine trial, no SAEs were reported in the manuscript. However, 1 subject on fluoxetine and 1 subject on placebo were hospitalized due to suicide related events, which would qualify as SAEs (Emslie, personal communication). In the multi-site trial, 5 SAEs were reported: 1 fluoxetine and 4 placebo subjects. The fluoxetine subject was hospitalized for swollen tonsils; the four placebo subjects were hospitalized for each of the following: kidney infection, aggressive behavior, abdominal pain/appendicitis, and self-mutilation. Data on the SAEs occurring in the TADS study are not yet available.


In the published report of paroxetine, 11 (12%) patients on paroxetine and 2 (2%) patients on placebo were reported to have SAEs (Keller et al., 2001). SAEs in the study were defined as serious if they resulted in hospitalization, were associated with suicidal gestures, or were determined by the treating physician to be serious. Of the 11 SAEs with paroxetine, 1 was a severe headache and the other 10 subjects had various psychiatric events including worsening of depression (2), ‘emotional lability’ (e.g., suicidal ideation/gestures; 5), hostility or conduct problems (2), and euphoria (1). Seven of these patients were hospitalized. Only the subject with headache was considered to have a drug-related SAE. In the placebo group, 1 subject developed ‘emotional lability’ and 1 had a worsening of depression.

In the international adolescent trial (#377), 11.8% (22) of paroxetine and 6.8% (6) of placebo subjects had SAEs. Three of the paroxetine subjects had agitation and 6 had ‘emotional lability’. In contrast, none of the placebo subjects had agitation, 3 had ‘emotional lability’, and 1 had nervousness.

In the child and adolescent trial (#701), SAEs occurred in 6/101 (5.9%) subjects in the paroxetine group and 1/102 (1%) patient in the placebo group. In the paroxetine group, 3 subjects had worsening of depression, 1 subject had suicidal ideation (termed ‘emotional lability’), and 2 had a suicide attempt (termed ‘emotional lability’) with subsequent medical conditions (1 hypertension, 1 arm laceration). Four of these subjects were discontinued following the SAEs. In the placebo group, 1 subject had suicidality (termed ‘emotional lability’), and was discontinued from the study.


Seven subjects (4%) in the sertraline group had SAEs: 2 with a suicide attempt, 3 with suicidal ideation, 1 with aggressive behavior, and 1 requiring hospital admission for medical problems. In contrast, 6 (3%) of subjects in the placebo group had SAEs: 2 with suicide attempt and 4 with hospitalization due to medical conditions.


In the published trial of citalopram, no SAEs occurred in patients treated with citalopram (Wagner et al., 2004). The report does not mention if there were any SAEs in the placebo group. In the unpublished trial of citalopram, specific information about SAEs was not provided in the MHRA report (MHRA website). However, 15 (12.4%) subjects in the treatment group versus 9 (8%) in the placebo group were hospitalized during the study for psychiatric disorders.


In the combined trials (Emslie et al., 2004), 14 (8%) on venlafaxine and 5 (3%) on placebo had SAEs. Medical SAEs occurred in 4 subjects on venlafaxine. Other SAEs in the venlafaxine group were due to psychiatric symptoms (1 agitation/hostility, 1 mania, 2 worsening of depression, 1 hallucinations) and suicidal behaviors (4 suicidal ideation and 1 suicide attempt). Types of SAEs in the placebo group were similar: 1 medical, 3 psychiatric (2 hostility, 1 mania), and 1 suicidal behavior (self-injurious behavior).


No SAEs occurred in the nefazodone study for patients receiving nefazodone. The report did not include information about SAEs in the placebo group.


Only 3 SAEs were reported following randomization (2 on mirtazapine and 1 on placebo). Only 1 of these involved worsening of depression and suicidal ideation (mirtazapine), and in the opinion of the investigator, this event was unlikely due to the study drug. The other 2 SAEs involved medical events: 1 overdose on Depakote after being ‘dared’ (mirtazapine), which required emergency room treatment and was considered by the investigator to be unlikely related to study drug, and 1 increased temperature with possible viral infection (placebo), which required hospitalization and was considered to be unrelated to study drug in the opinion of the investigator.


From published and unpublished trials of SSRIs and SNRIs in children and adolescents, there are reports of increased suicide-related events in patients treated with medication versus placebo (Table 7). One difficulty in interpreting suicidal behaviors in depression studies is that suicide and suicidal ideation are a symptom of depression. Suicide is the 3rd leading cause of death in adolescents. Approximately 19% of teenagers (age 15–19) in the general population think about suicide and nearly 9% of teenagers make an actual suicide attempt (MMWR, 2002). These rates are even higher in patients receiving some type of care for depression. Studies find that 35–50% of these youth have made, or will make, a suicide attempt. Thus, when an individual patient makes a suicide attempt during the course of treatment, it is simply not possible to know if the event was related to the medication or if it was a part of the illness itself. In many cases, patients have suicidal ideation prior to treatment, as part of the illness. Continuation or worsening of such symptoms could be related to medication, but could also simply be lack of improvement. Only when there are sufficient numbers of patients in placebo-controlled trials will we be able to determine if medication treatment is associated with increased suicidal behavior.

Table 7.  Overall relative risk for suicide-related events in clinical trials of youth by drug
DrugRelative risk (95% CI)
MDD trialsAll trials
  1. MDD = Major Depressive Disorder.

  2. 95% CI = 95% Confidence Interval.

  3. Source: Hammad, 2004; FDA website.

Celexa1.37 (.53, 3.50)1.37 (.53, 3.50)
LuvoxNo MDD trials5.52 (.27, 112.55)
Paxil2.15 (.71, 6.52)2.65 (1.00, 7.02)
Prozac1.53 (.74, 3.16)1.52 (.75, 3.09)
Zoloft2.16 (.48, 9.62)1.48 (.42, 5.24)
Effexor XR8.84 (1.12, 69.51)4.97 (1.09, 22.72)
Remeron1.58 (.06, 38.37)1.58 (.06, 38.37)
SerzoneNo eventsNo events
WellbutrinNo MDD trialsNo events
Total1.66 (1.02, 2.68)1.95 (1.28, 2.98)

The other major difficulty in interpreting data on suicidal behavior in these studies is terminology. Suicidal behavior is not the same as suicidal ideation. Different studies use varying terminology. For example, in the paroxetine trial, ‘emotional lability’ included suicidal ideation and suicidal gestures. Furthermore, investigators within a single study may use different terminology. An event such as cutting may be labeled several different things: suicidal gesture, self-mutilatory behavior, suicide attempt, etc. Clearly, hospitalization due to an actual attempt is not equal to hospitalization due to increased ideation. Thus, analyzing the events is somewhat difficult. Furthermore, within events that are actual suicidal behaviors (i.e. actually doing something that could lead to death), there are wide degrees of difference. Self-harm with no ideation or intent is not the same as a suicidal gesture, which is not the same as an attempt. Therefore, the FDA recently completed a project to re-evaluate each of these types of events across the pediatric antidepressant RCTs, and re-categorize them consistently across all studies.

Table 7 presents the results of the FDA re-analyses of suicide-related events in youth from antidepressant trials for MDD and for all indications. The overall relative rate of suicide related events was 1.95 (95% CI = 1.28–2.98) events (including suicidal ideation, self-harm, or attempts) for the trials (all disorders). As seen in Table 7, with the exception of venlafaxine, subjects on individual antidepressants were not statistically more likely to experience suicide-related adverse events compared to placebo, possibly because of the low rate of such occurrences. However, the overall rate for suicide-related events was statistically significant.

One interesting trend emerged from the FDA re-analyses. There was a large numeric difference between the increased risks for suicide ideation compared with actual suicide attempts for subjects on venlafaxine. Most of the risk-ratio for venlafaxine was driven by suicidal ideation events, rather than by actual suicidal behavior events (all suicide-related events 8.84, 95% CI = 1.12–69.51; suicidal ideation 7.89, 95% CI = .99–62.59; suicidal behaviors 2.77, 95% CI = .11–67.10). This was also true for sertraline. However, in subjects treated with paroxetine, citalopram or fluoxetine, the risk was higher for suicidal behaviors compared to suicidal ideation, although none of these differences were significant between medication and placebo, either by event or for the total.

The overall risk difference between the drug and placebo groups was 2–3%. However, even after extensive sub-analyses, no predictive factors were identified which distinguished between subjects with treatment emergent suicide-related adverse events and those with suicide-related events related to their depressive disorder.

Behavioral activation, hostility or switch to mania

Although behavioral activation adverse events and manic episodes are generally captured as adverse events leading to discontinuation or as SAEs, with the recent FDA emphasis on this possible adverse event, the reports of these events are reviewed specifically in this section. Table 8 presents the results from the FDA report regarding the risk of developing these adverse events on medication compared to placebo. As for spontaneous reports of behavioral adverse events (excluding those that lead to discontinuation or were classified as SAEs), only a few reports provide this information. For example, there were no spontaneous reports of behavioral adverse events mentioned for fluoxetine (1997, 2002a), citalopram, or nefazodone. In contrast, in the TADS study (March et al., 2004), 6 (5%) subjects on fluoxetine (either alone or in combination with CBT) experienced agitation/hostility/irritability versus 4 (4%) subjects on placebo. In the published report of paroxetine, 7 (7.5%) of paroxetine subjects (1 was reported as an SAE) and no placebo subjects were listed as having hostility (report included any adverse events at ≥5% on any of the treatments). For one of the unpublished study report for paroxetine (#377), all adverse events were listed. In this study, 5 subjects on paroxetine had agitation or hostility. Three of the subjects with agitation were discontinued from the study. In study#701, only adverse events that occurred in ≥5% of subjects were reported. One subject had hostility and was withdrawn from the study. The sertraline report listed all spontaneous adverse events that occurred in ≥5% of the sertraline group and at least two times the incidence of placebo. Within the children, agitation was more frequently seen on sertraline than placebo (8.1% vs. 2.3%); no behavioral events were mentioned for adolescents. Finally, the presented data for venlafaxine reported 5 (3%) subjects on venlafaxine, compared with 2 (1%) placebo subjects who had hostility adverse events.

Table 8.  Overall relative risk for treatment emergent agitation or hostility in MDD trials of youth by drug
DrugRelative risk (95% CI)
  1. MDD = Major Depressive Disorder; 95% CI = 95% Confidence Interval.

  2. *Note that TADS data are NOT added to Prozac.

  3. Source: Hammad 2004, FDA Website

Celexa1.87 (.34, 10.13)
Paxil7.69 (1.80, 32.99)
Prozac*1.01 (.40, 2.55)
Zoloft2.92 (.31, 27.83)
Effexor XR2.86 (.78, 10.44)
Remeron.52 (.03, 8.27)
Serzone1.09 (.53, 2.25)
All drugs1.79 (1.16, 2.76)

Specifically relating to mania, in the citalopram and nefazodone trials, none of the subjects developed mania (Wagner et al., 2004; MHRA website). In the sertraline trial, there was no mention of the risk of switching to hypomania/mania (Wagner et al., 2003). In the fluoxetine trials, 1 subject (1%) on fluoxetine in the 2002 study, and 3 subjects (6%) on fluoxetine and 1 subject (2%) on placebo in the 1997 study developed manic symptoms (Emslie et al., 1997; Emslie et al., 2002a; Emslie, personal communication). In the TADS study, mania occurred in 1% of both the fluoxetine alone and placebo groups. However, 3% of subjects on fluoxetine (either alone or in combination with CBT) developed hypomanic symptoms versus 1% of subjects on placebo. In the published study of paroxetine, 2 subjects on paroxetine developed euphoria (GSK website). In the unpublished paroxetine studies, there is only limited data on switching. One subject on placebo discontinued due to mood swings in study#701, and there were no reports of switching in study#377 (GSK website). Finally, 2 subjects (1%) of the subjects developed manic symptoms during the venlafaxine trial. No information is available on mania for the mirtazapine trials.

Methodology issues regarding safety outcomes

With the increased focus on adverse events associated with antidepressants, researchers have also focused on how these events are evaluated and reported in clinical trials. There are several issues that have been raised, including the method of evaluation of adverse events and the method of reporting adverse events in these trials.

As mentioned above, the process of evaluating adverse events in the placebo-controlled trials has been variable and inconsistent. The different methods used to evaluate adverse events in these trials may account for some of the differences in the rates of adverse events. The different methods used included 1) standardized side effect scales, 2) required serious adverse events reporting, and 3) general inquiry or the dependence on spontaneous reporting by patients. Although the majority of the studies simply depended on the spontaneous reporting of side effects by patients on general inquiry, one of the studies included the use of a standardized checklist to elicit adverse events (Emslie et al., 2002a). Clearly, the latter method would increase the number of adverse effects reported in these trials and may account for the lack of difference between the treatment and placebo groups in that trial (Rabkin & Markowitz, 1986). In fact, the lack of standardized and valid tools for assessing the safety of medications in youth is an issue that remains controversial (Greenhill et al., 2003).

Aside from the method of elicitation of adverse events, there are also issues with the classification and description of adverse events once they are elicited. This issue is of special concern in the reporting of suicide-related events. In clinical trials, it is unclear how suicide-related events are classified or defined. For example, one concern that has been previously mentioned is the use of other terms such as ‘emotional lability’ to describe adverse events that would be classified as suicide-related events in other studies. Furthermore, there is a need for better descriptions of the severity of adverse events, particularly suicide-related events. A frequent example is the intent or lethality of a suicide-related event. Better descriptions of the severity of adverse events will allow users of the literature to better understand the potential risk of adverse events with a particular treatment. Therefore, more clear and consistent guidelines are needed to help researchers to better classify and define adverse events.

There are several methodological issues that arise from the reporting of adverse events in these trials. First, adverse events have been reported in isolation from the subjects’ previous behaviors or history. Since prior behaviors are sometimes the best predictor of future behaviors, it is crucial for studies to report both the adverse event as well as the known history of similar behaviors in these subjects. This is particularly important in the reporting of suicide-related events which are generally highly associated with previous suicidal behaviors (Shaffer & Waslick, 2002). From our review, only 1 trial had collected and reported this important information (Emslie et al., 2002a).

Second, as discussed above, the selection criteria for the publication of adverse events are also quite variable. Some studies only reported adverse events in the treatment groups, while some studies reported adverse events only if there were significant differences between the active treatment and the placebo groups. Finally, some studies only reported adverse events if they occurred in more than 5% of the subjects and with an incidence in the treatment group of at least twice that compared to the placebo group (Wagner et al., 2003; Wagner et al., 2004). The inconsistency in reporting has made it difficult to compare adverse events across studies and for users of the literature to have a full appreciation of the range of adverse events that may occur in patients treated with either active medication or placebo in different studies.


Thus, with the variability in trial methodology and the variation in the drug/placebo response rates within a single trial, clinicians need to be judicious in their interpretation of research data on pediatric antidepressant trials. Clearly, there is no definitive answer. Even in the fluoxetine data, which both the US and UK regulatory agencies have declared safe and effective for pediatric depression, the 2nd multi-site trial had a negative result on the primary outcome measure. Does this mean that the study was negative or that the medication is not effective? Not likely. In fact, most (including these agencies) consider this trial positive because of the resounding positive outcomes in so many of the other variables. Yet, other trials have been deemed negative with similar situations.

There is no evidence from adult data that any of the SSRIs are different from one another. Thus, it is unlikely that these medications are any different in youth. The rates of placebo response have been remarkably different across trials, from 33% to 59%. Does that mean that there is different efficacy among placebos? Rather, the differences are likely more a result of methodological issues.

Finally, the controversies surrounding efficacy of SSRIs in pediatric depressed patients derives in part from the fact that regulatory agencies require 2 positive trials for a medication to receive an indication. In recent years, up to 50% of adult trials failed and frequently up to 4 to 5 studies are conducted to achieve the required 2 positive trials. There is no likelihood that more than 2 studies will ever be completed in the pediatric population with a single medication. Therefore, it is unlikely that these medications will even be given a chance to be adequately evaluated. Thus, clinicians are obligated to make their own interpretations of the limited data available.

In terms of the safety of antidepressants, it is clear from the review that adverse events occur with all antidepressants. However, where data is available, it would appear that only a handful of these events occur significantly more frequently in the treatment subjects versus control subjects. Furthermore, if the established criteria for ‘treatment emergent’ adverse events is used, the number of reported adverse events would decrease, making it even more difficult to determine significant differences between the treatment and control groups.

Perhaps the most controversial issue is around the rates of suicide-related events. There are several issues to consider when evaluating the current available data, including reporting issues with the trials and the reasonableness of this possible association. First, suicide events are a common occurrence in the general population and even more common in youth with depression. Surveys of youth suggest that up to 19% had suicidal ideation and 9% actually attempted suicide in the previous 12 months (MMWR, 2002). This rate is even higher among youth with depression, with up to 35–50% attempting suicide. Second, experts have raised concerns about the standard of reporting for suicide-related events in clinical trials. For example, the lethality or intent of attempts was not considered in the reporting. Based on the re-classification of the events by the FDA, there is a 2–3% increased rate of suicidal behavior (suicidal ideation or behavior) in youth taking an antidepressant. Finally, contrary to these reports, epidemiological data show an association between increased use of antidepressants and decreased rates of completed suicides both in youth and adults (Olfson, Shaffer, Marcus, & Greenberg, 2003). Furthermore, little attention has been paid within the individual trials on improvement of suicidal behaviors with treatment. Therefore, it is difficult to draw conclusions at this time about the association between suicide-related events and SSRIs in children and adolescents from the clinical trial data and epidemiological data.


There are several limitations to this review. First, although there is some evidence that differences may exist between children and adolescents in terms of their responses to antidepressants (affecting both treatment outcomes and rates of adverse events), there was insufficient data to examine this issue in this review. It appears in some trials that the adolescents may have shown a positive response, but the child data was less robust (i.e., 1 paroxetine trial, venlafaxine, sertraline). Furthermore, some of these trials had an overall negative outcome, possibly due to the lack of improvement in the child subgroup. There are possible differences between pre- and post-pubertal depression, which may have implications for treatment. For example, prepubertal depression may be associated with more psychosocial stressors, neurocognitive deficits, and disruptive disorders than adolescent depression (Harrington, Fudge, Rutter, Pickles, & Hill, 1990; Jaffee et al., 2002). There is also the question of whether pre-pubertal children exposed to SSRIs may be at greater risk to develop mania (Martin et al., 2004).

Second, adherence was not measured in the trials. Non-compliance may play a significant role in the development of adverse events since most newer antidepressants have a short half-life in children, and this may lead to increased rates of adverse events in this age group. The reports provided no data on compliance in these trials.

Finally, due to lack of availability of clinical trials data and the small number of certain adverse events (i.e., switch to mania), our analyses were limited. For example, there is insufficient data for us to calculate effect sizes for each of the included trials or to conduct statistical testing for certain rare adverse events. Furthermore, it is noteworthy that in the recent TADS study, which allowed increasing fluoxetine doses above 20 mg, there were more SAEs and suicidal events than in the 2 previous studies, which restricted doses to 20 mg. It is likely that dose played a significant impact not only in the adverse events occurring in the studies, but also in efficacy outcomes (i.e., children had high drop out rates due to AEs, which could have impacted efficacy results).

Future recommendations

Implications for research

There are several recommendations for future research. First, suicide-related events need to be better characterized in research settings so that they can be compared across studies. This includes the intent and lethality of an event. This should also be generalized to other adverse events where the severity of the event needs to be evaluated. Second, previous history or behaviors are commonly highly correlated with current or future behaviors. Therefore, studies should report not only the adverse events but also the history of similar behaviors in subjects. Third, adverse events should be reported in a standard manner in all clinical trials so that rates can be compared across studies with some confidence. Finally, there is emerging evidence both in clinical trials and published case reports of 3 possible adverse effects from antidepressants: suicide-related events, hostility or behavioral side effects and precipitation of mania. Future studies should pay special attention to the possible emergence of these adverse effects.

Clinical implications

Several points must be considered by clinicians using SSRIs in youth with depression:

  • 1Recent analyses of antidepressant trials have suggested a link between antidepressants and suicidal thinking and behavior in children and adolescents with depression or other psychiatric disorders. Suicidal behavior is a symptom of depression, so determining the cause of ongoing or increased suicidal behavior in youth who are depressed is somewhat difficult. However, antidepressants may be associated with worsening of depression or increased suicidality in some patients. Patients who are started on antidepressants should be closely monitored for clinical worsening, suicidality, or unusual changes in behavior.
  • 2In addition to monitoring for the worsening of depression and suicidality, patients should also be monitored for agitation, irritability, and other behavioral changes (anxiety, hostility, impulsivity, akathisia, etc.), particularly during the initial few months of treatment and at times of dose changes (increases or decreases).
  • 3Families and patient need to be fully informed about the possible risks associated with the use of antidepressants. However, they should also be informed about the risks of not being treated for a major depressive disorder (functional impairment, suicide risk) and in the case of suicidality, the epidemiological findings of lowered suicide rates with increased rates of antidepressant use. Families should be advised to closely observe the patient for any changes in depression, suicidality, and other behavioral changes, and should communicate any concerns to the clinician immediately.
  • 4Patients should be evaluated for risk and/or presence of bipolar disorder. It is possible that treating such patients with an antidepressant alone may increase the likelihood of precipitating a mixed or manic episode. In the event that clinicians suspect bipolar disorder (either initially or during a trial of medication), documentation of how treatment was subsequently adjusted should be made.
  • 5The FDA has recommended that treatment monitoring visits be frequent during the initial weeks of antidepressant treatment (weekly for 4 weeks), slowly reducing the frequency of visits over time (every other week for the next month and then again after 3 months of treatment). In addition, patients and families should be informed that making medication adjustments without consulting the clinician can cause complications, and all medication dose adjustments should be made by the doctor.