A naturalistic effectiveness study of maintenance therapies for the bipolar disorders

Abstract Background Treatment decision‐making for individuals with bipolar disorder can be difficult. Recommendations from clinical practice guidelines can be affected by multiple methodological limitations, while pharmaco‐epidemiological data suggest great variety in prescription practices across regions. Given these inconsistencies, this study aimed to provide an alternative perspective on the effectiveness of common bipolar disorder maintenance treatments through considering naturalistic data. Methods A total of 246 individuals with bipolar disorder (84 bipolar I [BP‐I], 162 bipolar II [BP‐II]) were recruited through clinics and/or websites. All were euthymic and had trialled at least one mood stabiliser. They completed an online survey containing questions on demographics, clinical variables, symptomatology, and the effectiveness/side effect profiles of any mood stabilisers (MSTs) or atypical antipsychotics (AAPs) that they have taken. Results Lithium and lamotrigine were the most commonly prescribed MSTs and the most effective at mood stabilisation. Lithium and lamotrigine appeared marginally more effective for BP‐I and BP‐II respectively, however, only the latter difference was statistically significant. Furthermore, lamotrigine had the more favourable side effect profile. Amongst the AAPs, quetiapine and olanzapine were the most commonly prescribed, but they were negligibly superior to other AAPs. Conclusion This study clearly established a preference for lamotrigine in the maintenance treatment of BP‐II. While the literature consistently emphasises the primacy of lithium in bipolar disorder treatment, its side effect profile as observed in this study remains a concern. Future research considering moderators of treatment response and concomitant medications could help to identify further nuances to consider for treatment decision‐making.

The management of the bipolar disorders is of great clinical importance given the significant health burden these conditions pose. 1 For maintenance treatments, pharmacotherapy with adjunctive psychotherapy is often considered to be optimal. 2Pharmacotherapy is generally viewed as the central therapeutic component, with mood stabilisers (MSTs) such as lithium and anticonvulsant medications (i.e., lamotrigine, valproate, and carbamazepine) prioritised, and atypical antipsychotics (AAPs) such as quetiapine, olanzapine, and aripiprazole commonly used as alternate or additional mood stabilisers or for their antimanic properties. 3,4Antidepressants can also be used as a primary treatment, though this is generally not recommended, largely because they can cause mood switching. 5o identify optimal treatments for the bipolar disorders, clinical practice guidelines (CPGs) authored by professional organisations are a common reference point for clinicians as they generally argue that an evidence-based approach underpins their development.Lithium is the most commonly recommended first-line treatment, 6 but recommendations can also differ according to diagnostic subtypes, as identified previously in a review of 11 CPGs. 3pecifically, for bipolar I disorder (BP-I), all CPGs recommended an MST as a first-line treatment (most commonly lithium), with AAPs being a common adjunctive medication.While for bipolar II disorder (BP-II), the most commonly nominated medication was lamotrigine, followed by lithium and quetiapine.
However, while their evidence-based approaches should theoretically generate consensual recommendations, the recommendations of these 11 CPGs were more distinguished by their differences than by their similarities. 3Such differences could reflect emphases on different nuances of bipolar disorder (e.g., severity, dominant polarity, recency of acute mania, diagnostic subtypes) or methodological nuances integral to developing the CPGs.One quality assessment study of CPGs also noted the existence of general issues relating to developmental rigour, practical applicability, conflicts of interest and involvement of necessary stakeholders, concluding that less than half of them could be recommended for use. 7inally, another key issue is that CPGs generally rely on efficacy data from randomised controlled trials (RCTs) and meta-analyses (which often rely on RCT evidence themselves 8 ).While often considered the 'gold standard', such studies often suffer limitations in terms of external validity, 9 such as the tendency for common inclusion/ exclusion criteria to exclude those with comorbid conditions and concomitant medications. 10As a result, individuals recruited for these studies often differ considerably from patients seen in clinical practice.
Thus, if we accept that clinicians make treatment decisions based on their experience of what works best, there is value in also considering naturalistic (i.e., 'realworld') effectiveness studies when evaluating any treatment alone or in comparison to other treatments.Pharmaco-epidemiological data on prescription rates for medications could potentially provide an indirect measure of naturalistic efficacy, but in practice these can vary quite widely across locations.For example, in one study directly comparing treatments for those with a bipolar disorder at two university medical centres in USA and in Italy, patients at the latter site were more commonly prescribed antipsychotics (for both subtypes), valproate or antidepressants if they had a BP-II diagnosis, and were less commonly prescribed lamotrigine if they had a BP-I diagnosis. 11Another large international study identified further geographic idiosyncrasies, including lower rates of lithium usage in North America and higher rates of typical antipsychotics in Europe. 12n short, CPGs can be inconsistent with each other, estimates of prescription rates can be too, and CPG recommendations are often not reflected in prescription patterns. 12Such factors hinder our ability to draw firm conclusions on first-choice maintenance medications for managing the bipolar disorders.Naturalistic effectiveness data, provided either by clinicians or patients themselves, can potentially be highly informative.Aside from one recent study that defined effectiveness on the basis of hospital re-admission, 13 such data are rarely considered.Our current study therefore aimed to evaluate the comparative effectiveness and side effect profiles of candidate

Significant Outcomes
• Lithium and lamotrigine were judged by patients to be the two most effective mood stabiliser medications for maintenance treatments of bipolar disorder.• Lamotrigine appears to be preferable for those with a bipolar II disorder, largely because of its more favourable side effect profile.

Limitations
• The study did not account for potential moderators or mediators of treatment response.• All data were self-reported and were not externally validated.• Recruitment avenues may not have been broad enough to account for geographic effects on prescription practices.
maintenance medications for managing the bipolar disorders as assessed by patients' self-reports.In so proceeding, we also sought to consider data in relation to the differing BP-I and BP-II sub-types as formalised by the DSM-5. 14n addition to these broad objectives, we pursued a specific hypothesis detailed previouslynamely, that lithium is superior to lamotrigine for managing BP-I, and lamotrigine is superior to lithium for managing BP-II. 15

| MATERIALS AND METHODS
All procedures outlined below were approved by the University of New South Wales Human Research Ethics Committee (project ID HC200192).

| Participants
Participants were recruited through three primary methods.The first involved six clinicians (which included authors J. L., A. B., A. J., and G. P.) inviting eligible patients from their own private practices in Sydney, Australia to participate in the study.The second involved online advertisements posted to recruitment webpages, newsletters, and social media accounts affiliated with the Black Dog Institute and the former School of Psychiatry at the University of New South Wales.The third method involved a direct invitation posted in a large Facebook support group consisting of (mostly American) individuals with a bipolar disorder.
Participants were required to be aged 18 or older, fluent in English, have a diagnosis of a bipolar I or II disorder, be currently euthymic, and either currently taking or had previously taken at least one of four common mood stabilisers (i.e., carbamazepine, lamotrigine, lithium, or valproate).This last inclusion criterion reflected the centrality of MSTs as maintenance treatments for the bipolar disorders. 3No pre-specified target for the sample size was set due to a lack of comparable studies that might allow a power estimate.We ceased recruitment after data had been obtained for 246 participants (84 BP-I, 162 BP-II, see Supplementary Table 1 for a breakdown according to recruitment avenue and diagnoses), judging that if differential responses existed such a sample size would be sufficient to show trends or significant differences.

| Study components
Participants were provided a link to an online self-report questionnaire hosted by Qualtrics XM (Qualtrics, Provo, Utah, USA), which they could access and complete at their own convenience.The questionnaire consisted of three sections, the first two of which sought demographic and clinical information respectively.For the former, data were collected on the participant's age, sex, marital status, educational background, employment status, and ethnicity.For the latter, data were collected on their clinician's diagnosis (i.e., bipolar I or II), the ages at which subjects experienced their first depressive and first manic/hypomanic episode, their age when first diagnosed with a bipolar condition, the length of their longest and average depressive and manic/hypomanic episodes, the percentage of time before treatment spent depressed, manic/hypomanic, and euthymic, whether they had ever been hospitalised for depression and/or mania, whether they had any first-or second-degree relatives with a bipolar or unipolar mood disorder, and whether they had experienced psychotic features (i.e., delusions or hallucinations) when depressed or manic.
Participants also completed a 54-item checklist of representative manic symptoms (each scored as either present or absent).The items came from a larger item set of potential symptoms of mania/hypomania that were previously derived by an international taskforce of bipolar disorder experts. 16The 54 items were then identified in an exploratory factor analysis as reflecting seven major constructs underlying manic/hypomanic symptomatology: sleep, disinhibition, distractibility, grandiosity, anger/frustration, speech and spending. 17he third section asked questions about medication history.For all questions, participants were asked to reflect on the effectiveness and effects of each medication independently, irrespective of whether it was part of a mono-or poly-therapeutic treatment program.Participants were asked first how many medications they were taking or had taken for their bipolar disorder overall, however, we were particularly interested in the four MSTs noted previously along with nine AAPs (aripiprazole, asenapine, brexpiprazole, lurasidone, olanzapine, paliperidone, quetiapine, risperidone, and ziprasidone).Participants were therefore requested to affirm which of these 13 medications they were currently taking or had previously taken, and for each, they were asked how effective they judged the medication to be in improving depression, mania/hypomania, and their bipolar disorder overall (each was measured on a continuous scale indicating the judged percentage effectiveness, ranging from 0% to 100%).Participants were also asked whether they experienced side effects, and if so, how severe they were (on a 5-point Likert scale from "very low" to "very high"), and whether these side effects led to them ceasing the medication.Finally, participants were asked to list their current medications, and how long they had been taking each.

| Analyses
All analyses were conducted in R (version 4.2.1),most of which examined differences between patients with BP-I and BP-II conditions.To this end, two sets of diagnostic strategies were employed: the bipolar subtype diagnosis provided to participants by their clinician (C1 and C2 for BP-I and BP-II respectively), and diagnostic classes determined only by the presence or absence of psychotic features during manic/hypomanic episodes (P1 and P2 respectively).According to DSM-5/DSM-5-TR criteria and other proposed modifications, 18 the presence of psychotic features indicates mania, and thus a BP-I diagnosis.This can allow for a potentially parsimonious approach to differentiating BP-I from BP-II.However, in our sample, 50.0% of clinically diagnosed BP-II participants (n = 81) reported experiencing psychotic features during their hypomanic episodes.The P1 vs P2 assignments were therefore considered to be a more flexible diagnostic allocation in highlighting the primacy of the presence/absence of psychotic features and (given the potential for clinician misdiagnosis) offered a rough proxy measure for the upper bound of the size of the BP-I class.
Diagnostic differences in demographic and clinical features were assessed using Mann-Whitney U tests (for continuous variables) and either Pearson's χ 2 tests (for most categorical variables) or Fisher's exact tests (for categorical variables with at least one expected cell count less than 5).The principal analyses sought to examine differences in medication prescription rates, treatment efficacy, and side effect profiles (i.e., prevalence, severity, and cessation rates) between the MSTs and AAPs.These were reported for the overall cohort, the C1 and C2 classes, and the P1 and P2 classes where necessary (given the common use of AAPs to treat psychosis, P1 and P2 differences on AAP variables were likely to be of trivial importance).All relevant analyses were conducted using either Mann-Whitney U or Pearson's χ 2 tests, with Benjamini-Hochberg adjustments for multiple comparisons. 19The AAPs brexpiprazole, paliperidone, and ziprasidone were excluded from all analyses due to low numbers (i.e., n < 10), apart from tests that considered all AAPs combined.
For medication exposure, we hypothesised that C1/P1 participants were likely to have received more medications overall, were more likely to have received lithium and AAPs, and were less likely to have received lamotrigine than C2/P2 participants.Further, in line with previously discussed research, 15,20 we hypothesised that lithium would have a superior cost-benefit profile (i.e., considering both effectiveness and side effects) than lamotrigine for C1/P1 compared to C2/P2 participants, while the reverse pattern would be true for C2/P2 participants.

| Demographic and clinical features
Table 1 presents the demographic features of the sample.In brief, the median age of the sample was 40.5 years, nearly three-quarters were women, and the proportion of those who were university-educated, employed, had a spouse/de facto partner, or were non-Indigenous Australians were all roughly 60%.There were no appreciable differences between the diagnostic subtypes, except for C1 patients being slightly older than C2 patients, and P1 participants being less likely than P2 participants to be employed.Differences between the diagnostic subtypes on clinical variables, as displayed in Supplementary Table 2, are largely consistent with known BP-I and BP-II features.

| Prescription rates
Table 2 reports the lifetime and current prescription rates of MSTs and AAPs.Of the four MSTs, lithium and lamotrigine were the most commonly prescribed.Just over 70% of all participants had received an AAP, of which quetiapine was the most frequent, with nearly half of the sample having received this medication.Lifetime exposure to quetiapine was significantly higher than for the second most received AAP, olanzapine (χ 2 = 14.9, p < 0.001), which in turn was higher than for the third most received, aripiprazole (χ 2 = 5.9, p = 0.015).Aside from lamotrigine, rates of attrition appeared quite high.However, the differences in attrition rates for both lithium (C1 33.3%, P1 36.6% vs. C2 50.0%,P2 54.5%; χ 2 = 4.12/4.03,p = 0.042/0.045)and the AAPs (C1 32.7% vs. C2 52.9%; χ 2 = 6.79, p = 0.009) were significantly higher for BP-II patients.

| Effectiveness
For the MSTs, lithium and lamotrigine were consistently perceived to be superior to both valproate and carbamazepine, as reported in Supplementary Table 3.
In relation to our specific hypothesis that lithium and lamotrigine are more effective for BP-I and BP-II respectively, the mean efficacy ratings displayed in Table 4 were in the expected directions but none of the analyses were significant after adjusting for multiple comparisons (p range 0.12-0.16).Unadjusted p-values did show some evidence for the superiority of lamotrigine in those without psychotic features (p = 0.042), while a similar relationship for C2 patients just exceeded the threshold for statistical significance (p = 0.058).Some clearer patterns emerged when consideration was limited to the efficacy of maintenance treatments in preventing specific mood phases (see Table 5).For depressive periods, there was clear evidence for the superiority of lamotrigine for both C2 and P2 assigned patients compared to their C1/P1 counterparts (both p's = 0.002).However, a similar relationship for lithium and C1/P1 patients was not observed (both p's = 0.74).For manic/hypomanic phases, lithium showed significantly greater efficacy for both C1 (p = 0.04) and P1 (p = 0.004) assigned patients.Lamotrigine also performed better for those without psychotic features (p = 0.04), though not for those with clinically diagnosed BP-II (p = 0.42).
In considering the AAPs, lurasidone, quetiapine, and olanzapine were consistently rated as the most effective ones.However, none of the differences between any of the AAPs reached statistical significance for any of the diagnostic subtypes (see Supplementary Table 4).Additionally, when all AAPs were considered together, there were no differences between C1 and C2 assigned patients in their efficacy ratings overall, or for depressive or manic/hypomanic episodes.

| Side effects
Supplementary Tables 5 and 6 report descriptive statistics for the side effect prevalence, severity, and side effect induced cessation rates for all medications.As displayed in Table 6, side effects were more severe and more common for lithium compared to lamotrigine.However, only C2/P2 patients were more likely to have ceased lithium than lamotrigine due to side effects.Further results from the pairwise comparisons of the other MSTs and AAPs are provided in Supplementary Tables 7 and 8.

| DISCUSSION
This study sought to collect naturalistic data on the prescription and treatment preferences of common medication treatments for the bipolar disorders, and to determine how such data might inform overall and subtype-specific treatment recommendations.Such data have the potential to be highly informative for clinical management given the inconsistent interpretations and recommendations offered by CPGs.As noted, such CPG data are generally derived from RCTs, which have methodological (and especially sampling) limitations.Our current approach therefore focused on obtaining naturalistic effectiveness data, while also considering the impact of side effects.While we do not argue that naturalistic data are necessarily superior to RCTs, we do suggest that clinical understanding can be advanced by considering data from both sources.
A strength of this study was that reasonable sample sizes of individuals who had taken lithium and/or lamotrigine were obtained, and this allowed some hypotheses to be pursued which generated some important findings.Nevertheless, we note that there are several study limitations.First, we focused on overall characteristics of medication impact and did not consider the role that symptom profiles or other clinical variables above and beyond diagnosis had on treatment response.Second, all data and diagnoses were self-reported and were not able to be externally validated.Third, recruitment occurred primarily in Australia and the United States, and given geographical differences in access to medications and their likely impact on prescription rates, it is unclear to what extent study findings can be generalised.Finally, there were several participants who identified as having experienced psychotic features (P1) but had not received a BP-I diagnosis from a clinician (C1).As psychosis during elevated mood episodes necessarily confers BP-I status, this was not expected.We could not establish whether this lack of overlap reflected clinician error, patient error, or the patients not providing information about self-reported psychosis when clinicians were formulating their diagnosis.While we place more emphasis on the results of the C1/C2 diagnostic allocations, we note that naturalistic data necessarily preferences lived experience, and that these inconsistencies may not reflect issues with data accuracy, but rather differences in how clinicians and patients perceive psychosis.This would certainly be an interesting avenue for future research, but it is beyond the scope of this study.Prescription rate data indicated that lithium and lamotrigine were the most commonly received MSTs, while quetiapine, olanzapine, and aripiprazole were the most commonly received AAPs.Such differential rates could reflect recommendations from CPGs, clinician observation, and/or marketing by pharmaceutical companies.2][23] Meanwhile, when considering diagnostic subtypes, BP-I and psychotic participants were more likely than BP-II and nonpsychotic participants to have been prescribed lithium and any AAP, less likely to have been prescribed lamotrigine, and had also trialled more medications.The last finding may reflect the greater severity of a BP-I or psychotic condition, its need for complementary MST and AAP medications, or its greater resistance to first line and subsequent medications.
Our results clearly indicated that, amongst the MSTs, lithium and lamotrigine were rated as more effective than both carbamazepine and valproate (admittedly only 26 participants had trialled carbamazepine; this may in itself reflect its perceived lack of efficacy by clinicians relative to other MSTs).Extending from this, one of our key hypotheses was that lithium would be the superior treatment for BP-I and lamotrigine for BP-II.When considering overall effectiveness alone, the differences were not statistically significant for any of the diagnostic subtypes considered in this study, although the associations were in the hypothesised direction.However, we did find evidence for the superiority of (i) lamotrigine in treating BP-II depression and non-psychotic mania/hypomania, and (ii) lithium for treating mania/hypomania in BP-I patients.2][23] However, there are several other possibilities as to why diagnostic differences appeared for specific mood episodes but not for efficacy overall.
One of the most obvious, as previously mentioned, is this study's lack of accounting for the effect of moderators or mediators (aside from diagnosis) on treatment response.Controlling for these could produce clearer diagnostic differences, or alternative markers of treatment response independent of diagnosis.Data collected from this study (see Methods) as well as previous metaanalyses of predictors 24 could offer some possible explanations.One is that the choice of lithium or lamotrigine might be better made on the basis of predominant polarity rather than diagnostic subtype.Of course, these constructs are not mutually exclusive; for example, it is more common for the dominant polarity of BP-II patients to be depression. 25,26However, respecting this criterion would likely have little practical utility given the preponderance of individuals without any dominant polarity. 25,26inally, it could also reflect a limitation of the questions used to assess effectiveness in this study.While our questions were kept unsophisticated for purposes of simplicity, there is the possibility of differences in interpretation, specifically as to whether "overall" effectiveness necessarily emphasises whichever pole is more dominant/severe, or whether depression and mania/hypomania are given equal weight.
Participants reported that side effects were quite common, particularly for lithium and the AAPs.Lamotrigine was consistently one of the better tolerated medications, having a lower side effect prevalence rate and being the least likely medication to be ceased due to side effects.When compared to lithium specifically, lamotrigine had the superior profile for all participants when considering the presence and severity of side effects alone.
Considering these findings together, we found clear evidence for the superiority of lamotrigine for BP-II participants.While appearing marginally more effective than lithium it had a distinctly less severe side effect profile, thus clearly establishing it as the preferred medication for treating BP-II.For our BP-I participants, lithium appeared marginally superior to lamotrigine in terms of effectiveness but again had a higher rate of side effects.Given the primacy of lithium in treatment recommendations for the bipolar disorders (and especially BP-I), 6,[21][22][23] our efficacy finding was consistent with that literature.But when considered with the side effect data, identifying a preferential treatment for BP-I is not as clear as it appears for BP-II.However, given the limited scope of this study, a greater emphasis on further nuances of treatment response, such as the role of moderators, polypharmacy effects, and the order in which medications are trialled, may reveal further practical realities that could argue more strongly for the primacy of lithium for BP-I.This should be a key focus for future research.
To conclude, this study provides real world data on the effectiveness and side effect profiles of medications prescribed for managing the bipolar disorders.Our naturalistic data indicated a clear preference for lamotrigine in individuals with a BP-II condition, further extending findings previously observed from clinical experience 15 and trials, 20 while also providing the first clear empirical evidence for this differential effect.For those with a BP-I condition lithium appeared only slightly superior to lamotrigine in terms of efficacy.Ultimately, these findings could potentially be informative for clinical decision making when considered in conjunction with CPG recommendations.

=
clinician diagnosed BP-II; P1 = psychotic features present during mood elevation; P2 = psychotic features not present during mood elevation.
Demographic profiles for the whole sample and each of the diagnostic subtypes (i.e., according to clinician diagnosis and the presence/absence of psychotic features).Lifetime and current prescription rates of each mood stabiliser and atypical antipsychotic for the whole sample and for each diagnostic subtype.
T A B L E 1 Note: C1 = clinician-diagnosed BP-I; C2 = clinician diagnosed BP-II; P1 = psychotic features present during mood elevation; P2 = psychotic features not present during mood elevation.a Statistical tests: age = Mann-Whitney U, gender/education/employment = Pearson's χ 2 , marital status/ethnicity = Fisher's exact.T A B L E 2 Note: p-values for each group of three tests (i.e., C1 vs. C2 and P1 vs. P2) were adjusted for multiple comparisons using the Benjamini-Hochberg procedure.For all ORs, the reference category is the diagnosis with the lesser prevalence.C1 = clinician-diagnosed BP-I; C2 = clinician diagnosed BP-II; P1 = psychotic features present during mood elevation; P2 = psychotic features not present during mood elevation.Overall efficacy for each medication for the whole sample and each diagnostic subtype.Efficacy for each medication in the long-term treatment of depression and mania/hypomania for the whole sample and each diagnostic subtype.
Note: C1 = clinician-diagnosed BP-I; C2 = clinician diagnosed BP-II; P1 = psychotic features present during mood elevation; P2 = psychotic features not present during mood elevation.T A B L E 5 T A B L E 6 Results of tests for whole sample and diagnostic differences in side effect profiles between lamotrigine and lithium.Note: p-values for the prevalence, severity, and cessation analyses were adjusted for multiple comparisons using the Benjamini-Hochberg procedure.Prevalence and cessation analyses used Pearson's χ 2 test, while the severity analyses used Mann-Whitney U tests.C1 = clinician-diagnosed BP-I, C2 = clinician diagnosed BP-II, P1 = psychotic features present during mood elevation, P2 = psychotic features not present during mood elevation.