Study design and initial assessment of the data
Bipolar depression outpatients were recruited by advertisement and from an outpatient clinic between 2006 and 2010 as per duration specified by the funding agency. Subjects participated in the study after providing informed consent. The study was approved by the Indiana University Investigational Review Board. Patients initially underwent a screening process in which they had a psychiatric evaluation including a structured interview to confirm the diagnosis of bipolar disorder. Patients also underwent a physical examination, ECG and laboratory tests. Inclusion criteria were: (i) age 18–65 years, (ii) satisfy DSM-IV-TR criteria for bipolar disorder and major depressive episode, (iii) LMTG inadequate response: defined as 17 item HDRS > 15 for at least 4 weeks of treatment with 100 mg LMTG daily, (iv) informed consent as approved by local Investigational Review Board and (v) if on other antidepressants or mood stabilizers on a stable dose for the past 4 weeks. Exclusion criteria were: (i) comorbid psychotic disorder such as schizophrenia or schizoaffective disorder, (ii) significant suicidal or homicidal risk, (iii) clinically significant medical illness, (iv) allergy or intolerance to LMTG or MMTN, (v) pregnancy, planning to be pregnant or not using adequate contraception, (vi) satisfying criteria for substance dependence within 6 months prior to start of the study and (vii) on any medication with significant adverse interaction with either LMTG or MMTN.
Patients on a stable dose of LMTG 100 mg or greater were randomized to either MMTN (starting dose 5 mg daily increased up to 20 mg daily over 4 weeks, then 20 mg daily from 4–8 weeks, depending on response and tolerability) or matching placebo for 8 weeks in a randomized, double-blind study parallel group fashion to receive either MMTN or placebo in a 1:1 ratio. Twenty-nine patients were evaluated using the 17-item HDRS , for the full 8 week experimental period.
For initial assessment of the data, the raw HDRS data were plotted per treatment group. To investigate the general behaviour of the data, locally weighted polynomial regression (LOESS) was applied .
The Bayesian parameter and error distributions were computed using a Markov-Chain Monte Carlo (MCMC) algorithm using the R2WinBUGS package in R 2.12.0 (The R Foundation for Statistical Computing, Vienna, Austria, WinBUGS 1.4.3, Imperial College and MRC, UK). Graphical representations were also performed in R.2.12.0.
Using a typical three stage hierarchical model approach, several non-linear model structures were investigated, including residual variability (RV) and between-subject variability (mixed effects, population dispersion model), represented by equations (Equation 1), (Equation 2), (Equation 3), (Equation 4).
(Equation 1) (Equation 2) (Equation 3) (Equation 4) (Equation 5)
Equation (Equation 1) represents the structural model, where y is the HDRS observation as a function of the vector for parameters with random effects (θ), the vector for parameters with mixed effects (θi) and residual error (ε). The vector for parameters with random effects (θ) is normally distributed (Equation (Equation 2)) around the vector of mean parameter values (), with some precision (τ), defined in terms of variance (σ, Equation (Equation 3)). The vector for parameters with mixed effects (θi, Equation (Equation 4)) incorporates between-subject variability, thereby defining a vector of mean parameter values for the ith individual (). The residual error (ε, Equation (Equation 5)) is normally distributed around zero, with some residual precision (τre), defined in terms of residual variance (σre). Initially, the priors for θ and θi were chosen as uninformative and uniformly distributed, describing the range of uncertainty, with exception for the parameter estimation of the value of HDRS at time zero (U(10,35) which approximates the range of observations at this time point). The uninformative prior vector values for σ and σre were set at U(0,10000).
In this manner, several base model structures (f in Equation (Equation 1)) were explored to best describe the data. Initially, a linear base model was applied, where s0 is the HDRS at time (t) zero and a represents the slope in HDRS over time (Equation (Equation 6));
To allow a non-linear decline in HDRS over time, an exponential base-model was used (Equation (Equation 7)), where k represents the speed of decline in HDRS;
To investigate time to displacement from s0 and differences in amplitude of the maximal response, a maximal effect base model (base Emax model) was applied (Equation (Equation 8)), where Emax is the maximal decrease in HDRS and E50 the time at which half the Emax is achieved;
A Gompertz function (base Gompertz model) was used to allow more freedom in the shape of the curve (Equation (Equation 9));
In the Gompertz function, α is the parameter for the amplitude of score improvement, β the parameter for the time to inflection from s0, and γ the parameter for the speed of decline in HDRS. Because of the intrinsic behavior of the formula, the parameters α, β and γ must be positive, giving rise to a relatively non-informative uniform prior distribution (U[0.0001,100]).
To allow for separate estimation of an initial decrease followed by an increase in HDRS (relapse), an inverse Bateman-function was explored (base Bateman model, Equation (Equation 10)).
In the base Bateman model, κ is a term referring to the maximum decrease in HDRS, dependent on the kdecr and kincr that represents the rate constants for decrease and increase in HDRS respectively. Based on the shape of the curves in the raw data, a uniform prior distribution was applied to the kdecr and kincr; U [0, 1].
On these five base models, a ‘switch’ was applied to differentiate between specific model parameters for the MMTN augmentation and placebo treatment group (Equation (Equation 11)).
As trti is the treatment for the ith individual (1 for MMTN-treatment and 2 for placebo), this allows estimation of the θtrt1 when MMTN is administered and estimation of θtrt2 when placebo is administered.
For model optimization and comparison purposes, the deviance information criterion (DIC) was used. The DIC is applicable for Bayesian approaches as it corrects for the trade-off between model goodness of fit (D(θ), defined by −2log-likelihood(data|θ)), the effective number of parameters in the model (pD) and prior parameter distributions  (Equation (Equation 12)).
With decreases in DIC, the model with one additional parameter was preferred over the parent model. History plots were qualified for lack of parameter correlation, and Gelman-Rubin-Brooks plots were created to investigate over-parameterization of the models . The shrink factor was considered acceptable when below 1.05 points at the end of the iterations. Models were internally qualified based on shape of the posterior parameter distributions as well as posterior and posterior predictive goodness of fit of the HDRS data on individual and population level.