Bayesian model of Hamilton Depression Rating Score (HDRS) with memantine augmentation in bipolar depression



Dr Robert R. Bies PhD PharmD, Department of Medicine and Medical Genetics, Division of Clinical Pharmacology, School of Medicine, Indiana University, 1001 W 10th Street W7138, Indianapolis, IN 46202, USA.

Tel.: +3176307868

Fax: +3172873006




Presynaptic and post-synaptic glutamatergic modulation is associated with antidepressant activity that takes several weeks to reach a maximal full effect. Limiting mood elevating effects after single drug administration may be the result of compensatory synaptic processes. Therefore, using augmentation treatment with agents having presynaptic and post-synaptic effects on the glutamatergic system, this study aims to evaluate the effect of augmentation therapy on the rate of change in mood elevation in patients with bipolar depression.


In a pilot study, 29 outpatients with bipolar depression on a stable lamotrigine dose regimen received placebo or memantine pills daily (titrated up by 5 mg week–1 to 20 mg) in a randomized, double-blind, parallel group, 8 week study. Patients were evaluated weekly using the 17-item Hamilton Depression Rating Score (HDRS) and all data were analyzed simultaneously. Linear, exponential, maximal effect, Gompertz and inverse Bateman functions were evaluated using a Bayesian approach population pharmacodynamic model framework. In these models, differences in parameters were examined across the memantine and placebo augmentation groups.


A Gompertz function with a treatment switch on the parameter describing the speed of HDRS decline (γ, 95% confidence interval [CI]) best described the data (γmemantine = 1.8, 95% CI 0.9, 3.6), γplacebo = 1.2, 95% CI 0.5, 3.5)). Between subject variability was identified on baseline HDRS (2.9, 95% CI 1.5, 4.4) and amplitude of score improvement (4.3, 95% CI 2.7, 6.5).


This pharmacodynamic approach identified an increased speed of response after memantine augmentation, compared with placebo augmentation in bipolar depression patients.

What Is Already Known about This Subject

  • Presynaptic and post-synaptic glutamatergic modulation is associated with antidepressant activity.
  • Limited mood elevation after single drug administration may be the result of compensatory synaptic processes.
  • Using augmentation treatment with presynaptic and post-synaptic agents affecting the glutamatergic system, this study evaluates the effect of augmentation therapy on the rate of change in mood elevation in patients with bipolar depression.

What This Study Adds

  • Bayesian approach population pharmacodynamic modelling identified an increased speed of response after memantine augmentation, compared with placebo augmentation in bipolar depression patients.


The glutamatergic system is one of the main excitatory neurotransmission systems. Glutamate (GLUT) acts upon the post-synaptic neurons via a number of different post-synaptic ionotropic receptors including the N-methyl-D-Aspartate (NMDA), kainate and AMPA receptors [1]. Several studies have shown that presynaptic and post-synaptic glutamatergic modulation is associated with antidepressant activity that takes weeks to months to reach its full effect [2]. Consequently, this has increased the interest in the role of GLUT in the pathophysiology of mood disorders and the development of novel antidepressant treatments which may decrease the time to effect [3-5]. However, there are several concerns. First, the clinical utility of NMDA receptor modulation has been limited by significant side effects of most NMDA receptor antagonists [5, 6]. Secondly, there is increasing evidence that the limiting mood elevating effects of presynaptic glutamate release inhibitors like lamotrigine (LMTG), and post-synaptic receptor antagonists like memantine (MMTN) may lie in compensatory synaptic processes. For instance, administration of the NMDA antagonist ketamine increases presynaptic GLUT release [7] and increased GLUT release results in down regulation of NMDA receptors [8]. Combination treatment with agents having presynaptic and post-synaptic effects on the glutamatergic system may be associated with better outcomes for depression [6, 9].

In a recent pilot clinical trial we investigated whether MMTN can augment the effect of LMTG in bipolar depressed patients who exhibited an inadequate response to monotherapy with LMTG [10]. To study the augmenting effects of MMTN administration, improvement on the 17-item Hamilton Depression Rating Score (HDRS [11]) was compared with placebo augmentation. Although this study could not indicate an effect of MMTN augmentation at the end of the study (week 8), antidepressant effects were seen during the first weeks of treatment. Longitudinal data such as these are amenable to non-linear mixed effects approaches to analyze aspects of the trajectory of disease and response to therapy, which would allow taking the speed of response into consideration.

In pharmacodynamic (PD) modelling and simulation techniques, the outcomes of rating scales are often treated as categorical variables. When the number of categories is sufficiently large, like in the case of HDRS, they can be considered as continuous variables. In general, longitudinal data on HDRS show a non-linear trend, exposing several patterns in HDRS response to placebo or drug treatment. First, there is a high prevalence of response in placebo-treated individuals, making it more difficult to discern a beneficial effect due to drug treatment [12]. Secondly, there can be three general (non-linear) behavioural responses observed in the time course of HDRS after administration of antidepressant therapy; (i) responding individuals who show an early rapid decline in HDRS that reaches a maximum improvement, (ii) relapsing individuals who show an early rapid decline followed by an increase in HDRS towards the initial baseline score and (iii) non-responding individuals who show no change in the HDRS over the period of treatment. Quantitatively distinguishing these typical individual HDRS patterns in a non-linear mixed-effects population approach is required in order to describe the time-course of HDRS adequately [13].

PD modelling and simulation techniques allow the use of different mathematical expressions in order to investigate and accurately describe non-linear PD systems. In addition, Bayesian analysis is an inferential method that allows interpretation of unknown parameters in terms of probability distributions. After formulating a model with unknown model parameters, a prior distribution is applied on the unknown parameters, based on (subjective) prior knowledge. Bayes rule is then applied to make inferences about the unknown parameters given the observed data and the prior knowledge. Prior means and their precision must be defined in a Bayesian analysis, where priors with high precision are referred to as informative and priors with low precision are called uninformative. The advantage of such an approach is that an updated belief on the model parameters is generated (posterior distributions). Secondly, this approach allows calculation of posterior predictive distributions that allows inferences to be made about future observations that take into account uncertainty in the model.

In this paper, a non-linear population approach PD-modelling is utilized to investigate the effect of MMTN augmentation on LMTG administration in patients with bipolar depression, sampled from the ‘Memantine Augmentation of Lamotrigine Incomplete-Response in Bipolar Depression’ study [10] ( Identifier: NCT00305578). We applied linear , exponential, maximal effect, Gompertz and inverse Bateman mixed effects PD models to describe the patterns in HDRS response in a Bayesian framework (WinBUGS [14]) over the 8 week experimental period, specifically to quantitate the time course of treatment effect during the first 4 weeks of treatment.


Study design and initial assessment of the data

Bipolar depression outpatients were recruited by advertisement and from an outpatient clinic between 2006 and 2010 as per duration specified by the funding agency. Subjects participated in the study after providing informed consent. The study was approved by the Indiana University Investigational Review Board. Patients initially underwent a screening process in which they had a psychiatric evaluation including a structured interview to confirm the diagnosis of bipolar disorder. Patients also underwent a physical examination, ECG and laboratory tests. Inclusion criteria were: (i) age 18–65 years, (ii) satisfy DSM-IV-TR criteria for bipolar disorder and major depressive episode, (iii) LMTG inadequate response: defined as 17 item HDRS > 15 for at least 4 weeks of treatment with 100 mg LMTG daily, (iv) informed consent as approved by local Investigational Review Board and (v) if on other antidepressants or mood stabilizers on a stable dose for the past 4 weeks. Exclusion criteria were: (i) comorbid psychotic disorder such as schizophrenia or schizoaffective disorder, (ii) significant suicidal or homicidal risk, (iii) clinically significant medical illness, (iv) allergy or intolerance to LMTG or MMTN, (v) pregnancy, planning to be pregnant or not using adequate contraception, (vi) satisfying criteria for substance dependence within 6 months prior to start of the study and (vii) on any medication with significant adverse interaction with either LMTG or MMTN.

Patients on a stable dose of LMTG 100 mg or greater were randomized to either MMTN (starting dose 5 mg daily increased up to 20 mg daily over 4 weeks, then 20 mg daily from 4–8 weeks, depending on response and tolerability) or matching placebo for 8 weeks in a randomized, double-blind study parallel group fashion to receive either MMTN or placebo in a 1:1 ratio. Twenty-nine patients were evaluated using the 17-item HDRS [11], for the full 8 week experimental period.

For initial assessment of the data, the raw HDRS data were plotted per treatment group. To investigate the general behaviour of the data, locally weighted polynomial regression (LOESS) was applied [15].

Bayesian analysis

The Bayesian parameter and error distributions were computed using a Markov-Chain Monte Carlo (MCMC) algorithm using the R2WinBUGS package in R 2.12.0 (The R Foundation for Statistical Computing, Vienna, Austria, WinBUGS 1.4.3, Imperial College and MRC, UK). Graphical representations were also performed in R.2.12.0.

Using a typical three stage hierarchical model approach, several non-linear model structures were investigated, including residual variability (RV) and between-subject variability (mixed effects, population dispersion model), represented by equations (Equation 1), (Equation 2), (Equation 3), (Equation 4).

display math(Equation 1)
display math(Equation 2)
display math(Equation 3)
display math(Equation 4)
display math(Equation 5)

Equation (Equation 1) represents the structural model, where y is the HDRS observation as a function of the vector for parameters with random effects (θ), the vector for parameters with mixed effects (θi) and residual error (ε). The vector for parameters with random effects (θ) is normally distributed (Equation (Equation 2)) around the vector of mean parameter values (math formula), with some precision (τ), defined in terms of variance (σ, Equation (Equation 3)). The vector for parameters with mixed effects (θi, Equation (Equation 4)) incorporates between-subject variability, thereby defining a vector of mean parameter values for the ith individual (math formula). The residual error (ε, Equation (Equation 5)) is normally distributed around zero, with some residual precision (τre), defined in terms of residual variance (σre). Initially, the priors for θ and θi were chosen as uninformative and uniformly distributed, describing the range of uncertainty, with exception for the parameter estimation of the value of HDRS at time zero (U(10,35) which approximates the range of observations at this time point). The uninformative prior vector values for σ and σre were set at U(0,10000).

In this manner, several base model structures (f in Equation (Equation 1)) were explored to best describe the data. Initially, a linear base model was applied, where s0 is the HDRS at time (t) zero and a represents the slope in HDRS over time (Equation (Equation 6));

display math(Equation 6)

To allow a non-linear decline in HDRS over time, an exponential base-model was used (Equation (Equation 7)), where k represents the speed of decline in HDRS;

display math(Equation 7)

To investigate time to displacement from s0 and differences in amplitude of the maximal response, a maximal effect base model (base Emax model) was applied (Equation (Equation 8)), where Emax is the maximal decrease in HDRS and E50 the time at which half the Emax is achieved;

display math(Equation 8)

A Gompertz function (base Gompertz model) was used to allow more freedom in the shape of the curve (Equation (Equation 9));

display math(Equation 9)

In the Gompertz function, α is the parameter for the amplitude of score improvement, β the parameter for the time to inflection from s0, and γ the parameter for the speed of decline in HDRS. Because of the intrinsic behavior of the formula, the parameters α, β and γ must be positive, giving rise to a relatively non-informative uniform prior distribution (U[0.0001,100]).

To allow for separate estimation of an initial decrease followed by an increase in HDRS (relapse), an inverse Bateman-function was explored (base Bateman model, Equation (Equation 10)).

display math(Equation 10)

In the base Bateman model, κ is a term referring to the maximum decrease in HDRS, dependent on the kdecr and kincr that represents the rate constants for decrease and increase in HDRS respectively. Based on the shape of the curves in the raw data, a uniform prior distribution was applied to the kdecr and kincr; U [0, 1].

On these five base models, a ‘switch’ was applied to differentiate between specific model parameters for the MMTN augmentation and placebo treatment group (Equation (Equation 11)).

display math(Equation 11)

As trti is the treatment for the ith individual (1 for MMTN-treatment and 2 for placebo), this allows estimation of the θtrt1 when MMTN is administered and estimation of θtrt2 when placebo is administered.

For model optimization and comparison purposes, the deviance information criterion (DIC) was used. The DIC is applicable for Bayesian approaches as it corrects for the trade-off between model goodness of fit (D(θ), defined by −2log-likelihood(data|θ)), the effective number of parameters in the model (pD) and prior parameter distributions [16] (Equation (Equation 12)).

display math(Equation 12)

With decreases in DIC, the model with one additional parameter was preferred over the parent model. History plots were qualified for lack of parameter correlation, and Gelman-Rubin-Brooks plots were created to investigate over-parameterization of the models [17]. The shrink factor was considered acceptable when below 1.05 points at the end of the iterations. Models were internally qualified based on shape of the posterior parameter distributions as well as posterior and posterior predictive goodness of fit of the HDRS data on individual and population level.


Initial assessment of the data

A priori visual inspection of the data indicated 12 responding individuals in the MMTN and six in the placebo treated groups. In the MMTN treated group, there did not seem to be any relapsing individuals, vs. four in the placebo treated group. Both treatment groups seemed to have two non-responding individuals. Additionally, for one individual in each group it was difficult to assign the pattern in HDRS response a priori. Descriptive analysis on fractional changes has been performed elsewhere [10] and showed that at t = 0 and t = 8 weeks, no significant difference was found between the mean HDRS of the treatment groups. Figure 1 represents the individual data of this study. The LOESS fit clearly indicates a non-linear response with an increased speed of response in the MMTN treated group when compared with the placebo treated group. The reported side effects were similar in frequency between the two treatment arms and were mild in intensity The reported side effect were (in % placebo/% treatment): central nervous system (73/71), hypomania (20/15), gastrointestinal (20/36), cardiovascular (20/7), sexual (0/7), urinary (6/0), respiratory (6/7), endocrine (0/7) and immunological (6/0) [10].

Figure 1.

Individual Hamilton Depression Rating Scores over time (solid lines), separated by memantine and placebo augmentation. The broken line represents the locally weighed polynomial regression per group


During model development, some models showed convergence problems in the base models, which required a term for RV on HDRS and/or a more informative prior with BSV on the population parameter estimate for s0. To allow comparison between all models, these components were included in all the base models. In the base models, s0 was estimated as a population parameter estimate with an informative uniform prior distribution (U[12,35]), based on the individual plots and the third study inclusion criterion (HDRS > 15, at least for 4 weeks of LMTG treatment). Also, non-informative priors were applied for BSV on s0 (N[0,0.00001]) and RV on HDRS score [U(0,10000)].

Final model

The first stage model required BSV on s0 and indicated relatively narrow posterior parameter density distributions (DIC = 1352.5) for all parameter except for β (29.6, 95% CI 2.47, 95.8). Also, the shape of the posterior parameter distribution of γ was not normally distributed and the posterior parameter distribution of β showed tailing. Constructing models with switches on α, β and γ resulted in DICs of 1351.1, 1354.8 and 1352.8 respectively. In the α switch model, inclusion of a second switch on β or γ, or a switch on all parameters did not improve the DICs (1351.4, 1351.3 and 1352.1, respectively). When a switch was applied on β and γ, the DIC was 1351.5. Based on the best shape of the posterior parameter distributions, the model with a switch on γ was continued to second stage model development. Although decreased compared with the base model, the posterior parameter distribution of β still showed slight tailing. Allowing for BSV on α decreased the DIC to 1319.3, while introducing BSV on β and/or γ did result in convergence problems. Based on visual assessment of the individual predictions (Figure 2), the model describes the data well and there is clearly an onset of action (β) estimated, as was observed in the initial assessment of the data. For the first weeks after treatment, the mean posterior population predictions indicate an increased speed of HDRS decline in the MMTN treated group, represented by its larger γ, although the CIs overlap (Table 1). The only parameter that could not be accurately estimated was the time to displacement (β), based on its CI. The mean posterior population predictive plots (Figure 3) now indicate that a maximal effect has been reached at the end of the experiments. As no differences in treatment groups was allowed in the final model on α and s0, the posterior predictions start and end at the same HDRS value. Separating these parameters based on treatment group did not improve the description of the HDRS trajectories. The key difference between the groups is that MMTN administration reduced the time to maximal effect from ∼6 to ∼4 weeks but did not appear to impact on other aspects of the HDRS trajectory.

Figure 2.

The median predictions (solid line) per individual (ID) over time for the memantine (A) and placebo (B) augmented groups, with their 95% confidence intervals (broken lines). The open circles represent the observed Hamilton Depression Rating Scores

Figure 3.

The median population predictions (solid lines) of the final Gompertz model, with the 95% confidence intervals (broken lines) for the memantine (bold lines) and placebo (fine lines) augmented groups

Table 1. Posterior population parameter estimates of the final Gompertz model
ParameterDefinition (units)Median95% confidence interval
s0Hamilton Depression Rating Score at t = 0 (HDRS)18.416.6, 20.6
αAmplitude (HDRS)7.75.6, 10.8
βTime to displacement (weeks)11.12.3, 90.3
γMMTNSpeed of decline in memantine group (HDRS week−1)1.80.9, 3.6
γplaceboSpeed of decline in placebo group (HDRS week−1)1.20.5, 3.5
BSV s0Between subject variability s0 (HDRS)2.91.5, 4.4
BSV αBetween subject variability α (HDRS)4.32.7, 6.5
RVResidual variability (HDRS)3.73.3, 4.1


Combination treatment with agents having presynaptic and post-synaptic effects on the glutamatergic system may be associated with better outcomes for depression. To this end we investigated the effects of MMTN compared with placebo augmentation to patients who showed an inadequate response to LMTG treatment. Non-linear population approach PD-modelling was applied for this investigation into the patterns in HDRS response of this pilot study, specifically to differentiate the speed of HDRS response between treatments.

After comparing several model structures, the Gompertz structure best described the data. A priori assessment of the data did not indicate differences in the HDRS score at the start or end of the experiments. In line with this assessment, applying a treatment switch on baseline HDRS (s0) or amplitude of HDRS improvement (α) did not improve the description of the data. The parameter representing time-to-displacement (β) improved the descriptive properties compared to the Emax model, although a treatment effect could not be identified by applying a switch on this parameter. Inclusion of a switch on the parameter describing speed of response (γ) clearly indicated an increased speed of HDRS response in the MMNT augmented group and improved the descriptive properties of the model. The data of a priori identified responsive and non-responsive patients were well described by the final Gompertz model (Figure 3). Due to the intrinsic behaviour of a Gompertz function, relapsing patterns in response (IDs 11, 12, 13, 22 and 26 in Figure 2) could not be described in the final model, which also caused difficulty in the estimation of β (relatively large CI). As the relapsing patterns mainly occurred in the last weeks of treatment (Figure 1), the posterior population predictions (Figure 3) showed no differences between the two treatment arms from week 6 to 8.

The analysis on this proof-of-concept study indicates no improvement in time to onset of effect (β) or maximal effect that could be reached over the 8 week period (α). However, the increased speed in HDRS decline (γ) suggests that MMTN augmentation may have a beneficial effect, although the CIs overlap between the treatment arms. Based on the posterior predictions, the maximum effect would be reached at 4 weeks rather than 6 weeks. An important clinical implication of the finding is the potential usefulness of MMTN in inducing a faster antidepressant response along with other slower acting medications such as LMTR or other antidepressants. The results suggest an antidepressant effect associated with MMTN treatment only for the period during which its dose was being titrated up. Once the dose was stabilized after 4 weeks of treatment, the antidepressant effect seemed to plateau. This plateau effect may be attributed to compensatory mechanisms within the glutamate system [7, 8], but it could be speculated that if the MMTN dose was increased further, a greater antidepressant response might be seen. As there was no increase in side-effects observed between the two treatment arms [10], this PD analysis does justify further studies to quantitate the augmenting effects of MMTN on LMTG.

To describe relapsing patterns in HDRS response, a bi-exponential function is favourable, e.g. an inverse Bateman function [18]. As only four individuals within the relatively small group of patients showed relapsing behaviour, convergence problems occurred while fitting a Bateman model. As a consequence, parameter identifiability was impossible for this model, especially for the parameters describing decrease and increase in HDRS. Increasing the number of patients in subsequent studies might help to identify relapsing behaviour. Several other advantages in the subsequent trial design could be proposed, e.g. inclusion of information on drug (pre-) treatments as well as obtaining pharmacokinetic information over time and use of different dosing regimens to collect information on drug exposure. This could lead to a pharmacokinetic–pharmacodynamic model with better descriptive as well as predictive properties. Also, the duration of the study might be prolonged, as several individuals might not have reached a maximal effect (Figure 1).

To quantitate better beneficial drug effects in subsequent studies, a robust longitudinal placebo model is required, that can account for drop-outs and provide insight into factors that contribute to differences in treatment response in patients and between studies. Combining such a model with a drug–effect model could eventually lead to simulation-based clinical trial design. An advantage of PD modelling over non-parametric approaches is that components for disease progression and (placebo) responsiveness to treatment can be easily implemented. Also, as data become available in subsequent trials, models can be constantly updated. This allows for adaptations during the trial to improve the study design in terms of dosing regimen, sample times, number of participants and duration of the trial. Such adaptive trial designs have great potential to reduce trial failure, specifically in studies investigating drug effects in the central nervous system [19].

In conclusion, this study describes an increase in speed of HDRS response following MMTN augmentation to LMTG treatment, when compared with placebo augmentation. It is a first step in the systematical description of HDRS effects following combination treatment of pre- and post-synaptic glutamate receptor inhibitors.

Competing Interests

Dr Stevens has received travels grants from the PPDM section of the AAPS (2008), LUF (2009) and NSFW (2009) and has had travel- and housing reimbursements from Eli Lilly and Company (2012). Dr Bies receives funding through the NICHD (NIH), the Indiana CTSI through a gift of Eli Lilly and Company and through Merck via a grant through the Regenstrief Institute. In addition, Dr Bies has had travel reimbursement in conjunction with meetings of the NCI Cancer Cooperative Alliance and the British Pharmacology Society in conjunction with Editorial Board meetings. Dr Shekhar receives funding through the NIMH, NCRR and NCATS (NIH), and manages grants to the Indiana CTSI from Eli Lilly and Company, Fairbanks Foundation, Fairbanks Institute, Indiana University Health, and Lilly Endowment. Dr Anand receives funding through the NIMH, Astra Zeneca and, for this study, the Stanley Research Institute.

The clinical trial was completed with grant funding to Professor Anand from the Stanley Research Foundation. This publication was made possible, in part, with support from the Indiana Clinical and Translational Sciences Institute (CTSI) funded, in part by Grant Number TR000006 from the National Institutes of Health, National Center for Advancing Translational Sciences, Clinical and Translational Sciences Award. Dr Stevens was supported by a career development award from the Indiana CTSI funded by Eli Lilly and Co.