Rapid model exploration for complex hierarchical data: application to pharmacokinetics of insulin aspart

We consider situations, which are common in medical statistics, where we have a number of sets of response data, from different individuals, say, potentially under different conditions. A parametric model is defined for each set of data, giving rise to a set of random effects. Our goal here is to efficiently explore a range of possible ‘population’ models for the random effects, to select the most appropriate model. The range of possible models is potentially vast, because the random effects may depend on observed covariates, and there may be multiple credible ways of partitioning their variability. Here, we consider pharmacokinetic (PK) data on insulin aspart, a fast acting insulin analogue used in the treatment of diabetes. PK models are typically nonlinear (in their parameters), often complex and sometimes only available as a set of differential equations, with no closed‐form solution. Fitting such a model for just a single individual can be a challenging task. Fitting a joint model for all individuals can be even harder, even without the complication of an overarching model selection objective. We describe a two‐stage approach that decouples the population model for the random effects from the PK model applied to the response data but nevertheless fits the full, joint, hierarchical model, accounting fully for uncertainty. This allows us to repeatedly reuse results from a single analysis of the response data to explore various population models for the random effects. This greatly expedites not only model exploration but also cross‐validation for the purposes of model criticism. © 2015 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.


Introduction
Consider a hierarchical data set, where we have a number of sets of response data, from different patients perhaps. We wish to apply a parametric model to each individual's data set and then define a 'population' model relating all of the individual-level parameters (random effects) together. There may be a variety of credible models for the random effects, and it is important to fully explore a range of possibilities. For example, the parameters may depend on observed covariates, or there may be different ways of partitioning their variability across levels of the hierarchy. However, such exploration can be cumbersome and time-consuming, especially when individual-level models are complex. For example, pharmacokinetic models are typically nonlinear (in their parameters), often have complex functional forms and are sometimes only available as a set of differential equations, with no closed-form solution. Fitting such a model for just a single individual can be a challenging task. Fitting a joint model for all individuals can be even harder, even without the complication of an overarching model selection objective.
We aim in this paper to develop a methodology for efficiently fitting a range of population models (including covariate selection) to the parameters (random effects) from individual-level models. We describe a two-stage approach that decouples inference on the population model from inference on each individual-level model but nevertheless fits the full, joint, hierarchical model, accounting fully for uncertainty. In the first stage, we estimate independent posterior distributions for the parameters in each individual-level model using Markov chain Monte Carlo (MCMC). We then store the resulting samples for use in the second stage, where they form 'proposal distributions' for the individual-level parameters in the full hierarchical model. The parameters are then updated via Metropolis-Hastings steps with an acceptance probability that is independent of the individual-level likelihood. This means that stage 2 is very efficient, allowing a wide range of models to be easily explored. In particular, our approach facilitates rapid exploration of covariate models using reversible jump MCMC, as well as exploration of different models for the variance components. It also facilitates criticism of the various population models via cross-validation techniques. Our approach is suited to situations in which the number of observations for each individual exceeds the number of parameters in the individual-level model and so is most likely to be useful in early clinical studies, in which detailed individual-level data are available.
We use this methodology to study four parameters relating to insulin kinetics in pregnant women with type 1 diabetes, using data from two clinical studies. We consider covariate selection models that enable identification of covariates that may be related to the kinetic parameters, as well as different structures for the hierarchical model, up to four levels. We believe that the complexity of our analysis would render it impracticable without the proposed two-stage methodology. Our two-stage approach is implemented in extensions to the OpenBUGS software [1,2], which is freely available from www.openbugs.net.
Rapid-acting insulin analogues (such as insulin aspart and insulin lispro) can assist in safely optimising glucose control [3][4][5], but little is known about their pharmacokinetics and reproducibility in pregnancy. A significant gestational delay of approximately 30 min in time-to-peak plasma aspart concentration from early to late pregnancy has been previously described [6]. The aims of the study reported herein were to explore the relationship between aspart pharmacokinetics and clinical/demographic factors for subjects with type 1 diabetes undergoing continuous subcutaneous insulin infusion (CSII) during pregnancy and to assess reproducibility within and between subjects.
Basic clinical results from some of these analyses have been reported previously [7]. Here, we present and explore more thoroughly the statistical and methodological issues.

Data
Our data are from two 24-h trials of insulin aspart for pregnant women with type 1 diabetes conducted between March 2009 and April 2011 [8,9]. Study protocols were approved by the Research Ethics Committee, and all participants provided written informed consent. During each study, women arrived at the clinical research facility (Addenbrooke's Hospital, Cambridge, UK) at midday and were monitored for 5 h after dinner on day 1 and breakfast on day 2. In study 1, 10 women were studied on two occasions: in early (12-16 weeks) and in later (28-32 weeks) gestation [8], with standardised dinner and breakfast, and under sedentary conditions. In study 2, 12 women were studied on two occasions in mid gestation (12-33 weeks) with standardised meals, snacks and exercise (3×20-min walks at 14:00, 19:30 and 09:00 h; and 2×50-min treadmill sessions at 15:00 and 09:30 h) [9].
A CSII delivering aspart maintained stable fasting and pre-meal glycaemic conditions during each visit using either closed-loop or conventional CSII. Under closed-loop CSII, the basal rate was adjusted every 15 min using continuous glucose measurements, whereas with conventional CSII, the women set temporary basal rates and used correction boluses according to capillary glucose measurements. The basal infusion rate was recorded from 14:00 h on day 1 in the first study and from 00:00 h on day 1, before the study started, in study 2. The median (interquartile range) infusion rate was 0.6(0.1-1.2) international units (U) per hour. All prandial insulin boluses were calculated according to capillary glucose levels and were initiated at 18:00 h on day 1 and 11:00 h on day 2. The median (interquartile range) prandial bolus dose was 8.9(6.5-12) U. We assume boluses infused steadily over a 1-min period, apart from eight boluses, which were delivered over longer periods. Insulin concentration readings were recorded from 16:30 h on day 1 in the first study and from 14:00 h on day 1 in the second study. Plasma insulin concentration was measured every 10 min for 90 min post-meal, every 15 min for 1.5-5 h post-meal and at 15-to 30-min intervals at other times, providing an average of 59 measurements per woman. Plasma insulin concentration was measured by an immunochemiluminometric assay (Invitron, Monmouth, UK; intra-assay coefficient of variation (CV) 4.7% and inter-assay CV 7.2-8.1%). Further details of study procedures are reported elsewhere [8,9].
We consider each mealtime separately so that the basic unit of study is a time series of insulin concentrations (a 'profile') over a period around the evening (17:30-23:00 h on day 1) or morning (06:30-12:00 h on day 2) meal from a particular visit of a pregnant woman. A total of 22 women underwent two visits, each involving two meals; this gave rise to 22 × 2 × 2 = 88 profiles to model.
Thirteen clinical and demographic factors were examined. These factors, which were considered time invariant at the profile-level, are summarised in Table I.

Profile-level model
3.1.1. Mechanistic model. We use a two-compartment model to represent insulin kinetics, as shown in Figure 1. At time t, Q 1 (t) and Q 2 (t) represent the insulin masses (in international units, U) in two subcutaneous tissue compartments, representing the delay in absorption into the blood. A controlled infusion of insulin enters compartment 1 at a rate Inf(t) (mU/min), supplemented by boluses of insulin, which we model as entering at a rate Bol(t) (mU/min). Insulin leaves both compartments according to a first-order process, with rate constant (t max ) −1 , where t max is the time-to-peak insulin concentration in minutes. This gives rise to a pair of ordinary differential equations: with Q 1 (0) = Q 2 (0) = 0, where t = 0 corresponds to midnight on day 1 before the trial starts. The run-in time between t = 0 and the time t = t start , when the trial starts, allows the states Q 1 (t start ) and Q 2 (t start ) to become independent of Q 1 (0) and Q 2 (0). Equations (1)-(2) do not, in general, have a closedform solution. In this paper, we obtain a numerical solution, instead, via the Cash-Karp Runge-Kutta method [10]. We base our regression function on the observable quantity in these trials, which is plasma insulin concentration. This is assumed to equilibrate instantaneously with the efflux from compartment 2 and is given by Q 2 (t)∕(t max × wt × MCR), where wt denotes the patient's body weight (kilogramme) and MCR is the metabolic clearance rate per unit of body weight: MCR is the volume of plasma (l) cleared of Figure 1. Two-compartment model for insulin kinetics, in which Q 1 (t) and Q 2 (t) represent the insulin masses in two subcutaneous tissue compartments, representing the delay in absorption into the blood. Inf(t) and Bol(t) represent insulin input via continuous infusion and supplemental boluses, respectively.
insulin per minute, per kilogramme. To account for any long-acting insulin taken before the start of the study (which continues to have an effect for around 24 h), we add a linear, 'residual insulin' term to our regression function. We assume that residual insulin concentration changes at a rate a (pmol/l/min) and that the post-prandial concentration at time t end (5 h post-meal) is b (pmol/l). The regression function is then given by with unknown parameters = (t max , MCR, a, b) ′ and observed data z = (wt, t end ) ′ .

Observation model.
Denote by y ijkm the mth measured plasma insulin concentration, taken at time t ijkm , for individual i during and the following meal k of visit where ijk and z ijk are profile-specific vectors of unknown parameters and data, respectively. In addition, the residual variance is given by which combines additive and multiplicative variance models, allowing for increases in measurement error above some baseline level as the modelled insulin concentration increases.

Population model
We wish to make inferences about the unknown parameters t max , MCR, a and b. In particular, we are interested in establishing whether any relationships exist between the parameters and the observed covariates, and in quantifying their variabilities both within and between women. We begin by making separate distributional assumptions for each component (indexed by l = 1, … , 4) of the parameter vector ijk : where LN(. , .) denotes a log-normal distribution with first and second parameters corresponding to mean and variance, respectively, on the log-scale. Note that we specifically avoid making a multivariate assumption for the whole ijk vector, for reasons that we will discuss later (Section 5). By choosing different forms for ijkl , we can specify a variety of population models for the ijkl s. In particular, we consider one-level, two-level and three-level population models as follows. In each case, we consider models both with and without covariates included.

One-level models.
Here, we assume that the profile-specific parameters ijkl are all conditionally independent, given a set of global parameters (or fixed effects). Hence where W ijkl is a row-vector containing observed values (for individual i at meal k of visit j) of the covariates chosen as predictors for the lth kinetic parameter (l = 1, … , 4); we discuss this further in Section 3.2.4. In this model, l represents the global mean (or intercept if covariates are included in the model), and 2 l represents the total variability (after any controlling for covariates) of parameter (or log-parameter) l.

Two-level models.
Here, we acknowledge that profiles from the same woman may be correlated. It is tempting to achieve this by introducing patient-specific intercept and gradient parameters. However, with only four profiles from each woman, it is not very realistic to attempt to estimate patient-specific covariate effects, and so, we allow only the intercept to vary between women: Here, 2 l measures the between-patient variability (for parameter l), and 2 l now represents the within-patient variability.

Three-level models.
Additionally, we might acknowledge that profiles from the same visit for a given woman may be correlated. Alternatively, profiles corresponding to the same mealtime for a given woman may be correlated. We might, therefore, allow visit-specific or mealtime-specific intercepts via In both cases, il represents the patient-specific mean intercept for parameter l, and 2 l measures the between-patient variability. The terms 2 V l and 2 M l , respectively, measure the variability between visits and between mealtimes (for a given individual) of the random intercepts for parameter l. In the case of visit-specific intercepts, 2 l now measures the within-visit variability, whereas in the case of mealtimespecific intercepts, 2 l measures the within-mealtime variability. Figure 2 shows a graphical model representation of our statistical model in the case of three-level (visit) model with covariates.

Covariate selection.
Suppose we have c available covariates in total and we arrange their observed values in an n × c matrix X, where n is the total number of profiles. We expect the importance of each covariate may differ between parameters of the mechanistic model, and thus, ijkl for each l = 1, … , 4 may be a function of a different subset of the predictors. We represent the selected subset of covariates by a vector l = ( l1 , … , lq l ) ′ . This gives the column indices in X of the selected covariates. Let W l denote the corresponding n × q l design matrix and W ijkl denote the row of W l corresponding to meal k at visit j for individual i. Thus, the appropriate linear predictor term for inclusion in ijkl is W ijkl l , where l is a vector of q l regression coefficients. In this paper, we treat each q l , l and l as unknown parameters and estimate their values using reversible jump MCMC [11,12]. Note that we standardise (and centre) the continuous covariates in X so that they are assessed in covariate selection on the same scale.
Interactions. Preliminary exploratory analyses suggested that parameters may differ between study and/or mealtime. However, when these covariates were included in X, neither was selected with high probability. We therefore decided to explore possible interactions. We allow for this by replacing study and mealtime in X with indicator variables for the following four combinations: study 1 breakfast, study 1 dinner, study 2 breakfast and study 2 dinner. We then choose a suitably weighted prior for the number of these indicators allowed in the model simultaneously, as discussed in Section 3.3.2.

Variance components.
The parameters of the residual variance model are treated as nuisance parameters and are assigned the following vague priors: The upper bound of 1 for the ijk s is chosen as residuals larger in magnitude than the modelled concentration are implausible in this setting. The remaining variability parameters are all assigned vague uniform priors on the standard deviation scale: However, in reversible jump MCMC, the elements of each l will play different roles in the covariate model from one iteration to the next; that is, a given element may correspond to various different covariates during evolution of the simulated Markov chain. Hence, it seems appropriate to choose the same prior variance for each element. In light of this, we can standardise the covariates so that the model treats them all equally. But we may still wish to allow for more or less diffuse priors for different types (or groups) of covariate. We thus decompose the linear predictor into separate terms for the continuous, binary and 'study-mealtime-interaction' covariates: where the C, B and I superscripts denote subvectors of W ijkl and l , corresponding to continuous, binary and interaction covariates, respectively, and the same prior variance is specified for all elements of each subvector of l . The prior standard deviation for coefficients corresponding to each covariate group is given by Δ l ∕1.96Δx, where Δ l is the width of the range of plausible values for the lth kinetic parameter (or log-parameter) and Δx is the width of the range of values for that covariate-type: Δx = 1 for binary covariates (including study-mealtime interactions) and Δx ≈ 2×1.96 for standardised continuous covariates. This is based on the assumption that the minimum and maximum plausible gradients define a 95% prior interval. Note that, lacking other prior information, we set Δ l to the range of the corresponding stage 1 posterior medians, and so, there is an element of using the data twice. While undesirable, this is necessary because the chosen prior variance influences the probability of inclusion of covariates in the model; hence, an informative prior is essential here.
The parameters associated with each covariate group are updated in the overall MCMC scheme as a separate reversible jump block. In each case, we wish to assume that all covariate models are equally likely a priori. We begin by assuming that all models of the same dimension are equally likely: where c C , c B and c I are the total numbers of available continuous, binary and interaction covariates, respectively. We then choose q I l = 0 4 12 q I l = 1 6 12 q I l = 2 1 12 where the latter prior excludes the possibility that q I l = 4 as this would lead to an unidentifiable model and acknowledges that all four possible models with q I l = 3 are essentially the same. Note that this specification renders equally probable a priori all distinct and identifiable models both within each covariate group and for the composite model defined by l and q l .
We make inference for the full hierarchical model (4) through a two-stage approach [13], which is outlined as follows.

Stage 1 analysis.
In the first stage, we construct and estimate a posterior distribution for each profile independently, using the likelihood defined in Section 3.1.2, independent priors for the nuisance parameters given by (3) and independent, 'flat' priors for the kinetic parameters given by where the lower and upper bounds for t max and MCR represent the minimum and maximum physiologically plausible values, respectively. Using MCMC, we generate a sample of simulated values for the profile-specific parameters from each profile-specific posterior: where y ijk denotes the set of all measured concentrations for profile ijk and p 1 ( ijk ) is the 'stage 1 prior' for ijk , given by the product of terms in (5). We denote the samples by , H ijk , and store them for use in stage 2, where they will be used to form proposal distributions for the profile-specific parameters in the full hierarchical model.

Stage 2 analysis.
Stage 2 defines an MCMC scheme for updating all unknown parameters in the full hierarchical model. The parameters for each of the 12 covariate selection sub-models, G l , G l , q G l , G = C, B, I, l = 1, … , 4, are jointly updated using reversible jump MCMC as described elsewhere [12]. The remaining parameters in Ω are updated by standard means. Each of the 'intercept' parameters, l and, where appropriate, il , V ijl and M ikl , i = 1, … , N, j = 1, … , J i , k = 1, … , K ij , l = 1, … , 4, has a full conditional distribution that is available in closed form, and so a standard Gibbs step is appropriate. The variance components, l , l , V l , M l , l = 1, … , 4, can be updated by slice sampling [14], say, which is the default option in OpenBUGS for our model.
The profile-specific parameters ijk , ijk and ijk are updated, jointly, as follows. From (4), their joint full conditional distribution is given by We wish to make a Metropolis-Hastings update with this as the target distribution. A candidate update is drawn from the proposal distribution by choosing s uniformly from {1, … , H ijk } and setting , where the right-hand side is one of the samples stored in stage 1. From (6) and (7), the target-to-proposal density ratio is thus Note the cancellation of likelihood terms and nuisance priors. This makes for rapid computation in stage 2, facilitating exploration of a wide range of population models for the profile-specific parameters. That the ratio does not depend on ijk or ijk also means that these parameters need not actually be updated. The Metropolis-Hastings acceptance probability for the proposed update is min (1, ), where is given by the target-to-proposal ratio at the proposed state divided by that at the current state: ) .
If the stage 1 prior p 1 (.) is 'flat', as in our model, then the ratio on the right can be ignored as it is approximately equal to 1.

Model assessment via cross-validation
The two-stage method described earlier also expedites cross-validation to assess the various models considered. In leave-one-out cross-validation, a model is evaluated via predictions drawn from the model estimated with a single observation, or set of observations, excluded. Large disparities, measured by a discrepancy function, between these predictions and the excluded observations are indicative of model inadequacy. The procedure is repeated with each observation, or set of observations, omitted in turn. Such approaches have been widely discussed in the literature [15,16]. Note that here we wish to assess the population model for the random effects ijk , as opposed to the fit of the pharmacokinetic model to the observed response data. A discrepancy function defined in terms of the response data is not appropriate when the focus is on the random effects, because agreement or otherwise of random effect estimates is unnecessarily masked by the observation error. Hence we require a discrepancy function that measures differences between 'observed' and 'predicted' random effects, denoted obs ijk and pred ijk , respectively. Clearly, the random effects cannot be observed, but the profilespecific posteriors defined by (6) can be used in lieu of observations [15]. The predicted random effects are obtained from the 'predictive prior': where y ⧵ijk denotes all observations except those from individual i during and the following meal k of visit j. This must be estimated with observations from each profile excluded in turn, which can be prohibitively time-consuming if the model is complex, as here. However, with the two-stage methodology presented here, only the second stage (which is computationally quick) needs to be repeated each time. Hence, this is a potentially important alternative to importance sampling [17], which can be unstable, or approximation [15,18].
ijk − obs ijk be our discrepancy function. We define the Bayesian p-value P = Pr and estimate it, for each profile, by independent sampling from the predictive prior (8) and the stage 1 posterior (6), obtaining n p-values in total. Each p-value should be uniformly distributed under the 'true' sampling model [15], suggesting that model assessment guided by quantilequantile plots of p-values is appropriate (note, however, that the p-values are not independent).

Stage 1 analysis
To draw samples from the independent, profile-specific posterior distributions, we used the freely available BUGS software [2,19]. Specifically, we used the OpenBUGS implementation (www.openbugs. net), which allows regression functions to be specified in terms of differential equations as standard. Convergence of the generated Markov chains was assessed informally by visually examining chainhistory plots and formally by applying the Brooks-Gelman-Rubin diagnostic [20,21] to the output from two Markov chains starting from widely differing initial values. We found that a burn-in phase of 100 000 iterations was easily sufficient. We performed a further 1 000 000 iterations following burnin for each profile, retaining every 100th value of each kinetic parameter (giving 10 000 approximately independent samples).   MCR, a and b). The shading is proportional to the inclusion probability.
A total of 1302 plasma insulin concentrations were available for analysis. The model fits for a typical individual are shown in Figure 3. Our model was unable to fit six out of the 88 profiles available, and so these profiles were removed from our analysis. Figure 4(a) shows predicted (posterior median) versus observed insulin concentrations for all profiles fitted and indicates a good performance overall. To examine the model's performance in more detail, we also plot standardised residuals against time and against predicted concentration (Figure 4(b) and (c), respectively). There are no obvious trends, suggesting that the residual variance model is adequate and the residuals are generally within the expected range, although there is a hint of systematic bias about 10 min after each mealtime. Figure 6. Decomposition of the variability in each pharmacokinetic (PK) parameter under each population model. Each panel represents the variance decomposition for a particular PK parameter for models either with (top row) or without (bottom row) covariates. Each coloured bar represents the proportion of total variance in the corresponding component for one of the four population models considered. Numbers shown within bars and on the y-axes are variances multiplied by the following factors, for log t max , log MCR, a and b, respectively: ×10 3 , ×10 3 , ×10 4 and ×10 −2 . Figure 5 shows the posterior inclusion probabilities for each covariate being included in the linear predictor for each pharmacokinetic parameter, under each population model. We consider an inclusion probability greater than 0.5 to signify a notable association. The covariates identified as associated with each pharmacokinetic parameter were substantively consistent across the population models considered. All models strongly indicate that both t max and a differ after breakfast in study 2 compared with the other meals and studies. We also found evidence for association of t max with pregnancy gestation and diabetes duration and of a with peak bolus rate, expected total daily dose and delivery of multiple boluses. Evidence of association with MCR was less strong, although there was some evidence that MCR differs in study 2. No notable associations were observed for b. Cross-validation analyses demonstrated that all of the population models we consider perform well: quantile-quantile plots (not shown) of the distribution of p-values under each model showed no obvious departures from uniformity. It was not possible to discern whether any specific model gave 'more uniform' plots than others. Differences amongst the models are more evident in the variance decomposition. These are shown in Figure 6. Two conclusions in common are apparent for t max and MCR when comparing the various models. First, the models including covariates exhibit notably less residual variability, as might be expected given the evidence of parameter-covariate association described earlier. Second, the residual variability is reduced in the two-level and three-level models compared with the one-level model. The three-level, meal-specific model, however, offers little beyond the two-level model in terms of explaining variability of the random effects: the residual variability is almost identical. In contrast, the three-level, visit-specific model substantially reduces the residual variability beyond that in the two-level model. The two-level and three-level models are similarly an improvement for the post-prandial concentration b, but there is little difference between the models with and without covariates. Conversely, woman-specific effects are less apparent for the accumulation rate a, but the inclusion of covariates does make a notable difference to the residual variability.

Stage 2 analyses
Overall, the three-level, visit-specific model with covariates appears to offer the best explanation of the observed variation. Parameter estimates for this model are shown in Table II. The most notable effect is the faster absorption after breakfast in study 2 as implied by the substantially decreased time-to-peak t max for these profiles. We also estimated that time-to-peak t max increases by 1.7% per week of gestation in pregnancy but decreases by 1.1% per year of diabetes.

Discussion
We have presented a two-stage method for simplifying the analysis of complex hierarchical data. This can be thought of as a type of particle filtering (sequential Monte Carlo sampling; [22,23]), where the resampling is carried out via Metropolis-Hastings. The method can be used whenever there is sufficient information in unit-specific data that units can be analysed independently (in stage 1). When unit-specific models are complex, as here, independent analyses may be a prerequisite for a full hierarchical analysis anyway, because the unit-specific data sets may require individual attention, for example, because of convergence issues or because we wish to assess whether a given unit-specific model can perform adequately over a wide range of units, say. However, our method is most likely to be useful when there is a range of 'population' models to consider for the unit-specific parameters. This is because the likelihood is dealt with in stage 1 and need not be computed in stage 2. Hence, the second stage can be performed repeatedly at little computational cost. This has allowed us to explore parameter-covariate relationships between four unknown parameters and 13 covariates of three different types, under a range of hierarchical structures (with up to four levels), for a model defined in terms of differential equations. The two-stage approach has also greatly facilitated cross-validation analyses to assess model performance. We chose to use cross-validation to explore models because we wished to focus our assessment of the models specifically on the choice of random effects model. Alternative approaches, such as the deviance information criterion (DIC) [24], would have undesirably focused on the pharmacokinetic model instead. It is our belief that the analyses presented herein would have been practically infeasible (or at least extremely cumbersome and time-consuming) without a two-stage approach.
We are mainly interested in the parameters t max and MCR, because a and b relate to the 'residual' insulin model, which is somewhat speculative (although it may aid in compensating for model misspecification). Our analyses indicate that t max is related to diabetes duration and gestational age in pregnant women using CSII. Time-to-peak increases by 1.7% per week of gestation and decreases by 1.1% per year of diabetes. In addition, there was a faster time-to-peak after breakfast in study 2, suggesting that moderately vigorous physical activity may counteract the gestational delay. The factors contributing to slower t max in late pregnancy are unknown. The impact of diabetes duration may be related to loss of residual C-peptide activity [25]. While these clinical/demographic factors are not easily modified, the impact of exercise, most likely related to enhanced tissue perfusion and temperature, is potentially modifiable and may be a useful tool for speeding up insulin absorption as pregnancy advances. In contrast with Gagnon-Auger et al. [26], where substantially higher doses were used, our results do not indicate relationships between t max and prandial dose, total daily dose or maternal body mass index.
There were no highly probable relationships identified for MCR, although in some models, there was a suggestion of decreased clearance following study 2 dinner and increased clearance following study 2 breakfast. This latter effect could be due to increased blood flow following physical exercise. There were also no relationships identified for b, but several were apparent for the drainage rate (−a). The drainage rate increased as the expected total daily dose increased but decreased with peak bolus rate and when multiple boluses were used.
A limitation of our analysis is that the data apply only to aspart delivered by CSII and are not necessarily applicable to lispro (another rapid-acting insulin analogue) or multiple daily injection therapy. However, as noted by Homko et al. [27], aspart and lispro have comparable pharmacokinetics, and CSII is increasingly recommended when glycaemic control targets are not achieved on multiple daily injections [28].
Although the two-stage method does not require independence between the four kinetic parameters, we chose in Section 3.2 to make separate distributional assumptions for each parameter. This is to avoid being too informative about the sizes (and relative sizes) of the variance components in our model, which characterise both between-patient and within-patient variabilities. The obvious alternative would be to assume multivariate normality for the vector (log t max , log MCR, a, b) ′ , with a covariance matrix free to assume any symmetric-positive-semidefinite (SPSD) form, allowing for any correlations that may exist between the different kinetic parameters. The inverse-Wishart prior commonly chosen for such covariance matrices ensures the SPSD constraint automatically. However, it is more informative than might be expected and can have considerable influence on the posterior, particularly when, as here, there are multiple levels of variability and small sample sizes within levels (e.g. [29]). Other approaches proposed in the literature (e.g. [30,31]) may be less informative, but we have chosen the relatively simple option of not allowing for correlations between kinetic parameters and assuming uniform priors for their variances.
We feel that this approach also reduces the possibility of confounding between covariance parameters and the inclusion of covariate effects.