External evaluation of population pharmacokinetic models of vancomycin in neonates: the transferability of published models to different clinical settings



Professor Evelyne Jacqz-Aigrain MD PhD, Department of Pediatric Pharmacology and Pharmacogenetics, Clinical Investigation Center CIC9202 INSERM, Hôpital Robert Debré, 48 Boulevard Sérurier, 75935 Paris Cedex 19, France.

Tel.: +331 4003 3656

Fax: +331 4003 5779

E-mail: evelyne.jacqz-aigrain@rdb.aphp.fr



Vancomycin is one of the most evaluated antibiotics in neonates using modeling and simulation approaches. However no clear consensus on optimal dosing has been achieved. The objective of the present study was to perform an external evaluation of published models, in order to test their predictive performances in an independent dataset and to identify the possible study-related factors influencing the transferability of pharmacokinetic models to different clinical settings.


Published neonatal vancomycin pharmacokinetic models were screened from the literature. The predictive performance of six models was evaluated using an independent dataset (112 concentrations from 78 neonates). The evaluation procedures used simulation-based diagnostics [visual predictive check (VPC) and normalized prediction distribution errors (NPDE)].


Differences in predictive performances of models for vancomycin pharmacokinetics in neonates were found. The mean of NPDE for six evaluated models were 1.35, −0.22, −0.36, 0.24, 0.66 and 0.48, respectively. These differences were explained, at least partly, by taking into account the method used to measure serum creatinine concentrations. The adult conversion factor of 1.3 (enzymatic to Jaffé) was tested with an improvement in the VPC and NPDE, but it still needs to be evaluated and validated in neonates. Differences were also identified between analytical methods for vancomycin.


The importance of analytical techniques for serum creatinine concentrations and vancomycin as predictors of vancomycin concentrations in neonates have been confirmed. Dosage individualization of vancomycin in neonates should consider not only patients' characteristics and clinical conditions, but also the methods used to measure serum creatinine and vancomycin.

What Is Already Known about This Subject

  • Population pharmacokinetics of vancomycin have been widely studied in neonates.
  • Many covariates including bodyweight, gestational age and post-natal age, renal function, co-administered drugs, etc. have been evaluated and some of them are associated with inter-individual pharmacokinetic variability.

What This Study Adds

  • The analytical technique used for measuring serum creatinine concentrations has been confirmed as a study-related factor influencing the transferability of published models to different clinical settings.
  • Different predictive performances were demonstrated between analytical methods (FPIA and EMIT).
  • The neonatal conversion factor of serum creatinine concentrations between the Jaffé and enzymatic methods and the interferences/cross-reactivity of analytical methods need to be evaluated in neonates in future studies.


Vancomycin, a glycopeptide antibiotic, is widely prescribed in neonatal intensive care units because of the increased incidence of neonatal late onset sepsis caused by coagulase-negative staphylococci and methicillin-resistant Staphylococcus aureus [1]. Vancomycin is a large, hydrophilic molecule with poor oral absorption. Hence it is given intravenously to treat systemic infections. Vancomycin is 25–50% protein bound, mainly to albumin and IgA (protein binding changes non-linearly with vancomycin concentrations), and is almost exclusively eliminated by the renal route [2, 3]. A small amount of vancomycin is eliminated by concentration-dependent, non-renal routes [4]. The pharmacokinetic–pharmacodynamic relationship of vancomycin to therapeutic response can be optimized by achieving a ratio of the area under the concentration–time curve in 24 h : the minimum inhibitory concentration of at least 400 h in adults with Staphylococcus aureus pneumonia [5, 6].

Population pharmacokinetic modelling approaches are strongly recommended for analysis of PK data in neonates. [7] To date, vancomycin is one of the most studied antibiotics using population pharmacokinetics in neonates and numerous studies have been published to characterize its pharmacokinetic parameters, to identify individual factors influencing variability and/or to develop dosing regimens for neonates [8-21]. Although all these models have been internally validated, no clear consensus on the optimal dosing regimen has been achieved in clinical practice [8, 22] because results obtained differ from one study to another.

One hypothesis for this discrepancy might be centre related differences in the data used for modelling. The centre-related factors (such as study population, including number of neonates, clinical practices, treatment protocols, analytical methods for vancomycin and serum creatinine concentration measurements) might have important influences on extrapolating the results to patients from another centre. This potential influence might not be identified with an internal evaluation process [23]. A recent review of all the population pharmacokinetic analyses of vancomycin also heightened the requirement for external evaluation of published models [24]. Therefore, the present study was conducted to perform an external evaluation of published vancomycin population pharmacokinetic models in neonates, in order to test their predictive performance using an independent dataset. Our aim was to identify the possible study-related factors influencing the transferability of pharmacokinetic models to different clinical settings.


Review of population pharmacokinetic models of vancomycin in neonates

We performed a systematic literature search in PubMed and EMBASE for all studies evaluating population pharmacokinetic parameters of vancomycin in neonates until 2010. We combined the following key words (MeSH and free text) in our search strategies: vancomycin, neonate, infant, newborn, paediatric, pharmacokinetic, population pharmacokinetics and reference lists of identified articles were then manually screened for additional relevant studies by two authors (Wei Zhao and Evelyne Jacqz-Aigrain).

The following modelling information was extracted from the articles and from direct contacts with the authors: model structure, typical population pharmacokinetic parameters, inter- and intra-individual variability, residual variability, covariates, estimation method (first order or first order condition with or without interaction option) and the methods of handling lower limit of quantification concentrations (e.g. half of quantification value or M3 method which maximizes the likelihood for all the data and treats the concentrations below the quantification as censored [25]). Models without confirmed information from original authors were excluded.

Patients-external evaluation database

Neonates with a post-natal age of <28 days, receiving vancomycin during their stay in the neonatal intensive care unit of Robert Debré University Hospital (Paris, France) between January 2010 and November 2010 were considered for inclusion in this prospective study if at least one vancomycin serum concentration was assayed for therapeutic drug monitoring. The following data were prospectively collected by a trained research assistant (Daolun Zhang): vancomycin dose, duration of administration, post-menstrual age (weeks), post-natal age (days), small for gestational age according to the foetal growth weight standard in European neonates [26], weight (kg), serum creatinine concentrations (μmol l−1), use of positive pressure ventilation, concurrent medications (such as non-selective cyclo-oxygenase inhibitors, inotropes, amoxicillin-clavulanic acid, spironolactone), time of the last dose before sampling and blood sampling times. Patients with incomplete information were excluded. The study was carried out in accordance with the Declaration of Helsinki. The Ethics Committee (CPP Comité de Protection des Personnes, Hôpital Saint Louis, île-de-France IV) declared that this research project could be exempted from obtaining informed consent because all data were extracted during routine therapeutic drug monitoring procedures.

Dosing regimen and sampling

Vancomycin (Sandoz, Levallois-Perret, France) was administered as an intravenous infusion over 60 min. The empirical initial dosing regimen is presented in Table 1, showing that the dose of 15 mg kg−1 was administered at a dosing interval of 6 to 36 h depending on post-menstrual age. Monitoring of vancomycin concentrations was performed in order to maintain a trough concentration at steady-state between 5 and 15 mg l−1. In cases of severe infection, trough concentrations of up to 20 mg l−1 were targeted, with close clinical follow-up.

Table 1. Dosage regimen of vancomycin in neonates used at Robert Debré hospital
Post-menstrual age (weeks)Dose (mg kg−1)Dosing interval (h)

Assay of serum vancomycin and creatinine concentrations

The serum vancomycin trough concentrations were determined either by an enzyme-multiplied immunoassay method (EMIT) using the Cobas Mira Plus System (Roche Diagnostics, Neuilly-sur-seine, France) or by a fluorescence polarization immunoassay method (FPIA) using the Cobas integra 400 plus system (Roche Diagnostics, Meylan, France). According to the manufacturer's instructions, the lower limit of quantification and coefficients of variation were 5 mg l−1 and <5.7% for EMIT, and 0.74 mg l−1 and <3.3% for FPIA, respectively [27, 28]. Serum creatinine concentrations were measured by an enzymatic method using the Advia 1800 chemistry system (Siemens Medical Solutions Diagnostics, Puteaux, France). The lower limit of quantification for this assay was 13 μmol l−1.

Evaluation of the predictive performance of published pharmacokinetic models

The predictive performance of published pharmacokinetic models was evaluated individually by simulation-based diagnostic methods. The simulation studies were conducted using nonmem VI (V2.0; Icon Development Solutions, USA).

Simulation-based diagnostics were performed by using normalized prediction distribution errors (NPDE) [29] and visual predictive checks (VPC). The dataset was simulated 1000 times using the published population model parameters (typical PK parameters, inter- individual variability and residual error models). For NPDE, a cumulative distribution was assembled for each observation with 1000 simulated concentrations. The NPDE is expected to follow an N (0, 1) distribution. The following graphs were plotted by using NPDE within the R package (v1.2) [30]: (i) a QQ-plot of the distribution of the NPDE vs. theoretical N (0,1) distribution and (ii) a histogram of the NPDE. For the VPC, the simulated concentrations (5th, 50th and 95th percentiles) and observed concentrations (5th, 50th and 95th percentiles) were plotted against time.


The external evaluation dataset consisted of 112 trough steady-state concentrations (32 measured with EMIT and 80 with FPIA) obtained from 78 neonates. Routine monitoring of vancomycin concentrations was carried out after the third dose and dosage intervals were assumed to be regular from the start of treatment (the ADDL data item was used to account for past dosage history). Overall, 58% of the first measured concentrations (n = 45) were in the target range of 5 to 15 mg l−1, 14% were below 5 mg l−1 and 28% were above 15 mg l−1. After dosage adjustment, 61% of the measured concentrations (n = 16) achieved the target, 4% were below 5 mg l−1 and 35% were still above 15 mg l−1. The characteristics of the patients are presented in Table 2.

Table 2. Baseline characteristics of 78 neonates (112 samples) in the evaluation dataset
  1. SGA, small-for-gestational-age. *The serum creatinine concentration was measured on the same day as the vancomycin concentrations by enzymatic method.
Concentration (mg l−1) 13.67.412.0<LLOQ–42.8
Weight (kg) 1.410.881.140.57–4.9
Post-natal age (days) 146143–27
Post-menstrual age (weeks) 32.24.331.026.3–43.7
Serum creatinine (μmol l−1)* 52264621–174
Positive pressure ventilation44    
 Inotropic drugs10    
 Non-selective NSAIDs5    
 Amoxicillin-clavulanic acid0    

Seven neonatal population pharmacokinetic models of vancomycin were published between years 1999 and 2010 [8-13, 17]. Differences between studies were identified with regard to the neonatal population in terms of total number, number of preterm and term babies and covariates tested. The characteristics of these seven studies are summarized in Table 3. As the dataset from model A was included in model B, only model B was evaluated in the following tests.

Table 3. Summary of seven published population pharmacokinetic models of vancomycin in neonates
ReferenceAnderson et al., 2007 [8]Allegaert et al., 2007 [9]Kimura et al., 2004 [10]Lo et al., 2010 [11]Marqués-Miñana et al., 2010 [12]> Grimsley et al., 1999 [17]Capparelli et al., 2001 [13]
PatientsPremature neonates (PMA range 24–34 weeks)Premature neonates (median PMA of 28 weeks, range 24–30 weeks)Neonates and infants (PMA range 25.1–48.4 weeks)Premature neonates (median PMA of 30 weeks; range 23.6–34 weeks)Neonates and infants (PMA range 25.1–48.1 week)Neonates and infants(GA range 25–41 weeks; PNA range 2–76 days)Neonates and infants (GA 33.5 ± 6 weeks; PNA 70 ± 100 days)
Dosing regimenPNA 1–7 days/serum creatinine value >71 μmol l−1: 15 mg kg−1 every 12 hPNA 1–7 days/serum creatinine value >71 μmol l−1: 15 mg kg−1 every 12 hPNA 1–7 days: 15 mg kg−1 every 12 hSee attached Table 3S1See attached Table 3S215 mg kg−1 every 24 h – 15 mg kg−1 every 8 h15 mg kg−1 every 24 h – 15 mg kg−1 every 12 h
PNA > 7 days: 15 mg kg−1 every 8 hPNA > 7 days: 15 mg kg−1 every 8 hPNA > 7 days: 15 mg kg−1 every 8 h
Number of patients21424919116111 (70 in building group and 41 in validation group)59374
Number of samples604648888353471103
Sampling timesTrough and peakCombination of sparse and intensive PK sampling
Structure modelOne compartment model with first order eliminationTwo compartment model with first-order elimination
Tested covariatesWeight, PMA, PNA, renal function, positive blood culture, positive ventilation, inotrope, NSAID, maternal betamethasone administrationWeight, PMA, SGA, renal function, NCOX, maternal betamethasone administrationWeight, PCA, PNA, GA, Apgar score, serum creatinineWeight, PMA, PNA, SGA, serum creatinine, race, ventilation, NSAIDWeight, birth weight, PMA, PNA, GA, height, urine output, BSA, gender, concomitant drugsWeight, PCA, GA, serum creatinine, nutrition,Weight, PNA, GA, gender, serum creatinine, apgar5 score, ECMO
Significant covariate on CLWeight, PMA, renal function*, positive pressure ventilationWeight, PMA, renal function*, non-selective NSAIDs, SGAWeight, PCA, serum creatinineWeight, PMA, SGAWeight, PMA, amoxicillin-clavulanic acidWeight, serum creatinineWeight, PNA, GA, serum creatinine
CL formula3.83 × (weight/70)0.75 × (PMA3.68/(PMA3.68 + 33.33.68)) × (516 × EXP (0.00823 × ((PMA − 40)/52 − 40))/serum creatinine)/6) × Fventilation1.58 × (weight/70)0.75 × (EXP(0.0456 × (PMA − 30))) × )) × (516 × EXP (0.00766 × ((PMA − 30)/52 − 40))/serum creatinine)/6) × FSGA × FNCOX0.0323 × weight/Serum creatinine (PCA ≥ 34 weeks)CL = 1.0 × (weight/70)0.75 × (PMA/30)3.16 × (FSGA)0.00192 × PMA × (1 + 0.65 × amoxicillin-clavulanic acid) × weight3.56 × weigh/serum creatinineWeight × (0.028/serum creatinine + 0.000127 × PNA + FGA) + 0.006
0.025 × weight/serum creatinine (PCA < 34 weeks)
Significant covariate on VWeight, inotropic drugsWeightWeightWeightWeight, spironolactoneWeightWeight
V formula39.4 × (weight/70) × Finotrope39.3 × (weight/70)0.66 × weight36.6 × (weight/70)0.572 × (1 − 0.344 × spironolactone) × weight0.669 × weightVss = 0.793 × weight + 0.01
Vc = 0.0334 × weight
Validation methodBasic goodness-of-fit plots, BootstrappingBasic goodness-of-fit plotsBasic goodness-of-fit plots, BootstrappingBasic goodness-of-fit plots, Bootstrapping, Box-plotsBasic goodness-of-fit plots, Bootstrapping, Visual predictive checkBasic goodness-of-fit plotsBasic goodness-of-fit plots, external evaluation
Vancomycine analytical methodPETINIA on DimensionPETINIA on DimensionFPIA on TDxFPIA on Cobas IntegraFPIA on TDxFPIA on TDxEMIT and FPIA
Serum creatinine methodJafféJafféEnzymaticJafféJafféJaffé
Table 3S1    
Dose mg kg−1Dosing interval (h)Dose mg kg−1Dosing interval (h)
Table 3S2:   
Weight (kg)PNA (days)Dose mg kg−1Dosing interval (h)
  1. AGA, appropriate for gestational age; BSA, body surface area; ECMO, extra corporeal membrane oxygenation; EMIT, enzyme-multiplied immunoassay method; FPIA, fluorescence polarization immunoassay method; GA, gestational age; NSAID, nonsteroidal anti-inflammatory drugs; PCA, postconceptional age; PETINIA, particle enhanced turbidimetric inhibition immunoassay; PMA, post-menstrual age; PNA, post-natal age; SGA, small for gestational-age. *Renal function = (516 × EXP (Kage × ((PMA − 40)/52 − 40))/serum creatinine)/6. †Urine output was tested as indicator of renal function; serum creatinine concentration was not tested. ‡A loading dose of 15 mg kg−1 was given for maintenance doses of less than 15 mg per dose. §Model B was developed using the dataset of model A plus 35 extra neonates.

One of the differences between the published models was the method used for measuring serum creatinine concentrations. The Jaffé method was used in models A, B, F and G, the enzymatic method in model C and serum creatinine concentration was not included in model D. Model E tested urine output as a potential variable for renal function, but it was not significant. The Jaffé method is known to be non-specific and can overestimate serum creatinine concentrations, especially when bilirubin is elevated. In adults, the serum creatinine concentration measurement using the Jaffé method overestimates that of the enzymatic method by about 30% [31, 32]. and we therefore tested the adult conversion factor of 1.3 in this study.

Normalized prediction distribution errors

Before accounting for differences in creatinine assay methods, the mean NPDE (Table 4 and Figure 1) were found to be significantly positive for models B, E, F and G (0.26% of the simulated patients had negative CL values with model E). A positive mean NPDE indicates an under-prediction of the concentrations in the external evaluation dataset. The negative mean NPDE for models C and D indicate an over-prediction. The significant difference in NPDE variance indicates over-prediction of variability for models C and E.

Figure 1.

Normalized prediction distribution errors NPDE and visual predictive check. NPDE: QQ-plot of the distribution of the NPDE vs. the theoretical N (0,1) distribution (left). Histogram of the distribution of the NPDE, with the density of the standard Gaussian distribution overlaid (right). Visual predictive check: observed data are plotted using a circle (○). The dashed lines represent the 10th, 50th and 90th percentiles of simulated data (n = 1000). The solid lines represent the 10th, 50th and 90th percentiles of observed data. Models B–E: the six published models. Models B1, F1 and G1: Jaffe equivalent concentrations with conversion factor of 1

Table 4. The respective mean and variance of normalized prediction distribution errors of external evaluation dataset using parameters derived from six published models
  1. Models B1, F1 and G1: Jaffé equivalent concentrations with conversion factor of 1.3. *Wilcoxon signed rank test. †Fisher variance test.

When the serum creatinine concentrations in the external evaluation dataset were converted to the Jaffé equivalent using the adult correction factor of 1.3, the NPDE of models B, F and G showed a major improvement in performance. The other models were not re-assessed because model C also used an enzymatic method for measuring serum creatinine concentration and models D and E did not include serum creatinine concentration as a covariate.

Figure  2 also illustrates the relationship between post-menstrual age, weight or serum creatinine concentration and NPDE. No appreciable differences in model performance were found across all post-menstrual age, weight and serum creatinine concentration ranges for each individual model.

Figure 2.

Normalized prediction distribution errors NPDE vs. time, population predicted concentrations, post-menstrual age (PMA), weight (WT) and serum creatinine concentration. NPDE vs. time since first dose. NPDE vs. population predicted concentrations (PRED). NPDE vs. PMA. NPDE vs. WT. NPDE vs. serum creatinine concentration

As two analytical techniques (EMIT and FPIA) were used to measure serum vancomycin concentrations, the NPDE value of each patient was extracted from the NPDE R package (v1.2) and compared between the two methods. Post-menstrual age, serum creatinine concentration and weight were not significantly different between the two groups. In Table 5, the different predictive performances of the two analytical techniques in the external dataset are shown, indicating their impact on the transferability of the pharmacokinetic models.

Table 5. The respective mean and SD of normalized prediction distribution errors of external evaluation dataset according to analytical methods of vancomycin (EMIT or FPIA)
Mean ± SDMean ± SD
  1. Models B1, F1 and G1: Jaffé equivalent concentrations with conversion factor of 1.3. *t-test (to describe the significance of the NPDE between the samples measured by the EMIT and by the FPIA for each model).
B1.01 ± 0.901.48 ± 1.140.04
C−0.59 ± 1.01−0.07 ± 1.190.03
D−0.75 ± 0.83−0.21 ± 1.180.02
E0.74 ± 1.360.04 ± 1.080.005
F0.28 ± 0.650.82 ± 0.980.005
G0.09 ± 0.740.64 ± 1.070.01
B10.26 ± 0.880.77 ± 1.120.02
F1−0.32 ± 0.670.26 ± 0.990.003
G1−0.36 ± 0.790.15 ± 1.040.01

Visual predictive checks

The VPCs showed an initial under-prediction of both the median and 90% percentile interval of observations for models B, F and G (Figure 1) but predictions of median vancomycin concentrations were acceptable with models C, D and E. Using the adult correction factor of 1.3, a slight under-prediction of median concentrations for model B1 was identified but predictions were acceptable for models F1 and G1 (Figure 2).


Modelling and simulation have shown major advantages in supporting dosing regimen selection, streamlining the costs and duration of drug development [33], particularly in paediatrics. According to regulatory guidelines [34-36], vancomycin is a good example of a drug for which the modelling and simulation approach can be used to establish optimal dosage recommendations in neonates. This antibiotic is active against well defined bacteria and its pharmacokinetic/pharmacodynamic relationship can be assumed to be similar across all age ranges including neonates, as the target is the bacterium, although treatment efficacy requires the assumption that host defence systems are similar in all age groups.

As non-linear, iterative search algorithms may produce spurious estimates of the model parameters, evaluation of the accuracy, robustness and predictive performance of these models are mandatory. An extensive model evaluation procedure should include an internal evaluation, followed by an external evaluation with an independent dataset and a prospective clinical study in a patient cohort with similar characteristics [37, 38]. However, full evaluation procedures are lacking in many published studies [2] and in a recent review [23], advanced internal evaluations were performed in only 16% of the models developed for children.

Internal evaluation aims at testing the ability of the proposed model to describe the data used to create the model. All neonatal population pharmacokinetic vancomycin models studied here, have reported internal evaluation using different methods, which include basic goodness-of-fit plots, bootstrapping or VPCs. The information obtained on the mean and variability of the pharmacokinetic parameters and the impact of covariates are important, as they will be used for pharmacokinetic study designs or dosing regimen optimization. However, such information should be carefully interpreted, particularly in neonates, because the number of study subjects recruited is frequently small and may not be representative of all neonatal groups (preterm and term babies). This is important because rapid physiological and developmental changes occur in this age range. All these factors could have an important, yet not always predictable, influence on interpreting and extrapolating the results.

External evaluation is an important additional procedure. Indeed, useful models are expected to describe precisely the original building dataset, but are also required to predict the expected concentrations and/or effects and their variability in patients with similar clinical, biological and disease treatment characteristics. External evaluation not only examines the modelling procedure, but also all other study-related factors. As demonstrated with our external evaluation dataset, different predictive performances of the published models were observed. This discrepancy could be partly explained by the following factors:

  1. Serum creatinine assay: It is known that the Jaffé method overestimates serum creatinine concentrations when compared with the enzymatic method, due to interferences with proteins, ketoacids, bilirubin, cephalosporins etc. This may lead to inaccuracies in calculating creatinine clearances when models based on Jaffé creatinine concentrations are fitted to datasets using the enzymatic method. Indeed, the enzymatic method is more specific and is considered to be more suitable, especially for premature neonates who commonly have high bilirubin concentrations [39, 40]. The adult conversion factor of 1.3 was evaluated with an improvement in the VPC and NPDE. There is no such validated conversion factor available for use in neonates. However, recent studies have demonstrated that continuous changes in neonatal serum composition (albumin, IgG, bilirubin) and renal maturation influenced the conversion of serum creatinine values between the two analytical methods [41, 42]. Therefore, it remains crucial to validate a creatinine conversion coefficient, adapted to neonates, which will probably change in the first weeks of life. Using this factor may improve transferability from one centre to another if different methods are used to determine serum creatinine concentrations.
  2. Analytical methods used for vancomycin monitoring: The FPIA assay has been shown to have interferences from vancomycin crystalline degradation products [43]. Vancomycin is converted to its crystalline degradation products when exposed to heat, including normal body temperature. This cross-reactivity was particularly important in patients with end-stage renal disease and could lead to falsely elevated serum vancomycin concentrations in excess of 50–70% [44] In addition, this overestimation may also have been influenced by total bilirubin concentrations and post-natal age [45]. The particle enhanced turbidimetric inhibition immunoassay used in model B does not cross-react with vancomycin crystalline degradation products. However the measured total concentrations from this method were higher than that of high-performance liquid chromatography, which is considered to be the ‘gold standard’ reference method [46]. In our present external evaluation dataset, the assays used to measure serum vancomycin concentrations changed throughout the study period. This allowed us to collect different serum vancomycin concentrations measured by both EMIT and FPIA, and the results showed different predictive performances of the two techniques. However, as the formation and accumulation of vancomycin crystalline degradation products in neonates are unknown and cross-reactions vary between assays from different manufacturers, a conversion factor forvancomycin concentrations between difference analytical techniques could not be investigated in the present study.
  3. Ethnicity: The external dataset was from Caucasian and African neonates, although the percentage of each ethnic group was not recorded, as special consent is required to collect this information. Our dataset was slightly over-predicted by model D, which was developed with Malaysian neonates. It should also be noted that, in contrast to the other models, model D did not include renal function as a covariate.

If a model is used to establish dosing regimen recommendations, then simulation-based diagnostics should be used and the NPDE and VPC are considered as the reference methods [29]. NPDE yields information on the accuracy of the predictive performance of a model by calculating the mean value and variance of the prediction errors. VPC shows the direct visual relationship between predicted and observed concentrations. The combination of these two methods facilitates interpretation of the results. As demonstrated by the review of published models, the differences in age and weight exist among the studies. Models B and D were developed based on PK data from preterm neonates and the other models were based on neonates and infants. Due to rapid physiological changes in neonates, it is important to perform model evaluation procedures across age and weight ranges [47], which are considered to be representative of developmental variables. No appreciable differences in model predictive performance across post-menstrual age and weight range were found for all six published models. This indicates that even though patients' age and weight vary, there was no systemic bias in population prediction using these models. In addition, the renal function should be an important covariate of vancomycin clearance. We evaluated the predictive performance across the serum creatinine concentration range and no appreciable differences were found. However, the integration of renal function in vancomycin neonatal dosing predictions is still controversial in published studies. Difference approaches (model-estimated method, serum creatinine measurement, urine output) were used. The comparison of these approaches should be evaluated in a further study based on a large dataset.

As our study intended to illustrate the importance of external evaluation to identify the possible study-related factors that might limit transferability in different clinical settings, we did not develop a new population pharmacokinetic model with our external evaluation dataset or recommend new dosing regimen. In addition, we cannot recommend which published model is ‘better’ as a result of this analysis, as this is beyond the scope of this analysis. As important difference was highlighted when transferring vancomycin published models, in the absence of external validation, models when only internally validated should only guide individual dosing regimens of vanocmycin in their own clinical setting.

There are some limitations to our study. As this study was based on routine therapeutic drug monitoring data, only vancomycin trough concentrations were available for external evaluation. Peak concentrations are not routinely measured because a large variability in peak concentrations is often observed and clinical benefit has not been investigated [48]. There is no study to compare the usefulness of monitoring vancomycin peak and/or trough concentrations in neonates. A full dosing history was not available for all samples. In such cases, regular dosing since the start of vancomycin treatment and every change in dosage was assumed.

In conclusion, in the current study, the predictive performance of six published neonatal pharmacokinetic models of vancomycin was evaluated with an independent external dataset. The published models gave important information on vancomycin population pharmacokinetic parameters and covariate relationship in neonates. However, the serum creatinine assay method, either Jaffé or enzymatic, has an important impact on model prediction when tested with independent patients. Given the continuous and important changes of blood composition during the neonatal period, the adapted conversion factor between different analytical techniques still needs to be investigated in neonates. A different predictive performance was also revealed between different analytical methods for serum vancomycin concentrations. The transferability of published results to different clinical settings has to consider study-related factors.

Competing Interests

All authors have completed the Unified Competing Interest form at http://www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: Dr Wei Zhao and Professor Evelyne Jacqz-Aigrain had support from ‘la Foundation PremUp’ (Professor Danièle Evain Brion, France) and Global Research in Paediatrics Network of Excellence (GRIP, EU-funded FP7 project, Grant Agreement number 261060) for the submitted work. The clinical research of Dr Karel Allegaert is supported by the Fund for Scientific Research, Flanders (Belgium) (F.W.O. Vlaanderen) by a Fundamental Clinical Investigatorship (1800209 N). The clinical research of Dr Yoke-Lin Lo is supported by IRPA grant 06-02-03-0246-EA246 from the Ministry of Science, Technology and Innovation of Malaysia. There were no financial relationships with any organizations that might have an interest in the submitted work in the previous 3 years and no other relationships or activities that could appear to have influenced the submitted work.

We thank the technicians in the clinical pharmacology department: Christel Grondin, Michel Popon, Samira Benakouche and Yves Médard for technical support. We would also like to acknowledge Professor Brian Anderson for helpful advice and discussion.