Targeted therapy revolutionizes the way that physicians treat cancer and other diseases, enabling them to select individualized treatment adaptively according to the patient's biomarker profile. The implementation of targeted therapy requires that the biomarkers are accurately measured, which may not always be feasible in practice. We propose two optimal marker-adaptive trial designs in which the biomarkers are subject to measurement errors. The first design focuses on a patient's individual benefit and minimizes the treatment assignment error so that each patient has the highest probability of being assigned to the treatment that matches his or her true biomarker status. The second design focuses on the group benefit, which maximizes the overall response rate for all the patients enrolled in the trial. We develop a Wald test to evaluate the treatment effects for marker subgroups at the end of the trial and derive the corresponding asymptotic power function. Simulation studies and an application to a lymphoma cancer trial show that the optimal designs proposed achieve our design goal and obtain desirable operating characteristics.

Identifying loci that modify the risk of cancer for mutation carriers is an important topic in oncogenetics. Within this research area, we are concerned with the analysis of the association between a genetic variant (single-nucleotide polymorphism rs13281615) and breast cancer among women with a pathogenic mutation in the BRCA2 gene. As this mutation is rare, data were collected retrospectively according to a case-study design through genetic screening programmes. This involves a selection bias and an intrafamilial correlation, which complicates the statistical analysis. We derive a Cramer–von Mises-type statistic to test the equality of genotype-specific survival functions when the proportional hazards model does not hold. A Clayton copula is specified to model the residual phenotype familial dependence and an innovative semiparametric bootstrap procedure is proposed to approximate the distribution of the test statistic under the null hypothesis. The test proposed is applied to data from European and North American mutation carriers and its performance is evaluated by simulations.

In phase I trials, effectively treating patients and minimizing the chance of exposing them to subtherapeutic and overly toxic doses are clinicians' top priority. Motived by this practical consideration, we propose Bayesian optimal interval (BOIN) designs to find the maximum tolerated dose and to minimize the probability of inappropriate dose assignments for patients. We show, both theoretically and numerically, that the BOIN design not only has superior finite and large sample properties but also can be easily implemented in a simple way similar to the traditional ‘3+3’ design. Compared with the well-known continual reassessment method, the BOIN design yields comparable average performance to select the maximum tolerated dose but has a substantially lower risk of assigning patients to subtherapeutic and overly toxic doses. We apply the BOIN design to two cancer clinical trials.

Systematic reviews of diagnostic tests often involve a mixture of case–control and cohort studies. The standard methods for evaluating diagnostic accuracy focus only on sensitivity and specificity and ignore the information on disease prevalence that is contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as population-averaged or overall positive and negative predictive values, which reflect the clinical utility of a diagnostic test. We propose a hybrid approach that jointly models the disease prevalence along with diagnostic test sensitivity and specificity in cohort studies, and sensitivity and specificity in case–control studies. To overcome the potential computational difficulties in the standard full likelihood inference of the hybrid model proposed, we propose an alternative inference procedure based on composite likelihood. Such composite-likelihood-based inference does not suffer computational problems and maintains high relative efficiency. In addition, it is more robust to model misspecifications compared with standard full likelihood inference. We apply our approach to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma.

In manufacturing industry, it may be important to study the relationship between machine component failures under stress. Examples include the failures of integrated circuits and memory chips in electronic merchandise given various levels of electronic shock. Such studies are important for the development of new products and for the improvement of existing products. We assume two-component systems for simplicity and we assume that the joint probability of failures increases with stress as a cumulative bivariate Weibull function. Optimal designs have been developed for two correlated binary responses by using the Gumbel model, the bivariate binary Cox model and the bivariate probit model. In all these models, the amount of damage ranges from −∞ to ∞. In the Weibull model, the amount of damage is positive, which is natural for experimental factors such as voltage, tension or pressure. We describe locally optimal designs under bivariate Weibull assumptions. Since locally optimal designs with non-linear models depend on predetermined parameter values, misspecified parameter values may lead to inefficient designs. However, we find that optimal designs under the Weibull model are surprisingly efficient over a wide range of misspecified parameter values. To improve the efficiency, we recommend a multistage procedure. We show how using a two-stage procedure can provide a substantial improvement over a design that was optimal for misspecified parameters.

Applications of finite state Markov transition models are numerous and the problem of estimating transition rates of such processes has been considered in many fields of science. Because these processes cannot always be followed in continuous time, the investigators often confront the question of when to measure the state of the process. The estimation of transition rates then needs to be based on a sequence of discrete time data, and the variance and estimability of the estimators greatly depend on the time spacings between consecutive observations. We study optimal time spacings for a sequence of discrete time observations to estimate the transition rates of a time homogeneous multistate Markov process. For comparative studies, optimal time spacings to estimate rate ratios are considered. Optimality criteria are formulated through the minimization of the variances of the parameter estimators of interest and are investigated assuming a stationary initial distribution. For practical purposes, we propose a simple approximation for the optimal time spacing and study the limits for its applicability. The work is motivated by studies of colonization with *Streptococcus pneumoniae*.

Incorporating spatial covariance into clustering has previously been considered for functional data to identify groups of functions which are similar across space. However, in the majority of situations that have been considered until now the most appropriate metric has been Euclidean distance. Directed networks present additional challenges in terms of estimating spatial covariance due to their complex structure. Although suitable river network covariance models have been proposed for use with stream distance, where distance is computed along the stream network, these models have not been extended for contexts where the data are functional, as is often the case with environmental data. The paper develops a method of calculating spatial covariance between functions from sites along a river network and applies the measure as a weight within functional hierarchical clustering. Levels of nitrate pollution on the River Tweed in Scotland are considered with the aim of identifying groups of monitoring stations which display similar spatiotemporal characteristics.

In many epidemiological and clinical studies, misclassification may arise in one or several variables, resulting in potentially invalid analytic results (e.g. estimates of odds ratios of interest) when no correction is made. Here we consider the situation in which correlated binary response variables are subject to misclassification. Building on prior work, we provide an approach to adjust for potentially complex differential misclassification via internal validation sampling applied at multiple study time points. We seek to estimate the parameters of a primary generalized linear mixed model that accounts for baseline and/or time-dependent covariates. The misclassification process is modelled via a second generalized linear model that captures variations in sensitivity and specificity parameters according to time and a set of subject-specific covariates that may or may not overlap with those in the primary model. Simulation studies demonstrate the precision and validity of the method proposed. An application is presented based on longitudinal assessments of bacterial vaginosis conducted in the ‘HIV epidemiology research’ study.

The goal of the paper is to predict the additional amount of antiretroviral treatment that would be required to implement a policy of treating all human immunodeficiency virus (HIV) infected people at the time of detection of infection rather than at the time that their CD4 T-lymphocyte counts are observed to be below a threshold—the current standard of care. We describe a sampling-based inverse prediction method for predicting time from HIV infection to attainment of the CD4 cell threshold and apply it to a set of treatment naive HIV-infected subjects in a village in Botswana who participated in a household survey that collected cross-sectional CD4 cell counts. The inferential target of interest is the population level mean time to reaching the CD4 cell-based treatment threshold in this group of subjects. To address the challenges arising from the fact that these subjects’ dates of HIV infection are unknown, we make use of data from an auxiliary cohort study of subjects enrolled shortly after HIV infection in which CD4 cell counts were measured over time. We use a multiple-imputation framework to combine across the different sources of data, and we discuss how the methods compensate for the length-biased sampling that is inherent in cross-sectional screening procedures, such as household surveys. We comment on how the results bear on analyses of costs of implementation of treatment-for-prevention use of antiretroviral drugs in HIV prevention interventions.

Characteristic scale is a notion that pervades the geophysical sciences, but it has no widely accepted precise definition. Motivated by the facts that the wavelet transform decomposes a time series into coefficients that are associated with different scales and that the variance of these coefficients (the so-called wavelet variance) decomposes the variance of the time series across scales, the paper proposes a definition for characteristic scale in terms of peaks in plots of the wavelet variance *versus* scale. After presenting basic theory for wavelet-based characteristic scales, a natural estimator for these scales is considered. Large sample theory for this estimator permits the construction of confidence intervals for a true unknown characteristic scale. Computer experiments are presented that demonstrate the efficacy of the large sample theory for finite sample sizes. Characteristic scale estimates are calculated for medium multiyear Arctic sea ice, global temperature records, coherent structures in river flows and the Madden–Julian oscillation in an atmospheric time series.

We consider the joint modelling, analysis and prediction of a longitudinal binary process and a discrete time-to-event outcome. We consider data from a prospective pregnancy study, which provides day level information regarding the behaviour of couples attempting to conceive. Reproductive epidemiologists are particularly interested in developing a model for individualized predictions of time to pregnancy (TTP). A couple's intercourse behaviour should be an integral part of such a model and is one of the main focuses of the paper. In our motivating data, the intercourse observations are a long series of binary data with a periodic probability of success and the amount of available intercourse data is a function of both the menstrual cycle length and TTP. Moreover, these variables are dependent and observed on different, and nested, timescales (TTP is measured in menstrual cycles whereas intercourse is measured on days within a menstrual cycle) further complicating its analysis. Here, we propose a semiparametric shared parameter model for the joint modelling of the binary longitudinal data (intercourse behaviour) and the discrete survival outcome (TTP). Further, we develop couple-based dynamic predictions for the intercourse profiles, which in turn are used to assess the risk for subfertility (i.e. TTP longer than six menstrual cycles).

Heavy long-lasting rainfall can trigger earthquake swarms. We are interested in the specific shape of lagged rain influence on the occurrence of earthquakes at different depths at Mount Hochstaufen, Bavaria. We present a novel penalty structure for interpretable and flexible estimates of lag coefficients based on spline representations. We provide an easy-to-use implementation of our flexible distributed lag approach that can be used directly in the established R package mgcv for estimation of generalized additive models. This allows our approach to be immediately included in complex additive models for generalized responses even in hierarchical or longitudinal data settings, making use of established stable and well-tested inference algorithms. The benefit of flexible distributed lag modelling is shown in a detailed simulation study.

Much research in recent years for evidence evaluation in forensic science has focused on methods for determining the likelihood ratio in various scenarios. When the issue in question is whether evidence is associated with a person who is or is not associated with criminal activity then the problem is one of discrimination. A procedure for the determination of the likelihood ratio is developed when the evidential data are believed to be driven by an underlying latent Markov chain. Three other models that assume auto-correlated data without the underlying Markov chain are also described. The performances of these four models and a model assuming independence are compared by using data concerning traces of cocaine on banknotes.

Drug-induced liver injury (DILI) is a major public health issue and of serious concern for the pharmaceutical industry. Early detection of signs of a drug's potential for DILI is vital for pharmaceutical companies' evaluation of new drugs. A combination of extreme values of liver-specific variables indicate potential DILI (Hy's law). We estimate the probability of joint extreme elevations of laboratory variables by using the conditional approach to multivariate extremes which concerns the distribution of a random vector given an extreme component. We extend the current model to include the assumption of stochastically ordered survival curves and construct a hypothesis test for ordered tail dependence between doses, which is a pattern that is potentially triggered by DILI. The model proposed is applied to safety data from a phase 3 clinical trial of a drug that has been linked to liver toxicity.

Post-market medical product surveillance is important for detecting rare adverse events that are not identified during preapproval. The goal of surveillance is to assess over time for elevated rates of adverse events for new medical products. These studies utilize administrative databases from multiple large health plans. We propose a group sequential method using a permutation approach with generalized estimating equations to account for confounding. A simulation study is conducted to evaluate the performance of the group sequential generalized estimating equation method compared with two other approaches. The methods are then applied to a vaccine safety application from the Vaccine Safety Datalink.

In the USA, the Centers for Medicare and Medicaid Services use 30-day readmission, following hospitalization, as a proxy outcome to monitor quality of care. These efforts generally focus on treatable health conditions, such as pneumonia and heart failure. Expanding quality-of-care systems to monitor conditions for which treatment options are limited or non-existent, such as pancreatic cancer, is challenging because of the non-trivial force of mortality; 30-day mortality for pancreatic cancer is approximately 30%. In the statistical literature, data that arise when the observation of the time to some non-terminal event is subject to some terminal event are referred to as ‘semicompeting risks data’. Given such data, scientific interest may lie in at least one of three areas: estimation or inference for regression parameters, characterization of dependence between the two events and prediction given a covariate profile. Existing statistical methods focus almost exclusively on the first of these; methods are sparse or non-existent, however, when interest lies with understanding dependence and performing prediction. We propose a Bayesian semiparametric regression framework for analysing semicompeting risks data that permits the simultaneous investigation of all three of the aforementioned scientific goals. Characterization of the induced posterior and posterior predictive distributions is achieved via an efficient Metropolis–Hastings–Green algorithm, which has been implemented in an R package. The framework proposed is applied to data on 16051 individuals who were diagnosed with pancreatic cancer between 2005 and 2008, obtained from Medicare part A. We found that increased risk for readmission is associated with a high comorbidity index, a long hospital stay at initial hospitalization, non-white race, being male and discharge to home care.

To account for measurement error (ME) in explanatory variables, Bayesian approaches provide a flexible framework, as expert knowledge can be incorporated in the prior distributions. Recently, integrated nested Laplace approximations have been proven to be a computationally convenient alternative to sampling approaches for Bayesian inference in latent Gaussian models. We show how the most common approaches to adjust for ME, the classical and the Berkson ME, fit into this framework. This is achieved through a reformulation with augmented pseudo-observations and a suitable extension of the latent Gaussian field. Two specific classes are described, which allow for a particularly simple implementation using integrated nested Laplace approximations. We present three applications within the framework of generalized linear (mixed) models with ME. To illustrate the practical feasibility, R code is provided in on-line supplementary material.

DNA is now routinely used in criminal investigations and court cases, although DNA samples taken at crime scenes are of varying quality and therefore present challenging problems for their interpretation. We present a statistical model for the quantitative peak information obtained from an electropherogram of a forensic DNA sample and illustrate its potential use for the analysis of criminal cases. In contrast with most previously used methods, we directly model the peak height information and incorporate important artefacts that are associated with the production of the electropherogram. Our model has a number of unknown parameters, and we show that these can be estimated by the method of maximum likelihood in the presence of multiple unknown individuals contributing to the sample, and their approximate standard errors calculated; the computations exploit a Bayesian network representation of the model. A case example from a UK trial, as reported in the literature, is used to illustrate the efficacy and use of the model, both in finding likelihood ratios to quantify the strength of evidence, and in the deconvolution of mixtures for finding likely profiles of the individuals contributing to the sample. Our model is readily extended to simultaneous analysis of more than one mixture as illustrated in a case example. We show that the combination of evidence from several samples may give an evidential strength which is close to that of a single-source trace and thus modelling of peak height information provides a potentially very efficient mixture analysis.

The paper develops hypothesis testing procedures for the stratified mark-specific proportional hazards model in the presence of missing marks. The motivating application is preventive human immunodeficiency virus (HIV) vaccine efficacy trials, where the mark is the genetic distance of an infecting HIV sequence to an HIV sequence represented inside the vaccine. The test statistics are constructed on the basis of two-stage efficient estimators, which utilize auxiliary predictors of the missing marks. The asymptotic properties and finite sample performances of the testing procedures are investigated, demonstrating double robustness and effectiveness of the predictive auxiliaries to recover efficiency. The methods are applied to the RV144 vaccine trial.

Weather predictions are uncertain by nature. This uncertainty is dynamically assessed by a finite set of trajectories, called ensemble members. Unfortunately, ensemble prediction systems underestimate the uncertainty and thus are unreliable. Statistical approaches are proposed to post-process ensemble forecasts, including Bayesian model averaging and the Bayesian processor of output. We develop a methodology, called the Bayesian processor of ensemble members, from a hierarchical model and combining the two aforementioned frameworks to calibrate ensemble forecasts. The Bayesian processor of ensemble members is compared with Bayesian model averaging and the Bayesian processor of output by calibrating surface temperature forecasting over eight stations in the province of Quebec (Canada). Results show that ensemble forecast skill is improved by the method developed.

The Lyon–Fedder–Mobarry global magnetosphere–ionosphere coupled model LFM-MIX is used to study Sun–Earth interactions by simulating geomagnetic storms. This work focuses on relating the multifidelity output from LFM-MIX to field observations of ionospheric conductance. Given a set of input values and solar wind data, LFM-MIX numerically solves the magnetohydrodynamic equations and outputs a bivariate spatiotemporal field of ionospheric energy and flux. Of particular interest here are LFM-MIX input settings required to match corresponding output with field observations. To estimate these input settings, a multivariate spatiotemporal statistical LFM-MIX emulator is constructed. The statistical emulator leverages the multiple fidelities such that the less computationally demanding yet lower fidelity LFM-MIX is used to provide estimates of the higher fidelity output. The higher fidelity LFM-MIX output is then used for calibration by using additive and non-linear discrepancy functions.

We propose and fit a Bayesian model to infer palaeoclimate over several thousand years. The data that we use arise as ancient pollen counts taken from sediment cores together with radiocarbon dates which provide (uncertain) ages. When combined with a modern pollen–climate data set, we can calibrate ancient pollen into ancient climate. We use a normal–inverse Gaussian process prior to model the stochastic volatility of palaeoclimate over time, and we present a novel modularized Markov chain Monte Chain algorithm to enable fast computation. We illustrate our approach with a case-study from Sluggan Moss, Northern Ireland, and provide an R package, Bclim, for use at other sites.

We consider an application in electricity grid load prediction, where generalized additive models are appropriate, but where the data set's size can make their use practically intractable with existing methods. We therefore develop practical generalized additive model fitting methods for large data sets in the case in which the smooth terms in the model are represented by using penalized regression splines. The methods use iterative update schemes to obtain factors of the model matrix while requiring only subblocks of the model matrix to be computed at any one time. We show that efficient smoothing parameter estimation can be carried out in a well-justified manner. The grid load prediction problem requires updates of the model fit, as new data become available, and some means for dealing with residual auto-correlation in grid load. Methods are provided for these problems and parallel implementation is covered. The methods allow estimation of generalized additive models for large data sets by using modest computer hardware, and the grid load prediction problem illustrates the utility of reduced rank spline smoothing methods for dealing with complex modelling problems.

In observational studies, interest mainly lies in estimation of the population level relationship between the explanatory variables and dependent variables, and the estimation is often undertaken by using a sample of longitudinal data. In some situations, the longitudinal data sample features biases and loss of estimation efficiency due to non-random dropout. However, inclusion of population level information can increase estimation efficiency. We propose an empirical-likelihood-based method to incorporate population level information in a longitudinal study with dropout. The population level information is incorporated via constraints on functions of the parameters, and non-random dropout bias is corrected by using a weighted generalized estimating equations method. We provide a three-step estimation procedure that makes computation easier. Some commonly used methods are compared in simulation studies, which demonstrate that our proposed method can correct the non-random dropout bias and increase the estimation efficiency, especially for small sample sizes or when the missing proportion is high. In some situations, the improvement in efficiency is substantial. Finally, we apply the method to an Alzheimer's disease study.

We consider a problem of reducing the expected number of treatment failures in trials where the probability of response to treatment is close to 1 and treatments are compared on the basis of the log-odds ratio. We propose a new class of urn designs for randomization of patients in a clinical trial. The new urn designs target a number of allocation proportions including the allocation proportion that yields the same power as equal allocation but significantly less expected treatment failures. The new design is compared with the doubly adaptive biased coin design, the efficient randomized adaptive design and with equal allocation. The properties of the new class of designs are studied by embedding them in a family of continuous time stochastic processes.

We discuss two-sample global permutation tests for sets of multivariate ordinal data in possibly high dimensional set-ups, motivated by the analysis of data collected by means of the World Health Organization's ‘International classification of functioning, disability and health’. The tests do not require any modelling of the multivariate dependence structure. Specifically, we consider testing for marginal inhomogeneity and direction-independent marginal order. As opposed to max-*T*-tests, which are known to have good power against alternatives with few strong individual effects, the tests proposed have good power against alternatives with many weak individual effects. Permutation tests are valid only if the two multivariate distributions are identical under the null hypothesis. By means of simulations, we examine the practical effect of violations of this exchangeability condition. Our simulations suggest that theoretically invalid permutation tests can still be ‘practically valid’. In particular, they suggest that the degree of the permutation procedure's failure may be considered as a function of the difference in group-specific covariance matrices, the proportion between group sizes, the number of variables in the set, the test statistic used and the number of levels per variable.

Novel molecularly targeted agents (MTAs) have emerged as valuable alternatives or complements to traditional cytotoxic agents in the treatment of cancer. Clinicians are combining cytotoxic agents with MTAs in a single trial to achieve treatment synergism and better outcomes for patients. An important feature of such combinational trials is that, unlike the efficacy of the cytotoxic agent, that of the MTA may initially increase at low dose levels and then approximately plateau at higher dose levels as MTA saturation levels are reached. Therefore, the goal of the trial is to find the optimal dose combination that yields the highest efficacy with the lowest toxicity and meanwhile satisfies a certain safety requirement. We propose a Bayesian phase I–II design to find the optimal dose combination. We model toxicity by using a logistic regression and propose a novel proportional hazard model for efficacy, which accounts for the plateau in the MTA dose–efficacy curve. We evaluate the operating characteristics of the proposed design through simulation studies under various practical scenarios. The results show that the design proposed performs well and selects the optimal dose combination with high probability.