Regulatory authorities require that the sample size of a confirmatory trial is calculated prior to the start of the trial. However, the sample size quite often depends on parameters that might not be known in advance of the study. Misspecification of these parameters can lead to under- or overestimation of the sample size. Both situations are unfavourable as the first one decreases the power and the latter one leads to a waste of resources. Hence, designs have been suggested that allow a re-assessment of the sample size in an ongoing trial. These methods usually focus on estimating the variance. However, for some methods the performance depends not only on the variance but also on the correlation between measurements. We develop and compare different methods for blinded estimation of the correlation coefficient that are less likely to introduce operational bias when the blinding is maintained. Their performance with respect to bias and standard error is compared to the unblinded estimator. We simulated two different settings: one assuming that all group means are the same and one assuming that different groups have different means. Simulation results show that the naïve (one-sample) estimator is only slightly biased and has a standard error comparable to that of the unblinded estimator. However, if the group means differ, other estimators have better performance depending on the sample size per group and the number of groups.

]]>In follow-up studies, the disease event time can be subject to left truncation and right censoring. Furthermore, medical advancements have made it possible for patients to be cured of certain types of diseases. In this article, we consider a semiparametric mixture cure model for the regression analysis of left-truncated and right-censored data. The model combines a logistic regression for the probability of event occurrence with the class of transformation models for the time of occurrence. We investigate two techniques for estimating model parameters. The first approach is based on martingale estimating equations (EEs). The second approach is based on the conditional likelihood function given truncation variables. The asymptotic properties of both proposed estimators are established. Simulation studies indicate that the conditional maximum-likelihood estimator (cMLE) performs well while the estimator based on EEs is very unstable even though it is shown to be consistent. This is a special and intriguing phenomenon for the EE approach under cure model. We provide insights into this issue and find that the EE approach can be improved significantly by assigning appropriate weights to the censored observations in the EEs. This finding is useful in overcoming the instability of the EE approach in some more complicated situations, where the likelihood approach is not feasible. We illustrate the proposed estimation procedures by analyzing the age at onset of the occiput-wall distance event for patients with ankylosing spondylitis.

]]>Longitudinal studies are often applied in biomedical research and clinical trials to evaluate the treatment effect. The association pattern within the subject must be considered in both sample size calculation and the analysis. One of the most important approaches to analyze such a study is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which “working correlation structure” is introduced and the association pattern within the subject depends on a vector of association parameters denoted by ρ. The explicit sample size formulas for two-group comparison in linear and logistic regression models are obtained based on the GEE method by Liu and Liang. For cluster randomized trials (CRTs), researchers proposed the optimal sample sizes at both the cluster and individual level as a function of sampling costs and the intracluster correlation coefficient (ICC). In these approaches, the optimal sample sizes depend strongly on the ICC. However, the ICC is usually unknown for CRTs and multicenter trials. To overcome this shortcoming, Van Breukelen et al. consider a range of possible ICC values identified from literature reviews and present Maximin designs (MMDs) based on relative efficiency (RE) and efficiency under budget and cost constraints. In this paper, the optimal sample size and number of repeated measurements using GEE models with an exchangeable working correlation matrix is proposed under the considerations of fixed budget, where “optimal” refers to maximum power for a given sampling budget. The equations of sample size and number of repeated measurements for a known parameter value ρ are derived and a straightforward algorithm for unknown ρ is developed. Applications in practice are discussed. We also discuss the existence of the optimal design when an AR(1) working correlation matrix is assumed. Our proposed method can be extended under the scenarios when the true and working correlation matrix are different.

]]>Data with a large *p* (number of covariates) and/or a large *n* (sample size) are now commonly encountered. For many problems, regularization especially penalization is adopted for estimation and variable selection. The straightforward application of penalization to large datasets demands a “big computer” with high computational power. To improve computational feasibility, we develop bootstrap penalization, which dissects a big penalized estimation into a set of small ones, which can be executed in a highly parallel manner and each only demands a “small computer”. The proposed approach takes different strategies for data with different characteristics. For data with a large *p* but a small to moderate *n*, covariates are first clustered into relatively homogeneous blocks. The proposed approach consists of two sequential steps. In each step and for each bootstrap sample, we select blocks of covariates and run penalization. The results from multiple bootstrap samples are pooled to generate the final estimate. For data with a large *n* but a small to moderate *p*, we bootstrap a small number of subjects, apply penalized estimation, and then conduct a weighted average over multiple bootstrap samples. For data with a large *p* and a large *n*, the natural marriage of the previous two methods is applied. Numerical studies, including simulations and data analysis, show that the proposed approach has computational and numerical advantages over the straightforward application of penalization. An R package has been developed to implement the proposed methods.

In clinical and epidemiological studies information on the primary outcome of interest, that is, the disease status, is usually collected at a limited number of follow-up visits. The disease status can often only be retrieved retrospectively in individuals who are alive at follow-up, but will be missing for those who died before. Right-censoring the death cases at the last visit (ad-hoc analysis) yields biased hazard ratio estimates of a potential risk factor, and the bias can be substantial and occur in either direction. In this work, we investigate three different approaches that use the same likelihood contributions derived from an illness-death multistate model in order to more adequately estimate the hazard ratio by including the death cases into the analysis: a parametric approach, a penalized likelihood approach, and an imputation-based approach. We investigate to which extent these approaches allow for an unbiased regression analysis by evaluating their performance in simulation studies and on a real data example. In doing so, we use the full cohort with complete illness-death data as reference and artificially induce missing information due to death by setting discrete follow-up visits. Compared to an ad-hoc analysis, all considered approaches provide less biased or even unbiased results, depending on the situation studied. In the real data example, the parametric approach is seen to be too restrictive, whereas the imputation-based approach could almost reconstruct the original event history information.

]]>Stratified medicine seeks to identify biomarkers or parsimonious gene signatures distinguishing patients that will benefit most from a targeted treatment. We evaluated 12 approaches in high-dimensional Cox models in randomized clinical trials: penalization of the biomarker main effects and biomarker-by-treatment interactions (full-lasso, three kinds of adaptive lasso, ridge+lasso and group-lasso); dimensionality reduction of the main effect matrix via linear combinations (PCA+lasso (where PCA is principal components analysis) or PLS+lasso (where PLS is partial least squares)); penalization of modified covariates or of the arm-specific biomarker effects (two-I model); gradient boosting; and univariate approach with control of multiple testing. We compared these methods via simulations, evaluating their selection abilities in null and alternative scenarios. We varied the number of biomarkers, of nonnull main effects and true biomarker-by-treatment interactions. We also proposed a novel measure evaluating the interaction strength of the developed gene signatures. In the null scenarios, the group-lasso, two-I model, and gradient boosting performed poorly in the presence of nonnull main effects, and performed well in alternative scenarios with also high interaction strength. The adaptive lasso with grouped weights was too conservative. The modified covariates, PCA+lasso, PLS+lasso, and ridge+lasso performed moderately. The full-lasso and adaptive lassos performed well, with the exception of the full-lasso in the presence of only nonnull main effects. The univariate approach performed poorly in alternative scenarios. We also illustrate the methods using gene expression data from 614 breast cancer patients treated with adjuvant chemotherapy.

]]>Omics experiments endowed with a time-course design may enable us to uncover the dynamic interplay among genes of cellular processes. Multivariate techniques (like VAR(1) models describing the temporal and contemporaneous relations among variates) that may facilitate this goal are hampered by the high-dimensionality of the resulting data. This is resolved by the presented ridge regularized maximum likelihood estimation procedure for the VAR(1) model. Information on the absence of temporal and contemporaneous relations may be incorporated in this procedure. Its computational efficient implemention is discussed. The estimation procedure is accompanied with an LOOCV scheme to determine the associated penalty parameters. Downstream exploitation of the estimated VAR(1) model is outlined: an empirical Bayes procedure to identify the interesting temporal and contemporaneous relationships, impulse response analysis, mutual information analysis, and covariance decomposition into the (graphical) relations among variates. In a simulation study the presented ridge estimation procedure outperformed a sparse competitor in terms of Frobenius loss of the estimates, while their selection properties are on par. The proposed machinery is illustrated in the reconstruction of the p53 signaling pathway during HPV-induced cellular transformation. The methodology is implemented in the ragt2ridges R-package available from CRAN.

]]>Multiple imputation has become a widely accepted technique to deal with the problem of incomplete data. Typically, imputation of missing values and the statistical analysis are performed separately. Therefore, the imputation model has to be consistent with the analysis model. If the data are analyzed with a mixture model, the parameter estimates are usually obtained iteratively. Thus, if the data are missing not at random, parameter estimation and treatment of missingness should be combined. We solve both problems by simultaneously imputing values using the data augmentation method and estimating parameters using the EM algorithm. This iterative procedure ensures that the missing values are properly imputed given the current parameter estimates. Properties of the parameter estimates were investigated in a simulation study. The results are illustrated using data from the National Health and Nutrition Examination Survey.

]]>A new approach for statistical association signal identification is developed in this paper. We consider a strategy for nonprecise signal identification by extending the well-known signal detection and signal identification methods applicable to the multiple testing problem. Collection of statistical instruments under the presented approach is much broader than under the traditional signal identification methods, allowing more efficient signal discovery. Further assessments of maximal value and average statistics in signal discovery are improved. While our method does not attempt to detect individual predictors, it instead detects sets of predictors that are jointly associated with the outcome. Therefore, an important application would be in genome wide association study (GWAS), where it can be used to detect genes which influence the phenotype but do not contain any individually significant single nucleotide polymorphism (SNP). We compare power of the signal identification method based on extremes of single *p*-values with the signal localization method based on average statistics for logarithms of *p*-values. A simulation analysis informs the application of signal localization using the average statistics for wide signals discovery in Gaussian white noise process. We apply average statistics and the localization method to GWAS to discover better gene influences of regulating loci in a Chinese cohort developed for risk of nasopharyngeal carcinoma (NPC).

The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ^{2} distribution. The tests are implemented on a real dataset from medical studies.

In this paper, we considered different methods to test the interaction between treatment and a potentially large number (*p*) of covariates in randomized clinical trials. The simplest approach was to fit univariate (marginal) models and to combine the univariate statistics or *p*-values (e.g., minimum *p*-value). Another possibility was to reduce the dimension of the covariates using the principal components (PCs) and to test the interaction between treatment and PCs. Finally, we considered the Goeman global test applied to the high-dimensional interaction matrix, adjusted for the main (treatment and covariates) effects. These tests can be used for personalized medicine to test if a large set of biomarkers can be useful to identify a subset of patients who may be more responsive to treatment. We evaluated the performance of these methods on simulated data and we applied them on data from two early phases oncology clinical trials.

We discuss group-sequential designs in superiority clinical trials with multiple co-primary endpoints, that is, when trials are designed to evaluate if the test intervention is superior to the control on all primary endpoints. We consider several decision-making frameworks for evaluating efficacy or futility, based on boundaries using group-sequential methodology. We incorporate the correlations among the endpoints into the calculations for futility boundaries and sample sizes as a function of other design parameters, including mean differences, the number of analyses, and efficacy boundaries. We investigate the operating characteristics of the proposed decision-making frameworks in terms of efficacy/futility boundaries, power, the Type I error rate, and sample sizes, while varying the number of analyses, the correlations among the endpoints, and the mean differences. We provide an example to illustrate the methods and discuss practical considerations when designing efficient group-sequential designs in clinical trials with co-primary endpoints.

]]>Random-effects meta-analyses are used to combine evidence of treatment effects from multiple studies. Since treatment effects may vary across trials due to differences in study characteristics, heterogeneity in treatment effects between studies must be accounted for to achieve valid inference. The standard model for random-effects meta-analysis assumes approximately normal effect estimates and a normal random-effects model. However, standard methods based on this model ignore the uncertainty in estimating the between-trial heterogeneity. In the special setting of only two studies and in the presence of heterogeneity, we investigate here alternatives such as the Hartung-Knapp-Sidik-Jonkman method (HKSJ), the modified Knapp-Hartung method (mKH, a variation of the HKSJ method) and Bayesian random-effects meta-analyses with priors covering plausible heterogeneity values; code to reproduce the examples is presented in an appendix. The properties of these methods are assessed by applying them to five examples from various rare diseases and by a simulation study. Whereas the standard method based on normal quantiles has poor coverage, the HKSJ and mKH generally lead to very long, and therefore inconclusive, confidence intervals. The Bayesian intervals on the whole show satisfying properties and offer a reasonable compromise between these two extremes.

]]>Science can be seen as a sequential process where each new study augments evidence to the existing knowledge. To have the best prospects to make an impact in this process, a new study should be designed optimally taking into account the previous studies and other prior information. We propose a formal approach for the covariate prioritization, that is the decision about the covariates to be measured in a new study. The decision criteria can be based on conditional power, change of the *p*-value, change in lower confidence limit, Kullback–Leibler divergence, Bayes factors, Bayesian false discovery rate or difference between prior and posterior expectation. The criteria can be also used for decisions on the sample size. As an illustration, we consider covariate prioritization based on genome-wide association studies for C-reactive protein levels and make suggestions on the genes to be studied further.

Measurement error in exposure variables is a serious impediment in epidemiological studies that relate exposures to health outcomes. In nutritional studies, interest could be in the association between long-term dietary intake and disease occurrence. Long-term intake is usually assessed with food frequency questionnaire (FFQ), which is prone to recall bias. Measurement error in FFQ-reported intakes leads to bias in parameter estimate that quantifies the association. To adjust for bias in the association, a calibration study is required to obtain unbiased intake measurements using a short-term instrument such as 24-hour recall (24HR). The 24HR intakes are used as response in regression calibration to adjust for bias in the association. For foods not consumed daily, 24HR-reported intakes are usually characterized by excess zeroes, right skewness, and heteroscedasticity posing serious challenge in regression calibration modeling. We proposed a zero-augmented calibration model to adjust for measurement error in reported intake, while handling excess zeroes, skewness, and heteroscedasticity simultaneously without transforming 24HR intake values. We compared the proposed calibration method with the standard method and with methods that ignore measurement error by estimating long-term intake with 24HR and FFQ-reported intakes. The comparison was done in real and simulated datasets. With the 24HR, the mean increase in mercury level per ounce fish intake was about 0.4; with the FFQ intake, the increase was about 1.2. With both calibration methods, the mean increase was about 2.0. Similar trend was observed in the simulation study. In conclusion, the proposed calibration method performs at least as good as the standard method.

]]>Randomized clinical trials comparing several treatments to a common control are often reported in the medical literature. For example, multiple experimental treatments may be compared with placebo, or in combination therapy trials, a combination therapy may be compared with each of its constituent monotherapies. Such trials are typically designed using a balanced approach in which equal numbers of individuals are randomized to each arm, however, this can result in an inefficient use of resources. We provide a unified framework and new theoretical results for optimal design of such single-control multiple-comparator studies. We consider variance optimal designs based on *D*-, *A*-, and *E*-optimality criteria, using a general model that allows for heteroscedasticity and a range of effect measures that include both continuous and binary outcomes. We demonstrate the sensitivity of these designs to the type of optimality criterion by showing that the optimal allocation ratios are systematically ordered according to the optimality criterion. Given this sensitivity to the optimality criterion, we argue that power optimality is a more suitable approach when designing clinical trials where testing is the objective. Weighted variance optimal designs are also discussed, which, like power optimal designs, allow the treatment difference to play a major role in determining allocation ratios. We illustrate our methods using two real clinical trial examples taken from the medical literature. Some recommendations on the use of optimal designs in single-control multiple-comparator trials are also provided.

This paper presents a novel semiparametric joint model for multivariate longitudinal and survival data (SJMLS) by relaxing the normality assumption of the longitudinal outcomes, leaving the baseline hazard functions unspecified and allowing the history of the longitudinal response having an effect on the risk of dropout. Using Bayesian penalized splines to approximate the unspecified baseline hazard function and combining the Gibbs sampler and the Metropolis–Hastings algorithm, we propose a Bayesian Lasso (BLasso) method to simultaneously estimate unknown parameters and select important covariates in SJMLS. Simulation studies are conducted to investigate the finite sample performance of the proposed techniques. An example from the International Breast Cancer Study Group (IBCSG) is used to illustrate the proposed methodologies.

]]>We present the one-inflated zero-truncated negative binomial (OIZTNB) model, and propose its use as the truncated count distribution in Horvitz–Thompson estimation of an unknown population size. In the presence of unobserved heterogeneity, the zero-truncated negative binomial (ZTNB) model is a natural choice over the positive Poisson (PP) model; however, when one-inflation is present the ZTNB model either suffers from a boundary problem, or provides extremely biased population size estimates. Monte Carlo evidence suggests that in the presence of one-inflation, the Horvitz–Thompson estimator under the ZTNB model can converge in probability to infinity. The OIZTNB model gives markedly different population size estimates compared to some existing truncated count distributions, when applied to several capture–recapture data that exhibit both one-inflation and unobserved heterogeneity.

]]>Spatiotemporal disease mapping focuses on estimating the spatial pattern in disease risk across a set of nonoverlapping areal units over a fixed period of time. The key aim of such research is to identify areas that have a high average level of disease risk or where disease risk is increasing over time, thus allowing public health interventions to be focused on these areas. Such aims are well suited to the statistical approach of clustering, and while much research has been done in this area in a purely spatial setting, only a handful of approaches have focused on spatiotemporal clustering of disease risk. Therefore, this paper outlines a new modeling approach for clustering spatiotemporal disease risk data, by clustering areas based on both their mean risk levels and the behavior of their temporal trends. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland.

]]>For clinical studies in which two coprimary endpoints are necessary for assuring efficacy of the treatment of interest, it is important to determine the minimal sample size needed to attain a certain conjunctive power (i.e., power to reject false null hypothesis for both endpoints). The traditional method of assigning the square root of the targeted overall power to each of the two hypothesis tests is optimal only when the standardized treatment effect sizes of the two endpoints are equal. In spite of this limitation the square root method is applied routinely, resulting in frequent overestimation of the overall sample size. A new, iterative method is presented to find the two individual power values for the two endpoints so that the targeted overall power is attained with the smallest possible overall sample size. The principle is to assign more power to the endpoint for which a larger standardized effect size is likely to occur based on prior information. The main assumption of the new method is the independence of endpoints. However, this is not a serious limitation in case of type II error, thus the method yields a good approximation even if the condition of independence is not fulfilled. The advantages of the new method are (a) a considerable saving (up to 24% in our examples) in the overall sample size, (b) the flexibility as it can be applied to any combination of endpoint types (e.g., normally distributed + binomial, survival + binomial, etc.) and (c) easy to program.

]]>In scientific research, many hypotheses relate to the comparison of two independent groups. Usually, it is of interest to use a design (i.e., the allocation of sample sizes *m* and *n* for fixed ) that maximizes the power of the applied statistical test. It is known that the two-sample *t*-tests for homogeneous and heterogeneous variances may lose substantial power when variances are unequal but equally large samples are used. We demonstrate that this is not the case for the nonparametric Wilcoxon–Mann–Whitney-test, whose application in biometrical research fields is motivated by two examples from cancer research. We prove the optimality of the design in case of symmetric and identically shaped distributions using normal approximations and show that this design generally offers power only negligibly lower than the optimal design for a wide range of distributions.

In high-dimensional omics studies where multiple molecular profiles are obtained for each set of patients, there is often interest in identifying complex multivariate associations, for example, copy number regulated expression levels in a certain pathway or in a genomic region. To detect such associations, we present a novel approach to test for association between two sets of variables. Our approach generalizes the global test, which tests for association between a group of covariates and a single univariate response, to allow high-dimensional multivariate response. We apply the method to several simulated datasets as well as two publicly available datasets, where we compare the performance of multivariate global test (G2) with univariate global test. The method is implemented in R and will be available as a part of the globaltest package in R.

]]>We define an adaptive procedure for control of the false discovery rate that is uniformly more powerful than the procedure of Benjamini and Hochberg. The power gain is tiny, however, and only appreciable for small numbers of hypotheses. We illustrate the new method with the case of two hypotheses, for which so far no procedure was known that controls false discovery rate but not also familywise error rate under positive dependence.

]]>Few articles have been written on analyzing three-way interactions between drugs. It may seem to be quite straightforward to extend a statistical method from two-drugs to three-drugs. However, there may exist more complex nonlinear response surface of the interaction index () with more complex local synergy and/or local antagonism interspersed in different regions of drug combinations in a three-drug study, compared in a two-drug study. In addition, it is not possible to obtain a four-dimensional (4D) response surface plot for a three-drug study. We propose an analysis procedure to construct the dose combination regions of interest (say, the synergistic areas with ). First, use the model robust regression method (MRR), a semiparametric method, to fit the entire response surface of the , which allows to fit a complex response surface with local synergy/antagonism. Second, we run a modified genetic algorithm (MGA), a stochastic optimization method, many times with different random seeds, to allow to collect as many feasible points as possible that satisfy the estimated values of . Last, all these feasible points are used to construct the approximate dose regions of interest in a 3D. A case study with three anti-cancer drugs in an in vitro experiment is employed to illustrate how to find the dose regions of interest.

]]>The problem of choosing a sample size for a clinical trial is a very common one. In some settings, such as rare diseases or other small populations, the large sample sizes usually associated with the standard frequentist approach may be infeasible, suggesting that the sample size chosen should reflect the size of the population under consideration. Incorporation of the population size is possible in a decision-theoretic approach either explicitly by assuming that the population size is fixed and known, or implicitly through geometric discounting of the gain from future patients reflecting the expected population size. This paper develops such approaches. Building on previous work, an asymptotic expression is derived for the sample size for single and two-arm clinical trials in the general case of a clinical trial with a primary endpoint with a distribution of one parameter exponential family form that optimizes a utility function that quantifies the cost and gain per patient as a continuous function of this parameter. It is shown that as the size of the population, *N*, or expected size, in the case of geometric discounting, becomes large, the optimal trial size is or . The sample size obtained from the asymptotic expression is also compared with the exact optimal sample size in examples with responses with Bernoulli and Poisson distributions, showing that the asymptotic approximations can also be reasonable in relatively small sample sizes.

Many attempts have been made to formalize ethical requirements for research. Among the most prominent mechanisms are informed consent requirements and data protection regimes. These mechanisms, however, sometimes appear as obstacles to research. In this opinion paper, we critically discuss conventional approaches to research ethics that emphasize consent and data protection. Several recent debates have highlighted other important ethical issues and underlined the need for greater openness in order to uphold the integrity of health-related research. Some of these measures, such as the sharing of individual-level data, pose problems for standard understandings of consent and privacy. Here, we argue that these interpretations tend to be overdemanding: They do not really protect research subjects and they hinder the research process. Accordingly, we suggest another way of framing these requirements. Individual consent must be situated alongside the wider distribution of knowledge created when the actions, commitments, and procedures of researchers and their institutions are opened to scrutiny. And instead of simply emphasizing privacy or data protection, we should understand confidentiality as a principle that facilitates the sharing of information while upholding important safeguards. Consent and confidentiality belong to a broader set of safeguards and procedures to uphold the integrity of the research process.

]]>We propose a method to plan the number of occasions of recapture experiments for population size estimation. We do so by fixing the smallest number of capture occasions so that the expected length of the profile confidence interval is less than or equal to a fixed threshold. In some cases, we solve the optimization problem in closed form. For more complex models we use numerical optimization. We detail models assuming homogeneous, time-varying, subject-specific capture probabilities, behavioral response to capture, and combining behavioral response with subject-specific effects. The principle we propose can be extended to plan any other model specification. We formally show the validity of the approach by proving distributional convergence. We illustrate with simulations and challenging examples in epidemiology and ecology. We report that in many cases adding as few as two sampling occasions may substantially reduce the length of confidence intervals.

]]>To optimize resources, randomized clinical trials with multiple arms can be an attractive option to simultaneously test various treatment regimens in pharmaceutical drug development. The motivation for this work was the successful conduct and positive final outcome of a three-arm randomized clinical trial primarily assessing whether obinutuzumab plus chlorambucil in patients with chronic lympocytic lymphoma and coexisting conditions is superior to chlorambucil alone based on a time-to-event endpoint. The inference strategy of this trial was based on a closed testing procedure. We compare this strategy to three potential alternatives to run a three-arm clinical trial with a time-to-event endpoint. The primary goal is to quantify the differences between these strategies in terms of the time it takes until the first analysis and thus potential approval of a new drug, number of required events, and power. Operational aspects of implementing the various strategies are discussed. In conclusion, using a closed testing procedure results in the shortest time to the first analysis with a minimal loss in power. Therefore, closed testing procedures should be part of the statistician's standard clinical trials toolbox when planning multiarm clinical trials.

]]>In randomized trials with noncompliance, causal effects cannot be identified without strong assumptions. Therefore, several authors have considered bounds on the causal effects. Applying an idea of VanderWeele (), Chiba () gave bounds on the average causal effects in randomized trials with noncompliance using the information on the randomized assignment, the treatment received and the outcome under monotonicity assumptions about covariates. But he did not consider any observed covariates. If there are some observed covariates such as age, gender, and race in a trial, we propose new bounds using the observed covariate information under some monotonicity assumptions similar to those of VanderWeele and Chiba. And we compare the three bounds in a real example.

]]>We investigate rank-based studentized permutation methods for the nonparametric Behrens–Fisher problem, that is, inference methods for the area under the ROC curve. We hereby prove that the studentized permutation distribution of the Brunner-Munzel rank statistic is asymptotically standard normal, even under the alternative. Thus, incidentally providing the hitherto missing theoretical foundation for the Neubert and Brunner studentized permutation test. In particular, we do not only show its consistency, but also that confidence intervals for the underlying treatment effects can be computed by inverting this permutation test. In addition, we derive permutation-based range-preserving confidence intervals. Extensive simulation studies show that the permutation-based confidence intervals appear to maintain the preassigned coverage probability quite accurately (even for rather small sample sizes). For a convenient application of the proposed methods, a freely available software package for the statistical software R has been developed. A real data example illustrates the application.

]]>In diagnostic medicine, the volume under the receiver operating characteristic (ROC) surface (VUS) is a commonly used index to quantify the ability of a continuous diagnostic test to discriminate between three disease states. In practice, verification of the true disease status may be performed only for a subset of subjects under study since the verification procedure is invasive, risky, or expensive. The selection for disease examination might depend on the results of the diagnostic test and other clinical characteristics of the patients, which in turn can cause bias in estimates of the VUS. This bias is referred to as verification bias. Existing verification bias correction in three-way ROC analysis focuses on ordinal tests. We propose verification bias-correction methods to construct ROC surface and estimate the VUS for a continuous diagnostic test, based on inverse probability weighting. By applying U-statistics theory, we develop asymptotic properties for the estimator. A Jackknife estimator of variance is also derived. Extensive simulation studies are performed to evaluate the performance of the new estimators in terms of bias correction and variance. The proposed methods are used to assess the ability of a biomarker to accurately identify stages of Alzheimer's disease.

]]>Characterization of a subpopulation by the difference in marginal means of the outcome under the intervention and control may not be sufficient to provide informative guidance for individual decision and public policy making. Specifically, often we are interested in the treatment benefit rate (TBR), that is, the probability of benefitting an intervention in a meaningful way. For binary outcomes, TBR is the proportion that has “unfavorable” outcome under the control and “favorable” outcome under the intervention. Identification of subpopulations with distinct TBR by baseline characteristics will have significant implications in clinical setting where a medical intervention with potential negative health impact is under consideration for a given patient. In addition, these subpopulations with unique TBR set the basis for guidance in implementing the intervention toward a more personalized scheme of treatment. In this article, we propose a Bayesian tree based latent variable model to seek subpopulations with distinct TBR. Our method offers a nonparametric Bayesian framework that accounts for the uncertainty in estimating potential outcomes and allows more exhaustive search of the partitions of the baseline covariates space. The method is evaluated through a simulation study and applied to a randomized clinical trial of implantable cardioverter defibrillators to reduce mortality.

]]>Spontaneous adverse event reports have a high potential for detecting adverse drug reactions. However, due to their dimension, the analysis of such databases requires statistical methods. In this context, disproportionality measures can be used. Their main idea is to project the data onto contingency tables in order to measure the strength of associations between drugs and adverse events. However, due to the data projection, these methods are sensitive to the problem of coprescriptions and masking effects. Recently, logistic regressions have been used with a Lasso type penalty to perform the detection of associations between drugs and adverse events. On different examples, this approach limits the drawbacks of the disproportionality methods, but the choice of the penalty value is open to criticism while it strongly influences the results. In this paper, we propose to use a logistic regression whose sparsity is viewed as a model selection challenge. Since the model space is huge, a Metropolis–Hastings algorithm carries out the model selection by maximizing the BIC criterion. Thus, we avoid the calibration of penalty or threshold. During our application on the French pharmacovigilance database, the proposed method is compared to well-established approaches on a reference dataset, and obtains better rates of positive and negative controls. However, many signals (i.e., specific drug–event associations) are not detected by the proposed method. So, we conclude that this method should be used in parallel to existing measures in pharmacovigilance.

Code implementing the proposed method is available at the following url: https://github.com/masedki/MHTrajectoryR.

We consider models for hierarchical count data, subject to overdispersion and/or excess zeros. Molenberghs et al. () and Molenberghs et al. () extend the Poisson-normal generalized linear-mixed model by including gamma random effects to accommodate overdispersion. Excess zeros are handled using either a zero-inflation or a hurdle component. These models were studied by Kassahun et al. (). While flexible, they are quite elaborate in parametric specification and therefore model assessment is imperative. We derive local influence measures to detect and examine influential subjects, that is subjects who have undue influence on either the fit of the model as a whole, or on specific important sub-vectors of the parameter vector. The latter include the fixed effects for the Poisson and for the excess-zeros components, the variance components for the normal random effects, and the parameters describing gamma random effects, included to accommodate overdispersion. Interpretable influence components are derived. The method is applied to data from a longitudinal clinical trial involving patients with epileptic seizures. Even though the data were extensively analyzed in earlier work, the insight gained from the proposed diagnostics, statistically and clinically, is considerable. Possibly, a small but important subgroup of patients has been identified.

]]>The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, *Am. Nat*. **176**, 96–98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples.

For the calculation of relative measures such as risk ratio (RR) and odds ratio (OR) in a single study, additional approaches are required for the case of zero events. In the case of zero events in one treatment arm, the Peto odds ratio (POR) can be calculated without continuity correction, and is currently the relative effect estimation method of choice for binary data with rare events. The aim of this simulation study is a variegated comparison of the estimated OR and estimated POR with the true OR in a single study with two parallel groups without confounders in data situations where the POR is currently recommended. This comparison was performed by means of several performance measures, that is the coverage, confidence interval (CI) width, mean squared error (MSE), and mean percentage error (MPE). We demonstrated that the estimator for the POR does not outperform the estimator for the OR for all the performance measures investigated. In the case of rare events, small treatment effects and similar group sizes, we demonstrated that the estimator for the POR performed better than the estimator for the OR only regarding the coverage and MPE, but not the CI width and MSE. For larger effects and unbalanced group size ratios, the coverage and MPE of the estimator for the POR were inappropriate. As in practice the true effect is unknown, the POR method should be applied only with the utmost caution.

]]>Standard optimization algorithms for maximizing likelihood may not be applicable to the estimation of those flexible multivariable models that are nonlinear in their parameters. For applications where the model's structure permits separating estimation of mutually exclusive subsets of parameters into distinct steps, we propose the alternating conditional estimation (ACE) algorithm. We validate the algorithm, in simulations, for estimation of two flexible extensions of Cox's proportional hazards model where the standard maximum partial likelihood estimation does not apply, with simultaneous modeling of (1) nonlinear and time-dependent effects of continuous covariates on the hazard, and (2) nonlinear interaction and main effects of the same variable. We also apply the algorithm in real-life analyses to estimate nonlinear and time-dependent effects of prognostic factors for mortality in colon cancer. Analyses of both simulated and real-life data illustrate good statistical properties of the ACE algorithm and its ability to yield new potentially useful insights about the data structure.

]]>Biomedical researchers are often interested in estimating the effect of an environmental exposure in relation to a chronic disease endpoint. However, the exposure variable of interest may be measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies an additive measurement error model, but it may not have repeated measurements. The subset in which the surrogate variables are available is called a *calibration sample*. In addition to the surrogate variables that are available among the subjects in the calibration sample, we consider the situation when there is an instrumental variable available for all study subjects. An instrumental variable is correlated with the unobserved true exposure variable, and hence can be useful in the estimation of the regression coefficients. In this paper, we propose a nonparametric method for Cox regression using the observed data from the whole cohort. The nonparametric estimator is the best linear combination of a nonparametric correction estimator from the calibration sample and the difference of the naive estimators from the calibration sample and the whole cohort. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via intensive simulation studies. The methods are applied to the Nutritional Biomarkers Study of the Women's Health Initiative.

The use of control charts for monitoring schemes in medical context should consider adjustments to incorporate the specific risk for each individual. Some authors propose the use of a risk-adjusted survival time cumulative sum (RAST CUSUM) control chart to monitor a time-to-event outcome, possibly right censored, using conventional survival models, which do not contemplate the possibility of cure of a patient. We propose to extend this approach considering a risk-adjusted CUSUM chart, based on a cure rate model. We consider a regression model in which the covariates affect the cure fraction. The CUSUM scores are obtained for Weibull and log-logistic promotion time model to monitor a scale parameter for nonimmune individuals. A simulation study was conducted to evaluate and compare the performance of the proposed chart (RACUF CUSUM) with RAST CUSUM, based on optimal control limits and average run length in different situations. As a result, we note that the RAST CUSUM chart is inappropriate when applied to data with a cure rate, while the proposed RACUF CUSUM chart seems to have similar performance if applied to data without a cure rate. The proposed chart is illustrated with simulated data and with a real data set of patients with heart failure treated at the Heart Institute (InCor), at the University of São Paulo, Brazil.

]]>A mixture of multivariate contaminated normal distributions is developed for model-based clustering. In addition to the parameters of the classical normal mixture, our contaminated mixture has, for each cluster, a parameter controlling the proportion of mild outliers and one specifying the degree of contamination. Crucially, these parameters do not have to be specified *a priori*, adding a flexibility to our approach. Parsimony is introduced via eigen-decomposition of the component covariance matrices, and sufficient conditions for the identifiability of all the members of the resulting family are provided. An expectation-conditional maximization algorithm is outlined for parameter estimation and various implementation issues are discussed. Using a large-scale simulation study, the behavior of the proposed approach is investigated and comparison with well-established finite mixtures is provided. The performance of this novel family of models is also illustrated on artificial and real data.

The food frequency questionnaire (FFQ) is known to be prone to measurement error. Researchers have suggested excluding implausible energy reporters (IERs) of FFQ total energy when examining the relationship between a health outcome and FFQ-reported intake to obtain less biased estimates of the effect of the error-prone measure of exposure; however, the statistical properties of stratifying by IER status have not been studied. Under certain assumptions, including nondifferential error, we show that when stratifying by IER status, the attenuation of the estimated relative risk in the stratified models will be either greater or less in both strata (implausible and plausible reporters) than for the nonstratified model, contrary to the common belief that the attenuation will be less among plausible reporters and greater among IERs. Whether there is more or less attenuation depends on the pairwise correlations between true exposure, observed exposure, and the stratification variable. Thus exclusion of IERs is inadvisable but stratification by IER status can sometimes help. We also address the case of differential error. Examples from the Observing Protein and Energy Nutrition Study and simulations illustrate these results.

]]>We use data from an ongoing cohort study of chronic kidney patients at Salford Royal NHS Foundation Trust, Greater Manchester, United Kingdom, to investigate the influence of acute kidney injury (AKI) on the subsequent rate of change of kidney function amongst patients already diagnosed with chronic kidney disease (CKD). We use a linear mixed effects modelling framework to enable estimation of both acute and chronic effects of AKI events on kidney function. We model the fixed effects by a piece-wise linear function with three change-points to capture the acute changes in kidney function that characterise an AKI event, and the random effects by the sum of three components: a random intercept, a stationary stochastic process with Matérn correlation structure, and measurement error. We consider both multivariate Normal and multivariate *t* versions of the random effects. For either specification, we estimate model parameters by maximum likelihood and evaluate the plug-in predictive distributions of the random effects given the data. We find that following an AKI event the average long-term rate of decline in kidney function is almost doubled, regardless of the severity of the event. We also identify and present examples of individual patients whose kidney function trajectories diverge substantially from the population-average.