Approximate likelihood and pseudo‐likelihood inference in meta‐analysis of diagnostic accuracy studies accounting for disease prevalence and study design

Bivariate random‐effects models represent a recommended approach for meta‐analysis of diagnostic test accuracy, jointly modeling study‐specific sensitivity and specificity. As the severity of the disease status can vary across studies, a proper analysis should account for the dependence of the accuracy measures on the disease prevalence. To this aim, trivariate generalized linear mixed‐effects models have been proposed in the literature, although computational difficulties strongly limit their applicability. In addition, the attention has been mainly paid to cohort studies, where the study‐specific disease prevalence can be estimated from, while information from case‐control studies is often neglected. To overcome such limits, this article introduces a trivariate approximate normal model, which accounts for disease prevalence along with accuracy measures in cohort studies and sensitivity and specificity in case‐control studies. The model represents an extension of the bivariate normal mixed‐effects model originally developed for meta‐analysis not accounting for disease prevalence, under an approximate normal within‐study distribution for the logit of estimated sensitivity and specificity. The components of the approximate within‐study covariance matrix are derived and the likelihood function is obtained in closed‐form. The approximate likelihood approach is compared to that based on the exact within‐study distribution and to its modifications following a pseudo‐likelihood strategy aimed at reducing the computational effort. The comparison is based on simulation studies in a variety of scenarios, and illustrated in a meta‐analysis about the accuracy of a test to diagnose fungal infection and a meta‐analysis of a noninvasive test to detect colorectal cancer.


INTRODUCTION
Meta-analysis of diagnostic tests is a well recognized instrument for the combination of evidence from different studies about the accuracy measures of a test.The performance of a test is commonly evaluated by the pairs of indices given by sensitivity, that is, the conditional probability of testing positive in diseased subjects, and specificity, that is, the conditional probability of testing negative in nondiseased subjects.A recent interest in the literature about the evaluation of a diagnostic tests using meta-analysis instruments lead to the development of sophisticated solutions accounting for the joint distribution of the accuracy measures. 1,2They include hierarchical models, [3][4][5][6][7] bivariate copula models, 8,9 nonparametric and simulation-based solutions. 10,11The most famous solution represented by bivariate mixed-effects models due to Reitsma et al 3 and Arends et al 4 is based on a normal approximation of the joint distribution of (logit of) sensitivity and specificity at the between-study level, resulting in a marginal normally distributed model, whose associated likelihood is in closed-form and can be easily evaluated with standard software.A preferable solution which better reflects the nature of the data 12 is the use of bivariate generalized linear mixed-effects model, as proposed in Reference 13, which defines the within-study model through the binomial specification for the number of true positives and true negatives for each study included in the meta-analysis.The price to pay is the presence of computational obstacles, as, for example, the need for numerical integration of the corresponding likelihood function.Convergence problems can frequently occur, typically related to unreliable estimates of the parameters of the variance/covariance matrix truncated on the boundary of the parameter space especially in presence of small sample size.See the results in References 11, 14, and 7.
One of the relevant issues in meta-analysis of diagnostic accuracy studies 2 is the risk of spectrum effect, 15,16 a term describing the risk of different test performance across patients subgroups having characteristics associated to the disease.In case of spectrum effect, study's results are not generalizable and may have limited applicability. 17,18The presence of a spectrum of clinical and demographic patients' characteristics, such as the severity of the disease, can give rise to different values of sensitivity and specificity of the diagnostic test according to the level of the prevalence of the disease in the population to which the test is applied, see Brenner and Gefeller 19 and Leeflang et al. 20,21 The heterogeneity induced by the spectrum in terms of dependence between test performance measures and disease prevalence needs to be accounted for, in order to provide generalizable measures of test accuracy.To this aim, Chu et al 22 extend the bivariate generalized linear mixed-effects model in Reference 13 to jointly model the information from disease prevalence, sensitivity and specificity of the diagnostic test.The model starts from the exact binomial within-sampling distribution used to describe the number of diseased, true positives and true negatives, respectively, provided by each study included in the meta-analysis.At the between-study level, a trivariate normal distribution models the relationship between the (logit of) disease prevalence, sensitivity and specificity.It is expected that such a model formulation retains the computational issue of the bivariate counterpart, as conjectured in Reference 22. Deeper investigations performed in Reference 23 and in Chen et al 24 highlight the computational issues of the proposal in Reference 22, especially in case of small sample size or outside model conditions.To overcome the limitations, Chen et al 23 suggest a pseudo-likelihood approach inspired by the composite likelihood methodology, 25 by avoiding an explicit modeling of the dependencies between the random-effects and substituting the multinomial distribution by the product of marginal binomial distributions.The resulting likelihood still needs numerical integration, but with a substantial reduction of computational problems and a high relative efficiency of the results with respect to the full likelihood case.A similar strategy based on the pseudo-likelihood is adopted in Reference 24, where disease prevalence, sensitivity and specificity are jointly modeled on their original scale, in place of their logit transformations, via the Sarmanov distribution. 26The use of trivariate copulas to account for the correlation between the accuracy measures and the disease prevalence has been recently proposed in References 27 and 28, the last one including the hybrid generalized linear mixed-effects model in Reference 23 as a special case.
Most of the above-mentioned approaches developed in the literature are constructed to properly account for study design, that is, to combine information from cohort studies and case-control studies about the accuracy of the same diagnostic test.Although only cohort studies provide information about the disease prevalence, the exclusion of case-control studies would lead to a loss of efficiency. 29his article proposes a trivariate normal mixed-effects model as an extension of the bivariate normal mixed-effects model originally developed for diagnostic accuracy studies in References 3 and 4 to include information about the disease prevalence, taking into account the study design.The within-study model is approximated by a trivariate normal distribution, in case of cohort studies, to jointly model the distribution of the observed logit transformed accuracy measures and the disease prevalence, and by a bivariate normal distribution when information about the logit transformed accuracy measures come from case-control studies.The within-study variance/covariance matrix is derived applying first-order Taylor approximation to functions of the counts in the misclassification table comparing the test under

Diagnostic test Diseased Non-diseased
Positive evaluation to a reference standard (eg, Bagos 30 ).Under the usual normality assumption for the between-study joint distribution of the logit transformed accuracy measures and prevalence, the resulting likelihood function is in closed-form, and the risk on convergence failure of the optimization procedure is substantially reduced.In this way, the resulting model is computationally more convenient than the approaches based on the exact distribution in Reference 22 and in References 31 and 23 The approximate trivariate solution is compared to the exact model specification and its pseudo-likelihood counterpart. 23,31A modification of the pseudo-likelihood solution is proposed, with a less stringent working independence assumption, that is, under independence of the measures of risk with the disease prevalence, while the correlation between the pairs of (logit of) accuracy indices is maintained.This solution can be interpreted as a compromise between the computational advantages of a pseudo-likelihood approach and the search for a satisfactorily accurate solution.
The performance of the approximate likelihood approach, the exact likelihood approach and its pseudo-likelihood versions is investigated in a series of simulation studies, with data generated according to a two-step procedure with the aim of reflecting the real mechanism generating true positives and true negatives, instead of the usual simulation strategy which samples data directly from the model considered in the meta-analysis.Simulations cover different scenarios, including small-to-moderate sample size, normally and not normally distributed random-effects, the presence or absence of correlation between disease prevalence and sensitivity or specificity of the diagnostic test at the population level.Accuracy of inferential procedures and applicability with interest on convergence issues are examined.
The competing methods are applied to two real meta-analyses about the accuracy of a test to diagnose invasive fungal infection and the accuracy of a noninvasive test developed to detect colorectal cancer.

MODELS WHEN THE DISEASE PREVALENCE IS NOT ACCOUNTED FOR
Consider a meta-analysis of n studies, each of them providing information about the accuracy a diagnostic test.Information is typically in terms of a two-by-two table comparing the number of positives and negatives from the test under evaluation and the number of diseased and non-diseased from a gold standard, as reported in Table 1.Values in the table refer to the number of true positives, true negatives, false positives and false negatives reported by study i, denoted by n 11i , n 00i , n 10i , and n 01i , respectively.The number of diseased and non-diseased is represented by n 1i and n 0i , respectively.The estimate of sensitivity (SE i ) and specificity (SP i ) provided by study i are obtained as n 11i ∕n 1i and n 00i ∕n 0i , respectively.In case disease prevalence is not considered, the evaluation of diagnostic accuracy studies proceeds according to an exact model or an approximated model, depending on whether the exact information provided by all the studies is considered or a transformation useful to reach normality is adopted.Following Reitsma et al 3 and Arends et al, 4 we consider the logit transformation of the estimate of sensitivity and 1-specificity given by ηi = logit(n 11i ∕n 1i ) and ξi = logit(1 − n 00i ∕n 0i ), respectively, having an approximate bivariate normal distribution ( ηi ξi )) . ( Correlation between ηi and ξi is zero, as they are typically computed on different subjects.The above within-study structure is coupled with the between-study component considering a normal distribution for the random-effects In the above specification,  and  are the means over the n studies,  2  and  2  are the between-study variances and  is the correlation coefficient between ηi and ξi that tends to be positive.
Marginally, the approximate normal model for )) , which gives rise to a likelihood function in closed-form for the whole parameter vector . The straightforward implementation with standard software makes the approximate model very attractive, together with a low risk of computational issues.An ad hoc continuity correction is necessary in case of empty cells of the misclassification matrix in Table 1.
The exact model proposed in Reference 13 is a generalized bivariate linear mixed-effects model that avoids the approximate normal formulation for the within-study component and considers the number of true positives and true negatives from study i as realizations of binomial variables, in this way meeting the suggestions in Reference 12. and Given the above specification, the model does not require an ad hoc continuity correction when the number of true positives, true negatives, false positives, or false negatives in a study is zero.The between-study model follows the normal distribution in (2).The resulting generalized linear mixed-effects model has an associated likelihood function not in closed-form, which is proportional to where  2 (⋅; ⋅) denotes the density function of the bivariate normal distribution in (2) with 2 × 2 variance/covariance matrix Σ 2 .Numerical integration of the likelihood function is necessary.Computational issues related to the approach include convergence failure, mainly in case of small sample size, with variance/covariance components close to the boundary of the parameter space, see Chen et al, 7 Guolo, 11 and Takwoingi et al 14 for investigations.

Trivariate generalized linear mixed-effects model (TGLMM)
The extension of the bivariate exact model proposed in Reference 13 has been suggested in Reference 22 who jointly model the diagnostic accuracy measures SE i and SP i with the disease prevalence in a trivariate generalized mixed-effects model.Let  i be the study-specific disease, defined as the probability of having the disease.Then, the exact distribution of the number of positives in study i follows a binomial distribution The within-study level of the model specification is thus given by (3) plus the disease prevalence model (4).The between-study model is a trivariate extension of the normal model in (2), accounting for the possible correlation between the logit transformation of sensitivity and specificity and the logit transformation of the disease prevalence  i = logit i , namely, where  2  denotes the variance of  i , and   and   denote the correlation between  i and  i and between  i and  i , respectively.According to the above specifications, the likelihood function for the whole parameter vector where  3 (⋅; ⋅) denotes the density function of the trivariate normal distribution in (5) with 3 × 3 variance/covariance matrix Σ 3 .The likelihood function ( 6) is not in closed-form, and it needs to be evaluated through numerical integration.To better approximate normality and reduce computational failure, Chu et al 22 suggest to use the Fisher's transformation of the correlation coefficients, z j = 0.5 , with j = 1, 2, 3 corresponding to   ,   and   .As Chu et al 22 confirm, there could be numerical problems related to the estimation of the variance/covariance parameters, especially in case of small number of studies included in the meta-analysis and large values of the correlation components.The issue is in line with the results and discussion in Reference 7 and Takwoingi et al 14 referred to the bivariate case with no disease prevalence.
More recently, Chen et al 7 define a trivariate model for diagnostic accuracy studies accounting for disease prevalence that specify the between-study components through the Sarmanov beta distributions 26,32 in order to avoid normality assumptions.Estimation is carried out using a composite likelihood approach to avoid numerical instabilities.The proposal in Reference 7 includes data from the cohort studies, which provide information about the disease prevalence, and data from case-control studies, where estimation of the disease prevalence is not representative, but information about sensitivity and specificity is available and needs to be accounted for.Results are encouraging and overcome the computational limits of the approach in Reference 22 although simulations are restricted to moderate to large sample sizes, with meta-analysis including 30 or 50 studies.

Proposed approximate solution: Trivariate linear mixed-effects model (TLMM)
This paper intends to provide a simplified framework to work with diagnostic accuracy measures, while accounting for disease prevalence by taking advantage of the approximate bivariate normal structure developed in Reference 3 and in Reference 4 in case of no disease prevalence.The resulting approximate solution is based on a TLMM.Let (5) represents the trivariate normal between-study model.The within-study level is now specified by considering an approximate joint distribution of ( ηi , ξi , γi ) ⊤ as a trivariate normal, obtained as an extension of model ( 1), after a proper evaluation of the within-study correlations between logit sensitivity and logit specificity with the logit of disease prevalence  i .Namely, Note that at the within-study level, the correlation between the observed versions of logit sensitivity and logit specificity with the logit of disease prevalence is zero.The derivation of the elements of the variance/covariance matrix is reported in Section A of the supplementary material.The steps useful to derive the elements are inspired by the work in Reference 33 about multivariate meta-analysis to compare diagnostic tests and are a special case of a general methodology developed in Reference 30 about covariances of correlated outcomes.Given the above specifications, the marginal approximate distribution of ( ηi , ξi , γi , where The approximate model reduces to the model in References 3 and 4 in case of no correlation between the diagnostic accuracy measures and the disease prevalence at the between-study model.The model can be easily extended to account for study design, that is, a mixture of information about cohort studies and case-control studies.Suppose that the meta-analysis includes n 1 cohort studies and n 2 case-control studies, so that n = n 1 + n 2 .Only cohort studies allow an appropriate estimation of the disease prevalence.The likelihood function for the whole parameter vector Similarly to the bivariate model, the trivariate version accounting for disease prevalence retains computational attractiveness, as the associated likelihood functions is in closed-form, even when different study designs are included.The maximum likelihood estimator θ of the whole parameter vector  = ( , ,  i , A restricted maximum likelihood estimation can be adopted as an alternative, as it will be done in the simulation study, see Section 4. The proposal retains two characteristics of the model in Reference 3.Both the methods are two-steps approaches, where sensitivity and specificity are estimated at the first step and then used within a meta-analysis approach following a random-effects formulation.We suggest to use a sandwich standard error (e.g., Kauermann and Carroll 34 ) in order to properly account for the uncertainty associated to the entire procedure.Secondly, a continuity correction is needed in case of sensitivity or specificity equal to zero or one.Keeping with much fo the literature, an arbitrary value equal to 0.5 can be added to the numbers in the 4-fold Table 1.

Pseudo-likelihood solutions
A different way to perform meta-analysis of diagnostic accuracy studies exploiting (part of) the exact distribution for the data at the within-study level is based on the use of a pseudo-likelihood approach, inspired by the composite likelihood theory, 25 with the aim of reducing computational issues.See also Chen et al 7 for a similar approach in case of bivariate models for meta-analysis of accuracy studies and Chen et al 31 in the more general framework of multivariate meta-analysis.Let n c 1i and n cc 1i denote the number of positives in cohort study i and in case-control study i, respectively.A similar notation using suffixes is adopted for n i , n 0i , n 11i and n 10i .The pseudo-likelihood function proposed in Reference 23 is defined as the exact likelihood in Chu et al 22 expressed in (6), but under a working independence assumption, meaning that the correlation parameters at the between-study level are fixed at zero.Accordingly, the between-study model can be interpreted as a product of univariate normal densities for  i ,  i , and  i , and consequently the likelihood function can be separated into the product of marginal univariate integrals, Correspondingly, the whole parameter vector becomes . The computational burden required by the evaluation of ( 9) is substantially lower than that required by (6), as numerical integration is applied to each single integral separately, and not on the three-dimensional space.We refer to this approach as TGLMM-PL1.
A modification of TGLMM-PL1 is proposed, that is less stringent with respect to the working independence assumption.As the inferential interest is on the parameters associated to sensitivity and specificity, the working independence assumption used in (9) can be relaxed, by allowing the likelihood to depend on the correlation between  i and  i only, while fixing the other correlation parameters   and   to zero.According to this specification, the likelihood function is the product of the joint density function for ( ηi , ξi ) ⊤ times the density function for the prevalence of the disease, The whole parameter vector is . We refer to this approach as TGLMM-PL2.The standard error of the maximum pseudo-likelihood estimator  from TGLMM-PL1 and TGLMM-PL2 can be obtained through the sandwich method, 34 in order to account for possible misspecification of the model.Let  i () be the log-likelihood contribution of study i, obtained as the logarithm of each term in (9) or in (10).The sandwich estimator of the variance/covariance matrix of the maximum pseudo-likelihood estimator θ is where } , f (x)∕x is the derivative of f (x) with respect to x and E{X} is the expectation of X. Matrices I() and J() can be consistently estimated by their empirical counterpart.

SIMULATION STUDY
The performance of the different likelihood-based approaches is evaluated through simulation.We distinguish the approximate approach for the trivariate linear mixed-effects model under maximum likelihood (TLMM) or restricted maximum likelihood estimation (TLMM-REML), the exact likelihood approach in Reference 22 for the trivariate generalized linear mixed-effects model (TGLMM), the pseudo-likelihood approach based on the working independence assumption on all the correlation parameters (TGLMM-PL1) and on the correlation parameters involving the disease prevalence (TGLMM-PL2).The TGLMM in Reference 22 has been extended to account for either the cohort design and the case-control design, similarly to what suggested for TLMM, TGLMM-PL1, and TGLMM-PL2, that is, by subdividing the entire log-likelihood function as the sum of the cohort associated component including the disease prevalence information and the case-control component that does not.Simulations are implemented in the R programming language. 35The code is available at https://github.com/annamariaguolo/DTMAprevalence.

Set-up
Data simulation follows a two-step procedure.In the first step, values of ( i ,  i ,  i ) ⊤ are generated according to relationship (5) or substituting the normal distribution with a skew-normal distribution, or a Student t distribution with four degrees of freedom.In this way, the competing methods are evaluated in case of deviations from normality for the random-effects components.The skewness parameter for ( i ,  i ,  i ) ⊤ has been chosen equal to (−1, 0. ) ⊤ are chosen equal to (1.2, 0.5, 0.25) ⊤ .Increasing correlation between  i and  i is considered, namely,   ∈ {0.2, 0.6, 0.8}.A null or a large correlation between the measures of risks and the disease prevalence are considered, namely,   =   ∈ {0.0, 0.7}, in order to account for scenarios without or with spectrum bias.The simulation study considers small to moderate sample size, with the number n of studies included in the meta-analysis equal to 15 or 25.Given the values of ( i ,  i ,  i ) ⊤ , the second step generates the number of true positives and true negatives as follows.For each study i, the sample size n i is obtained from a Uniform variable on [40,200].For cohort studies, given n i , the number of diseased and non-diseased within each study i is a realization of a binomial distribution as in (4).From here, the number of true positives and true negatives is obtained as a realization of binomial variables as in (3).When needed, as for the approximate TLMM, the observed values ηi , ξi , and γi are calculated within each study i as logit transformation of the empirical version of sensitivity, specificity, and prevalence, respectively.
For each scenario of the simulation study, we assume an equal number of cohort studies and case-control studies.The number of replicates for each scenario is fixed to 1000.Likelihood maximization has been primarily performed using the Nelder and Mead algorithm. 36In case of nonconvergence of the optimization algorithm, other solutions have examined, as the quasi-Newton BFGS algorithm as well as a change in the starting values of the parameters.Numerical integration for the likelihood function in the exact approach and for the pseudo-likelihood has been carried out using the Gauss-Hermite quadrature, with 21 nodes.Starting values for the optimization procedure are given by the empirical estimates of the parameters.The standard error of the estimators is obtained using the sandwich-type estimator.In order to reduce the rate of nonconvergence of TGLMM, the Fisher-type transformation of the correlation parameters is used.

Results
Methods are compared in terms of bias, standard deviation, and average of standard errors of the estimators of the parameters.Empirical coverages for Wald-type confidence intervals at nominal level 0.95 for parameters , , and  are also reported.Results under nonconvergence were excluded when evaluating the simulations results.The failure rate of the approaches is another criterion used for comparison.The lollipop plots (eg, Morris et al 37 ) in Figure 1 report the bias (dot point) for the estimators of the fixed-effects components , , and  obtained from the competing approaches under increasing values of the between-study correlation   , large correlation between the accuracy measures and the prevalence of the disease , small sample size n = 15, and normally distributed random components, when sensitivity and specificity are equal to 0.95 and 0.90, respectively.The associated Monte Carlo Wald-type 95% confidence interval in represented via parentheses.A similar graph for the estimators of the variance components is reported in Figure 2.For the correlation components, the similar graph is based on the results from TLMM, TLMM-REML, and TGLMM, while the pseudo-likelihood approach is helpful only for the estimation of   .Results are reported in Figure 3. Analogous results under deviations from normality for the random-effects are reported in Section B.1 of the supplementary material.The same results under independence between the accuracy measures and the disease prevalence (   =   = 0.0 ) are reported in Section B.2 of the supplementary material.The results in case of more extreme scenario are reported in Section B.3 of the supplementary material.
The use of the approximate model, under either maximum likelihood or restricted maximum likelihood estimation (TLMM and TLMM-REML), gives rise to slightly more biased estimates of  if compared to the exact solution TGLMM or the pseudo-likelihood approaches TGLMM-PL1 and TGLMM-PL2, although with a smaller variability, see Figure 1.The increase of   does not substantially affect the results.Comparable bias is obtained with respect to the estimate of , while less biased results are obtained in case the interest is on .Larger bias is experienced by all the methods in case of deviations from normality for the random-effects, see Figures S1 and S4 in the supplementary material.Modifications of the correlation between the accuracy measures and the disease prevalence can have an impact on the bias of the estimator of , while the estimators of  and  are less affected.When the correlation is zero, instead, the estimator of  is almost unbiased under all the examined methods, see Figures S17, S20, and S23 in the supplementary material.Similar conclusions hold for the estimation of the variance components, with larger variability associated to the exact solution TGLMM and to the pseudo-likelihood versions TGLMM-PL1 and TGLMM-PL2 when the interest is on  2  and  2  , and to the approximate solution TLMMM when the interest is on  2  .See the results under normal random-effects distribution given in Figure 2 for   =   = 0.7 and in Figure S18 in the supplementary material for   =   = 0.0.Results are confirmed under deviations from normality for the random-effects, see Figures S2 and S21 under a skew-normal distribution with nonzero and zero correlation between the accuracy measure and the disease prevalence, respectively, and Figures S5 and S24 under a Student t distribution with nonzero and zero correlation between the accuracy measure and the disease prevalence, respectively.
When the interest is on the correlation components, the approximate likelihood approach is satisfactory, if compared to the exact method TGLMMM, that provides largely variable results, whichever the correlation between the accuracy measure and the disease prevalence.See, for example, the results in Figure 3 and Figures S3, S6, S19, S22, and S25 in the supplementary material.The pseudo-likelihood approach to be applied is TGLMM-PL2, which accounts only for   , and whose results suffer for substantial variability, see, for example, Figure 3.
The supplementary material section includes, among others, the results when the sample size is n = 25.Increasing the sample size gives rise to a better performance of all the methods in terms of bias of the estimators of the fixed-effects parameters, the variance components and the correlations, and in terms of the associated variability, as expected.
Figure 4 reports the empirical coverage of Wald-type confidence intervals at nominal level 95% for the competing methods under different random-effects distribution, when n = 15 and the correlation between the accuracy measures and the prevalence of the disease is 0.7.Empirical coverages obtained from TGLMM are not reported as they suffer from the large variability of the reduced number of results given by the high rate of nonconvergence.The behavior of the pseudo-likelihood solutions TGLMM-PL1 and TGLMM-PL2 and of the approximate solutions TLMM and TLMM-REML under normal random-effects distribution is comparable when the interest is on  and , with empirical coverages close to the target level, irrespectively of the correlation   .When the interest is on   , the pseudo-likelihood approach is slightly preferable.The violation of normality for the random-effects does not substantially affect the conclusions.The only relevant modification is a substantial reduction of empirical coverages for all the examined methods when focusing on , under a skew-normal random-effects specification.Similar results hold when the correlation between the accuracy measures and the prevalence of the disease is zero, see Figure S26 in the Supplementary Material, and when the samples size is increased to n = 25, see Figures S16 and S35 in the Supplementary Material, under nonzero and zero correlation between the accuracy measure and the disease prevalence, respectively.
Results of the simulation study under the extreme scenario with large sensitivity and specificity, namely, sensitivity equal to 0.99 and specificity equal to 0.95, are reported in Figures S37-S76 in Section B.3 of the Supplementary Material, for increasing sample size n, different correlation, and different random-effects distribution.Globally, the estimators of the parameters tend to be more biased than experienced in case of sensitivity equal to 0.95 and specificity equal to 0.90, especially when TLMM and TLMM-REML are used and with reference to the estimator of the fixed effect  and the associated variance  2  .In addition, bias is more pronounced for random-effect distribution far from normal, see, for example, Figures S40-S63.As a consequence, coverages of confidence intervals can be very unsatisfactory, see, for example, the results close to zero for TLMM and TLMM-REML when the interest is on , for small sample size (Figures S46-S66).Increasing the sample size does not help, see Figures S56-S76.
The compared methods behave substantially differently from a computational point of view, under either the examined accuracy scenarios.The use of the approximate likelihood does not require computational effort, if compared to the numerical integration needed by the pseudo-likelihood approaches and by the exact solution.Relevant differences among the methods emerge according to the failure rate.Failure is typically a consequence of estimates of the correlation parameters on the boundary of the parameter space and/or not positive-definite variance/covariance matrix.Convergence problems are very seldom experienced when adopting an approximate solution, either TLMM and TLMM-REML, with the worst rate less than 10% when n = 15,   = 0.2, and   =   = 0.7.Conversely, convergence issues deeply affect the exact likelihood solution TGLMM with a failure rate up to 65% for small sample size.The failure rate does not improve using different starting points or optimization algorithms in the estimation process.The pseudo-likelihood solutions TGLMM-PL1 and TGLMM-PL2 result in a smaller failure rate if compared to the exact solutions, in this way making them more attractive strategies.The failure rate for the competing methods tend to increase with the correlation   moving the boundary of the parameters space.This result is in line with previous findings in the literature of meta-analysis of diagnostic accuracy studies, see, for example, Guolo, 11 Takwoingi et al 14 and Chen et al. 7,23,24 As expected, the nonconvergence risk reduces as the sample size increases.

Additional study
Upon suggestion of a Referee, a comparison of the exact and approximate likelihood-based approaches to the hybrid copula mixed model in Nikoloulopoulos 28 has been carried out through an additional simulation study.The hybrid copula model 28 considers an exact-binomial-within-study distribution and a stochastic representation of the between-study distribution using copulas (e.g., Nikoloulopoulos 9 ), to be chosen in a variety of parametric proposals.Maximum likelihood estimation is adopted, using a quasi-Newton method.The model includes the pseudo-likelihood approach for the trivariate generalized linear mixed-effects model in Chen et al 23 as a special case.The simulation study refers to the high accuracy case, with sample size equal to 15.The corresponding results are reported in Section B.4 of the supplementary material.The implementation of the approach exploits the functionalities in the R package CopulaREMADA 38 and considers the vine copula. 39hen the focus in on the fixed-effects parameters, the behavior of the hybrid copula mixed model is similar to that of the pseudo-likelihood solutions TGLMM-PL1 and TGLMM-PL2, with a slightly reduced variance, see Figures S77-S83.When the focus in on the variance and the correlation components, instead, the variability associated to the estimation is much more reduced.The reduced variability of the estimators obtained by the hybrid copula mixed model turns out is a reduced empirical coverage of confidence interval if compared to the exact likelihood-base solutions, see Figure S86.From a computational point of view, the failure rate of the hybrid copula mixed model approach is reduced with respect to the TGLMM approach, but increased with respect to the approximate solutions TLMM and TLMM-REML and to the pseudo-likelihood solutions TGLMM-PL1 and TGLMM-PL2, especially in case of correlation   close to the boundary of the parameters space.

5
DATA ANALYSIS

𝜷-D-glucane data
The clinical relevance of invasive fungal infections has increased given their diffusion especially in fragile patients, and the associated high risk of mortality.Early detection of diagnostic markers of a fungal infection is necessary to promptly administer antifungal therapy, considering that a definitive diagnosis may be slow and require invasive procedures.Karageorgopoulos et al 40 carry out a meta-analysis about the diagnostic accuracy of the serum or plasma -D-glucane to diagnose invasive fungal infections.Study design includes cohort and case-control studies, for a sample size of n = 14.Data are reported in Table 2. Data have been previously examined in Reference 27 via a trivariate copula approach accounting for study design.The use of approximate likelihood, exact likelihood and pseudo-likelihood approaches for inference gives rise to the estimates of sensitivity, specificity and prevalence of the disease reported in Table 3.The associated 95% confidence interval is also reported, based on sandwich standard errors.The Fisher-type transformation of the correlation parameters has been used when adopting TGLMM.Results from the approximate restricted likelihood are close to those based on the maximum likelihood, so they are not reported.
Results are nearly the same in terms of specificity, around 85%, and slightly differ with respect to the estimated sensitivity.The approximate solution TLMM leads to the largest prevalence of the disease, equal to 14.7%.According to the examined approaches, there is no evidence of correlation between the accuracy measures and the prevalence of the disease.This justifies the equality of the results from the pseudo-likelihood solutions.

Colorectal cancer screening data
Colorectal cancer is one of the major causes of cancer-related deaths worldwide.Recently, Launois et al 41 carried out a meta-analysis focused on the performance of noninvasive screening instruments devoted to colorectal cancer detection.The accuracy of the instruments is evaluated with respect to the invasive gold standard represented by colonscopy.We focus on the data used in Reference 41 to evaluate the accuracy of an immunochemical-based fecal blood test, resulting from 10 cohort studies, as reported in Table 4.The use of approximate likelihood, exact likelihood and pseudo-likelihood approaches for inference gives rise to the estimate of sensitivity, specificity and prevalence of the disease reported in Table 5.The associated 95% confidence interval is also reported, based on sandwich standard errors.The Fisher-type transof the correlation parameters has been used when adopting TGLMM.Results from the approximate restricted likelihood are close to those based on the maximum likelihood, so they are not reported.All the methods result in a similar value of specificity of the immunochemical based fecal blood test, which is around 93%, while they differ with respect to the estimated sensitivity.The approximate solution and the pseudo-likelihood solutions provide a value of sensitivity smaller than the estimate from the exact likelihood solution, equal to 91.3%.All the methods give the same value of prevalence of the disease.

CONCLUDING REMARKS
This article developed an approximate likelihood approach for inference in meta-analysis of diagnostic accuracy studies when the disease prevalence is taken into account, as a reason of erroneous classification of the disease status.The approximate likelihood approach is proposed as a feasible alternative to the trivariate generalized linear random-effects model based on the exact distribution for the accuracy measures of the test and the disease prevalence (e.g., Chu et al, 22 Chen et al 24 ), whose application is more involved and gives substantial computational issues.The approximate solution is based on a trivariate normal random-effects model, which extends the classical bivariate normal random-effects model for (logit of) sensitivity and specificity, after an appropriate derivation of the within-study variance/covariance components.The proposed approach shares with the bivariate model the convenient application, which does not pose computational issues, as the likelihood function is in closed-form, so standard software is sufficient for parameter estimation.In the meanwhile, a satisfactory behavior of the method is maintained if compared to the exact solution.The disadvantage is the need for a continuity correction in order to avoid undefined measures of accuracy and their variances, which is arbitrary and questionable.The pseudo-likelihood solution in Reference 23 and its proposed modification suggested to have a less stringent working independence assumption are examined as well.With such a choice, the ad hoc continuity correction is avoided, as the methods work within the exact framework, although at the price of additional computation effort if compared to the approximate solution, as the likelihood function is not in closed-form.Nevertheless, computational issues are reduced with respect to the exact approach, since the dimension of the integrals that need to be numerically evaluated is smaller.As a Referee suggested, computational issues might be further reduced using different estimation techniques.Nikoloulopoulos 28,39 focuses on copula mixed models which include the pseudo-likelihood approach for the trivariate generalized linear mixed-effects model in Reference 23 as a special case.Within this framework, Nikoloulopoulos 28,39 proposes a maximum likelihood estimation based on Gauss-Legendre quadrature for integral evaluation, which might improve the convergence rate of the optimization algorithm.Simulation results show that advantages of the proposed solutions, either approximate and pseudo, and they are evident in case of small sample size and when the correlation components are close to the boundary of the parameter space.The result is in line with previous findings in the literature, when meta-analysis of diagnostic test does not account for disease prevalence (e.g., Chen et al, 7 Guolo, 11 and Takwoingi et al 14 ).
A relevant feature of the approaches developed in the article is the possibility to merge information from cohort studies, which allow the estimation of the study-specific disease prevalence, and information from case-control studies, by separating the likelihood contribution related to the study design.
The study gave rise to several interesting directions of future research.The accuracy of the test in the studies included in the meta-analysis is based on comparison to a gold standard.A relevant issue refers to the unavailability of a gold standard, for example as a consequence of measurement errors, see Chu et al. 42 An additional interesting extension of the work would account for partial verification bas, occurring when only a subset of the samples is verified by the reference test and the selection mechanism depends on the results of the test under evaluation (e.g., de Groot et al 43 ).See Ma et al 29 for a treatment of the problem from a Bayesian point of view.
Although we did not account for the presence of additional study-specific covariates in the model, such an extension is possible, in the simplest way by modifying the specification of the mean in the between-study model to include covariates information.The extension is expected to provide no substantial computational obstacles especially when dealing with the approximate-likelihood or the pseudo-likelihood-based solutions.
The development of a Bayesian approach to meta-analysis of diagnostic accuracy studies accounting for disease prevalence can be considered as a line of research.In light of this, a Bayesian version of the model proposed in Reference 22 might be inspired by the results in Reference 44 in the classical scenario not accounting for disease prevalence.How the Bayesian solution behaves with respect to the frequentist counterpart under either the exact or the approximate within-study distribution would be of interest also from a computational point of view.

F I G U R E 1
Lollipop for the estimators of the fixed-effects components , , and , under increasing   ,   =   = 0.7, and n = 15.Dot point: bias of the estimator; parentheses: 95% Monte Carlo Wald-type 95% confidence interval based on the standard deviation.Random-effects distributed as a normal.

F I G U R E 2
Lollipop plot for the estimators of the variance components  2  , 2  , and  2  , under increasing   ,   =   = 0.7, and n = 15.Dot point: bias of the estimator; parentheses: 95% Monte Carlo Wald-type 95% confidence interval based on the standard deviation.Random-effects distributed as a normal.

F I G U R E 3
Lollipop plot for the estimators of parameters   ,   , and   , under increasing   ,   =   = 0.7, and n = 15.Dot point: bias of the estimator; Parentheses: 95% Monte Carlo Wald-type 95% confidence interval based on the standard deviation.Random-effects distributed as a normal.

F G U R E 4
Empirical coverage probability of Wald-type confidence interval for the estimators of , , and , under increasing   , when   =   = 0.7, and n = 15.Dashed line: nominal 95% level.
5, 0) ⊤ , reflecting a situation where skewness affects the distribution of  i and  i .Values of the fixed-effects parameter vector ⊤ in order to consider a sensitivity equal to 0.95 and a specificity equal to 0.90 for the diagnostic test, and a prevalence of the disease equal to 0.25.A more extreme scenario with larger sensitivity and speci- 40mber of true positives (TP), Positives (P), true negatives (TN), negatives (N), and type of study design (design) for the -D-glucane data.40 40 B L E 3Estimate of sensitivity, specificity, and disease prevalence with associated 95% confidence interval (in parentheses) for the -D-glucane data.40 41mber of true positives (TP), Positives (P), true negatives (TN) and negatives (N) for the colorectal cancer screening data.41Estimate of sensitivity, specificity, and disease prevalence with associated 95% confidence interval (in parentheses) for the colorectal cancer screening data.41