Statistical Corrections of Invalid Correlation Matrices



Suppose estimates are available for correlations between pairs of variables but that the matrix of correlation estimates is not positive definite. In various applications, having a valid correlation matrix is important in connection with follow-up analyses that might, for example, involve sampling from a valid distribution. We present new methods for adjusting the initial estimates to form a proper, that is, nonnegative definite, correlation matrix. These are based on constructing certain pseudo-likelihood functions, formed by multiplying together exact or approximate likelihood contributions associated with the individual correlations. Such pseudo-likelihoods may then be maximized over the range of proper correlation matrices. They may also be utilized to form pseudo-posterior distributions for the unknown correlation matrix, by factoring in relevant prior information for the separate correlations. We illustrate our methods on two examples from a financial time series and genomic pathway analysis.