Matrix-based Concordance Correlation Coefficient for Repeated Measures

Authors

  • Sasiprapa Hiriote,

    Corresponding author
    1. Department of Statistics, Faculty of Science, Silpakorn University, Nakorn Pathom, 73000, Thailand
    Search for more papers by this author
  • Vernon M. Chinchilli

    Corresponding author
    1. Department of Public Health Sciences, Penn State Hershey College of Medicine, Hershey, Pennsylvania 17033-0855, U.S.A.
    Search for more papers by this author

email:ssiprapa@su.ac.th

email:vchinchi@psu.edu

Abstract

Summary In many clinical studies, Lin's concordance correlation coefficient (CCC) is a common tool to assess the agreement of a continuous response measured by two raters or methods. However, the need for measures of agreement may arise for more complex situations, such as when the responses are measured on more than one occasion by each rater or method. In this work, we propose a new CCC in the presence of repeated measurements, called the matrix-based concordance correlation coefficient (MCCC) based on a matrix norm that possesses the properties needed to characterize the level of agreement between two p× 1 vectors of random variables. It can be shown that the MCCC reduces to Lin's CCC when p= 1. For inference, we propose an estimator for the MCCC based on U-statistics. Furthermore, we derive the asymptotic distribution of the estimator of the MCCC, which is proven to be normal. The simulation studies confirm that overall in terms of accuracy, precision, and coverage probability, the estimator of the MCCC works very well in general cases especially when n is greater than 40. Finally, we use real data from an Asthma Clinical Research Network (ACRN) study and the Penn State Young Women's Health Study for demonstration.

1. Introduction

In many scientific studies, one of the research objectives is to assess agreement of observations made by two raters or methods. For example, in the area of medical diagnostic testing, the main research interest is to compare the results of a new technique with the gold standard practice. When the observations are measured on a continuous scale, the concordance correlation coefficient (CCC), introduced by Lin (1989), is one of the most popular measures of agreement. The CCC evaluates the agreement between two readings from the same sample by measuring how far each paired data point deviates from the 45 ° line through the origin, called the concordance line. The value of CCC ranges between −1 and 1 with the equality to 1 for perfect positive agreement, 0 for no agreement, and −1 for perfect negative agreement. Unlike the traditional approaches, for example the Pearson correlation coefficient, the paired t-test, and the least squares test, which sometimes fail to detect departure from the concordance line or falsely reject strong agreement, the CCC can fully assess the desired reproducibility characteristics.

However, in many fields of science, especially medical sciences, the need of measure of agreement between the two raters or methods often arises when the data are obtained at several occasions. For example, in a longitudinal asthma clinical trial, one of the research goals was to study the amount of agreement between plasma cortisol AUC (area under the curve) measured every hour and every two hours at several visits. In this situation, we need a repeated measure CCC that can quantify the overall agreement between two random vectors of repeated measurements.

For paired or unpaired repeated measurements study design, Chinchilli et al. (1996) developed a weighted CCC based on a random coefficient model that allows the within-subject variances to change across subjects. For each subject, the CCC was constructed as an average of q CCC's of the least squares random vectors, whose variance–covariance matrices were of dimension q×q. Then, the global CCC was defined as a weighted average of the coefficients using a weight function based on the amount of variation within each subject.

King, Chinchilli, and Carrasco (2007) proposed another version of the CCC in the presence of repeated measurements. They characterized the amount of agreement between two p× 1 random vectors, X and Y, by inline image, where D is a p×p nonnegative definite matrix of weights among the different repeated measurements. Then, the repeated measure CCC was defined as

image

Carrasco, King, and Chinchilli (2009) developed a CCC for longitudinal repeated measurements through the appropriate specification of the intraclass correlation coefficient from a variance components linear mixed model. The authors showed that this CCC is equivalent to the repeated measure CCC proposed by King et al. (2007) when the weight matrix D is the identity matrix.

In this work, we introduce a new repeated measure CCC that not only can be proven to possess the properties needed to measure the amount of overall agreement between two p× 1 vectors of random variables but also has more intuitive appeal than the former methods. In Section 2, we first construct a matrix that characterizes the overall agreement between the two vectors. Then, to ease the problem of interpretation, we transform this matrix to a scalar based on a matrix norm and scale its value to range between −1 and 1. We call our repeated measure CCC the matrix-based concordance correlation coefficient (MCCC). To estimate the MCCC, we consider an estimator based on U-statistics. For inference, we derive the asymptotic distribution of the proposed estimator. To obtain a confidence interval or a test statistic, we consider a consistent estimator of the asymptotic variance and the Z-transformation to improve the normal approximation and bound the confidence limits. In Section 3, a Monte Carlo simulation is performed to assess the properties of the estimator of the MCCC based on finite samples. Finally, in Section 4, some real examples are used to demonstrate the application of the MCCC.

2. Matrix-Based Concordance Correlation Coefficient

2.1 Definition and Properties

Let (X, Y) be a 2p× 1 random vector from a 2p-variate distribution with a finite 2p× 1 mean vector inline image and a positive definite 2p× 2p covariance matrix

image

where X and Y are the measurements of each method with

image

To characterize the level of agreement between the two p× 1 vectors X and Y, let us consider the following p×p matrices,

image(1)

and

image(2)

Then, we construct a matrix that characterizes the overall agreement between the two vectors denoted by inline image, as follows:

image(3)

where Ip×p denotes the p×p identity matrix, VD is nonnegative definite, VI is positive definite, and VI−1/2 denotes the symmetric square-root decomposition of the inverse of VI.

For ease of notation, we write A > 0p×p if a p×p, symmetric matrix A is positive definite and A0p×p if A is nonnegative definite.

inline image has the following properties (see the proof of each property in Web Appendix A).

  • 1inline image.
  • 2inline image if and only if inline image.
  • 3inline image if and only if X=Y with probability one.
  • 4inline image if and only if X=−Y with probability one and inline image.
  • 5If p= 1, then inline image reduces to Lin's CCC.
  • 6If inline image and inline image are diagonal matrices and inline image, then each of the diagonal elements of inline image corresponds to Lin's CCC.

Based on these properties, one can use inline image to measure the amount of agreement between two vectors of random variables. The closer inline image is to the identity matrix, the higher the level of positive agreement between the two vectors. Conversely, the closer inline image is to the negative identity matrix, the higher the level of negative agreement between the two vectors. If inline image is equal to the zero matrix, then it means that the two vectors are independent, in other words, there is no agreement between the two vectors.

In most circumstances, it will not be straightforward to gauge the “closeness” of inline image to the identity matrix, and most researchers will want a numerical value to represent “closeness.” For example, suppose that inline image follows a joint distribution with inline image

image

and

image

The resultant matrix is

image

which is somewhat close to the identity matrix, indicating a good level of agreement between X and Y.

Suppose that in the above example, the expectation vector for Y actually is inline image Then

image

and it is not clear how to describe the level of agreement between X and Y.

Therefore, we recommend the construction of a matrix norm to assess the distance between inline image and the identity matrix. For p×p symmetric matrices A and B, a function g is a matrix norm if it satisfies the following conditions (Stewart, 1973):

  • 1if A0, then g(A) > 0.
  • 2if c is a constant, then g(cA) = |c|g(A).
  • 3g(A+B) ≤g(A) +g(B).

An appealing matrix norm for our situation is the Frobenius norm, defined as

image

where λ1, λ2, … , λp represent the eigenvalues of A. A generalization of the Frobenius norm is to let D be a positive definite matrix of weights and construct inline image. Setting D equal to the identity matrix yields the Frobenius norm, which we use throughout the remainder of the manuscript.

In addition to determining the distance between inline image and the identity matrix, we want to scale and center the matrix norm so that it ranges between −1 and +1, where the latter corresponds to perfect agreement and the former corresponds to perfect negative agreement (in a manner comparable to other agreement coefficients). Therefore, we propose the matrix-based MCCC, ρg, as follows:

image(4)

Note that ρg ranges between −1 and 1. By the definition of ρg in (4) and the properties of g, it can easily be seen that ρg= 0 if inline image, which is the case that there is no agreement between the two vectors; ρg = 1 if inline image, which is the case that there is a perfectly positive agreement between the two vectors; and ρg=− 1 if inline image, which is the case that there is a perfectly negative agreement between the two vectors. In addition, the closer inline image is to Ip×p, the closer ρg is to 1 and the closer inline image is to Ip×p the closer ρg is to −1.

2.2 Estimation

To estimate inline image and ρg, we first shall estimate VD and VI with unbiased estimators based on U-statistics.

Assume that inline image are independent and identically distributed random vectors from a 2p-variate distribution with finite fourth moments. Define

image(5)

and

image(6)

Now, we construct the estimator of inline image and ρg as

image(7)

and

image(8)

Let

image

and

image

where the vec operator vectorizes a matrix by stacking its columns.

Note that

image(9)

and

image(10)

That is, inline image and inline image are U-statistics with kernels inline image and inline image, respectively. Since inline image and inline image and inline image are unbiased estimators of inline image and inline image, respectively.

2.3 Inference

To make inference about ρg, we derive the asymptotic distribution of inline image. By (8), inline image is a function of inline image. Thus, we first derive the limiting distribution of inline image. The proof of this theorem appears in Web Appendix B.

Theorem 1Assume that(X1, Y1), …, (Xn, Yn)are independent and identically distributed random vectors from a2p-variate distribution with finite fourth moments. Letinline imageandinline imagebe defined as in (9) and (10), respectively. Then

image(11)

where

image

whereinline imageis an arbitrary fixed vector and the expected value is taken with respect to the random vectorinline image.

Finally, we apply the theory on functions of asymptotically normal vectors (Serfling, 1980) to the result from the above theorem to obtain the asymptotic distribution of inline image as follows.

Theorem 2Assume thatinline imageare independent and identically distributed random vectors from a2p-variate distribution with finite fourth moments. Letinline imagebe defined as in (8) and g represents the Frobenius norm. Then

image(12)

whereinline image, andinline imageis defined as in Theorem 1.

Proof.  By applying the theory on functions of asymptotically normal vectors (Serfling, 1980) to (11), we have

image(13)

By the properties of matrix derivatives,

image

To obtain confidence intervals or test statistics for hypothesis testing about ρg, we need to calculate the estimates of the parameters of the asymptotic variances in (12). In addition to the estimates of inline image and inline image, defined in (9) and (10), we need the estimate of the variance–covariance matrix inline image. According to Sen (1960), inline image can be consistently estimated by

image

where inline image, and inline image

As shown in the paper by Lin (1989), the normal approximation of Lin's CCC can be improved by using the inverse hyperbolic tangent transformation or the Z-transformation. Confirmed by Monte Carlo study in Lin's paper, the Z-transformation accelerates the convergence to normality of the sample CCC not only when the samples are from the normal distribution but also when the samples are from (a) short-tailed symmetric distributions like the uniform and (b) long-tailed skewed-to-the-right distributions like the Poisson. The Z-transformation was also shown to effectively improve the normality approximation of the sample repeated measure CCC for both normal and contaminated normal data in the paper by King et al. (2007).

To improve the asymptotic normality of our sample MCCC inline image, we also will invoke the Z-transformation for inference about the MCCC, ρg. Let inline image be

image

Then it follows from the theory on functions of asymptotically normal statistics (Serfling, 1980, Theorem 3.1) that inline image is asymptotically normal with mean

image

and variance

image

By replacing the parameters in the variance of inline image with their estimates, we can obtain the confidence interval for Z denoted by inline image and then by transformation we can obtain the confidence interval for ρg based on the Z-transformation as follows:

image

As noted by Lin (1989), the confidence interval of the MCCC based on the Z-transformation will be bounded in the open interval (−1,1) and more realistically asymmetric.

3. Simulation Studies

To assess the finite-sample properties of the sample MCCC, inline image, and the corresponding Z-transformation, inline image, when g is the Frobenius norm as described in Section 2.1, we performed a Monte Carlo simulation by generating the data from three distributions: multivariate normal distribution, multivariate Student's t-distribution, and multivariate lognormal distribution with three combinations of location and scale shifts and levels of correlation between X and Y. In each case, we consider three and five repeated measurements per unit for three levels of within-unit correlation (ρ= 0, 0.4, 0.8) with sample sizes of n= 20, n= 40, n= 80, and n= 160. For each of these situations, 1000 runs were performed using SAS/IML software. The scenarios considered here for this simulation study are similar to those considered by King et al. (2007).

3.1 Multivariate Normal Distribution

In this section, five repeated measures paired samples were generated from each of the following cases of the multivariate normal distribution.

  • Case 1 :  Means inline image and inline image and covariance matrix Λ1⊗Λ2 where
    image
    and Λ2 is a 5 × 5 compound symmetric within-unit correlation structure with ρ= 0.4, assuming repeated measures have equal variance for both X and Y. This case represents a slight difference in location and scale parameters, and strong positive correlation between X and Y.
  • Case 2 :  Means inline image and inline image and covariance matrix Λ1⊗Λ2 where
    image
    and Λ2 is a 5 × 5 compound symmetric within-unit correlation structure with ρ= 0.4, assuming repeated measures have equal variance for both X and Y. This case represents a moderate difference in location and scale parameters, and moderate positive correlation between X and Y.
  • Case 3 :  Means inline image and inline image and covariance matrix Λ1⊗Λ2 where
    image
    and Λ2 is a 5 × 5 compound symmetric within-unit correlation structure with ρ= 0.4, assuming repeated measures have equal variance for both X and Y. This case represents a large difference in location and scale parameters, and weaker positive correlation between X and Y.

Then, the three repeated measures paired samples were generated from the same three cases of the multivariate normal distribution using the first three observations in inline image and inline image and a 3 × 3 compound symmetric within-unit correlation structure. In addition, all six situations were repeated with ρ= 0 and 0.8 instead of 0.4.

In each run, we calculated inline image, their estimated asymptotic variances, and the 95% confidence intervals as described in Sections 2.2 and 2.3. Based on 1000 runs, for each scenario, we evaluated the normality, accuracy, precision, and coverage probability of the confidence intervals. To assess the normality, we examined Q–Q plots of inline image and inline image. To evaluate accuracy, we calculated the means of the estimates and their relative biases (relative bias = inline image). To assess precision, we calculated the average estimated asymptotic variances and the empirical variances of inline image.

The Q–Q plots of inline image and inline image for sample size 20 based on a simulation of 1000 runs from the multivariate normal distribution for the scenario with three repeated measures and high within-unit correlation (ρ= 0.8) are shown in Web Figure 1. These Q–Q plots confirm that the distribution of inline image is much improved by the Z-transformation for all cases. As expected, the distribution of both inline image and inline image get closer to the normal distribution for all cases as sample size increases. In addition, these results are similar for the scenarios with zero and moderate within-unit correlation and for the case of five repeated measures.

Figure 1.

Scatter plots of blood draw data for each visit with Lin's CCCs. This figure appears in color in the electronic version of this article.

The means of estimates inline image and their relative biases, the average estimated asymptotic variances (inline image), the empirical variances of inline image (Var(inline image)), and the empirical coverage probabilities for the 95% confidence intervals of ρg based on a simulation of 1000 runs from the multivariate normal distribution for three repeated measurements (p= 3) are shown in Table 1. For all cases, the average relative biases of inline image are positive and become closer to zero as the sample size increases. With strong between-unit correlation (Case 1), the average relative biases of inline image are less than 0.035 for all sample sizes and all levels of within-unit correlation. With moderate between-unit correlation (Case 2), the average relative biases of inline image are less than 0.04 when n is greater than or equal to 40 and is about 0.08 when n is equal to 20 for all levels of within-unit correlation. For weaker between-unit correlation (Case 3), the average relative biases of inline image are less than 0.07 when n is greater than or equal to 40 and is about 0.14 when n is equal to 20. To evaluate precision, we compared the mean of the estimated asymptotic variances of inline image with the empirical variance of inline image in each situation. For all scenarios, we found that the average asymptotic variance estimates of inline image were very close to the empirical variances. Lastly, the coverage probabilities for the 95% confidence intervals of ρg based on the Z-transformation are close to 0.95 faster for Cases 1 and 3 but a little bit slower for Case 2. The results for the scenarios with five repeated measurements are similar to those for the scenarios with three repeated measures but the rate of convergence is somewhat slower than the three repeated measures case (results are shown in Web Table 1). In general, based on the accuracy, precision, and coverage probabilities, when the data are from the multivariate normal distribution, the estimator of the MCCC based on the Frobenius norm performs notably well especially when n is greater than or equal to 40.

Table 1. 
Mean of estimates and relative biases ofinline image, the asymptotic and empirical variance ofinline image, and the coverage probability for the 95% confidence intervals ofρgbased on a simulation of 1000 runs from the multivariate normal distribution with three repeated measurements(p= 3)
Within-unit correlationCaseMeasureSample sizeρg
204080160
01inline image0.8510.8670.8740.8770.881
 Rel. bias0.0330.0160.0080.004 
 inline image0.0220.0110.0050.003 
 inline image0.0230.0100.0050.003 
 Prob0.8670.9220.9330.937 
2inline image0.6070.6330.6440.6510.657
 Rel. bias0.0750.0370.0190.009 
 inline image0.0110.0060.0030.001 
 inline image0.0110.0050.0030.001 
 Prob0.8320.8860.9110.935 
3inline image0.3300.3570.3670.3730.379
 Rel. bias0.1290.0580.0300.014 
 inline image0.0100.0050.0020.001 
 inline image0.0090.0040.0020.001 
 Prob0.9180.9430.9390.949 
0.41inline image0.8810.8960.9030.9060.909
 Rel. bias0.0310.0140.0070.003 
 inline image0.0250.0120.0060.003 
 inline image0.0250.0100.0050.002 
 Prob0.8650.9190.9350.943 
2inline image0.6480.6770.6900.6970.704
 Rel. bias0.0800.0380.0200.009 
 inline image0.0150.0070.0030.002 
 inline image0.0150.0070.0030.002 
 Prob0.8380.9020.9210.937 
3inline image0.3500.3810.3930.4010.407
 Rel. bias0.1390.0640.0330.015 
 inline image0.0120.0060.0030.001 
 inline image0.0110.0050.0030.001 
 Prob0.8980.9350.9380.946 
0.81inline image0.8930.9090.9150.9180.921
 Rel. bias0.0290.0130.0070.003 
 inline image0.0300.0130.0060.003 
 inline image0.0250.0100.0050.002 
 Prob0.8740.9320.9330.951 
2inline image0.6670.6980.7120.7190.726
 Rel. bias0.0820.0380.0200.009 
 inline image0.0190.0090.0040.002 
 inline image0.0160.0070.0040.002 
 Prob0.8510.9120.9200.946 
3inline image0.3610.3940.4080.4160.423
 Rel. bias0.1450.0670.0350.016 
 inline image0.0140.0060.0030.001 
 inline image0.0120.0060.0030.001 
 Prob0.9070.9370.9410.948 

3.2 Multivariate Student's t-distribution

In this section, three and five repeated measures paired samples were generated from the multivariate Student's t-distribution with 10 degrees of freedom using the same scenarios as in the case of the multivariate normal distribution. As in Section 3.1, for each scenario, we evaluated the normality, accuracy, precision, and coverage probability of the confidence intervals based on a simulation of 1000 runs.

As in the normal case, for small sample size, the distribution of inline image is much closer to normality than that of inline image for all levels of correlation (some results are shown in Web Figure 2). Overall, based on the accuracy, precision, and coverage probabilities, when the data are from the multivariate Student's t-distribution, the estimator of the MCCC based on the Frobenius norm performs as well as in the normal case for all scenarios (results are shown in Web Tables 2 and 3).

3.3 Multivariate Lognormal Distribution

In this section, three and five repeated measures paired samples were generated from each of the three cases of the multivariate normal distribution and then transformed to multivariate lognormal distribution. As in Sections 3.1 and 3.2, for each scenario, we assessed the normality, accuracy, precision, and coverage probability of the confidence intervals based on a simulation of 1000 runs.

In general, based on the accuracy, precision, and coverage probabilities, when the data are from the multivariate lognormal distribution, the estimator of the MCCC based on the Frobenius norm performs very well for all levels of correlation especially when sample size is greater than or equal to 40 (results are shown in Web Tables 4 and 5).

4. Examples

In this section, we demonstrate the use of the MCCC for measuring an overall agreement between two vectors of repeated measures presented in Section 2 using some real examples.

4.1 Blood Draws Data

The data for this example are taken from an Asthma Clinical Research Network (ACRN) study reported by Martin et al. (2002). The main objective of this trial was to develop a reliable method to compare six different available inhaled corticosteroid (ICS) preparations in terms of systemic bioavailability as measured by effect on cortisol suppression. Three different outcomes were considered to evaluate this systematic effect, namely hourly plasma cortisol concentrations, 12- and 24-hour urine cortisol concentrations, and a morning blood osteocalcin. After one week of placebo run-in period, corticosteroid-naive asthma subjects enrolled at six ACRN centers were randomized to one of the six ICS and matched placebo groups. Following randomization, another placebo week was continued and then the subjects were admitted for an overnight testing at each of the next five weekly visits. During an overnight stay, an out-of-laboratory 12-hour urine collection was conducted between 8 A.M. and 8 P.M. and then in-laboratory urine cortisol collection and hourly blood sampling for cortisol was performed between 8 P.M. and 8 A.M.; blood for osteocalcin concentration was taken at 7 A.M. The area under the concentration-time curve (AUC) for hourly plasma cortisol measurements is considered the most reliable method to assess systematic effect. An additional interesting goal was to assess the agreement between the plasma cortisol AUC calculated from measurements taken every hour and measurements taken every two hours. This is very useful for future studies because the every two-hour analysis requires less sleep interruption and lower budget.

Table 2 shows the summary statistics of the hourly data (CortAuc1) and every two-hour data (CortAuc2) for each visit including the Pearson correlation coefficients between the two measurements. The scatter plots of the blood draw data for each visit are shown in Figure 1. Lin's sample CCC and the corresponding 95% confidence interval for each visit are also included in the graph. The scatter plots and Lin's CCCs indicate strong agreement between the plasma AUCs based on the hourly and every other hour data for all five visits. To measure the overall agreement between hourly and every two-hour data based on all five visits without any specific assumption about the pattern of agreement, the MCCC is a proper coefficient. The inline image matrix is

image

which is somewhat close to the identity matrix, indicating a good level of agreement between hourly and every two-hour measurements. For these data, the point estimate and the 95% confidence interval of the MCCC based on the Frobenius norm are calculated as inline image. Using the same data, the estimate of the repeated measures CCC proposed by King et al. (2007) is 0.958 with 95% confidence interval = inline image. This result is based on the weight matrix consisting of equal on-diagonal elements and zero off-diagonal elements. According to King et al. (2007), the point estimate of the weighted CCC created by Chinchilli et al. (1996) for these data is 0.971 and the corresponding 95% confidence interval is inline image. All of these results suggest high level of overall agreement between the two sets of plasma AUCs.

Table 2. 
Summary statistics of hourly data (CortAuc1) and every two-hour data (CortAuc2) for each visit with Pearson correlation coefficients
Summary statisticsVisit number
34567
Sample size121121121121121
Mean of Cortauc15.9615.8355.7275.5495.281
Std dev of Cortauc10.5320.5580.5520.6820.766
Mean of CortAuc25.9785.8585.7675.5845.316
Std dev of CortAuc20.5610.5910.5650.6520.796
Pearson correlation coefficients0.9520.9490.9480.9610.977

4.2 Body Fat Data

For this example, we use the data from the Penn State Young Women's Health Study conducted by Lloyd et al. (1998). In these data, percentages of body fat are obtained from 82 white female subjects at age 12.5, 13, and 13.5 years based on whole-body composition measurements made by dual-energy x-ray absorbtiometer (DEXA) and skinfold caliper. The summary statistics of these data are shown in Table 3. Figure 2 shows the scatter plots of the percentages of body fat along with the estimates of Lin's CCC and 95% confidence intervals, indicating moderate agreement for all three visits. The inline image matrix for these data is

image

which is definitely not close to the identity matrix, but is not clear how far from the identity matrix. For ease of interpretation, the point estimate of the MCCC based on the Frobenius norm along with the corresponding 95% confidence interval is inline image. This result indicates moderate agreement between the percentage of body fat measured by the DEXA and skinfold caliper. Based on this data set, the estimate of the repeated measures CCC using the approach suggested by King et al. (2007) using uneven weighting of the diagonal elements and zero off-diagonal elements is 0.568 and the 95% confidence interval is inline image. Reported by King et al. (2007), the weighted CCC estimate proposed by Chinchilli et al. (1996) for the same data is 0.597 with the corresponding 95% confidence interval = inline image. These results also suggest moderate agreement between the two data sets.

Table 3. 
Summary statistics of percentages of body fat measured by DEXA and skinfold caliper for each visit with Pearson correlation coefficients
Summary statisticsAge
12.51313.5
Sample size828282
Mean of DEXA21.53921.49521.569
Std dev of DEXA 4.031 3.671 3.645
Mean of CALIP23.65625.24725.118
Std dev of CALIP 3.392 3.266 3.027
Pearson correlation coefficients 0.787 0.770 0.775
Figure 2.

Scatter plots of body fat data for each visit with Lin's CCCs. This figure appears in color in the electronic version of this article.

5. Discussion

We have introduced an index of overall agreement between two responses in the presence of repeated measurements, which is an extension of Lin's CCC. First, we developed a matrix that possesses the properties needed for assessing the amount of agreement between two vectors of random variables. For ease of interpretation we used a matrix norm called the Frobenius norm to transform this matrix to a scalar and scale its value to range between −1 and 1. We called this new repeated measures CCC “the MCCC.” This MCCC has desirable characteristics and can easily be used without any specific assumption about the model. For inference, we constructed an asymptotically unbiased estimator based on U-statistics and derived its asymptotic distribution. A consistent estimator of its asymptotic variance has also been proposed for obtaining confidence intervals or testing hypotheses. Moreover, we used the Z-transformation to bound the confidence limits and improve the rate of convergence. The simulation results confirmed that overall in terms of accuracy, precision, and the coverage probabilities, the estimator of the MCCC based on the Frobenius norm works very well in general cases especially when n is greater than or equal to 40.

It seems that the MCCC proposed here is similar to the repeated measure CCC (RMCCC) suggested by King et al. (2007) and Carrasco et al. (2009). However, it can be shown that our MCCC is totally different from the other two existing methods. First, the King et al. (2007) article proposes the RMCCC as

image

where D is a p×p nonnegative definite matrix of weights. Although ρg and ρc,rm have a similar structure when D is the identity matrix, they define very different parameters because the latter is based on the trace function and the former is based on a matrix norm. The trace function is not a matrix norm because property (2) is violated, for example, inline image. Given the rigorous construction and more intuitive appeal of the statistical approach in our manuscript, we would prefer the MCCC to the RMCCC by King et al. (2007) for assessing agreement between X and Y in a repeated measurement setting. The article by Carrasco et al. (2009) builds on the King et al. (2007) article by invoking random effects assumptions and estimating the variance components accordingly, but it uses the same construction as ρc,rm.

Here, we used the U-statistics approach instead of applying the sample counterparts of mean, variances, and covariances because U-statistics possess many desirable properties such as unbiasedness and asymptotic normality under mild conditions (Lenth, 1983). To estimate the MCCC ρg we need the summation of dependent random variables for estimating inline image, which is more complex than the usual summation of independent random variables. However, U-statistics can cope with this complex summation and have been proven to have some decent properties under minimal assumptions (Hoeffding, 1948).

In the future, the MCCC based on other forms of distance function may be considered and compared to the one based on the Frobenius norm. Furthermore, when the data are obtained by stratified random sampling, where each sample comes from a different subpopulation, one may need a weighted average of the MCCCs to evaluate overall agreement across strata. In addition, the MCCC may be generalized to evaluate agreement among more than two vectors of variables. These topics for extensions will be explored in future work.

6. Supplementary Materials

Web Appendices, Tables, and Figures referenced in Sections 2 and 3 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.

Acknowledgements

We would like to thank the two anonymous reviewers and the associate editor for valuable comments and suggestions that greatly helped improve this current manuscript. We would also like to thank Professor Tonya S. King for her kindness and very useful recommendations about programming.

Ancillary