SEARCH

SEARCH BY CITATION

A method for assessing rater reliability by means of a design of overlapping rater teams is presented. The products to be rated are split randomly into m disjoint subsamples, m equaling the number of raters. Each rater rates at least two subsamples according to a prefixed design. The covariances or correlations of the ratings can be analyzed with LISREL models, resulting in estimates of the rater reliabilities. Models in which the rater reliabilities are congeneric, tauequivalent, or parallel can be tested. We address problems concerning the identification and the degrees of freedom of the models and present two examples based on essay ratings.