We consider inference methods for interobserver agreement studies characterized by two raters and several outcome categories that one can naturally combine to address a series of questions of a priori interest. We propose a new method based on a series of nested, statistically independent inferences, each corresponding to a binary outcome variable obtained by combining a substantively relevant subset of the original categories. We conduct the inferences using a goodness-of-fit procedure that extends the approach of Donner and Eliasziw. The methodology presented is an alternative to methodology that places each of the outcome categories on an equal footing in estimating interobserver agreement for multinomial data. We provide two examples. © 1997 by John Wiley & Sons, Ltd. Stat. Med., Vol. 16, 1097–1106 (1997).