Behavioral items (N= 78) critical to the job success of logging supervisors were developed from 1204 critical incidents, the frequency with which a supervisor (N= 300) engaged in each behavior was rated on a 5-point Likert type scale by two sets of observers. A factor analysis reduced the items to 38 and 33, respectively, for the two sets of observers which in turn constituted 10 and 11 factors or criteria for performance evaluation purposes. Multiple regression equations based on composite scores were used to predict cost-related measures of logging crew effectiveness. The shrinkage in Rs after double cross-validation was moderately small. Moreover, the behavioral observation scales (BOS) that were developed by factor analyzing the observation ratings had moderately high reliability and accounted for more variance in the cost-related measures than did the BOS developed by traditional judgmental clustering techniques. The similarities and differences between BOS and BES procedures are discussed.