SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

This commentary describes practical implications of Schmidt's (International Journal of Selection and Assessment, 20, 1–13 (2012)) rationale supporting content validity evidence for cognitive tests. These implications include descriptions of the meaning of six key inferences about local, specific cognitive tests, four of which are supported by the traditional methods of content evidence, and two of which are not. These help clarify the important incremental inference from Schmidt's proposed methodology that cognitive tests supported by content evidence will also be predictive of job performance in the local setting. A caution is raised that content evidence does not support a general inference that local, specific cognitive tests will take on all empirical properties of general cognitive measures. An additional job analysis step is recommended to strengthen the linkage between the specific cognitive job skills/behaviors and the more general theory of general cognitive ability.

1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

Schmidt (2012) makes an important contribution to the discussion about content validity evidence by describing how content validation procedures can be applied to tests of cognitive skills and aptitudes. But Schmidt's main points go well beyond the nature of content-oriented evidence for cognitive tests; they address a broader view that seeks to integrate content and criterion evidence into conclusions about cognitive tests as predictors of job performance. Indeed, Schmidt describes the main purpose of the paper as demonstrating that ‘in the domains of cognitive skills, aptitudes and abilities, test development procedures that yield content validity also yield criterion-related validity.’

The purpose of this commentary is to describe practical implications of Schmidt's view of content validity for cognitive tests with a focus on the inferences that may be drawn from content-oriented evidence and the unique inference Schmidt argues for where the test content represents specific cognitive skills and aptitudes.

To set the stage, a few key points in Schmidt's analysis should be singled out. First, Schmidt describes the procedures for content validation of cognitive tests as the same as the traditional, well-understood procedures for content validation (Principles for the Validation and Use of Personnel Selection Procedures, Society for Industrial and Organizational Psychology, Inc., 2003; Standards for Educational and Psychological Testing, American Education Research Association, American Psychological Association, & National Council on Measurements in Education, 1999; Stelly & Goldstein, 2007). The procedures for producing content evidence for cognitive tests are not different from procedures for other types of tests. These include the specification of the target domain of work content, job analysis describing the manner in which the target domain manifests itself in work behavior, a strategy for sampling from the target domain, and procedures for developing measures of the targeted elements of the work domain. Second, Schmidt distinguishes between cognitive tests along a continuum of generality ranging from highly specific tests of cognitive skills to general tests of cognitive abilities. Third, Schmidt asserts that content evidence is not relevant in cases where job analysts draw a theory-based conclusion from the job analysis that a general cognitive ability (GCA) such as learning ability underlies successful performance. Content evidence is applicable only where job analysis specifies the work behaviors that operationalize a target skill/aptitude. This is necessary so that this operationalization in terms of work behavior can be linked by job experts to the same skill/aptitude's operationalizationed in the test. Stelly and Goldstein (2007) made a very similar point about domains and tests that are too broadly defined to admit content evidence.

To address issues raised by Schmidt, Figure 1 provides a schematic model for describing inferences from content evidence. This figure applies Binning and Barrett's (1989) general model of validity inferences to the special case of content evidence relating to tests of cognitive skills/aptitudes. The language of constructs, predictors, and criteria used in some versions of the general model is replaced by work content domain, test content, and job performance, respectively. In contrast to the general Binning and Barrett model, this content evidence model shows a single work content domain underlying both the test and job performance, indicating that they are different measures or operationalizations of the same work content domain (the careful reader might wonder whether there is any distinction between the work content domain and job performance. This model adopts the Principles' view that the work content domain is composed of the behaviors, activities, knowledge/skills/abilities/other characteristics (KSAOs) necessary for performance on the job). Figure 1 also shows the distinction between the whole work content domain and the sampled version used as the basis for test development. The numbered linkages (solid lines) represent the processes and inferences associated with content evidence.

figure

Figure 1. A model of content-oriented evidence of validity in the cognitive domain.

Download figure to PowerPoint

To represent Schmidt's central point that content valid cognitive skill/aptitude tests also have criterion-related validity, Figure 1 depicts the theory/explanatory context in which cognitive content evidence can be placed. The lettered linkages (dashed lines) represent the linkages between the theory of cognitive ability and job performance and the meaning of the cognitive work content domain (A) and the prediction of job performance (B). These lettered linkages are outside the scope of content evidence itself. Also, this theory context and the lettered linkages are specific to cognitive work domains and may have little practical value in other domains. While virtually all work behaviors have a cognitive component, the focus in some domains, for example, typing, is on the linkage between developed skills and performance. In such cases, job performance may be equivalent to the skilled behavior operationalized in the test. The theoretical framework of cognitive ability may have little practical, incremental value over content evidence in cases such as typing.

2. What is the meaning of content validity evidence for cognitive tests?

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

The meaning of content validity evidence can be understood by identifying the inferences that may be drawn from content evidence. The four numbered linkages in Figure 1 represent the four key linkages supported by content validity evidence. Each linkage represents a set of processes and the inferences enabled by those processes. In Figure 1, these linkages are numbered in their logical sequence.

2.1. Linkage 1: work content domain – job performance

Linkage 1 is about the relationship between the work content domain of interest, in this case, cognitive behaviors, activities, KSAOs (i.e., manifestations of cognitive skills/aptitudes), and job performance. In the content validation process for cognitive skills/aptitudes, there are three components of this linkage: (a) the description of the behaviors, activities, and/or KSAOs that constitute the content domain; (b) the judgment that these are relevant to successful performance; and (c) the identification of these as cognitive in nature. Job analysis can provide the content evidence supporting (a) and (b). Using the appropriate experts, job analysis methods can also be used to attach cognitive labels to work behaviors. But job analysis does not develop the general labels and definitions of cognitive skills/aptitudes/abilities. These have been derived from separate research that has informed the theory of cognitive ability and job performance. As a result, the job analysis process supporting linkage 1 enables inferences about (a) the content and scope of the cognitive work domain, (b) the linkage between the behaviors, activities, and/or KSAOs representing cognitive skills/aptitudes and job performance, and (c) the cognitive labeling applied to the work content domain. But job analysis does not support any new inferences about the definitions of cognitive abilities, skills, or aptitudes.

2.2. Linkage 2: work content domain – sample of work content domain

Linkage 2 is about the adequacy with which the work content domain is sampled for the purpose of developing or choosing a test used to make personnel decisions. While there is no one set of adequacy standards for domain sampling in support of personnel decisions, one almost universal condition for sample adequacy is that the sample includes work content important for job performance. The processes that ensure sample adequacy typically involve some assessment of importance with respect to performance. This means that sampling adequacy is usually not about the sample's representativeness of the whole domain as much as it is about importance for successful performance. There does not appear to be any reason grounded in work content to expect that different adequacy standards would be applied to cognitive skills/aptitudes than to other domains. The inference that the work content domain has been adequately sampled would be supported by the same types of subject matter expert (SME) judgments for cognitive domains as for other domains.

(Note, Schmidt's point that the predictive validity of any three, or so, specific skills/aptitudes would capture all the criterion-oriented validity of GCA might suggest a different adequacy standard for sampling from the cognitive domain than from other domains. But this consideration flows from the empirically supported theory of cognitive ability and job performance. It does not flow from a consideration of work content. For that reason, it seems unlikely that adequacy standards grounded in job analytic descriptions of work content are likely to be different for the cognitive domain than for other domains.)

2.3. Linkage 3: sampled domain – test content

One of the most critical features of content evidence is that test content and work content be expressed in comparable ways so job experts can confirm that the test content matches the sampled work content. This matching by experts of test content to work content is the lynchpin of content evidence. Schmidt makes the point that because of this requirement, only the operationalization of cognitive skills/aptitudes specific to the work context lends itself to content evidence. The process of content validation for tests of cognitive skills/aptitudes can support linkage 3 when the test content satisfies this level of specificity requirement. The inference that can be made from such content-to-content matching evidence is that the test content is representative of the work content represented by the sampled work content domain. When the matching of test content to work content includes item content as well as all the features of the testing process that impact score meaning such as response formats, instructions, and scoring rules, this inference can be extended to a further inference that the test scores measure the skills/aptitudes in the sampled work domain.

2.4. Linkage 4: test content – job performance

Three processes underlying content evidence – job analysis, domain sampling based on importance for performance, and test–work content matching, which support linkages 1–3, have the net effect of enabling the inference represented by linkage 4 that the test content represents cognitive skills/aptitudes that are important for successful job performance.

In sum, there are four major parts to the meaning of content evidence for cognitive skills/aptitudes.

  1. Content evidence provides a credible description of the cognitive work content domain in the target work.
  2. Content evidence supports the inference that certain cognitive components of the work content domain are important for successful performance.
  3. Content evidence supports the inference that the operationalization of cognitive skills/aptitudes in the test represents sampled work content.
  4. Content evidence supports the inference that the tested skills/aptitudes represent work content important for performance.
    • This inference may extend to the meaning of test scores, but not their psychometric properties, depending on the comprehensiveness with which test and item content was matched to work content

3. What is unique about the prediction inference supported by Schmidt's analysis?

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

In contrast to these inferences supported by content evidence, two other potentially useful inferences are not supported by content evidence.

  1. Content evidence does not support any inference about an underlying theoretical framework, if any can be specified. Content evidence does not evaluate hypotheses or propositions about any theoretical framework. Indeed, the processes of content evidence make no assumption about any underlying theoretical framework. The fact that in the case of the cognitive work domain, there is a well-developed theoretical framework does not mean that content evidence has evaluated or tested it in any sense. Rather, the theoretical framework has informed the content evidence by providing the labels and definitions of the cognitive skills/aptitudes/abilities manifest in the work behavior and operationalized in the test.
  2. Content evidence alone does not support an inference about prediction because content evidence alone does not demonstrate prediction. This is a nuanced and debatable point, which is at the heart of Schmidt's main purpose to demonstrate that cognitive skills/aptitudes supported by content evidence will also demonstrate criterion-oriented validity, which is prediction. The Principles (p. 21) and others (e.g., Goldstein, Zedeck, & Schneider, 1993) assert that content evidence supports the inference that a content valid test predicts job performance. In effect, the Principles asserts that the rationale for linkage 4 may be interpreted as a rationale that test scores will predict job performance. The Principles uses this language of ‘rationale’ as the basis for its claim about prediction, ‘This (that the selection procedure samples important work behaviors) provides the rationale for the generalization of results from the (content-oriented) validation study to prediction of work behavior’ (p. 21; content in italics added). But it is clear that the process of gathering content evidence does not include prediction evidence, much less psychometric evidence about the properties of test scores. The Principles' prediction claim could be described as a claim that it is plausible that test scores will predict performance because test scores satisfy a necessary condition of prediction that the test content producing the scores represents work behavior important for performance. But in the absence of (a) empirical criterion evidence of prediction or (b) any linkage to a theoretical framework supported by criterion-oriented evidence, there is not a sufficient basis for inferring from content evidence that test scores predict job performance.

Schmidt's main point is that in the case of cognitive skill/aptitude tests, content evidence may be linked to and supplemented with theory-based evidence supporting an additional inference that work-specific cognitive skill/aptitude tests predict job performance. In short, this chain of two types of evidence begins with the content evidence that for a particular job, cognitive skills/aptitudes are important for successful performance. The procedures supporting linkages 1 and 2 provide this evidence. As described later, it is important that the content evidence establish that the KSAOs in question are cognitive in nature by some method of expert judgment. The theory of cognitive ability and job performance, which accounts for extensive evidence about relationships among cognitive measures, explains that such specific skills/aptitudes are task-specific manifestations, ultimately, of GCA. This theory also accounts for a large volume of criterion-oriented prediction evidence by explaining that GCA predicts job performance by enabling the learning of job knowledge, which directly affects performance. Once content evidence has established that job performance is a function of work-specific cognitive skills/aptitudes, it follows from the empirically supported theory of cognitive ability and job performance that tests of work-specific cognitive skills/aptitudes will predict job performance. This chain of evidence is a composite of content evidence and a theory explaining extensive criterion-oriented evidence. Because the criterion-oriented evidence demonstrates empirical prediction, an appropriate inference about content valid tests of cognitive skills/aptitudes is that they will predict job performance. This inference generalizes linkage B to linkage 4 in Figure 1. It is important to recognize that this additional inference is about the particular skill/aptitude tests with content evidence. It is not an inference about a broad category of tests.

This prediction inference is not the same as the prediction rationale cited in the Principles. It generalizes empirical prediction evidence explained by theory to the local setting via the content-based demonstration that cognitive skills/aptitudes are important for performance in the local setting. It is not merely a rationale for expecting prediction, it is a generalization of prediction evidence to the local setting. This generalization to the local setting is a function of the theoretical explanation of GCA's impact on performance and its relationships to specific cognitive skills/aptitudes and the content evidence that specific cognitive skills/aptitudes drive performance in the local setting. This is not an inference from content evidence alone. Rather, it is an inference from the combination of content evidence and a theory about GCS that is based on empirical criterion evidence.

(It should be noted that even this inference, with its grounding in empirical evidence, does not overcome the lack of psychometric evidence that may be available for ad hoc, local skill/aptitude tests. The inference of prediction implicitly assumes the psychometric quality of the test scores is adequate.)

4. A recommendation

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

As noted, this chain of evidence includes evidence that the local cognitive skills/aptitudes reflected in the work domain are cognitive in nature. While this is likely to be self-evident in most cases, self-evidence is usually not satisfactory. It is recommended that the content evidence process include an additional step that is usually not part of the job analysis process. In this step, experts in the application of cognitive ability in work, such as industrial–organizational psychologists, would rate the extent to which each operationalized skill/aptitude included in the test is identified with more GCA factors such as reading comprehension, deductive reasoning, and spatial ability. This expert-based linkage would improve the rigor, clarity, and documentation of the linkage between the work content domain as represented in the test and the cognitive theory.

5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References

Schmidt shows that specific cognitive skill/aptitude tests supported by content evidence will also demonstrate criterion-oriented validity (prediction). This is explained by the theoretical conclusion that job-specific skill/aptitude tests predict job performance because they are manifestations of GCA, which is known to predict job performance. This argument assumes that there are no systematic differences between skill/aptitude tests and GCA tests that interfere with this inference. Yet it is known that work-specific measures of cognitive KSAOs do differ from GCA measures in certain ways. In particular, these two types of cognitive measures are known to differ in the magnitude of mean race/ethnic group differences. For example, McKay and McDaniel (2006) reported standardized mean black–white differences on work samples, job knowledge tests, on-the-job training, and academy training of 0.42, 0.53, 0.05, and 0.46, respectively. They also reported mean black–white differences of 0.35 on job performance. In contrast, Schmitt, Clause, and Pulakos (1996) reported a standardized mean black–white difference of 0.83 on GCA tests (all differences represent higher scores for whites.) With the exception of on-the-job training tests, these results are representative of the general result that black–white differences on GCA tests are approximately twice as large as differences on tests measuring cognitive ability in job-specific contexts, including measures of job performance itself.

Schmidt's argument does not address possible systematic differences between job-specific skill/aptitude tests and GCA tests that might limit this generalization from GCA prediction to skill/aptitude prediction. One can speculate that one source of systematic difference is that job-specific skill/aptitude tests are influenced by level of learning/achievement to a greater extent than are GCA tests. Where learning/achievement influences skill/aptitude test scores, this influence is not likely a function of specificity itself but more likely is a function of the work context of the items, which may introduce work-related skill/knowledge achievement as a component of score variance. Certainly, job performance measures are expected to be strongly affected by learning/achievement since the theory of cognitive ability and job performance describes learning as the mechanism by which cognitive ability influences performance.

Although learning/achievement may be a more significant factor in job-specific skill/aptitude tests than in GCA tests, it does not appear to benefit the predictive validity of skills/aptitude tests. Schmidt and Hunter (1998) reported very similar average criterion validities for general mental ability tests (0.51), work sample tests (0.54), and job knowledge tests (0.48). These empirical results may reduce the concern that systematic differences between skill/aptitude tests and GCA tests can limit the inference about skill/aptitude test prediction from GCA prediction. However, the empirical differences between these two types of tests with regard to race/ethnic differences do suggest that not all relationships with GCA tests will generalize to skill/aptitude tests when content evidence supports the skill/aptitude tests.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. What is the meaning of content validity evidence for cognitive tests?
  5. 3. What is unique about the prediction inference supported by Schmidt's analysis?
  6. 4. A recommendation
  7. 5. Do all empirical results for GCA tests generalize to content valid skill/aptitude tests? A caution
  8. References
  • American Education Research Association, American Psychological Association, & National Council on Measurements in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
  • Binning, J. F., & Barrett, G. V. (1989). Validity of personnel decisions: A conceptual analysis of inferential and evidentiary bases. Journal of Applied Psychology, 74, 478494.
  • Goldstein, I. L., Zedeck, S., & Schneider, B. (1993). An exploration of the job analysis-content validity process. In N. Schmitt & W. Borman (Eds.), Personnel selection in organizations (pp. 334). San Francisco, CA: Jossey-Bass.
  • McKay, P. F., & McDaniel, M. A. (2006). A reexamination of black–white mean differences in work performance: More data, more moderators. Journal of Applied Psychology, 91, 538554.
  • Schmidt, F. L. (2012). Cognitive tests used in selection can have content validity as well as criterion validity: A broader research review and implications for practice. International Journal of Selection and Assessment, 20, 113.
  • Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262274.
  • Schmitt, N., Clause, C. S., & Pulakos, E. D. (1996). Subgroup differences associated with different measures of some common job-relevant constructs. In C. L. Cooper & I. T. Robinson (Eds.), International review of industrial and organizational psychology (Vol. 11, pp. 115139). Chichester: Wiley.
  • Society for Industrial and Organizational Psychology. (2003). Principles for the validation and use of personnel selection procedures (4th ed.) Bowling Green, OH: Author.
  • Stelly, D. J., & Goldstein, H. W. (2007). Application of content validation methods to broader constructs. In S. M. McPhail (Ed.), Alternative validation strategies: Developing new and leveraging existing validity evidence (pp. 252316). San Francisco, CA: Jossey-Bass.