Validity: on the meaningful interpretation of assessment data
Article first published online: 27 AUG 2003
Volume 37, Issue 9, pages 830–837, September 2003
How to Cite
Downing, S. M. (2003), Validity: on the meaningful interpretation of assessment data. Medical Education, 37: 830–837. doi: 10.1046/j.1365-2923.2003.01594.x
- Issue published online: 27 AUG 2003
- Article first published online: 27 AUG 2003
- Received 29 May 2003; accepted for publication 3 June 2003
- Educational measurement;
- Reproducibility of results
Context All assessments in medical education require evidence of validity to be interpreted meaningfully. In contemporary usage, all validity is construct validity, which requires multiple sources of evidence; construct validity is the whole of validity, but has multiple facets. Five sources – content, response process, internal structure, relationship to other variables and consequences – are noted by the Standards for Educational and Psychological Testing as fruitful areas to seek validity evidence.
Purpose The purpose of this article is to discuss construct validity in the context of medical education and to summarize, through example, some typical sources of validity evidence for a written and a performance examination.
Summary Assessments are not valid or invalid; rather, the scores or outcomes of assessments have more or less evidence to support (or refute) a specific interpretation (such as passing or failing a course). Validity is approached as hypothesis and uses theory, logic and the scientific method to collect and assemble data to support or fail to support the proposed score interpretations, at a given point in time. Data and logic are assembled into arguments – pro and con – for some specific interpretation of assessment data. Examples of types of validity evidence, data and information from each source are discussed in the context of a high-stakes written and performance examination in medical education.
Conclusion All assessments require evidence of the reasonableness of the proposed interpretation, as test data in education have little or no intrinsic meaning. The constructs purported to be measured by our assessments are important to students, faculty, administrators, patients and society and require solid scientific evidence of their meaning.