Comparison of UMAT scores and GPA in prediction of performance in medical school: a national study
Phillippa J Poole, Department of Medicine, Faculty of Medical and Health Sciences, University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand. Tel: 00 64 9 373 7599; Fax: 00 64 9 373 7555 (ext 86440/86747); E-mail: email@example.com
Medical Education 2012:46: 163–171
Context Medical schools continue to seek robust ways to select students with the greatest aptitude for medical education, training and practice. Tests of general cognition are used in combination with markers of prior academic achievement and other tools, although their predictive validity is unknown. This study compared the predictive validity of the Undergraduate Medicine and Health Sciences Admission Test (UMAT), the admission grade point average (GPA), and a combination of both, on outcomes in all years of two medical programmes.
Methods Subjects were students (n = 1346) selected since 2003 using UMAT scores and attending either of New Zealand’s two medical schools. Regression models incorporated demographic data, UMAT scores, admission GPA and performance on routine assessments.
Results Despite the different weightings of UMAT used in selection at the two institutions and minor variations in student demographics and programmes, results across institutions were similar. The net predictive power of admission GPA was highest for outcomes in Years 2 and 5 of the 6-year programme, accounting for 17–35% of the variance; UMAT score accounted for < 10%. The highest predictive power of the UMAT score was 9.9% for a Year 5 written examination. Combining UMAT score with admission GPA improved predictive power slightly across all outcomes. Neither UMAT score nor admission GPA predicted outcomes in the final trainee intern year well, although grading bands for this year were broad and numbers smaller.
Conclusions The ability of the general cognitive test UMAT to predict outcomes in major assessments within medical programmes is relatively minor in comparison with that of the admission GPA, but the UMAT score adds a small amount of predictive power when it is used in combination with the GPA. However, UMAT scores may predict outcomes not studied here, which underscores the need for further validation studies in a range of settings.
The selection of medical students is controversial and the choice of tools with which to select candidates for admission from the much larger applicant pool represents a major challenge.1 Selection processes are required to determine those with the aptitude to complete the programme and go on to become the best doctors, but must also rank students for entry to the limited places available. Most selection tools have face validity; however, determining the extent to which a tool predicts an outcome measure, or its predictive validity, is more difficult. One issue concerns deciding which outcome to validate the tool against; another is that usually only candidates with higher scores gain entry and thus it is uncertain how those with lower scores would have fared.2 A final issue refers to how tools might be combined to enhance their predictive validity.3,4
Prior academic achievement predicts most strongly who will remain in medical school, performance during medical school (especially early on), junior doctor performance and time taken to become a specialist.3–5 Prior academic achievement is measured by results on standardised tests within the school system or a grade point average (GPA) calculated from university papers. Professional attributes may be assessed using personal statements, testimonials, personality and emotional intelligence tests,5,6 or interviews, including multiple mini-interviews.7,8 Although some of these methods show promise, none has proved a better predictor of subsequent performance than prior academic achievement.3,4
Tools have been developed that combine assessments of basic medical science knowledge with measures of more general cognitive skills, such as problem-solving, reasoning and writing skills. Examples include the Medical College Admission Test (MCAT; Association of American Medical Colleges, Washington, DC, USA), used since 1928,9 and the Graduate Australian Medical School Admissions Test (GAMSAT; Australian Council for Educational Research [ACER], Melbourne, Vic, Australia); both are used to select for graduate medical programmes. Test scores on the MCAT are most predictive of performance early in a medical programme,9–11 whereas GAMSAT scores correlate weakly with routine in-course assessments12 and clinical reasoning abilities.13
Tests of general cognition are used increasingly in selection into undergraduate medical programmes.7 By contrast with tools used for graduate-entry programmes, these do not test candidates’ knowledge of basic medical sciences. The UK Clinical Aptitude Test (UKCAT; UKCAT Consortium) scores student ability in four distinct domains: quantitative reasoning; verbal reasoning; abstract reasoning, and decision analysis.14 In Australia and New Zealand, the Undergraduate Medicine and Health Sciences Admission Test (UMAT; ACER) is used. This multiple-choice test consists of three sections which focus, respectively, on: logical reasoning and problem solving; understanding people, and non-verbal reasoning.15
Calls have been made to better establish the predictive validity of the UKCAT and the UMAT across a range of schools and curricula.6,16–19 Furthermore, although these tests are relatively convenient for medical schools,1 applicants are required to travel to specified locations on specified days prior to application to medical school. Such tests may prove a barrier to application for students already disadvantaged socio-economically, educationally or by distance.2 Ironically, these may be the very students schools wish to recruit to meet their social mission.20,21 As these tests do not pre-suppose a curriculum, a counter-argument proposed by test administrators in their support is that they minimise prior educational disadvantage.14
We undertook a cross-institution study with the aim of comparing the predictive validity of UMAT scores, the admission GPA, and combined GPA and UMAT scores for student performance on all routine assessments in two undergraduate medical programmes.
The study setting offers three advantages. For the majority of students, the institutions do not use minimum thresholds in UMAT scores, which allows for a wider range of scores on which to determine predictive validity. Secondly, prior academic achievement is determined on a common set of university courses. Finally, this was a national study.
The study was carried out in New Zealand (NZ), in which two medical programmes serve a total population of 4.4 million people. Each is a 6-year hybrid programme in which students are selected for entry into the second year (Year 2) of medicine after completing 1 year at university or a prior degree.22 Major summative written and clinical assessments are administered at the end of Year 5, prior to the student undertaking a final apprentice-type year as a trainee intern.23 Students must pass each year before they can progress to the next and performance is assessed against predefined standards.
To be eligible for admission, students must meet predefined academic thresholds during their first-year university courses or prior degrees. Those meeting the threshold are then ranked on their achievements. The University of Auckland calculates the mean of course grades, whereas the University of Otago uses the mean of percentage scores. For simplicity, these are both referred to as the ‘admission GPA’.
Otago introduced the UMAT as a selection tool in 2003, and Auckland in 2004; their first cohorts completed their medical degrees in November 2008 and 2009, respectively. There are minor differences in selection policies between the two programmes. For example, Auckland uses an interview as a third selection tool,24,25 whereas Otago does not. Furthermore, the weightings given to UMAT scores vary between the two programmes: at Otago the UMAT score has a weighting of 34% and admission GPA provides the other 66%; at Auckland, the UMAT score is weighted at 15%, the structured interview score at 25% and admission GPA at 60%.26
In addition to the standard entry pathway, each programme has two pathways of affirmative entry, of which one applies to rural students and the other applies to students of Māori (indigenous) and Pacific backgrounds.26
A bi-university research group was established in 2006 to study the predictive validity of selection tools in the New Zealand setting. Ethics approvals were granted in 2009 by the University of Auckland Human Participants Ethics Committee and the University of Otago Ethics Committee for Study on Human Subjects.
Subjects and outcomes
Data were obtained for all NZ medical students for whom UMAT formed part of their selection process, up to and including the 2009 entry cohort. Those admitted through the Māori and Pacific affirmative pathway were not included in this study because the selection tools are used differently in this group. We recorded the following for each study subject: demographic variables of gender, age and ethnicity; whether the student had been admitted as a graduate or through the rural pathway; admission GPA; scores on the three UMAT sections, and average UMAT score.
Outcomes were results on routine summative in-programme assessments, such as the global year grade. Table 1 shows a description of the components of each outcome and the measurement scales used. Outcomes that are similar between the two programmes appear in the same row. The University of Otago changed some of its assessments during the study period, such as the Year 2 short-answer question (SAQ) examination. As a result, there was some variation in numbers of subjects in each analysis. Where students repeated a year, only the initial grade for that year was included in order to ensure predictive validity was established among students undergoing the same assessments for the first time.
Table 1. Outcomes in the two programmes in which the predictive validity of the admission grade point average (GPA) and Undergraduate Medicine and Health Sciences Admission Test (UMAT) score was determined
|Year 2 |
|Average of grades across all Year 2 courses||Continuous scale: 1–9||Year 2 |
|Average of marks from MCQ and SAQ tests |
In 2008–2009, it included an OSCE
|Continuous scale: 1–100|
| || || ||Year 2 |
|Aggregate performance||3-point scale: distinction, pass, fail|
| || || ||Year 2 |
|Mark from end-of-year SAQ test||Continuous scale: 1–5|
|Year 3 |
|Average of grades across all Year 3 courses||Continuous scale: 1–9||Year 3 |
|Average of grades on MCQ and SAQ tests |
In 2009, it included an OSCE
|Continuous scale: 1–5|
| || || ||Year 3 |
|Aggregate performance||3-point scale: distinction, pass, fail|
|Year 4 |
|Rules-based grade derived from seven clinical attachments, projects, written examinations and communication skills||3-point scale: distinction, pass, fail||Year 4 |
|Rules-based determination derived from clinical attachments and projects||3-point scale: distinction, pass, fail|
|Year 4 |
|Performance on history taking and communication task with SP||3-point scale: distinction, pass, fail|| || || |
|Year 5 |
|Rules-based grade derived from written examination and clinical assessments||3-point scale: distinction, pass, fail||Year 5 |
|Rules-based determination derived from clinical attachments, projects and end-of-year examinations||3-point scale: distinction, pass, fail|
|Year 5 |
|Combined MCQ and SAQ test||Continuous scale: 1–100||Year 5 |
|Combined MCQ and SAQ test||Continuous scale: 1–100|
|Year 5 |
|Rules-based grade derived from clinical short cases examination, seven clinical attachments, and projects||3-point scale: distinction, pass, fail||Year 5 |
|7–12-station OSCE||Continuous scale: 1–100|
|Year 6 |
|Rules-based grade derived from nine clinical attachments||3-point scale: distinction, pass, fail||Year 6 |
|Rules-based grade derived from six clinical attachments||3-point scale: distinction, pass, fail|
Separate analyses were conducted for each programme. Each analysis used regression models to measure the predictive association between UMAT score, admission GPA or combined GPA and UMAT score, and the outcomes listed in Table 1 when background factors were controlled for (age, ethnicity, graduate, rural pathway). For students at the University of Auckland, the interview score was included among the background factors.
An R2 multiple linear regression model was used when the dependent variable consisted of continuous scores; a Nagelkerke pseudo R2 ordinal regression model was used when the dependent variable was categorical (Distinction, Pass, Fail). Seven regression models were established for each outcome in each programme; an example is given in Table 2. One of these models included the background variables alone. The remaining models included background variables with admission GPA, UMAT score, admission GPA and UMAT score combined, or one of the UMAT sections. Thus, for each outcome in each year and university, it was possible to quantify the net predictive effect of the UMAT score (overall score or any section score), admission GPA, or both, on outcomes over and above other information available at selection. This was calculated by extracting the percentage of variance explained by the background factors from the total variance explained by the model.
Table 2. Regression model examining the predictive power of admission grade point average (GPA) and Undergraduate Medicine and Health Sciences Admission Test (UMAT) score on Year 2 GPA; ‘Ref’ represents the reference group
|Dependent variable||Outcome measure||Year 2 GPA|
|Independent variables||Demographic variable||Gender (Ref = female) |
Ethnicity (Ref = European*)
Age at Year 2
Entrance path (Ref = general admission path)
Previous degree (Ref = none)
|Admission score variables||Admission GPA |
UMAT Section 1 score
UMAT Section 2 score
UMAT Section 3 score
Interactions among the independent variables were not measured because the outcome of interest was the predictive power of these variables. Multi-collinearity among the admission GPA and scores on the UMAT sections was measured by the variance inflation factor (VIF). This was < 1.4 for each regression, well below the unacceptable level of 10. There are no published data on the reliability of the UMAT. Although reliability data are calculated for many of the assessment outcomes that contribute to an overall year result, the reliability of the overall result cannot be determined.
Data were available for 1346 students. At least one set of final-year outcome data were available for each programme, along with data for between two and five cohorts for the other outcomes. Demographic data, admission GPA and UMAT scores are shown in Table 3. Cohorts did not vary significantly over time (data not shown).
Table 3. Subject characteristics, by programme
|Age at start of Year 2, mean (SD)||20.5 years (2.8)||19.6 years (1.3)|
|Gender||Female 52.7%||Female 54.7%|
|Ethnicity, n (%)*||European†, 286 (51.0%) |
Māori, 7 (1.2%)
Pacific, 8 (1.4%)
Asian, 214 (38.0%) (incl. Indian)
Other, 43 (7.6%)
|European†, 503 (64.2%) |
Māori, 14 (1.8%)
Pacific 7 (0.9%)
Asian, 223 (28.5%)
Indian, 40 (5.1%)
Other, 39 (5.0%)
|Graduate entrants, n (%)||93 (16.5%)||50 (6.4%)|
|Rural pathway students, n (%)||96 (17.1%)||135 (17.2%)|
|Admission GPA, mean (SD)||8.2 (0.76) calculated from grades, maximum 9||87.2 (4.6) calculated from percentages, maximum 100|
|UMAT score, mean (SD)||57.8 (6.4)||58.0 (5.9)|
| UMAT Section 1||59.1 (9.6)||59.7 (8.7)|
| UMAT Section 2||56.2 (8.3)||56.8 (9.9)|
| UMAT Section 3||58.1 (9.7)||57.4 (8.2)|
Predictive validity of UMAT score and GPA
The results of the regression model analyses for each programme are shown in Table 4. The net predictive power, or percentage of variance in each of the outcomes explained by the tool(s) above background variance, appears in the last three columns of the table.
Table 4. Predictive ability in the regression models by dependent variable and models for the University of Auckland and University of Otago medical programmes
|University of Auckland medical programme|
| Year 2 GPA||L||563||0.049||0.387||0.097||0.397||34.8||33.8||4.8|
| Year 3 GPA||L||438||0.060||0.327||0.081||0.333||27.3||26.7||2.1|
| Year 4 Overall grade||O||315||0.071||0.161||0.081||0.164||9.3||9.0||1.0|
| Year 4 Communication skills||O||315||0.096||0.106||0.109||0.115||1.9||1.0||1.3|
| Year 5 Overall grade||O||169||0.177||0.256||0.259||0.322||14.5||7.9||8.2|
| Year 5 Clinical grade||O||165||0.199||0.261||0.231||0.281||8.2||6.2||3.2|
| Year 5 Written examination||L||190||0.109||0.165||0.208||0.244||13.5||5.6||9.9|
| Year 6 Overall grade||O|| 87||0.129||0.129||0.189||0.190||6.1||0.0||6.0|
|University of Otago medical programme|
| Year 2 GPA||L||556||0.024||0.258||0.028||0.263||23.9||23.4||0.4|
| Year 2 Outcomes||O||783||0.035||0.233||0.044||0.240||20.5||19.8||0.9|
| Year 2 SAQ||L||225||0.070||0.261||0.126||0.321||25.1||19.1||5.6|
| Year 3 Outcomes||O||672||0.054||0.174||0.057||0.178||12.4||12.0||0.3|
| Year 3 GPA||L||547||0.058||0.291||0.066||0.303||24.5||23.3||0.8|
| Year 4 Outcomes||O||529||0.113||0.200||0.121||0.208||9.5||8.7||0.8|
| Year 5 Outcomes||O||363||0.125||0.298||0.178||0.354||22.9||17.3||5.3|
| Year 5 Written examination||L||361||0.073||0.272||0.101||0.289||21.6||19.9||2.8|
| Year 5 OSCE||L||362||0.235||0.350||0.247||0.369||13.4||11.5||1.2|
| Year 6 Outcomes||O||220||0.081||0.114||0.123||0.164||8.3||3.3||4.2|
Combining the admission GPA and UMAT scores improved the predictive power of the regression models across all outcome measurements in both programmes. In general, GPA models were more predictive than UMAT score models for all outcome measures, except the Year 6 overall grade in both programmes and some Year 5 outcomes at Auckland. The combined models were at most only 1.5 times more predictive than the GPA models, but up to four times more predictive than the UMAT score models. In both programmes, the combined model was best at predicting outcomes in Years 2 and 5.
The maximum variance explained by any regression model was 35% for the Year 2 GPA at Auckland, by the admission GPA and UMAT score combined; the least variance explained was 0%, for Year 6 outcomes at Auckland, by the GPA alone. At best, the total UMAT score explained 10% of variance in the Year 5 written examination at Auckland. Although the UMAT score had greater predictive power than the GPA for Year 6 outcomes, it explained only 6% of variance at Auckland and 4% at Otago (R2 not statistically significant). When results were analysed by scores on the three individual UMAT sections, there was no substantive change to the findings (data not shown). The highest predictive power of an individual UMAT section was seen for Section 1 in Year 5 at Auckland, where it explained 8% of variance in the written examination mark and 7% in the Year 5 overall grade. For all other outcomes, the net predictive power of any UMAT section was ≤ 5%.
This study’s main finding is that when the score on a general cognitive test, the UMAT, is combined with admission GPA for selection purposes, the ability to predict outcomes on all major summative assessments in an undergraduate medical programme is enhanced, but not by much. On outcomes in which general cognitive skills might be expected to come into their own, such as the Year 4 communication skills test or Year 5 clinical examination, the UMAT score does not show a predictive advantage over GPA. As the predictive power of the UMAT score is so low, we cannot draw firm conclusions about the relative predictive power of individual UMAT sections.
The final year (Year 6) is the most closely related to junior doctor practice.23 Neither selection tool predicts performance in this year well, although the UMAT was at least as predictive as admission GPA in both programmes. One explanation for the poor predictive ability of selection tools is that performance in Year 6 is measured using relatively blunt instruments, such as supervisor reports, by contrast with Year 5 assessments. Another is that performance at the end of the programme is likely to reflect the numerous confounding factors that influence students’ abilities as they progress, particularly the curriculum itself.
Another finding is that student background has a predictive association with outcomes of a magnitude approximate to that of the UMAT score. Although schools are unlikely to use components of student background, such as age, gender or ethnicity, as selection criteria, this finding underscores the need to consider background factors when evaluating the predictive validity of selection tools.
Like others, we found that prior academic achievement is a moderate predictor of outcomes, but this predictive power drops throughout the programme.3,5,9,27,28 Furthermore, a recent study found UMAT scores to correlate poorly with GPAs over 4 years of university study (2 years in a prior degree and 2 years in medicine),19 but did not evaluate UMAT scores against performance in the final 2 years of the programme, nor in individual assessments. Our data suggest the UMAT has some predictive power for academic outcomes later in a medical programme and that this is of an order of magnitude similar to that of the GPA.
A strength of this study is that it was conducted nationally across two distinct medical programmes. There was consistency in the findings, despite minor differences in students, selection and curricula. Furthermore, the mean UMAT scores (and spread) were similar in the two programmes. This lends support to the claim that these are robust and generalisable findings.
In Australia, most medical schools set a minimum threshold for UMAT scores. With the exception of graduate entrants at Otago, who must score above the 25th centile, no UMAT threshold is imposed in NZ, which yields a potentially wider range of scores upon which to test its predictive ability. It could be argued that as all successful applicants had high GPAs, they would have higher UMAT scores. This is not the case, as the minimum average UMAT score was 31. A future area of study will concern the performance of students with lower UMAT scores and high admission GPAs. Because the GPA contributes the majority of weight to ranking decisions for admission, testing the converse is not possible in this population.
Weaknesses include the fact that outcomes for fewer subjects were evaluable in later years of the programmes, raising the possibility of a type II error. The UMAT may be more predictive at different weightings or when used as a threshold score. Given the low predictive power seen in this study, and the similarity of findings across both universities, we believe it is unlikely that any change in weighting would improve the UMAT’s predictive power significantly for the outcomes we have reported.
Although both medical schools invest considerably in making student assessments authentic and ensuring they are aligned with the domains of medical practice,23,29–31 it is possible that the UMAT predicts outcomes that were not studied. Already UMAT scores have been found not to correlate with emotional intelligence in final-year students.6 If UMAT scores were to predict attrition from a medical programme, the negative impacts on stakeholders of admitting a student who is unlikely to complete medical studies could be reduced. Additionally, the UMAT may be more useful in certain student groups; if so, it may represent a helpful refinement to selection policies. If selection tools are to help predict which candidates will become effective doctors, longer-term follow-up is required, including on eventual specialty and location of practice. As the UMAT has been used in NZ only since 2003, it is not yet possible to measure how it predicts workforce outcomes.
Our study supports the inclusion of a measure of academic achievement in medical student selection.32 As used in the NZ setting, the UMAT score has less predictive ability than the admission GPA, but it does add to the GPA’s predictive power by a small amount, especially later in a medical programme. Given the shortage of reliable tools with which to select medical students, we would favour a strategy in which the validation of the UMAT is continued in various settings and the outcomes on which predictive ability is tested are broadened.
Contributors: PP contributed to the study concept and design, application for ethical approval, and the acquisition, collation and interpretation of data, and led the first and all subsequent drafts and revisions of the manuscript. BS contributed to the analysis and interpretation of data, and the drafting and revision of the manuscript. JR contributed to the study concept and design, the collation and interpretation of data, and the drafting and revision of the manuscript. TW contributed to the study concept and design, application for ethical approval, the acquisition, collation and interpretation of data, and the first and subsequent drafts and revisions of the manuscript. All authors approved the final manuscript for publication.
Acknowledgements: the authors acknowledge the support of Dr John Monigatti, Director of Medical Admissions, Michelle Chung, Student Services Centre, and Ian Wood and Belinda May of the Medical Programme Directorate, University of Auckland, and Melany Rohan, Manager of Health Sciences Admissions, University of Otago.
Conflicts of interest: none.
Ethical approval: this study was approved in 2009 by the University of Auckland Human Participants Ethics Committee and the University of Otago Ethics Committee for Study on Human Subjects.