Agreement between ethnicity recorded in two New Zealand health databases: effects of discordance on cardiovascular outcome measures (PREDICT CVD3)


Correspondence to: Dr Roger J. Marshall, School of Population Health, University of Auckland, Private Bag 92019, Auckland, New Zealand. Fax: +64 9 373 7503; e-mail:


Objectives: To assess agreement between ethnicity as recorded by two independent databases in New Zealand, PREDICT and the National Health Index (NHI), and to assess sensitivity of ethnic-specific measures of health outcomes to either ethnicity record.

Method: Patients assessed using PREDICT form the study cohort. Ethnicity was recorded for PREDICT and an associated NHI ethnicity code was identified by merge-match linking on an encrypted NHI number. Agreement between ethnicity measures was assessed by kappa scores and scaled rectangle diagrams.

Results: A cohort of 18,239 individuals was linked in both PREDICT and NHI databases. The agreement between ethnicity classifications was reasonably good, with overall kappa coefficient of 0.82. There was better agreement for women than men and agreement improved with age and with time since the PREDICT system has been operational. Ethnic-specific cardiovascular (CVD) hospital admission rates were sensitive to ethnicity coding by NHI or PREDICT; rate ratios for ethnic groups, relative to European, based on PREDICT were attenuated towards the null relative to the NHI classification. Conclusions: Agreement between ethnicity was moderately good. Discordances that do exist do not have a substantial effect on prevalence-based measures of effect; however, they do on measurement of the admission of CVD.

Implications: Different categorisations of ethnicity data from routine (and other) databases can lead to different ethnic-specific estimates of epidemiological effects. There is an imperative to record ethnicity in a rational, systematic and consistent way.