• Open Access

What is the impact of missing Indigenous status on mortality estimates? An assessment using record linkage in Western Australia


  • Glenn K. Draper,

    1. Epidemiology Branch, Department of Health Western Australia
    Search for more papers by this author
  • Peter J. Somerford,

    1. Epidemiology Branch, Department of Health Western Australia
    Search for more papers by this author
  • Albert (Sonny) A. G. Pilkington,

    1. Office of Aboriginal Health, Department of Health Western Australia and Centre for International Health, Curtin University, Western Australia and National Centre for Epidemiology and Population Health, Australian National University, Australian Capital Territory
    Search for more papers by this author
  • Sandra C. Thompson

    1. Office of Aboriginal Health, Department of Health Western Australia, Centre for International Health, Curtin University, Western Australia
    Search for more papers by this author

Correspondence to:
Peter Somerford, Epidemiology Branch, Department of Heath, Western Australia, 189 Royal Street, East Perth, Western Australia, 6004. Fax: (08) 9222 2396; e-mail: peter.somerford@health.wa.gov.au


Background: The analysis aimed to assess the Indigenous status of an increasing number of deaths not coded with a useable Indigenous status from 1997 to 2002 and its impact on reported recent gains in Indigenous mortality.

Methods: The Indigenous status of WA death records with a missing Indigenous status was determined based upon data linkage to three other data sources (Hospital Morbidity Database System, Mental Health Information System and Midwives Notification System).

Results: Overall, the majority of un-coded cases were assigned an Indigenous status, with 5.9% identified as Indigenous from the M1 series and 7.5% from the M2 series. The significant increase in Indigenous male LE of 5.4 years from 1997 to 2002 decreased to 4.0 and 3.6 years using the M1 and M2 series, respectively, but remained significant. For Indigenous females, the non-significant increase in LE of 1.8 years from 1997 to 2002 decreased to 1.0 and 0.6 years. Furthermore, annual all-cause mortality rates were higher than in the original data for both genders, but the significant decline for males remained.

Conclusion: Through data linkage, the increasing proportion of deaths not coded with a useable Indigenous status was shown to impact on Indigenous mortality statistics in Western Australia leading to an overestimate of improvements in life expectancy. Greater attention needs to be given to better identification and recording of Indigenous identifiers if real improvements in health status are to be demonstrated. A system that captures an individual's Indigenous status once and is reflected in all health and administrative data systems needs consideration within Australia.

It is now well established that Indigenous Australians experience considerably poorer health outcomes than their non-Indigenous counterparts, with Indigenous persons consistently experiencing higher mortality rates and more ill-health.1 While Indigenous Australians continue to fare poorly across the entire health spectrum, significant improvements have been reported over the past decade.

In the 10 years to 2002, Indigenous mortality rates in Western Australia have fallen, infant mortality has declined and there have been significant gains in life expectancy (LE).2 While this is encouraging, the reported decline in Indigenous mortality coincided with a decline in the proportion of deaths appropriately coded with a useable Indigenous identification status. Between 1997 and 2002, the proportion of deaths coded with an appropriate Indigenous status on WA mortality data decreased significantly. If Indigenous persons were over-represented among those deaths not appropriately coded, then the recent improvements seen in Indigenous mortality could be a consequence of this coding anomaly. Misclassification of Indigenous status in mortality data has the potential to distort mortality estimates for the minority Indigenous population although it would have minimal impact on estimates for the remainder of the population. This reflects the relative size of the two populations given that Indigenous people represented only 3.8% of the West Australian population and 2.5% of the Australian population in 2006.3

Health-related epidemiological and statistical information provides evidence on which to base health policy and programs, and access to accurate and reliable data is critical to improve the health of populations, particularly minorities and vulnerable populations. Without accurate data there is little capacity to monitor changes in health status, evaluate access to services and the response of services to address needs or to quantify the resources expended on health services and programs.

This study assesses the effects of the incomplete identification of Indigenous status in the mortality records. The unique data linkage record system maintained in WA was utilised to link records on individuals across a variety of datasets to ascertain the Indigenous status of those cases where status was unknown.


Data sources

All data analysed in this study are maintained by the Western Australian Department of Health (DOH). The primary data source was data on mortality in WA for the years 1997-2002 provided to the Epidemiology Branch of DOH from the Office of the Registrar General for Births, Deaths and Marriages. The WA Registrar General (RG) receives information pertaining to all deaths that occur within the State. The dataset was analysed for Indigenous status based upon the Indigenous identification field in the dataset for deaths of WA residents whose deaths were registered in WA (Table 1).

Table 1.  Number of deaths in the WA RG Mortality data by Year and Indigenous status, 1997-2002.
Indigenous status1997199819992000200120021997-2002%
% Indigenous status unknown of all deaths0.6%1.3%1.9%3.2%2.8%6.6%2.8% 

Cases where Indigenous status was unknown were identified and matched to records of other administrative datasets through the WA Data Linkage System (WADLS). The WADLS consists of links within and between the State's core population health datasets augmented through links to an extensive collection of external research and clinical datasets.4 Using a unique record linkage number, all the available records for the individuals were extracted from three datasets: the Hospital Morbidity Dataset System (HMDS) for the years 1970 to 2002; the Mental Health Information System (MHIS) from 1966 to 2002 and the Midwives Notification System (MNS) from 1970 to 2002.

Algorithms for defining Indigenous status

Two algorithms were used to determine Indigenous status from the linked external data sources.

The first method (M1) involved the use of multiple records. Initially an Indigenous status was derived for each individual within each dataset. As each dataset may contain multiple entries for an individual it was possible for an individual to be identified with a different Indigenous status in different records. Therefore, the number of times the different Indigenous status values were recorded was counted for each individual. The records for each individual were then aggregated with Indigenous status based upon the count of how many times each individual was coded as Indigenous, non-Indigenous or unknown (Table 2). If the count for Indigenous and non-Indigenous were unknown, the Indigenous status was coded to unknown. If the count for non-Indigenous was greater than or equal to the count for Indigenous, the Indigenous status was coded to non-Indigenous. Indigenous status was coded to Indigenous if the count for non-Indigenous was less than the count for Indigenous.

Table 2.  Example of algorithm used in method M1 for a single individual with multiple separations in either the HMDS or MHIS datasets.
  1. 0 = Missing; 1 = Yes (Identified);

  2. In the above example, the individual would be considered Indigenous

Total Separations31

For babies born in WA, the Indigenous status of the mother is only captured and attributed as the baby's Indigenous status in the Midwives Notification System. Thus, in this study, a baby born to an Indigenous mother was coded as Indigenous, while a baby born to a non-Indigenous mother was coded as unknown.

Indigenous status was then aggregated across the three datasets by counting the occurrence of the different Indigenous status values resulting in a final Indigenous and non-Indigenous variable with a possible score of 0 to 3. For example, if a particular case was coded as Indigenous on the HMDS, MHDS and the MWDS, the score for the new Indigenous variable would be three, while the score for the non-Indigenous variable would be zero. The following algorithm was then applied to derive the final Indigenous status (Table 3).

Table 3.  Algorithm used for allocation of Indigenous status for some possible combinations in method M1.
  1. Notes:   NI = Non-Indigenous     Potential Count = 0-3

  2.   I = Indigenous  U = Unknown

  • If non-Indigenous is greater than Indigenous, status = non-Indigenous

  • If non-Indigenous is less than Indigenous, status = Indigenous

  • If non-Indigenous = Indigenous, status = non-Indigenous

  • If non-Indigenous is missing or unknown and Indigenous is missing or unknown, status = unknown

The approach (M1) described is conservative as it accounts for any incorrect coding of Indigenous status on a single record and summarises the Indigenous status from multiple records. This methodology is most likely to underestimate the number of Indigenous deaths.

A second method (M2) identified an individual as Indigenous if any record for that individual in the external data sources was coded as Indigenous. While this method is simpler than the first, it may overestimate the number of Indigenous deaths based on a single Indigenous misclassification.

Matching the new dataset to the original mortality file

Each of the two new Indigenous variables were added to the original RG mortality data file and coded as either non-Indigenous, Indigenous or unknown. This enabled a comparison between three data series; the Indigenous status in the original data and the Indigenous status derived from the two methods using data linkage.

Life expectancy calculations

Abridged life tables based on five-year age groups were constructed to calculate life expectancies using three year rolling death and population data. Indigenous population estimates were calculated by the Epidemiology Branch, Department of Health WA, and were based on 1996 and 2001 census data provided by the Australian Bureau of Statistics. The estimation of the 2002 population was derived by using the standard cohort progression method under the assumption of no net migration. Non-Indigenous population estimates were derived by subtracting the Indigenous population estimates from the total population estimates. The 95% Confidence intervals (CIs) for the LE estimates were derived using @RISK software5 to produce a distribution of life expectancies from Monte Carlo simulation by applying a Poisson distribution to the number of deaths for each five-year age group.

Mortality rates

Mortality rates were directly age-standardised to the total Australian population as at June 2001 using five-year age groups. All mortality rates were based on three-year rolling averages. The trends in mortality rates were assessed using Poisson regression. Standardised rate ratios were calculated to compare the Indigenous death rate to the Non-Indigenous death rate on a three-year rolling average. The standardised rate ratios were the ratio of the observed Indigenous deaths and the expected Indigenous deaths derived from applying the age and sex specific death rates of the Non-Indigenous population to the Indigenous population.


Between 1997 and 2002, the proportion of deaths recorded by the WA RG with Indigenous status classified as unknown increased steadily from 0.6% in 1997 to 6.6% in 2002 by which time the number of records with Indigenous status classified as unknown was twice that of deaths classified as Indigenous. There were a total of 1,784 cases coded as unknown in the WA RG dataset for the period 1997-2002 (Table 1).

After adding the Indigenous identification data from the HMDS, MNS and the MHIS, 95% (n=1,696) of previously unidentified cases were identified as either Indigenous or non-Indigenous using multiple records (M1), while identification was marginally increased to 96% (n = 1,707) by using a single identification (M2). The remaining cases could not be matched in any of the datasets (Table 4). Based on a single identification to determine Indigenous status (M2), 7.5% (n=134) of previously unidentified cases were identified as Indigenous compared to 5.9% (n=106) when using multiple records (M1).

Table 4.  Number of deaths with Indigenous status identified from linked data 1997-2002.
 Method of Indigenous identification
 Single (M2)Multiple (M1)
No match774.3774.3

The HMDS provided a match to 95.1% of the unknown cases, compared to 6.0% from the MNS and 27.0% from the MHIS. The HMDS provided multiple hospitalisation records for 94.5% of deaths matched to this data source and 4.4% of those deaths with multiple matches to the HMDS records provided discrepant Indigenous status values for the same individual.

The impact of improved indigenous identification on life expectancy

There was a significant increase in life expectancy for males between 1997 and 2002, regardless of the method used to determine Indigenous identification. Using the original Indigenous status that was supplied with the WA RG mortality dataset, Indigenous male LE in WA in 2002 was 67.2 (95% CI: 65.3 to 70.5) years, an increase of 5.4 years from 61.8 (95% CI: 60.2 to 63.6) years in 1997. Applying the Indigenous status derived from multiple records (M1) resulted in a decline in LE estimates of Indigenous males, with the LE increase from 1997 to 2002 reduced to 4.0 years and LE in 2002 reduced to 65.5 (95% CI: 63.8 to 68.1) years. The reduction in the LE estimates from the original was greater for Indigenous males when Indigenous identification was based upon the identification in any record (M2). The LE in 2002 was estimated at 65.0 (95% CI: 63.3 to 67.3) years, with the increase from 1997 to 2002 reduced to 3.6 years (Table 5).

Table 5.  Male life expectancy at birth in WA by Indigenous status and method of Indigenous identification.
  1. Notes:

  2. a) Life expectancy at birth    b) 95% Confidence interval of life expectancy

 95% CIb(77.0-77.5)(60.2-63.6)(13.4-17.3)(76.9-77.4)(60.0-63.3)(13.6-17.4)(76.9-77.4)(59.9-63.2)(13.7-17.5)
 95% CI(77.3-77.6)(61.3-64.3)(13.0-16.3)(77.1-77.5)(61.0-64.0)(13.1-16.5)(77.1-77.5)(61.0-63.9)(13.2-16.5)
 95% CI(77.9-78.3)(61.8-64.8)(13.1-16.5)(77.6-78.0)(61.2-64.1)(13.5-16.8)(77.6-78.0)(61.1-63.9)(13.7-16.9)
 95% CI(78.4-78.8)(63.0-66.5)(11.9-15.8)(78.1-78.4)(62.4-65.9)(12.2-16.0)(78.1-78.4)(62.3-65.8)(12.3-16.1)
 95% CI(79.1-79.5)(64.4-68.0)(11.1-15.1)(78.6-79.0)(63.1-66.3)(12.3-15.9)(78.6-79.0)(62.8-65.8)(12.8-16.2)
 95% CI(79.2-79.7)(65.3-70.5)(8.7-14.4)(78.7-79.1)(63.8-68.1)(10.6-15.3)(78.7-79.1)(63.3-67.3)(11.4-16.2)
Difference1997-20022.25.4 1.84.0 1.83.6 

For Indigenous females, there was no significant increase in life expectancy across the six years. In the original WA RG dataset, the LE had increased over this time from 69.2 (95% CI: 67.5 to 71.4) years to 71.0 (95% CI: 69.2 to 73.5) years, an increase of 1.8 years. The LE estimates for Indigenous females decreased when applying the Indigenous status derived from multiple records (M1). In 1997 the LE in M1 was lower than in the original data at 69.0 (95% CI: 67.3 to 71.2) years and increased by 1.0 year to 70.0 (95% CI: 68.3 to 72.5) years in 2002. For females in particular, a large proportion of the apparent LE increase that occurred across the years 1997-2002 reflected the increasing proportion of deaths for which Indigenous status was unknown, as LE increased by only 0.6 years when using Indigenous status based on a single identification (Table 6).

Table 6.  Female life expectancy at birth in WA by Indigenous status and method of Indigenous identification.
  1. Notes:

  2. a) Life expectancy at birth    b) 95% Confidence interval of life expectancy

 95% CIb(82.8-83.2)(67.5-71.4)(11.4-15.7)(82.7-83.1)(67.3-71.2)(11.5-15.8)(82.7-83.1)(67.2-71.0)(11.7-15.9)
 95% CI(83.0-83.4)(68.4-71.6)(11.4-15.0)(82.9-83.3)(68.2-71.3)(11.6-15.1)(82.9-83.3)(68.1-71.1)(11.8-15.2)
 95% CI(83.7-84.1)(68.7-71.7)(12.0-15.4)(83.4-83.8)(68.4-71.5)(11.9-15.4)(83.4-83.8)(68.2-71.2)(12.2-15.6)
 95% CI(84.0-84.4)(69.4-73.0)(11.0-15.0)(83.7-84.1)(68.9-72.3)(11.4-15.2)(83.7-84.1)(68.7-72.0)(11.7-15.4)
 95% CI(84.3-84.7)(69.0-72.4)(11.9-15.7)(83.9-84.3)(68.2-71.6)(12.3-16.1)(83.9-84.3)(67.8-71.0)(12.9-16.5)
 95% CI(84.2-84.7)(69.2-73.5)(10.7-15.5)(83.8-84.2)(68.3-72.5)(11.3-15.9)(83.8-84.2)(67.8-71.7)(12.1-16.5)
Difference1997-20021.51.8 1.11.0 1.10.6 

Based on the 95% CIs, the increase in LE for males was significant for all series, but there was no statistically significant difference between the series for each annual LE estimate. For Indigenous females, there was no statistically significant difference from 1997 to 2002 or between the series for each annual LE estimate. In 2002, the difference between Indigenous males and females in LE was not significantly different using the original identifier. However, using M1 and M2 methods, the LE of Indigenous females was significantly better than for males (2.7 and 2.6 years respectively) compared to a 5.1 year difference between the sexes for non-Indigenous people.

For non-Indigenous people, LE significantly increased among both males and females from 1997 to 2002, and occurred regardless of the Indigenous identifier data series used to calculate LE. There was no difference between non-Indigenous LE estimates derived from each method of identifying Indigenous status from linked data for both sexes (Tables 5 and 6).

Improved Indigenous identification also affected the magnitude of the difference between Indigenous and non-Indigenous LE. Although from 1997 to 2002 the LE gap among males reduced, increased Indigenous identification resulted in the gap being 1.1 years more for M1 data series and 1.6 years higher for M2 data series compared to that calculated with the original identifier. For females the difference between Indigenous and non-Indigenous LE remained constant in the original data from 1997 to 2002, but was increased by 0.5 years by M1 data series and 1.1 years by M2 data series in 2002 (Tables 5 and 6).

The impact of indigenous identification on all-cause mortality

Based on the original Indigenous status, the Indigenous male mortality rate for all-causes between 1997 and 2002 decreased by an average of 5.8% per year and in 2002 was 14.2 (95% CI: 11.6 to 16.7) per 1,000 persons. The mortality rate was higher at 15.3 (95% CI: 12.6 to 18.0) deaths per 1,000 persons in 2002 when Indigenous status derived from M1 data series was applied. The rate of decrease was reduced to 4.4% per year. Applying the Indigenous status based on M2 data series further increased the mortality rate to 15.8 (95% CI: 13.1 to 18.6) per 1,000 and reduced the rate of decrease to 4.0% annually (Table 7). The decrease in mortality rates for males remained significant regardless of data series applied.

Table 7.  Male all cause mortality ratea WA by Indigenous status and method of Indigenous identification, 1997-2002.
 MalesNon-IndigenousIndigenousRate ratiodNon-IndigenousIndigenousRate ratioNon-IndigenousIndigenousRate ratio
  1. Notes:  a) Rate per 1,000 persons    b) Age standardised rate

  2.    c) 95% Confidence Interval of ASR    d) Standardised rate ratio

 95% CIc(8.5-9.0)(15.1-20.9)(2.5-3.3)(8.6-9.0)(15.3-21.2)(2.6-3.4)(8.6-9.0)(15.4-21.3)(2.6-3.4)
 95% CI(8.4-8.8)(14.3-20.1)(2.8-3.6)(8.5-8.9)(14.5-20.3)(2.8-3.7)(8.5-8.9)(14.6-20.4)(2.9-3.7)
 95% CI(8.0-8.4)(14.3-20.2)(2.3-3.0)(8.1-8.5)(14.8-20.7)(2.4-3.1)(8.1-8.5)(14.9-20.9)(2.4-3.1)
 95% CI(7.6-8.1)(13.6-19.3)(2.8-3.6)(7.8-8.2)(14.0-19.7)(2.8-3.7)(7.8-8.2)(14.1-19.8)(2.9-3.7)
 95% CI(7.2-7.6)(12.5-18.0)(2.5-3.3)(7.5-7.9)(13.6-19.2)(2.5-3.3)(7.5-7.9)(13.9-19.6)(2.6-3.3)
 95% CI(7.1-7.5)(11.6-16.7)(2.1-2.8)(7.5-7.9)(12.6-18.0)(2.3-3.1)(7.4-7.8)(13.1-18.6)(2.4-3.2)
Trend1997-2002P<0.0001P=0.0006 P<0.0001P=0.0046 P<0.0001P=0.0089 

There was no significant trend for the female Indigenous all-cause mortality from 1997 to 2002 for any series. Based on the original Indigenous status, the female Indigenous rate in 2002 was 12.0 (95% CI: 9.6 to 14.4) per 1,000. The mortality rate in 2002 using the Indigenous status derived from multiple records (M1) was 12.6 (95% CI: 10.5 to 15.0) per 1,000, while the mortality rate in 2002 based on a single identification (M2) was 13.2 (95% CI: 10.7 to 15.7) per 1,000 persons (Table 8).

Table 8.  Female all cause mortality ratea WA by Indigenous status and method of Indigenous identification, 1997-2002.
  Non-IndigenousIndigenousRate ratiodNon-IndigenousIndigenousRate ratioNon-IndigenousIndigenousRate ratio
  1. Notes:  a) Rate per 1,000 persons    b) Age standardised rate

  2.    c) 95% Confidence Interval of ASR    d) Standardised rate ratio

 95% CIc(5.3-5.6)(10.5-15.7)(2.9-4.0)(5.4-5.7)(10.6-15.9)(2.9-3.9)(5.4-5.7)(10.8-16.1)(2.9-3.9)
 95% CI(5.3-5.6)(10.1-15.2)(2.7-3.8)(5.3-5.6)(10.3-15.4)(2.7-3.7)(5.3-5.6)(10.4-15.6)(2.7-3.8)
 95% CI(5.0-5.2)(10.0-15.0)(2.5-3.5)(5.1-5.3)(10.1-15.2)(2.5-3.5)(5.1-5.3)(10.3-15.4)(2.5-3.5)
 95% CI(4.8-5.1)(9.4-14.2)(3.1-4.3)(4.9-5.2)(9.7-14.5)(3.1-4.3)(4.9-5.2)(9.9-14.8)(3.2-4.3)
 95% CI(4.7-4.9)(9.8-14.7)(2.5-3.5)(4.9-5.1)(10.3-15.1)(2.6-3.6)(4.9-5.1)(10.7-15.7)(2.7-3.7)
 95% CI(4.7-5.0)(9.6-14.4)(3.0-4.1)(4.9-5.2)(10.2-15.0)(3.1-4.1)(4.9-5.2)(10.7-15.7)(3.2-4.2)
Trend1997-2002P<0.0001P=0.302 P<0.0001P=0.707 P<0.0001P=0.913 

The significant decreasing trend in mortality rate among non-Indigenous males and females remained regardless of the data series applied. The standardised death rate ratio of Indigenous to non-Indigenous remained above 2.0 for both males and females providing no evidence of narrowing of the relative Indigenous: non-Indigenous gap in mortality (Tables 7 and 8).


Health-related epidemiological and statistical information provides a basis for evidence and health policy. Accurate and reliable data is critical to improve the health of populations, particularly the health of high-risk groups such as the Indigenous population. This study identified that in WA by 2002 more death records were missing an Indigenous status than those recorded as Indigenous. To resolve this issue the paper demonstrated that in WA, it is possible to estimate the Indigenous status of previously unknown cases recorded in the WA RG mortality dataset through the use of data linkage to other administrative datasets. Algorithms were created to create a single Indigenous identifier, which enabled unknown cases within the WA RG dataset to have Indigenous status determined for the majority of previously un-coded cases.

After establishing the Indigenous status of these cases, the mortality rates for all-cause were higher, while LE was correspondingly lower. Given the small size of the Indigenous population in WA and the small numbers of deaths per year, confidence intervals for the estimates are wide. Nevertheless, the point estimates indicate a difference in increase of LE from 1997 to 2002 between the original and the conservative estimates using multiple health records (M1) to derive Indigenous status LE of 1.4 years for males and 0.8 for females, which is obviously highly clinically and socially significant. Over the same period the LE among Australian males increased by 1.8 years, while the LE among Australian females increased by 1.3 years.6 For the less conservative method, where a person had ever been recorded as Indigenous (M2), the difference in estimates was greater by 0.4 years for both sexes. This clearly demonstrates the impact that un-coded Indigenous status can have on LE calculations for Indigenous people in WA, although the errors in misclassification should not obscure the smaller but real improvements that seem to have occurred for Indigenous males between 1997 and 2002. However, the data shows the mortality of Indigenous women has not improved.

There are limitations of this study, as there could be no ‘validation’ of the final Indigenous status, but an attempt has been made to present two series of estimates based on a conservative method (M1) of improving Indigenous identification and a less conservative method (M2). The LE estimates derived from the two series provide a range for the Indigenous LE. Another difficulty is the small size of the Aboriginal population with wide confidence intervals around estimates so that small improvements in mortality statistics of a short time frame may not be statistically significant.

For most health administrative data collections, Indigenous status is based upon self-identification, as it is impractical to examine the issues of Indigenous descent and acceptance as an Indigenous person within the Indigenous community in the circumstances in which data collection occurs. There are much more rigorous approaches to ascertainment of Indigenous status in other settings, for example, related to native title claims or land rights and access to mining royalties. It would be feasible to examine the accuracy of the final status if data were subjected to processes used in other settings. The allocation of Indigenous status based upon algorithms using information from a number of administrative datasets assumes that Indigenous status is likely to be correctly identified in those datasets.

A previous study that examined the accuracy of Indigenous status in hospital admissions found the level of accuracy in recording of Indigenous status for non-Indigenous people (state overall 99.5%, range from 98.9% to 99.7% across regions of the state) was consistently higher than for Indigenous people.7 The study showed the number of Indigenous people recorded in hospital inpatient data was an understatement of the number of Indigenous people admitted as patients. The level of completeness varied between health regions from 78.3% in the metropolitan health regions to 93.9% in Pilbara/Kimberley Health Region. It can therefore be assumed that even our less conservative data series (M2) is more likely to underestimate than overestimate the number of Indigenous people. For the midwives data, the Indigenous status of the mother only is recorded and is considered to be reasonably accurate by data custodians because of the contact and visits by extended family that often occur post-confinement (V. Gee, 2006, personal communication).

The accuracy of data regarding Indigenous people depends in part on both their willingness, and the opportunities provided to identify themselves as being Indigenous and there are known variations to this important data collection process across the health and geographical regions of WA. Australia is a multicultural society of which Indigenous Australians are a part, and there are many interracial and intercultural relationships and marriages. The issue of descent and Indigenous identification are for some Indigenous people blurred; the Inquiry into the removal of Indigenous children known as the Stolen Generation reported in Bringing them Home revealed the extent to which many Indigenous people had been denied knowledge of their Indigenous heritage.8 This has been seen in the increasing number of people who have identified as Aboriginal and Torres Strait Islander over the past few censuses.9 Individuals who are not clearly identified as Indigenous in appearance may not be asked whether they self-identify as an Indigenous person, or for some Indigenous people, given the racism to which Indigenous people have been subjected,10 may prefer not to self-identify as an Indigenous in some or many circumstances. Moreover the issue of Indigenous identity is very complex.11

This paper has focused on missing Indigenous status in death records. The problem of incorrectly coded Indigenous status has not been directly addressed in this paper, but data linkage also offers a means for assessing the magnitude of this issue. The uncertainty of Indigenous population estimates derived from Census data are another potential source of error in measures of mortality, and individuals may report Indigenous status differently in the Census to elsewhere, resulting in a numerator-denominator mismatch. A recent report from the ABS linking death data to Census data shows this mismatch is minor in WA.12

The increase in deaths between 1997 and 2002 that did not have an Indigenous identifier was found to be a result of a change in the computer system within the WA RG office and less intensive efforts to follow up missing data (Canard, C 2006. personal communication). Such limitations in the quality and availability of data compromise the ability to assess changes in Indigenous mortality over time, both in absolute terms, and relative to the rest of the WA population.

While data linkage can be used to improve Indigenous identification in death records, a more systematic approach to improvements in data collection methods is needed for data concerning Indigenous people. Assessments of Indigenous identifiers in datasets need to be conducted regularly to inform the development of process to ensure data quality. Arguably, for the purposes of health information, a more fundamental reform is needed, where a person's Indigenous identity is ‘chosen’ and consistently fixed across health and administrative data collections.

It is important to note the effect that poorer identification of Indigenous status has on calculation of life expectancy and mortality rates since both were negatively impacted by better Indigenous classification. Yet many analyses report on Indigenous health indices, without adequate attention to whether the Indigenous status field is accurate. If adjusted for under-ascertainment of Indigenous status, the apparent improvements in Indigenous LE may be less impressive than has been reported. Consideration of Indigenous identifiers is vital given the 2008 Australian Government commitment to Close the Gap in Aboriginal and Torres Strait Islander life expectancy that needs to reflect real improvements in health status and not be an artifact of imprecise data collection processes.


We gratefully acknowledge advice in relation to the project design and data access provided by Scott Cameron, Caron Molster and Kathy Crouchley. S Pilkington contributed to this study while a student on the Master of Applied Epidemiology program at ANU, a program funded by the Department of Health and Ageing and whose field placement was supported by the Office of Aboriginal Health in the Western Australian Department of Health and Curtin University.