Hematologic malignancies in South Africa 2000–2006: analysis of data reported to the National Cancer Registry

Abstract Little is known about the incidence patterns of hematologic malignancies in Sub‐Saharan Africa, including South Africa. We estimated incidence rates of pathology‐confirmed adult cases of leukemia, myeloma and related diseases (myeloma), Hodgkin lymphoma (HL), and non‐Hodgkin lymphoma (NHL) reported to the National Cancer Registry of South Africa (NCR) between 2000 and2006, by age, gender, and population group (Black, White, Coloured, Asian/Indian). Gender‐specific age‐standardized rates were calculated overall and by population group and incidence rate ratios (IRRs) were estimated using Poisson regression models. Between 2000 and 2006, there were 14662 cases of leukemia, myeloma, HL, and NHL reported to the registry. Incidence rates of reported hematologic malignancies were generally 20–50% higher among males than females. Our analyses suggested marked differences in the rates of reported hematologic malignancies by population group which were most pronounced when comparing the White versus Black population groups (IRRs ranging from 1.6 for myeloma to 3.8 for HL for males and females combined). Challenges related to diagnosis and reporting of cancers may play a role in the patterns observed by population group while the set‐up of the NCR (pathology‐based) could lead to some degree of under‐ascertainment in all groups. This is the first country‐wide report of the incidence of hematologic malignancies in South Africa. Despite challenges, it is important to analyze and report available national cancer incidence data to raise awareness of the cancer burden and to characterize patterns by demographic characteristics so as ultimately to improve the provision of cancer‐related health care.


Introduction
Worldwide, leukemia, multiple myeloma (MM), Hodgkin lymphoma (HL), and non-Hodgkin lymphoma (NHL) collectively accounted for an estimated 6.5% of new cancer cases in 2012 with the majority of these cases coming from NHL followed by leukemia [1]. While global estimates suggest two and threefold higher incidence rates of NHL and leukemia, respectively, in high income countries compared to Sub-Saharan Africa [1], there is little known about the incidence patterns of hematologic malignancies in this region, including South Africa. Similar to worldwide figures, hematologic malignancies were estimated to contribute about 6% of new cancer cases in South Africa in 2012 [1].
To date, the literature of hematologic malignancies in South Africa is largely based on hospital-based studies which report on patient and disease characteristics of leukemias and lymphomas [2][3][4][5][6], with a particular focus on the prevalence of HIV and the differences in cancer characteristics between HIV-positive and -negative patients. There is a well-established association between HIV and several types

ORIGINAL RESEARCH
Hematologic malignancies in South Africa 2000-2006: analysis of data reported to the National Cancer Registry of hematologic malignancies, including but not restricted to the AIDS-defining subtypes of NHL [7][8][9]. While hospitalbased studies benefit from detailed patient information, there is also a need to estimate incidence and mortality rates, particularly at the national-level. Such data provide important information about the overall burden of disease, which in turn inform cancer control strategies, and provide a basis for investigating underlying determinants of disease.
Studies from the United Kingdom and United States show considerable variability in the incidence of hematologic malignancies by gender and age [10][11][12] as well as by population group [10,11,13]. It remains unknown whether the incidence of these malignancies in South Africa follows similar patterns to those reported in higher income areas. Recently, it was reported that the incidence of pediatric hematologic malignancies was approximately three times higher among White compared with Black children within South Africa [14]. As population group is highly correlated with socioeconomic position and access to private health care services in South Africa [15], the authors hypothesized that differences in access and utilization of health care services likely explain at least some of the observed incidence differences [14]. To our knowledge, these patterns have not been investigated among adults in South Africa.
In this report, we analyze for the first time the incidence of adult cases of leukemia, multiple myeloma and related diseases, HL, and NHL reported to the National Cancer Registry of South Africa (NCR) between 2000 and 2006, by age, gender, and population group.

National cancer registry
A detailed description of the NCR has been published elsewhere [16]. Briefly, the NCR (www.ncr.ac.za) is a pathology-based registry, reporting on malignancies confirmed in public and private laboratories throughout the country. The registry includes only incident, primary invasive cancers based on confirmation by histology, cytology, or hematology. Trained coders at the NCR code the diagnoses from pathology reports based on primary site and morphological type according to the International Classification of Diseases for Oncology, third edition (ICD-O-3) [17]. For 2000-2004, the NCR determined cancer diagnosis from pathologist reporting of SNOMED codes, rather than full pathology reports. For 2005-2006, full pathology reports were received from public laboratories but not private. Until 2011, reporting to the registry was done on a voluntary basis although all of the National Health Laboratory Services (NHLS) laboratories regularly reported to the registry as the NCR is a division of the NHLS. Reporting has been less complete from the private sector, particularly from 2005 onwards [16]. In addition to basic demographic information about the patient (name, age and/or date of birth, gender) and tumor diagnosis information (topography, morphology, date of diagnosis), the registry extracts information on population group (Black, White, Coloured (i.e., mixed ancestry), and Asian/Indian), where available from the pathology reports. If not found on the pathology report, a hot-deck imputation method is used to estimate population group using a database of approximately 1.4 million surnames with known population group [16]. In the dataset used for this analysis, 66.8% of the case reports had missing population group and thus a substantial proportion was imputed. If population group cannot be estimated (i.e., surname with no match in the database), it is left as missing. A comparison of the distribution of population group based on actual versus imputed data for a subset of 277130 cancer cases (contributing to the database) reported to the registry between 1990 and 1995 showed very good agreement between actual and imputed values. The distribution of original versus imputed () data was as follows: 53% (50%) White, 40% (41%) Black, 2% (2%) Asian/Indian, and 5% (7%) Coloured (chi-squared test P-value = 0.94 for distribution differences).

Hematologic malignancy cases
For the present report, we included all pathology-confirmed incident cases of leukemia, myeloma and related diseases (subsequently referred to as myeloma), HL, and NHL reported to the NCR that were diagnosed at ages ≥15 years between 2000 and 2006. We followed the Surveillance, Epidemiology and End Results (SEER) Program site recode (http://seer.cancer.gov/siterecode/icdo3_dwhoheme/index. html) which is based on the ICD-O-3 [17] and the 2008 WHO Classification of Tumours of Haematopoietic and Lymphoid Tissues [18] for classification of these four groups (see Tables 1 and 2 for listing of ICD-O-3 codes). The lack of full pathology reports prevented implementation of the more detailed classification of lymphoid neopslasms [18]. Our analysis did not include hematologic malignancies other than leukemia, myeloma, HL, or NHL.

Population data
Consistent with the approach used in the annual reports of the NCR for the years included in this study, we used the alternative South African mid-year population estimates [19] from the Centre for Actuarial Research, University of Cape Town, stratified by age, gender, and population group for calculation of incidence rates. These mid-year population estimates are similar in magnitude to the official mid-year estimates, but maintain an age distribution that is consistent  with that of the most recent census in 2011 and, as with the NCR for this time period, we considered them as the more appropriate population estimates for the purposes of estimating age-specific and age-standardized rates.

Statistical methods
Gender-specific crude incidence rates overall and stratified by population group were estimated for leukemia, myeloma, HL, and NHL. Reflecting limited case numbers in individual age groups, age-specific rates were not estimated separately for males and females. Gender-specific agestandardized rates, overall and stratified by population group, were calculated using the SEGI world standard [20] truncated for ages ≥15 (ASR 15 + ). The ASR 15 + is a weighted average of age-specific rates based with the following weights for each age group Incidence rate ratios (IRRs) and 95% confidence intervals (CIs) were estimated using Poisson regression models with the number of cases for a given category of hematologic malignancy as the outcome, the population size as the log offset and a log link function, overall and stratified by population group, comparing rates among females to males. Similar models, overall and stratified by gender, were used to compare incidence rates by population group, using the Black population group as the reference group. All models were adjusted for age group (5 year categories) and calendar year (single-year treated as a categorical variable). Models including all population groups and/or both males and females were further adjusted for population group and gender. Hematologic patients with unknown population group and/or gender (4.7%) were excluded from Poisson models as there were no corresponding population estimates for such groups. As underascertainment of cancers may be more prominent at older ages (in many settings) [21], sensitivity analyses were repeated by restricting the dataset to ages <75 years.

Results
Between 2000 and 2006, there were a total of 14662 cases of leukemia, myeloma, HL, and NHL reported to the registry. There were 46 cases (0.3%) with unknown gender. Table 1 presents the distribution by population group for the 14616 cases with known gender, by calendar year of diagnosis, reporting source (private vs. public), and age at diagnosis, separately for males and females. In all calendar years, approximately half of the cases were reported among the Black, 33% among the White, 10% among the Coloured, and 5% or less among the Asians/Indian population groups. The distribution of population group differed substantially between public and private laboratories, with the White population group accounting for approximately half of all cases reported by private laboratories. With increasing age at diagnosis, the proportion of cases coming from the Black population group declined while that from the White population group increased steadily. Similar patterns were observed for males and females with respect to calendar year, reporting source and age.
The breakdown of hematologic malignancies types is presented for males and females separately, overall and by population group, in Figure 1. Regardless of gender or population group, NHL was the most commonly reported hematologic malignancy, accounting for approximately 50% or more of cases in most population groups (Fig. 1B). In all groups, this was followed by leukemia, contributing 15-25% of cases in the various subgroups.
Crude and age-standardized incidence rates are presented for leukemia, myeloma, HL, and NHL by population group and gender in Table 2. Incidence rates varied markedly by population group; in general, the lowest rates were observed among the Black population group and the highest among the White population group. An exception was myeloma, for which rates were lowest among the Asian/Indian population group for both males and females.  For all population groups combined, the reported incidence rate of hematologic malignancies was 1.2 to 1.5-fold higher among males than females (Figure 2a). Similar patterns were observed across the four population groups (Figure 2b).
The IRR for population group are presented in Figure 3. For males and females combined, reported incidence rates of hematologic malignancies tended to be higher among the White, Coloured, and Asian/Indian population groups than among the Black population group (Fig. 3A). The exception was for myeloma, for which no statistically significant difference was observed between the Asian/Indian and Black population groups, in either males or females. The largest rate ratios were observed comparing the White  and Black population groups, ranging from 1.56 (95% CI 1.38-1.76) for myeloma to 3.77 (95% CI 3.38-4.21) for HL. Gender-specific patterns were similar to those observed for males and females combined ( Fig. 3B and C).
Age-specific rates of leukemia, myeloma, HL, and NHL are presented in Figure 4A-D by population group. With the exception of HL, incidence rates tended to increase with age until approximately age 75, followed by a decline at the oldest ages. For HL (Fig. 4C), the patterns appeared quite different between the population groups, most notably comparing the White and Black population groups. Among the White population group, there was an early peak in HL incidence rates at ages 20-29 and a later peak around age 70-75 with rates somewhat lower and generally stable in between these age groups. Among the Black population group, there was an increase with HL with age until approximately age 30, at which point the rate plateaued followed by a subsequent decline beginning around age 60. For all four major types of hematologic malignancies investigated, incidence rates were consistently higher among the White population group than the Black population group, irrespective of age, but these differences tended to increase with age ( Fig. 4A-D).
In sensitivity analyses restricted to ages <75, there were no marked changes in the results presented in Tables 1-2 or Figures 1-3. Incidence rate ratios (IRRs) by population group were slightly attenuated at ages <75 compared with the full adult population, but the reduction was very minor and the interpretation unchanged. This observation is consistent with the patterns observed in age-specific rates whereby differences between the White and Black population groups were most apparent at older ages.

Summary of key results
The incidence of adult hematologic malignancies (diagnosed at ages 15 years or older) was estimated for laboratory-confirmed cases reported to the NCR between 2000 and 2006, describing overall rates as well as those by age, gender, and population group. NHL was the most common hematologic malignancy reported to the NCR during this time period, irrespective of gender and population group. Incidence rates of reported hematologic malignancies were generally 20-50% higher among males than females. Our analyses suggested lower rates of reported hematologic malignancies among the Black population group compared with other population groups, with differences most pronounced when comparing the White and Black population groups. These differences tended to become more marked with increasing age. With respect to age-specific rates, incidence rates increased with age for hematologic malignancies other than HL. For HL, among the White population group, a bimodal peak was observed at ages 20-29 and 70-75. A different pattern was observed among the Black population group; reported HL rates increased with age until approximately age 30, at which point the rate plateaued, followed by a subsequent decline beginning around age 60.

Interpretation of key results
The observation that NHL, followed by leukemia, was the most common of these four broad categories of hematologic malignancies is consistent with worldwide patterns [1]. The higher incidence rates among males than females are also consistent with gender patterns reported elsewhere [12,22].
With respect to population group, age-adjusted incidence rates from the U.S. Surveillance, Epidemiology and End Results (SEER) 18 registries in the United States for the period of 2000-2011 show a predominance among the White vs. Black populations with annual White to Black ratios (estimated using the SEER Fast Stats tool [23]) of 1.3-1.5 for NHL, 1.2-1.4 for leukemia and 1.1-1.3 for HL. These estimates are somewhat lower than those estimated in the NCR data. Of note, previous analyses of the SEER data have shown that the magnitude and direction of these population group incidence rate ratios varies by subtype of leukemia and lymphoma [10,11]. As we did not have full pathology reports for all cases (as discussed in the Methods) we were not able to consider subtypes in this analysis. The apparently distinct age-specific patterns observed for HL between the White and Black population groups in the NCR data are also reported in the SEER data where a clear bimodal pattern, classically associated with HL, is much more pronounced in the White than Black populations [24]. Globally, the classic bimodal age pattern appears to be more a characteristic of more economically developed areas [25]. In contrast to what is observed for NHL, HL, and leukemia rates, the incidence of myeloma in the SEER data is approximately twofold greater among the Black than White population groups [13]. In contrast, the reported incidence rate of myeloma in the NCR was approximately 50% greater among the White than Black population groups.
For any cancer site, differences in the underlying distribution of genetic and environmental risk factors as well as factors related to completeness of reporting and diagnosis drive demographic variations in the incidence patterns. The etiology of hematologic malignancies is largely unexplained, with few known determinants [25][26][27]. Established environmental risk factors for leukemia include ionizing radiation and certain chemical exposures such as benzene [27]. For NHL, there is clear evidence for an association with infectious diseases (HIV, Epstein-Barr virus (EBV), Hepatitis C Virus (HCV), and Human T-Lymphotrophic Virus (HTLV-1)) [26] and increasing data to support a role for lifestyle, occupational, and environmental factors [28]. HL also has an infectious etiology -EBV is one of few known risk factors [25]. As is common for registry-based studies, we did not have individual-level information about risk factors. While we cannot exclude the possibility that differences in the distribution of or susceptibility to etiologic factors could explain the marked differences by population group observed in the NCR data, the known infectious and environmental risk factors would not seem likely explanations. In order for these factors to drive truly higher rates of disease within South African White versus Black population groups, they would need to be more prevalent in the White population group.
Disparities in the completeness of diagnosis and reporting between population groups may have contributed to the observed incidence rate patterns. The NCR is a pathologybased registry and thus only hematologic malignancies with a histologic, cytologic, or hematologic (bone marrow aspirate or trephine biopsy) confirmation are captured. Consequently, there is an inherent risk of under-estimating rates based on the registry data as cases diagnosed by other means (i.e., peripheral blood smear) are not reported. Problems of under-reporting may be compounded, however, by other factors that disproportionately affect the Black population group compared with the White population group and contribute not only to under-reporting but also underdiagnosis of these cancers. First, a smaller proportion of the Black population group have access to a private medical aid fund (7.2% of the Black population versus 63.1% of the White population according to 2006 data) [15]. Public medical services are chronically under-resourced and understaffed [29,30] and as such, patients in the public sector may be less likely than those in the private sector to receive a comprehensive diagnostic work-up. Furthermore, the system operates under a tiered structure by which patients are referred from primary health clinics to tertiary hospitals via other tiers [31] and patients may be lost from the system before presenting at referral centers. Factors such as distance to the nearest tertiary center [32] may lead to delayed or no diagnosis. Second, the burden of HIV varies markedly by population group [33]. While HIV is associated with increased risk of lymphomas, particularly subtypes of NHL [9], atypical presentation and histology of HIVassociated lymphomas may lead to misdiagnosis or delayed diagnosis [34]. In population groups with high HIV rates, competing mortality from other causes [9,35] may reduce the opportunity for lymphomas to develop while late-stage presentation of disease [3,36] may lower the chances of cancers ever being diagnosed by pathology.

Strengths/limitations
This is the first country-wide study on hematologic malignancies in South Africa and one of very few studies from Sub-Saharan Africa. The study benefits from the large number of cases permitting detailed examination of rates and patterns by age, gender, and population group. The diverse population of South Africa enabled us to investigate differences by population group which is associated with socioeconomic circumstances and access to private health care in South Africa [15]. Nonetheless, analyses of the Asian/Indian and Coloured population groups were less robust owing to smaller case numbers than in the Black and White population groups, particularly when further examining age-specific patterns. Limitations include that, by definition, the NCR data being used is restricted to pathology-confirmed cancer cases and thus it is understood that it does not fully capture incident hematologic malignancies in the country. Furthermore, there was a decline in reporting to the NCR by some private sector laboratories (beginning in 2005) [16]. This would be expected to have the greatest impact on cases reported from the White population group which could attenuate the observed rate ratios by population group. As discussed above, we were unable to implement more detailed subtype classification as full pathology reports were not available for the period of 2000-2004. Previous studies of lymphoma and leukemia in the United States suggest that racial differences may vary considerably by subtype [10,11]. More generally, there are known etiologic differences in the subtypes of leukemia and lymphomas [26][27][28]. Once available, it will be important to repeat these analyses using data from more recent years during which full pathology reports were received by the NCR. Another limitation is that population group had to be imputed for a substantial proportion of the dataset. The imputation method, however, has been previously validated in the NCR and the limitation appears to be of minor importance, although some misclassification cannot be ruled out. While it is important to consider our results in the context of these limitations, the NCR provides the most comprehensive overview of these cancers in the country at this time.

Conclusions
The hematologic malignancies investigated here collectively account for an estimated 6% of new cancer cases and 8% of cancer deaths in South Africa [1]. The consistency of patterns by age and gender with those reported in other populations [1,10,12,13,22,24] suggest that underlying risk factors for these cancers are unlikely to modify the age distribution or gender ratio. Differences between population groups, however, would appear to be more pronounced than those observed in some other settings. We hypothesize that challenges related to diagnosis and reporting of cancers play a role in the patterns by population group while the set-up of the NCR (pathologybased) could lead to some degree of under-ascertainment, irrespective of population group, gender, or age. Despite challenges, it is important to analyze and report available national cancer incidence data to raise awareness of the cancer burden and to characterize patterns by demographic characteristics so as ultimately to improve the provision of cancer diagnosis and care.