Spatial clustering of endemic Burkitt's lymphoma in high-risk regions of Kenya



Endemic Burkitt's lymphoma (eBL), the most common childhood cancer in sub-Saharan Africa, occurs at a high incidence in western Kenya, a region that also experiences holoendemic malaria. Holoendemic malaria has been identified as a co-factor in the etiology of this cancer. We hypothesized that eBL may cluster spatially within this region. Medical records for all eBL cases diagnosed from 1999 through 2004 at Nyanza Provincial General Hospital were reviewed for case residential information to examine this hypothesis. Two cluster detection methods, Anselin's Local Moran test for spatial autocorrelation and a spatial scan test statistic, were applied to this residential data to determine whether statistically significant high- and low-risk areas were present in the Province. During the 6-year study period, 272 children were diagnosed with eBL, with an average annual incidence of 2.15 cases per 100,000 children. Using Empirical Bayes smoothed rates, the Local Moran test identified 1 large multi-centered area of low eBL risk (p-values < 0.01) and 2 significant multi-centered clusters of high eBL risk (p-values < 0.001). The spatial scan detected 3 small independent low-risk areas (p-values < 0.02) and 2 high-risk clusters (p-values = 0.001), both similar in location to those identified from the Local Moran analysis. Significant spatial clustering of elevated eBL risk in high-malaria transmission regions and of reduced incidence where malaria is infrequent suggests that malaria plays a role in the complex eBL etiology, but that additional factors are also likely involved. © 2006 Wiley-Liss, Inc.

Endemic Burkitt's lymphoma (eBL) is the most common childhood cancer in sub-Saharan Africa. The distribution of this cancer has geographic and climate-associated features. Cases are most frequently observed in regions roughly 10° north and south of the equator with minimum annual rainfall and mean temperatures of 50 cm and 15.6°C, respectively.1, 2 These specific features have been linked to chronic and intense malaria transmission (e.g. holoendemic malaria).3, 4, 5 Within these holoendemic malaria transmission regions, however, eBL occurrence is not uniform,5 suggesting that other environmental factors may contribute to the pattern of eBL.4, 6, 7

Many studies have been conducted in an effort to explain this apparent nonrandom eBL distribution. Pike et al. first detected spatial-temporal eBL clustering in Uganda,8 where cases within the West Nile District during 1961–1965 occurred more closely in space and time than could be expected by chance alone. Other investigations identified spatial-temporal clustering in Bwamba County of Uganda during 1966–19689 and more recently, among older children living in communities bordering Lake Malawi.10 Studies in Ghana, Northern Tanzania and other districts in Uganda, however, failed to detect eBL case clustering.11, 12, 13 It is unclear whether these conflicting findings result from eBL risk-factors that are present in certain places only at certain times, bias in eBL case detection or the impact of demographic changes on the distribution of at-risk populations.13 Despite its potential link to underlying causal mechanisms, the question of whether eBL clusters within high-risk regions remains unanswered.

Although Kenya lies within the well-defined “lymphoma belt”, the country was not included in these early statistical investigations of the spatial-temporal distribution of eBL. More recently, however, information from 2 studies suggests that eBL hot-spots do exist in Nyanza Province, a high-risk region in western Kenya. Makata et al.14 noted elevated rates of pediatric cancers, including eBL, in communities directly bordering Lake Victoria, and data published by Mwanda et al.15 described greater eBL incidence among children living in Siaya District, in the north of the Province. These findings, if confirmed, could provide important leads in identifying potential eBL co-factors. However, spatial statistics were not used to determine whether the distribution of eBL observed in these studies reflected significant clustering or chance variability in disease occurrence alone.

Our investigation was conducted to test whether these previous findings (i.e., spatial clustering in Lakeside communities and/or Siaya District) are statistical and epidemiologically valid and determine whether other areas in the region also experience elevated eBL risk. Accordingly, we analyzed medical records of eBL cases diagnosed at Nyanza Provincial General Hospital from 1999 through 2004 and used 2 cluster detection methods to statistically evaluate the spatial distribution of eBL in Nyanza Province. A new look at the spatial distribution of this old disease in a high-risk Province in Kenya, especially given significant demographic changes in the region, may not only confirm recognized high-risk areas but also elucidate previously unidentified spatial patterns. A better understanding of eBL disease patterns will ultimately generate new information regarding the multi-factorial etiology of this pediatric cancer.

Material and methods

Study population

Nyanza Province is located in western Kenya, bordered to the west by Lake Victoria, east by the Rift Valley highlands, north by Western Province and south by Tanzania (Fig. 1). Most of the Province lies just south of the equator and has a semi-tropical to tropical climate. Malaria transmission varies from holoendemic in lowland areas around Lake Victoria to seasonal (April to July) in the highlands (elevation ∼1,600–2,000 m). Although rainfall occurs throughout the year, precipitation is typically elevated during the “long” (March to June) and “short” (October to December) rainy seasons.

Figure 1.

Map of Kenya and District-level map of Nyanza Province.

The population of Nyanza Province in 1999 was ∼4.4 million, of which ∼50% was <15 years old.16 In 1999, the Province was divided into 12 Districts, 65 Divisions, 324 Locations and 933 Sublocations. This analysis focuses on clustering using 4th administrative level data, that is Location-specific eBL incidence rates (IR) (1999 median childhood population estimate = 5,569, range 421–31,776; median surface area = 34.4 km2, range 1.4–154.9 km2). Under-five childhood mortality was 116 per 100,000, with malaria and respiratory infections contributing to the majority of these deaths.17 In 2000, more than a quarter of the urban adult population in Nyanza was seropositive for HIV, and up to 40% in certain high-risk groups.18 Most people in the Province are of the Luo tribe, with the Kisii, Suba and Kuria tribal groups representing smaller proportions of residents.

Nyanza Provincial General Hospital is located in Kisumu city. As the largest hospital in the Province, it provides out-patient and in-patient services, and maintains its own pathology and radiology laboratories. The hospital is the referral center for childhood cancer cases and is the only medical facility that provides chemotherapeutic treatment for eBL in the region.

EBL case data and human subjects approval

We reviewed pediatric medical records of all patients ≤15 years old who had been admitted to Nyanza Provincial General Hospital from 1999 through 2004 with a physician diagnosis of Burkitt's lymphoma. A standardized abstraction form was used to collect demographic, clinical and place of residence information. All eBL cases hospitalized at Nyanza Provincial Hospital during 2003 and 2004 were included in an immunologic study on eBL and were histologically confirmed. Although all cases diagnosed and treated for eBL between 1999 and 2002 were also histologically confirmed, documentation was not always available for patients hospitalized during this time period. When documentation was not available, only cases treated for eBL and responding to treatment with no potential competing diagnosis (e.g. non-eBL cancer) were included in this study.

Our study protocol was approved by Internal Review Boards at the University of Michigan, Case Western Reserve University and University Hospitals of Cleveland, as well as the Ethical Review Committee at the Kenya Medical Research Institute. The Chief Medical Director at Nyanza Provincial General Hospital granted access to eBL medical records.

Population and geographic data

We calculated childhood (0–15 years of age) population estimates for each Location in Nyanza Province from 1999 through 2004. These estimates were based on 1999 census data obtained from the Kenya Medical Research Institute/Wellcome Trust Collaborative Programme (KEMRI/Wellcome Trust) in Nairobi and 2000–2004 age-specific population projections generated by the Kenyan Central Bureau of Statistics.16 Population projections were based on algorithms incorporating the impact of HIV/AIDS, other causes of mortality, births and migration. Digitized maps of 1999 Location administrative boundaries in Nyanza Province were also obtained from the KEMRI/Wellcome Trust Collaborative Programme.

Data management

Data were entered into an excel database and converted to SAS (version 8.2, SAS Institute, Cary, NC) for preliminary analysis. We excluded cases residing outside Nyanza Province as well as duplicate records. If we detected an exact match on all demographic variables, with age differences reflecting only changes in admission dates, then a case was considered a duplicate. The entry with the later hospital admission date was excluded. To minimize error in case mapping, we cross-referenced place of residence with the administrative place names from the 1999 census. If inconsistencies were noted between 2 administrative levels (e.g., Location and Sublocation), the smaller unit (Sublocation) was assumed to be correct, as parents and guardians were most likely to be familiar with more local governmental administrative location names. ArcGIS (version 9.1, ESRI, Redlands, CA) was used for mapping and visualization.

Statistical analysis

We used 2 cluster detection programs, GeoDa (version 0.9.5-i, Urbana-Champaign, IL) and SaTScan (version 6, Boston, MA), to fully examine our hypothesis regarding the presence of eBL spatial clustering in the Province. Both these programs test for spatial clustering using area (case and population at-risk) data, and output the location, approximate size and significance level of identified clusters. The programs are publicly available from the internet. Different spatial analytical methods may identify different underlying spatial patterns.19 Consistent findings using more than 1 method, therefore, would suggest robust results.

GeoDa–Anselin's Local Moran test

We used the GeoDa cluster detection software program for generating Anselin's Local Moran (LISA) test statistics. The Local Moran statistic assesses spatial autocorrelation and identifies administrative units (i.e., Locations) with disease rates statistically similar to and dissimilar from their neighbors. The null hypothesis is that there is no association in eBL IRs among neighboring administrative units. The alternative hypothesis is that spatial clustering exists (neighboring Locations have similar eBL rates).

IRs based on relatively uncommon health events such as eBL can be unstable (large variances). Therefore, we applied Empirical Bayes (EB) smoothing methods to our eBL disease data. This smoothing method adjusted disease rates, especially in administrative units with small population estimates, toward the overall mean of the study area. We defined neighbors using queen contiguity relationships (single point) and obtained Monte Carlo p-values based on 9,999 conditional randomizations. Because of multiple comparisons and chances for overlapping clusters, we used an alpha level of 0.01 to assess statistical significance.

SaTScan–Spatial scan

We repeated our cluster detection analysis using the spatial scan test in SaTScan. This test uses a circular window in which the center of the circle moves across the study area. At each center position, the radius of the window varies between 0 and a user defined upper limit, thereby encompassing different sets of neighboring administrative units (along with disease and population data for that unit). An administrative unit is included in the circle if the centroid for that unit lies in the circle. An infinite number of distinct circles varying in size and location are created. For each circular window, the spatial scan calculates the likelihood of observing the reported and expected number of eBL cases inside the circle given the number of reported and expected cases outside the circle. The circular window with the maximum likelihood is defined as the most likely cluster, indicating that it is least likely to have occurred by chance alone. The scan statistic evaluates the null hypothesis that the eBL disease rate is the same throughout the study area. The alternative hypothesis is that there is at least 1 circular window in which the disease rate inside the circle varies significantly from remainder of the Province.

We used raw eBL data in SaTScan to avoid potential bias in estimating the likelihood ratio test statistic. A data file containing raw eBL case and population data for the centroid (longitude and latitude coordinate) of each 4th level administrative unit in the Province was generated in GeoDa. This data file was imported into SaTScan assuming a Poisson probability disease model (case and at-risk population data). Although we set the maximum cluster size at 50% of the study population for purposes of statistical inference (avoiding preselection bias), we limited the size of reported clusters to those representing 20% or less of the population to focus on localized clustering. Presence of high-risk clusters and low-risk areas was assessed simultaneously. p-Values for maximum likelihood ratios were based on 9,999 Monte Carlo randomizations. An alpha level of 0.05 was used to assess statistical significance. Likelihood-ratio based test statistics and reported p-values account for multiple testing. Results from the spatial scan and tests for spatial autocorrelation were compared with each other as well as with the smoothed eBL rate map.


Study population

From 1999 through 2004, 372 eBL patients were diagnosed with eBL at Nyanza Provincial General Hospital. Cases residing outside of Nyanza Province (n = 76), duplicated case reports (n = 12) and cases lacking a laboratory or clinical diagnosis (n = 29) were excluded (categories not mutually exclusive), resulting in a study population of 272 cases. The 6-year average annual IR was 2.15 cases per 100,000 children (Table I). Place of residence information to the 4th administrative level (i.e., Location) was available for 235 (86.0%) of these cases. The median onset age of these cases was 7.0 years (interquartile range = 5.0–9.0), and 58.7% were male (95% CI = 52.3–65.0). Cases were reported from 5 different tribes. By far, the majority of these were Luo (93.0%), followed by Kisii (3.3%) and Kuria (2.2%). Individual cases also occurred in Suba, Samia and Kalenjin children.

Table I. Annual and 6-year Average Annual eBL Incidence Rates per 100,000 Children, Nyanza Province 1999–2004 (n = 272)
YearNo.At-risk populationIR
6-year average45.32,108,1252.15

Rate maps

Raw and EB smoothed IRs are presented in Figures 2 and 3. Raw rates are presented using a quintile classification scheme. Breakpoints from this classification scheme were applied to the distribution of EB rates to better visualize the impact of smoothing. According to the EB smoothed map, Locations on the north shore of Lake Victoria appeared to have elevated risk when compared to the surrounding areas, as did several administrative units in Nyando District as well as in the northern most region of Siaya District. The apparent area of high incidence on the southwest border with Tanzania was less pronounced following smoothing. Low-risk units were concentrated in the highland Districts of Nyamira, Central Kisii and Gucha.

Figure 2.

Raw IRs (eBL cases per 100,000 children) by Location (n = 324), Nyanza Province, 1999–2004. Breakpoints generated using a quintile classification scheme.

Figure 3.

EB smoothed rates (eBL cases per 100,000 children) by Location (n = 324), Nyanza Province, 1999–2004. Breakpoints based on quintile classification scheme generated from the distribution of raw IRs (same as used in raw rate map).

Spatial clustering of cases using GeoDa–Anselin's Local Moran test (LISA)

Formal cluster analysis identified significant eBL clustering in the Province. Two multi-centered clusters of high eBL risk were identified using the Local Moran test for spatial autocorrelation (Table II and Fig. 4). As suggested in the EB smoothed rate map, the first large cluster, located in the western region of Kisumu District, was centered on 5 Locations: Central Kisumu, East Seme, South Central Kisumu, South West Kisumu and West Kisumu. EB rates for these Locations ranged from 2.1 to 8.0 cases per 100,000 children. The second high-risk cluster was detected in Nyando District, comprising 2 Location centers: Kikolo/East Kano and North Nyakach. EB smoothed IRs of these Locations were 3.1 and 4.4 cases per 100,000 children, respectively. A single large and significant area of Locations with low eBL risk was detected. This area of low risk was centered on 45 of the 65 Locations in the highland Districts of Nyamira, Kisii and Gucha (data not presented). During the 6-year period, only 1 case was reported from the 45 cluster centers.

Figure 4.

Statistically significant (p < 0.01) high-risk clusters and low-risk area of eBL identified by Anselin's Local Moran test for spatial autocorrelation in GeoDa, Nyanza Province, 1999–2004. Bolded boundaries reflect high-risk cluster and low-risk area centers.

Table II. Statistically Significant High-Risk Clusters Identified with Anselin's Local Moran Test for Spatial Autocorrelation in GeoDa, Nyanza Province, 1999–2004
ClusterCenterRaw IR1EB smoothed IR1Local Moranp-value
  • 1

    Rates reflect cases per 100,000 children.

Western KisumuCentral Kisumu5.
East Seme13.07.08.880.001
S.C. Kisumu13.28.011.89<0.001
S.W. Kisumu16.
West Kisumu2.
NyandoN. Nyakach6.73.10.860.001
Kikolo/E. Kano10.24.40.86<0.001

Spatial clustering of cases using SaTScan–spatial scan test statistic

Table III and Figure 5 present the results from the spatial scan test for spatial clustering. The most likely high-risk eBL cluster included 3 Locations in the western region of Kisumu District: East Kisumu, South West Kisumu and West Kisumu (High-Risk Cluster No. 1, p = 0.0001). The second most likely high-risk cluster included a total of 20 administrative units; 5 located in the eastern region of Kisumu District and 15 located in Nyando District (High-Risk Cluster No. 2, p = 0.0001). By comparing these clusters to the EB smoothed rate map, we find that the rates for Western Kisumu cluster are all above 4.90 cases per 100,000 children, while the EB smoothed rates for the Nyando/Eastern Kisumu cluster were highly variable, ranging from 1.1 to 5.5 cases per 100,000 children. The 15 cluster Locations in Nyando District generally had higher smoothed rates compared to the 5 Locations in Eastern Kisumu District. Both high-risk clusters detected in SaTScan were either included in or overlapped those identified using GeoDa. Three areas of significantly low eBL risk were identified with the spatial scan (Low Risk Area No. 1, p = 0.0001; Low Risk Area No. 2, p = 0.0004; Low Risk Area No. 3, p = 0.0170). The most likely low-risk area was located in the east-central highland region of Nyamira, Kisii and Gucha Districts. All 3 areas were consistent in location with the low-risk area identified in GeoDa, with the exception of Low Risk Area No. 3, which also included 3 administrative units from Migori District. SaTScan also excluded several highland administrative units bordering Homa Bay District to the west.

Table III. Statistically Significant High-Risk Clusters and Low-Risk Areas of eBL Identified with the Satscan Spatial Scan Test Statistic, Nyanza Province, 1999–2004
TypeNo.Reported1 casesExpected casesRelative riskNo. of locationsp-value
  • 1

    Cumulative case totals over 6-year study period.

High-Risk Cluster1182.647.3030.0001
Low-Risk Area1023.450.00200.0001
Figure 5.

Significant high-risk clusters and low-risk areas of eBL detected with the SaTScan spatial scan test statistic, Nyanza Province, 1999–2004.

Neither cluster detection method identified significant high-risk areas within Siaya District. Only the Western Kisumu District cluster included administrative units bordering Lake Victoria in both GeoDa and SaTScan. Although several administrative units on the southwest border with Tanzania appeared to have elevated risk, these were not identified as a statistically significant cluster in either cluster detection method.


Study population

Our study detected significant spatial clustering of eBL in Nyanza Province, a high-risk eBL region in Western Kenya. These findings were based on place of residence information obtained from children who reached Nyanza Provincial General Hospital and were diagnosed with eBL. The dramatic nature of eBL presentation suggests that most cases will be referred to the Provincial hospital when treatment is available. Moreover, that our study included 4 cases from Kuria District in the southeastern-most region of Nyanza Province, as well as 9 cases from Muhuru Division in Migori District with some of the most isolated communities in southwest Nyanza, indicates that we have captured the majority of cases occurring in the Province during the study period. Similar case demographics between our study and others support this assertion; the median disease onset age (7.0 years), approximate equal gender ratio and similar ethnic distribution (93% Luo) in our study were also consistent with those from other studies in Kenya and a follow-up study in Northern Tanzania.15, 20 The exception may be for the year 2002 when treatment was not available and for the few cases for which residence information at the 4th administrative level was not available. Our conclusions, therefore, assume a similar geographic distribution for undiagnosed eBL cases which may have occurred during 2002 and for cases without high resolution residence information.

Spatial clustering

Two cluster detection methods were used to test the hypothesis that eBL spatially clusters in Nyanza Province. Both methods detected similar and significant high-risk clustering and low-risk areas, suggesting that our results are robust. One important eBL co-factor, Epstein-Barr virus infection, is evenly distributed in the population and infection occurs early in childhood.21, 22 Our consistent findings, therefore, may imply clustering of other environmental or socio-cultural eBL co-factors, although potential reporting bias must also be assessed. Neither high-risk cluster included administrative units in Siaya District, as initially hypothesized. Whether eBL clusters near Lake Victoria was more difficult to assess. Only 1 high-risk Lakeside cluster was identified by both cluster detection methods.

Morrow proposed that geographic variability in eBL incidence reflects differences in malaria transmission intensity.5 Because of the lack of local-level transmission data for our analysis, we cannot rule this out. Detecting highly significant areas of low eBL risk in the highlands does support this hypothesis. Elevation for Locations in the highlands ranges from 1,475 to 2,205 m. Although malaria can occur throughout the region, transmission is generally more seasonal (March to July) in the highlands when compared to the lowland areas near Lake Victoria. Transmission above 1,600 m is rare.23, 24 Locations included in the most likely low-risk area identified by SaTScan had minimum elevations of roughly 1,580 m or more.

The most significant high-risk cluster identified with both methods involved administrative units situated in the western region of Kisumu District, bordering but not including Locations in urban downtown Kisumu. This cluster includes the major transportation route from Siaya and Bondo Districts to Kisumu city and Nyanza Provincial General Hospital. Proximity to this transportation route may have increased case recognition from these Locations. Elevated eBL incidence, however, was not observed in administrative units directly east of Kisumu city, despite similar road conditions. Although the Kenya Medical Research Institute is also located within this cluster, it is unlikely that non-eBL related research conducted in neighboring communities influenced eBL recognition and health service utilization. These factors (i.e., major transportation routes and the Kenya Medical Research Institute) are not present in the majority of high-risk cluster administrative units located in Nyando District.

Statistically significant clustering in the Province, including in areas where malaria is holoendemic, provides evidence that other co-factors may be involved. These findings may be either linked to unique geographic features of the area, related to shared socio-cultural practices or both. Nyando District, for example, is a rural and economically poor region prone to seasonal flooding. While flooding may result in increased mosquito breeding habitat and malaria transmission, it may also be associated with temporary displacement of people and crop destruction. Both frequent relocation25 and poor nutrition (including protein deficiency)26 have been associated with increased risk in previous studies.

Additionally, Euphorbia tirucalli, a plant with tumor promoting properties,27 grows abundantly throughout the Province, but its use in household, medicinal and play time activities may vary.28, 29 This plant is frequently used for a variety of purposes, including firewood, in and near the Kisumu and Nyando District high-risk clusters. The fact that both these cluster areas have also been greatly impacted by the HIV/AIDS epidemic cannot be ignored. All eBL cases included in our study were HIV negative. Nevertheless, parental death due to HIV/AIDS or other cause may have a significant impact on family structure and health status. No sibling cases were observed during the 6-year study period, suggesting that genetic factors are unlikely to explain eBL clustering in the Province.

Local Moran and spatial scan test statistics

This analysis used 2 cluster detection methods. The spatial scan test statistic assesses disease distribution using centroids and a circular scan. Certain geographic features, such as Lake Victoria, may not be considered, and cluster boundaries, due the number of overlapping circular windows, may not be well defined.30 Anselin's Local Moran test for spatial autocorrelation, on the other hand, is less sensitive to the unique geographic features of Nyanza Province, but examines clustering according to user defined neighbor relationships. Despite these different attributes, both cluster detection methods identified significant high-risk clusters and low-risk areas in similar geographic locations. Consistent results using these 2 methods, in addition to 6-years of eBL case data, and a rate smoothing technique suggest that these results are robust. High resolution malaria transmission data will soon be available. Determining whether these spatial clusters occur independently from malaria transmission intensity, as well as over different time periods, is an important next step. Improving our understanding of the geographic distribution of eBL will ultimately provide crucial information on the role of environmental and/or socio-cultural factors in the complex causal mechanisms underlying eBL etiology.

Nyanza Province continues to represent a high-risk region for eBL with an average annual IR of more than 2 cases per 100,000 children. Our study was the first to rigorously investigate a fine-scale spatial distribution of eBL in this region. Two different cluster detection methods identified statistically significant high-risk clusters and a large low risk region. Most noteworthy are the low-risk areas observed in the highland region and the 2 high-risk clusters situated in Kisumu and Nyando Districts. The etiology of eBL is complex and multi-factorial. Significant spatial clustering of elevated eBL risk in high-malaria transmission regions and of reduced incidence where malaria is infrequent suggests that malaria plays a role in eBL etiology, but that additional co-factors may also be involved. Future studies are needed to evaluate these eBL clusters, and to determine if they are associated with specific environmental and/or social-cultural characteristics.


We thank the Kenya Medical Research Institute in Kisumu, Kenya, for the opportunity to conduct this study. The success of this project also relied on the dedicated work of Dr. Juliana Otieno, chief pediatrician, and Dr. Margaret Odour, Department of Pathology, both of Nyanza Provincial General Hospital. Dr. Otieno diagnosed and treated the Burkitt's lymphoma cases included in this study, and also provided essential support with patient records and case histories. Dr. Odour worked diligently to assure quality examination of all laboratory specimens. We also thank Dr. Robert Snow at the Kenya Medical Research Institute/Wellcome Trust Collaborative Programme in Nairobi for access to population and digitized map data for Nyanza Province, and Mr. Scott Swan at the University of Michigan Center for Statistical Consultation and Research for assistance with importing data into a geographic information system. This research was approved by the Director of the Kenya Medical Research Institute.