Authors Dr Jim Wright (corresponding author) and Dr Stephen Gundry, Water and Environmental Management Research Centre, University of Bristol, 83 Woodland Road, Bristol BS8 1US, UK. Tel.: +44 117 954 5289; Fax: +44 117 954 5389; E-mail: email@example.com, firstname.lastname@example.org Mr Ronan Conroy, Department of Epidemiology and Public Health Medicine, Royal College of Surgeons in Ireland, Mercer Building, Mercer Street Lower, Dublin 2, Ireland. E-mail: email@example.com
Objective To assess the extent and causes of microbiological contamination of household drinking water between source and point-of-use in developing countries.
Methods A systematic meta-analysis of 57 studies measuring bacteria counts for source water and stored water in the home to assess how contamination varied between settings.
Results The bacteriological quality of drinking water significantly declines after collection in many settings. The extent of contamination after water collection varies considerably between settings, but is proportionately greater where faecal and total coliform counts in source water are low.
Conclusions Policies that aim to improve water quality through source improvements may be compromised by post-collection contamination. Safer household water storage and treatment is recommended to prevent this, together with point-of-use water quality monitoring.
Water-related diseases continue to be one of the major health problems globally. An estimated 4 billion cases of diarrhoea annually represented 5.7% of the global disease burden in the year 2000 (WHO 2002). One of the major strategies for tackling this problem is the installation of protected sources such as boreholes, standpipes or wells to provide water of better quality. However, such communal facilities are located some distance from the home, requiring collection and transport from the source and subsequent storage of water within the household. It has frequently been observed that the microbiological quality of water in vessels in the home is lower than that at the source, suggesting that contamination is widespread during collection, transport, storage and drawing of water (Van Zijl 1966; Lindskog & Lindskog 1988). This contamination may lessen the health benefits of water source improvements.
This paper draws together the evidence from all studies of household water contamination between source and point-of-use and identifies how water contamination varies between the different study settings. Reported changes in water quality between source and point-of-use are summarized through a literature review and water quality changes investigated in relation to study design, quality and setting characteristics.
Criteria for inclusion
The review considers only field-based studies in developing countries where water is transported from a source outside the home and then stored within the household. Findings about water quality in non-domestic or emergency settings, such as hospitals, schools or refugee camps, are excluded from the review. The review is further restricted to microbiological measures of contamination and excludes chemical aspects of water quality. Studies using more unusual microbiological testing measures (e.g. somatic coliphages) are not included in the review because there are too few studies to make meaningful generalizations. The review is therefore limited to coliform bacteria (total coliforms, faecal coliforms and Escherichia coli). The review is only concerned with field-based studies and does not cover any laboratory-based findings.
The included studies reported water quality results at source and at point-of-use for the sampled households in developing country communities. The results of these studies are usually presented as an aggregate measure e.g. mean E. coli per 100 ml, proportion of point-of-use samples with faecal coliforms, etc. Sample sizes are available for most studies.
Search strategy for identification of studies
The search for relevant literature was primarily through on-line bibliographic databases (PubMed; Web of Science and the African Index Medicus). Key words included ‘water + coliform + household’, ‘water + coliform + storage’, ‘water + coliform + vessel’, ‘water + coliform + stored’ and the same combination of search words using ‘coli’ instead of ‘coliform’. Abstracts highlighted by key word searches were scanned for relevance and papers photocopied as necessary. The so-called ‘ancestry approach’, in which references from key papers are systematically traced, was also used for two previous review papers (Vanderslice & Briscoe 1993; Mintz et al. 1995). In addition, the Journal of Diarrhoeal Diseases Research was hand-searched for relevant articles. Non-peer reviewed publications were also hand-searched, including the ‘WEDC’ conference series from 1994 onwards, ‘Waterlines’ and ‘Dialogue on Diarrhoea’. In terms of language, the search was restricted to papers published in English, French or Spanish and in terms of publication date, to papers published before 2003. In addition to findings from the literature, unpublished results from our own fieldwork in Venda district, South Africa are also included in the analysis of contamination between source and point-of-use.
Abstracts and other details of relevant studies were collated in a bibliographic database. A spreadsheet was also created for each of the three study types described above. Each spreadsheet comprised a series of headings against which the characteristics of each study could be recorded. Study characteristics were in all cases recorded by the lead author in these spreadsheets. Once the spreadsheet had been compiled, it was exported to a statistical package (Stata) for analysis.
In eight studies, results were published in too brief a form to be usable for the review. In such cases, where work had been published since 1980, an attempt was made to contact the lead author of the publication to obtain more detailed study results.
We used a set of statistical techniques known as meta-analysis to investigate the results of the observational studies identified through the literature review. However, the statistical combination of data from observational studies is known to produce spurious results (Egger et al. 1998) and therefore no attempt was made to assess the overall, typical contamination pattern across all these studies. Rather, the meta-analysis used here investigated how bacteriological contamination varied between the studies and to what extent this variation was explained by study characteristics. We used a five-stage approach to the analysis.
(i) Identify study characteristics
The studies could be differentiated by their different settings, their basic design and their perceived quality. Each of the different characteristics within these three headings has been suggested as an explanatory variable for observed differences in contamination.
Source contamination: The percentage of samples testing positive for indicator bacteria may decrease after collection from highly contaminated sources because of die-off as bacteria compete for limited oxygen and nutrients in the water (Momba & Notshe 2003). Conversely, the percentage of positive samples may increase after water is collected and stored from safe sources because of contamination through hands, unwashed containers and dippers. The geometric mean indicator bacteria count and percentage of samples positive for such bacteria were therefore used to measure source contamination.
Use of protected water source: At least one study (Lindskog & Lindskog 1988) has shown that households may assume that water from protected sources is safe and boil it less often.
Sanitation: Where basic sanitation is lacking, there is more likelihood of indicator bacteria from faeces being introduced into stored water (Ologe 1989).
Use of uncovered vessels: Using uncovered water containers is likely to increase water contamination between source and point-of-use as hands are dipped into vessels to scoop a cupful of water (Chidavaenzi et al. 1998).
Storage vessel material: In one study, earthenware vessels showed significantly higher levels of contamination (Vanderslice & Briscoe 1993).
Urban vs. rural: The study location (urban or rural) was also assessed, as the urban studies generally took place in high density areas where poor environmental health might increase post-collection contamination of water.
Our characterization of the different studies was limited by the data collected in the original fieldwork. Variables such as the time since water was collected (Roberts et al. 2001), the method of extracting water (Vanderslice & Briscoe 1993), and the use of separate vessels for storage and transport (Lindskog & Lindskog 1988) have also been shown to affect water recontamination. However, such characteristics were recorded only in a minority of the studies we reviewed and so could not be adequately analysed in this review.
Disease-oriented study: Studies were identified that concentrated sampling effort on households with cases of disease such as diarrhoea (typically through a case control design). By concentrating on households where water-borne disease was present, such studies seemed more likely to identify higher levels of water contamination between source and point-of-use.
Intervention-related study: In some instances, we used baseline data on water contamination from intervention studies involving new water sources, sanitation or hygiene education. Where no baseline data were collected, we used control group data. This characteristic was also recorded, so that we could test whether contamination patterns were different in such study populations (e.g. because villages with particularly unhygienic environments were more likely to be selected for intervention studies).
Whether or not the study was published in a refereed journal was used as a measure of the quality of each of the studies. These characteristics were subsequently used in a meta-regression analysis as described in (v) below.
(ii) Indicator bacteria used
All of the studies analysed used one or more of the following three indicator bacteria: (1) total coliforms which are Gram-negative bacteria that ferment lactose at 35–37 °C within 24–48 h; (2) faecal thermo-tolerant coliforms which are a subset of total coliform bacteria that ferment lactose at 44–45 °C and (3) E. coli which are exclusively faecal in origin, are a sub-group of the faecal coliforms that produce the enzyme B-galactosidase and not urease. WHO guidelines state that none of these bacteria should be detectable in a 100-ml water sample (WHO 1997). Of these bacteria, E. coli are regarded as the most reliable indicator of faecal contamination and total coliforms as the least reliable indicator.
(iii) A common basis for measuring contamination
Although it was possible to categorize studies by characteristic and indicator bacteria, methods of reporting indicator bacteria counts varied depending on the quality of water under investigation. High bacteria count studies were typically where over 90% of source samples tested positive for the bacteria in question, in which case counts were generally reported as geometric means. It was thus decided to split the studies in each characteristic-indicator pair according to whether bacteria counts were high or low and use two different techniques to bring each to a common basis of measurement. These techniques were: in settings where bacterial counts were low, log odds ratios were calculated using the Dersimonian and Laird random effects method from the numbers of positive and negative samples at source and point-of-use. Where most or all of the water samples in a study were contaminated, standardized mean differences (Egger et al. 1997b) were calculated from the means and standard deviations of the log-transformed bacteria counts using the Cohen method. The standardized mean difference is the difference between the mean bacteria count at source and point-of-use, divided by the pooled standard deviation for both sets of water samples. For most studies, log-transformed counts were used because the distribution of bacteria counts in water is generally positively skewed (exceptions are noted in Table 1).
Table 1. Characteristics of studies included in the review, grouped by type of indicator bacteria
% with adequate sanitation
% with covered vessels
% with earthen- ware vessels
% with protected sources
% source samples contaminated
Sampling focuses on disease cases
Published in referred journal
A, lower limit of detection greater than one bacterium per 100 ml; B, sample size not specified at source, point-of-use or both. Lowest possible sample size used that could account for percentage of contaminated samples; C, arithmetic mean used to describe bacteria counts, so normal distribution assumed instead of log-normal distribution; D, no sample sizes or measures of dispersion provided for bacteria counts; E, excludes households boiling water; F, bacteria tested per 50 ml, not per 100 ml; G, number of source and/or point-of-use samples not specified; H, no measure of dispersion provided for bacteria counts; I, no measures of central tendency, means or samples sizes (passing reference to water contamination only); N, no; Y, yes.
(iv) Testing for significant variation between studies
From stage (ii) to (iii), we grouped the available studies into six categories according to the three types of indicator bacteria and two levels of contamination. As there were no ‘high bacteria count’ studies of E. coli, one category was effectively empty leaving a total of five categories of studies. For each of these groups, we tested to see if there were significant differences in contamination patterns between studies using a heterogeneity test.
A separate analysis was performed for each of the three indicator organisms because their abundance in water and their origins may be different. For example, total coliforms are known to originate from decaying vegetation whereas E. coli are not and consequently patterns of contamination may vary between the three types of organism. Furthermore, total coliforms are more numerous than faecal coliforms, which are more numerous than E. coli and so bacteria counts vary for each indicator. ‘High’ and ‘low’ bacteria count studies were analysed separately because the method of assessing the change in water quality was different for these two types of study.
(v) Explaining the variation in contamination between studies
Where there were significant differences in contamination between studies, we used a technique known as meta-regression to see whether the study characteristics could account for such differences. For each of the five groups of studies, we examined whether the study design, setting and quality characteristics described in (i) above could account for differences in contamination patterns. Meta-regression based on the method of restricted maximum likelihood was used to account for possible differences in the log odds ratios and standardized mean differences between the included studies (Thompson & Sharp 1999). Because significant heterogeneity was identified in (iv) above, a random effects regression model was used. This model assumes that the studies are not drawn from the same population.
We also tested for publication bias, which occurs where smaller studies are more likely to be published if their results are significant. In this context, studies based on limited numbers of water samples might be more likely to be published when they showed significant deterioration in water quality from source to point-of-use. The resulting data sets were therefore tested for publication bias using Egger's test (Egger et al. 1997a).
Figure 2a shows the odds ratios derived from the proportion of samples contaminated with E. coli at source and point-of-use [the ‘low bacteria count’ studies in (iii) above]. Figure 2b,c show the odds ratios for faecal and total coliforms. In Figure 2, odds ratios above 1 indicate that water quality is worse at point-of-use, whilst values below 1 indicate that it is worse at source. Figure 3 shows the standardised mean difference in faecal and total coliforms between source and point-of-use [the ‘high bacteria count’ studies in (iii) above]. Standardized mean differences above 1 indicate that water quality is worse at point-of-use, whilst values below 1 indicate that it is worse at source. In both Figures 2 and 3, the change in water quality recorded in a study is significant if the error bars do not pass through 1.
Figures 2 and 3 suggest that the proportion of contaminated samples and geometric mean bacteria count were significantly greater at point-of-use in approximately half of the studies reviewed. Regardless of the indicator bacteria, there were no studies where the geometric mean bacteria count or proportion of contaminated samples was significantly lower at point-of-use.
Small-scale studies showing significant levels of contamination between source and point-of-use were no more likely to be published than those showing no significant contamination. This was evident from Egger's test, which was not significant for any bacteriological indicator.
The heterogeneity test suggested that there was significant variability between different studies for all three types of indicator bacteria and both statistical measures of contamination between source and point-of-use.
Table 2 shows the meta-regression results for the three types of indicator bacteria for ‘high’ and ‘low bacteria count’ settings. In each case, the regression model that explained the greatest amount of variation between studies is presented. In a bivariate meta-regression analysis, study quality, as measured by publication in a refereed journal, was not significantly related to the change in water quality for any of the indicator bacteria. Consequently, studies published in refereed journals were pooled with 11 studies published elsewhere in the meta-regression analysis shown in Table 2.
Table 2. Meta-regression results for studies of microbiological water quality at source and point-of-use
General level of water quality
Study characteristic(s) in best model
Number of studies
% of heterogeneity explained
Residual heterogeneity (τ2)
* Indicates co-efficient significant at the 99% level.
† Indicates a co-efficient significant at the 95% level.
τ2 is an estimate of the between-studies variance, having adjusted for the study characteristics shown in the table. This estimate assumes that the residual errors are normally distributed. The percentage of heterogeneity explained indicates the reduction in between-studies variance after adjusting for the study characteristics.
Low bacteria counts
Low bacteria counts
Proportion using covered water vessels
Low bacteria counts
Percentage of contaminated source samples
High bacteria counts
Geometric mean bacteria count at source
High bacteria counts
Geometric mean bacteria count at source
For E. coli, the variation between study results was not explained by any of the study setting, design or quality characteristics. For ‘low bacteria count’ studies of faecal coliforms, all of the variation between study results was explained by the proportion of households using covered water storage vessels.
For the ‘high bacteria count’ studies of both total and faecal coliforms, contamination after collection was significantly higher when the geometric mean bacteria count for source water was low. The percentage of contaminated source samples was also associated with higher levels of contamination for ‘low bacteria count’ studies of total coliforms. This can be seen from Figures 2c and 3 in which study results have been ordered according to the level of source contamination. Studies with safe water sources appear at the top of these figures and poorer quality sources at the bottom. Water quality deterioration from source to point-of-use, as measured by the standardized mean difference or log odds ratio, is greater for the studies of uncontaminated water sources. The total coliform studies with good quality source water were generally in urban settings and most were designed to sample households with disease cases more intensively. Consequently, both these variables were also significantly related to higher levels of contamination when entered into separate regression models for total coliforms.
Water contamination between source and point-of-use
Most observational studies of change in the microbiological quality of water at source and point-of-use indicate a decline after collection, although there is significant variation between settings. The results in Figures 2 and 3 suggest that approximately half of the included studies identified significant contamination after collection. There were no instances where microbiological water quality improved significantly after collection. The decline in water quality between source and point-of-use measured in terms of faecal and total coliforms is proportionately greater where source water is largely uncontaminated. These are often ‘improved’ water sources, such as wells and communal standpipes. For such sources, safer household water storage (Chidavaenzi et al. 1998) may be an appropriate additional intervention to prevent contamination of domestic water. If water testing is performed only at sources in such settings, then results of monitoring may not reflect the quality of water actually consumed in the home.
The percentage of point-of-use samples contaminated with faecal coliforms was also lower where households generally covered their water containers. This reduction in contamination by covering vessels implicates hands and cups being dipped in water as a probable source of contamination. This observation is supported by intervention studies, which have found that covered vessels reduce faecal and total coliform counts in stored water by 50% (Chidavaenzi et al. 1998; Mazengia et al. 2002).
Our review assessed the aggregate, community-level changes in water quality across multiple households and not water quality changes at the individual household level. It should be borne in mind that even where the typical household experiences poorer water quality at point-of-use, there are likely to be a minority of households that do not conform to this general trend within a population. For example, Vanderslice and Briscoe (1993) found that water quality actually improved between source and point-of-use for 16% of households in their study, although the majority experienced the opposite trend.
Nearly all studies used total coliforms, faecal coliforms or E. coli as an indicator of faecal contamination, reflecting available water testing technology in most developing countries. Of these indicator bacteria, E. coli are regarded as the most reliable measure of public health risks in drinking water (Edberg et al. 2000). Total coliforms can originate from decaying vegetation in tropical areas and so do not necessarily indicate the presence of pathogens in water. Similarly, faecal coliforms are now often referred to as ‘thermo-tolerant’ coliforms because many may be non-faecal in origin. In our analysis, the most useful indicator of faecal contamination in point-of-use water, E. coli, was thus the least predictable.
It is possible that the studies reviewed here may be subject to two types of systematic bias. One possibility is that subjects, on realising that water samples are being taken from neighbouring homes, modify their behaviour accordingly and become more careful in their handling and storage of water. Such a change in behaviour would mean that the studies here systematically underestimate the extent of contamination after collection. Secondly, the studies reviewed here are generally based on two sets of samples, those taken in the home and those taken at source. The source for each household is often identified by asking a subject about the water sources used that day. Our own recent fieldwork in Zimbabwe suggests that study subjects may be reluctant to admit to using an unprotected source, as they realise that such sources are regarded as hazardous to health. In this event, where subjects claim to use protected sources but are actually drawing water from unprotected sources, the extent of post-collection contamination is likely to be overestimated. Furthermore, the recommended sterilization of taps and boreholes by ‘burning off’ can lead to an underestimate of source contamination that may occur as water is pumped (Mertens et al. 1990).
The number of study characteristics examined in this review was limited by the findings reported in the literature. There may be other unreported, confounding factors in communities using safer sources (e.g. higher population densities and greater overcrowding) that could account for greater contamination of stored water there. In some studies, it was also unclear whether the number of samples taken from different source types reflected the number of households using each source type. This would also affect estimates of water quality change between source and point-of-use. Furthermore, sample sizes and measures of dispersion (e.g. geometric standard deviations) for coliform distributions need to be documented when reporting results. Without such key information, omitted in several papers reviewed here, comparison of findings between studies is difficult.
It is clear that the microbiological processes occurring within the transport and storage vessels are complex, given the interaction of the biota in the collected water with biofilms in the containers and/or recontamination through dipping hands and cups into containers. Future research is required to understand these processes in more detail and also to assess how the storage period affects point-of-use water quality. We also recommend that future studies record turbidity as this may indicate the presence of organic matter, a major influence on regrowth or die-off of micro-organisms. Given that total and faecal coliforms are no longer regarded as reliable indicators of faecal contamination, it is also recommended that E. coli be assayed in future studies of recontamination of domestic drinking water.
Microbiological contamination of water between source and point-of-use is widespread and often significant. Increased faecal and total coliform counts in stored domestic water are especially found in urban areas with uncontaminated supplies. The results imply that samples taken from storage vessels may provide a better reflection of the quality of water consumed than source samples, particularly in urban areas with safe water sources. Given these findings, future research is required into ways to combat such contamination at point-of-use if the health benefits of improved sources are not to be compromised.
This research was funded by the European Union INCO-DEV programme (contract no. ICA4-CT-2000-30039). The authors wish to acknowledge the assistance of Dr Prabhat Vaze of the Office for National Statistics for technical guidance and Kevin Murray for comments on chemical analysis of water samples. We also acknowledge Misheck Kirimi of the Network for Water and Sanitation International (Kenya), Kirsty Forbes and Hannah Bird for their assistance in compiling literature. The authors also wish to acknowledge the assistance of numerous authors contacted, who provided further information for the review here and comments by two anonymous reviewers.