Community-based validation of assessment of newborn illnesses by trained community health workers in Sylhet district of Bangladesh

Authors


Corresponding Author Abdullah H. Baqui, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Suite E-8138, 615 N. Wolfe St., Baltimore, MD 21205, USA. E-mail: abaqui@jhsph.edu

Summary

Objectives  To validate trained community health workers’ recognition of signs and symptoms of newborn illnesses and classification of illnesses using a clinical algorithm during routine home visits in rural Bangladesh.

Methods  Between August 2005 and May 2006, 288 newborns were assessed independently by a community health worker and a study physician. Based on a 20-sign algorithm, sick neonates were classified as having very severe disease, possible very severe disease or no disease. The physician’s assessment was considered as the gold standard.

Results  Community health workers correctly classified very severe disease in newborns with a sensitivity of 91%, specificity of 95% and kappa value of 0.85 (P < 0.001). Community health workers’ recognition showed a sensitivity of more than 60% and a specificity of 97–100% for almost all signs and symptoms.

Conclusion  Community health workers with minimal training can use a diagnostic algorithm to identify severely ill newborns with high validity.

Abstract

Validation basée sur la communauté, de l’évaluation des maladies du nouveau-né par des agents de santé communautaire formés dans le district de Sylhet au Bangladesh

Objectifs:  Valider la reconnaissance des signes et symptômes des maladies du nouveau-né et la classification des maladies par les agents de santé communautaire formés, en utilisant un algorithme clinique au cours des visites de routine à domicile dans les zones rurales du Bangladesh.

Méthodes:  Entre août 2005 et mai 2006, 288 nouveau-nés ont étéévalués indépendamment par un agent de santé communautaire et un médecin de l’étude. Sur base d’un algorithme à 20 signes, les nouveau-nés malades ont été classés comme ayant: une maladie très grave, une possible maladie très grave ou pas de maladie. L’évaluation du médecin a été considérée comme référence.

Résultats:  les agents de santé communautaire ont correctement classé les maladies très graves chez les nouveau-nés avec une sensibilité de 91%, une spécificité de 95% et une valeur Kappa de 0,85 (p <0,001). La reconnaissance par les agents de santé communautaires a démontré une sensibilité de plus de 60% et une spécificité de 97 à 100% pour presque tous les signes et les symptômes.

Conclusion:  les agents de santé communautaire avec une formation minimale peuvent utiliser un algorithme de diagnostic pour identifier les nouveau-nés gravement malades avec une validitéélevée.

Abstract

Validación basada en la comunidad de la evaluación de enfermedades en recién nacidos realizadas por trabajadores sanitarios comunitarios del distrito de Sylhet en Bangladesh

Objetivos:  Validar el reconocimiento de signos y síntomas de enfermedades neonatales realizado por trabajadores sanitarios comunitarios entrenados y la clasificación posterior de la enfermedad utilizando un algoritmo clínico durante visitas rutinarias a hogares rurales de Bangladesh.

Métodos:  Entre Agosto 2005 y Mayo 2006, 288 recién nacidos fueron evaluados de forma independiente por un trabajador sanitario y un médico involucrado en el estudio. Basándose en un algoritmo de 20 signos, los neonatos enfermos fueron clasificados como teniendo una enfermedad muy grave, una posible enfermedad muy grave y sin enfermedad. La evaluación del médico se consideró el criterio de referencia.

Resultados:  Los trabajadores sanitarios comunitarios clasificaron de forma correcta las enfermedades muy graves en recién nacidos con una sensibilidad del 91%, especificidad del 95%, y valor kappa del 0.85 (p<0.001). El reconocimiento de los trabajadores sanitarios comunitarios mostró una sensibilidad de más del 60% y una especificidad del 97-100% para la mayoría de signos y síntomas.

Conclusión:  Los trabajadores sanitarios comunitarios con un entrenamiento mínimo pueden utilizar un algoritmo diagnóstico para identificar a neonatos severamente enfermos.

Introduction

Every year, an estimated 4 million newborns die globally within the first month of life; 99% of those deaths occur in the developing world (Lawn et al. 2005; World Health Organization 2006). Serious infections, such as sepsis and pneumonia, account for up to 50% of neonatal deaths in high mortality settings (Black et al. 2003; Bryce et al. 2005; Lawn et al. 2005). Many of these deaths could be prevented by improving early recognition of newborn illnesses and access to appropriate and timely treatment (Bhutta et al. 2005; Darmstadt et al. 2005; Lawn et al. 2005).

In response to the need for improved illness recognition in the post-natal period, WHO expanded its Integrated Management of Childhood Illness (IMCI) strategy to include diagnosis and management of severe illnesses in infants less than 2 months of age (World Health Organization & UNICEF 2005). Both the original IMCI guidelines and newer young infant guidelines were designed and tested at first-level health facilities by professional health workers (Kalter et al. 1997a,b; Kolstad et al. 1997; Perkins et al. 1997; Simoes et al. 1997; Weber et al. 1997a,b; Young Infants Clinical Signs Study Group 2008; Gupta et al. 2000; Narang et al. 2007). In limited resource settings where most births occur in the home attended by untrained providers and care-seeking is low, minimally trained community health workers (CHWs) can play a vital role in reducing neonatal mortality by improving illness recognition and access to medical treatment (Bang et al. 1999; Bhutta et al. 2005; Haines et al. 2007; Baqui et al. 2008b; Kumar et al. 2008).

Limited evidence from developing countries suggests CHWs working outside formal health facilities can implement IMCI-type algorithms to identify serious illnesses in children (Simoes & McGrath 1992; Zeitz et al. 1993; Hadi 2001; Kelly et al. 2001; Kahigwa et al. 2002; Winch et al. 2005; Kallander et al. 2006). However, few studies have validated the ability of CHWs to recognize clinical signs and classify diseases in newborns in the first month of life in the home or at the community level (Bang et al. 2005; Mullany et al. 2006; Darmstadt et al. 2009).

We report the findings of a study conducted to compare the performance of CHWs with 6 weeks of training in assessing neonates using an IMCI-type algorithm, with that of physicians using the same clinical algorithm.

Methods

Study participants and design

The CHW validation study was nested within a community-based cluster randomized controlled trial in rural Bangladesh with three arms: Home Care, Community Care and comparison (Baqui et al. 2008b). The trial was implemented in a population of about 480 000 in three sub-districts of Sylhet district. Sylhet has the highest neonatal mortality rate of Bangladesh’s six divisions and at baseline in 2002, 91% of births occurred at home and only 22% of newborns received a check-up within the first 30 days of life (National Institute of Population Research and Training (NIPORT) et al. 2001;Baqui et al. 2008a). In the Home Care arm, a female CHW provided home-based preventive and curative maternal-neonatal health care to a catchment population of about 4000. In the Community Care arm, community mobilizers provided the same information on maternal and newborn care during group education sessions rather than home visits. We previously reported a 34% neonatal mortality reduction in the Home Care arm, from 46.9 deaths per 1000 at baseline to 29.2 during the last 6 months of the 30-month intervention (Baqui et al. 2008b).

CHWs in the Home Care arm made two antenatal home visits to promote birth and newborn care preparedness and three post-natal home visits to check the health of the newborn on day 1 (day of birth), day 3 and day 7 of life using a standardized newborn assessment form. CHWs used a 20-sign clinical algorithm adapted from Bangladesh’s IMCI algorithm to classify sick neonates with very severe disease (VSD) or possible very severe disease (PVSD) (Figure 1). Neonates were classified with VSD if one or more of the following eight signs or symptoms were present: (i) observed convulsions, (ii) unconsciousness, (iii) fast breathing (respiratory rate of ≥60 breaths per minute), (iv) severe chest in-drawing, (v) fever (temperature ≥38.3 °C), (vi) hypothermia (temperature ≤ 35.3 °C), (vii) many or severe skin pustules or blisters on single large area, or pus or redness with swelling or (viii) umbilical redness extending to skin. Neonates were classified with PVSD if one or more of the following twelve signs or symptoms were present: (i) history of convulsion, (ii) bulging fontanelle, (iii) vomiting after every feed, (iv) mild fever (temperature 37.8–38.4 °C), (v) mild hypothermia (temperature 35.3–36.4 °C), (vi) weak, abnormal or absent cry, (vii) lethargic, less than normal movement, (viii) not able to feed, suck or attach to breast, (ix) umbilicus discharging pus, (x) umbilical redness not extending to skin, (xi) some skin pustules or (xii) jaundiced palm and sole after 1 day of life. Newborns with VSD or PVSD were referred for free treatment to government sub-district hospitals. CHWs offered home antibiotic treatment to VSD cases and PVSD cases with more than one sign if the family was unable to take the sick newborn to the hospital and made a follow-up visit within the next 24 h to monitor the infant and reinforce referral. We previously reported that CHW home treatment of severe illness in neonates was as effective as treatment by medically qualified providers (Baqui et al. 2009). Adjusted hazard ratio for death during the neonatal period was 0.22 with 95% confidence interval (CI) 0.07–0.71 for CHW treatment and 0.61 (95% CI 0.37–0.99) for treatment by qualified providers, compared to newborns who received no treatment or treatment by untrained providers.

Figure 1.

 Clinical algorithm for assessment and classification of sick neonates by CHWs in Sylhet.

Community health workers

Women with at least a 10th grade education were recruited as CHWs via advertisements in local newspapers, community gathering places and educational institutions. A preliminary group of 41 CHWs was selected to undergo training after a formal interview and final placements were made based on successful completion of testing at the end of training. Mean age of CHWs at time of recruitment was 23 (standard deviation ± 4 years, range 18–46) and more than 60% of CHWs were single, divorced or separated. CHWs received 6 weeks of training, including skills development for behaviour change communication, clinical assessment of neonates, treatment of newborns with injectable antibiotics and record-keeping necessary for the trial. The training also included hands-on clinical training under supervision in a tertiary care hospital and in households. A 3-day refresher training was conducted midway through the 30-month intervention period. CHWs received ongoing training and support from five female field services supervisors (one for every eight CHWs). The supervisors accompanied CHWs during home visits to support their work and evaluate performance using a structured checklist at least 2 days a month. Supervisors provided feedback to CHWs at fortnightly meetings and discussed job responsibilities, difficulties encountered in the field and solutions to those problems. About half of the CHWs originally recruited for the trial worked for the entire 30 months of the intervention, and replacement CHWs were recruited and trained following a similar process as needed. CHW attrition and replacement were primarily due to CHWs’ leaving to get married or to take another job; additional analysis and discussion of attrition will be published separately.

Sample size and study procedures

The CHW validation study was conducted in the Home Care arm of the trial from August 2005 to June 2006. The sample size was calculated to provide estimates of sensitivity and specificity of CHWs’ assessment assuming project physicians’ assessment as the gold standard. Physicians had an MBBS degree and received training to standardize their use of the algorithm for newborn assessment. To estimate assumed sensitivity and specificity of 80% with precision of ±15%, 95% significance level and 80% power, 150 well and 150 sick babies (i.e. those having VSD or PVSD) were required. Since the prevalence of VSD and PVSD was about 10% in the community, we needed to oversample sick newborns. To recruit neonates randomly and yet to recruit about equal numbers of well and sick newborns, we used the following sampling method. We randomly selected a CHW supervisor every day (n = 5) and asked the supervisor to prepare a list of well and sick newborns in the eight CHW areas under her supervision by mid day. If there was one or more sick newborns identified, then the first sick newborn identified was selected. If there was no sick newborn in her area, then the first well newborn from her area was selected. The CHWs were not informed of the selection. After the CHW completed her scheduled visit, the supervisor notified a study physician by cell phone who reassessed the selected newborn within 3–6 h using the same clinical algorithm, either in the home or at hospital if newborn was successfully referred. The CHW who assessed the selected newborn was not present during the physician’s assessment. CHWs were given feedback on the results of the two assessments at their regularly scheduled fortnightly meetings by supervisors.

Data

A total of 288 neonates participated in the validation study and mean age at assessment was 9 days (standard deviation ± 6.9, range 1–28); 53% of assessments were performed during the first week of life. Forty-one CHWs active during the validation study participated, 31 with at least 1 year of work experience and the others with less. Each CHW completed an average of seven assessments (standard deviation ± 4.4, range 1–17). One of the two physicians conducted 60% of the re-assessments.

Statistical analysis

All 288 neonates with consecutive assessments by CHW and physician were analysed. Agreement between CHWs’ and physicians’ assessments was evaluated by kappa statistic, with values of <0, 0.00–0.20, 0.21–0.40, 0.41–0.60, 0.61–0.80 and 0.81–1.00 considered as poor, slight, fair, moderate, substantial and almost perfect agreement, respectively (Landis & Koch 1977). All validity measures were calculated using physicians’ assessment as the gold standard. Sensitivity, specificity and 95% confidence intervals (CI) were calculated for CHWs’ identification of signs and symptoms and classification of VSD and PVSD. To estimate the predictive value of CHWs’ assessments in the entire Home Care arm population, we calculated positive predictive value (PPV) and negative predictive value (NPV) using sensitivity and specificity from the validation study, and prevalence of signs and symptoms in the Home Care arm, since the predictive value of a test varies based on prevalence of the condition in the population (Altman & Bland 1994; Hussaini et al. 1999; Verbeek et al. 2000; Gordis 2004). We estimated prevalence data from post-natal visit records of 8474 newborns assessed by CHWs from January 2004 to December 2005. Newborns were counted as having VSD or PVSD if CHWs classified the newborn at least once with VSD or PVSD during any visit. Data were analyzed with stata Version 9.2 (StataCorp, College Station, TX, USA).

Data collection and data quality assurance

Study supervisors and investigators checked the accuracy of data collection by CHWs in the field and routinely reviewed data forms for accuracy, consistency and completeness. All data quality problems were addressed promptly. Data were entered in custom-designed databases with necessary range and consistency checks and periodically checked by reviewing frequency distributions and cross-tabulations.

Ethical approval

The study received ethical approval from the Johns Hopkins Bloomberg School of Public Health Committee on Human Research and the Ethical Review Committee of the International Centre for Diarrhoeal Disease Research, Bangladesh. Informed verbal consent was obtained from all individual study participants by study staff.

Results

Sensitivity of CHWs’ assessment as compared to physicians’ assessment was above 60% for all signs and symptoms in the algorithm except mild fever (temperature 37.8–38.4 °C) which was 28.6% and history of convulsion (50.0%) (Table 1). Specificity was 97–100% for all signs and symptoms. Sensitivity for the most prevalent sign of illness in the validation sample, rapid breathing, was 87.5% and specificity was 98.0%. The kappa values were highly significant (P < 0.001) and indicated substantial (0.61–0.80) or almost perfect agreement (0.81–1.00) between the two sets of assessments on 16 of the 20 signs and symptoms of the algorithm. CHWs’ identification of mild fever (temperature 37.8–38.4 °C) achieved only fair agreement (kappa 0.39, < 0.001). A negative value (kappa –0.01) for the sign ‘umbilical redness not extending to the skin’ indicates less than chance agreement, although the result was not significant (= 0.54). Kappa values were not calculated for the remaining two signs, severe chest in-drawing and unconsciousness, because they were not observed in any newborns.

Table 1.   Sensitivity, specificity and agreement of CHWs’ assessment compared to physicians’ assessment (n = 288)
 SensitivitySpecificityAgreement
n/N%(95% CI)n/N%(95% CI)κP-value
  1. n = cases identified by CHWs, N = cases identified by physicians.

  2. ‡Exact binomial confidence intervals.

  3. — Some validity measures could not be calculated for signs with no cases identified by CHWs.

Signs and symptoms of very severe disease
 Fast breathing35/4087.5(73.2, 95.8)243/24898.0(95.4, 99.3)0.855<0.001
 Hypothermia (≤35.3 °C)15/1883.3(58.6, 96.4)266/27098.5(96.3, 99.6)0.798<0.001
 Many or severe skin pustules or blisters10/1190.9(58.7, 99.8)274/27798.9(96.9, 99.8)0.826<0.001
 Convulsion observed6/6100.0(54.1, 100.0)282/282100.0(98.7, 100.0)1.000<0.001
 Fever (temperature ≥ 38.3 °C)4/666.7(22.3, 95.7)282/282100.0(98.7, 100.0)0.797<0.001
 Umbilical redness extending to the skin2/2100.0(15.8, 100.0)285/28699.7(98.1, 100.0)0.798<0.001
 Severe chest in-drawing0/10.0(0.0, 97.5)287/287100.0(98.7, 100.0)
 Unconscious0/0288/288100.0(98.7, 100.0)
Signs and symptoms of possible very severe disease
 Lethargic or less than normal movement14/2070.0(45.7, 88.1)264/26898.5(96.2, 99.6)0.718<0.001
 Mild hypothermia (35.3–36.4 °C)15/2268.2(45.1, 86.1)259/26697.4(94.7, 98.9)0.656<0.001
 Not able to feed or not suck at all17/2373.9(51.6, 89.8)258/26597.4(94.6, 98.9)0.699<0.001
 Jaundiced palm and sole after 1 day of birth10/1662.5(35.4, 84.8)270/27299.3(97.4, 99.9)0.700<0.001
 Weak, abnormal and absent cry8/1361.5(31.6, 86.1)272/27598.9(96.8, 99.8)0.652<0.001
 Some skin pustules10/1283.3(51.6, 97.9)271/27698.2(95.8, 99.4)0.728<0.001
 Mild fever (37.8–38.4 °C) 2/728.6(3.7, 71.0)280/28199.6(98.0, 100.0)0.391<0.001
 Umbilicus discharging pus4/580.0(28.4, 99.5)281/28399.3(97.5, 99.9)0.722<0.001
 History of convulsion1/250.0(1.3, 98.7)286/286100.0(98.7, 100.0)0.665<0.001
 Bulging fontanelle1/1100.0(2.5, 100.0)287/287100.0(98.7, 100.0)1.000<0.001
 Vomiting everything1/1100.0(2.5, 100.0)287/287100.0(98.7, 100.0)1.000<0.001
 Umbilical redness not extending to the skin0/10.0(0.0, 97.5)284/28799.0(97.0, 99.8)–0.0050.541
Identification of Illnesses
 Very severe disease67/7490.5(81.5, 96.1)204/21495.3(91.6, 97.7)0.847<0.001
 Possible very severe disease48/5981.4(69.1, 90.3)219/22995.6(92.1, 97.9)0.783<0.001

CHWs’ classification of VSD had a 90.5% sensitivity, 95.3% specificity and kappa value of 0.85 (P < 0.001) which corresponds to almost perfect agreement between CHWs’ and physicians’ assessments. CHWs’ classification of PVSD had a kappa value of 0.78 (P < 0.001) indicating substantial agreement as well as high sensitivity and specificity (81.4% and 95.6%, respectively).

For individual signs and symptoms in the algorithm, PPV ranged from a low of 29.6% for mild hypothermia (temperature 35.3–36.4 °C) to a high of 100.0% for five other signs (history or observed convulsions, fever ≥ 38.3 °C, bulging fontanelle and vomiting) (Table 2). We calculated PPV of 51.0% for VSD and 66.4% for PVSD. NPV was high (99–100%) for all signs and symptoms, and classifications.

Table 2.   Positive predictive value (PPV) and negative predictive value (NPV) of CHWs’ assessment compared to physicians’ assessment
 Projahnmo (n = 8474)Validation (n = 288)PPVNPV
Prevalence (%)Prevalence (%)% (95% CI)†% (95% CI)†
  1. †Confidence intervals were estimated using upper and lower bounds of the likelihood ratio of a positive or negative test. For signs with 100% specificity, likelihood ratios were estimated using the substitution formula (0.5 added to all cell frequencies).

  2. —Some validity measures could not be calculated for signs with no cases identified by CHWs.

Signs and symptoms of very severe disease
 Fast breathing3.613.961.8(40.3, 79.5))99.5(98.9, 99.8)
 Hypothermia (≤35.3 °C)1.26.340.6(20.2, 64.9)99.8(99.4, 99.9)
 Many or severe skin pustules or blisters 0.63.833.6(13.9, 61.3)99.9(99.6, 100.0)
 Convulsion observed0.32.1100.0(9.0, 96.2)100.0(99.7, 100.0)
 Fever (temperature ≥ 38.3 °C)0.72.1100.0(13.2, 97.7)99.8(99.3, 99.9)
 Umbilical redness extending to the skin0.20.736.4(5.7, 63.0)100.0(99.6, 100.0)
 Severe chest in-drawing0.30.3
 Unconscious0.10.0
Signs and symptoms of possible very severe disease
 Lethargic or less than normal movement1.58.041.7(20.6, 66.3)99.5(99.1, 99.8)
 Mild hypothermia (35.3–36.4 °C)1.67.629.6(16.1, 48.0)99.5(99.0, 99.7)
 Not able to feed or not suck at all1.56.929.9(16.5, 47.9)99.6(99.2, 99.8)
 Jaundiced palm and sole after 1 days of birth0.95.643.6(15.6, 76.4)99.7(99.4, 99.8)
 Weak, abnormal and absent cry1.04.536.3(14.6, 65.5)99.6(99.2, 99.8)
 Some skin pustules4.64.268.9(47.3, 84.6)99.2(97.2, 99.8)
 Mild fever (37.8–38.4 °C)0.72.436.1(5.5, 84.7)99.5(99.2, 99.7)
 Umbilicus discharging pus5.71.787.2(61.6, 96.7)98.8(93.4, 99.8)
 History of convulsion0.10.7100.0(1.4, 85.1)99.9(99.8, 100.0)
 Bulging fontanelle0.10.3100.0(2.4, 88.5)100.0(99.7, 100.0)
 Vomiting everything0.20.3100.0(4.6, 93.9)100.0(99.4, 100.0)
 Umbilical redness not extending to the skin1.00.30.0(1.5, 74.0)99.0(98.3, 99.7)
Identification of Illnesses
 Very severe disease5.625.751.0(36.1, 65.7)99.5(98.9, 99.7)
 Possible very severe disease11.220.566.4(51.6, 78.6)98.0(96.6, 98.8)

Discussion

We evaluated the ability of trained CHWs in a rural area of Bangladesh to assess newborns in their homes and correctly identify sick newborns, in comparison with assessments of physicians. We demonstrated that CHWs with no prior background in health were able to recognize signs of illness and classify illness in neonates using a simple clinical algorithm after 6 weeks of training. CHWs correctly identified newborns with VSD requiring immediate referral with a sensitivity of 91%, specificity of 95%, and kappa value of 0.85. Based on their assessment of clinical signs, CHWs missed 7 of 74 cases of VSD and incorrectly classified 10 of 214 neonates as having VSD.

Compared to previous validation studies of CHWs’ assessments of neonates in their homes, CHWs in this study detected illnesses with equal or higher sensitivity and similar specificity. In a 7 year study in rural India, the ability of village health workers (VHWs) to diagnose neonatal sepsis using a shorter algorithm of seven signs was compared to the results generated by computer algorithm using visit records (Bang et al. 2005). VHWs diagnosed sepsis cases with 89% sensitivity and 99.5% specificity in comparison to computer algorithm. Our group recently conducted another validation study of CHWs with similar training in Mirzapur, Bangladesh which also compared their assessment of neonates to that of a study physician (Darmstadt et al. 2009). This study used an 11-sign algorithm to diagnose VSD and observed 73% sensitivity, 98% specificity and kappa value 0.63, indicating substantial agreement between CHWs and physicians. Lower sensitivity in the Mirzapur validation study may be due to inclusion of fewer ill newborns (3% prevalence of VSD compared to 26% VSD in our sample) and longer interval between CHW’s and physician’s assessments.

Our results also compare favourably to validation studies in older children; our sensitivity for severe illness was higher and specificity was equivalent. In a hospital-based study in Kenya, CHWs with 3 weeks of initial training and a 1 week refresher course were evaluated on their use of a 21-sign algorithm to identify illnessess in children <5 years old. They classified VSD with sensitivity of 57–65% compared to physician’s assessment (VSD was indicated by presence of one of four danger signs or fever in children <2 months) (Kelly et al. 2001). In a study in northern and central Bangladesh, female community health volunteers (CHVs) received 3 days of training on identification and treatment of acute respiratory infections (ARIs) in children 3–59 months of age and monthly refresher trainings (Hadi 2003). They diagnosed ARIs with 68% sensitivity and 95% specificity compared to study paediatrician’s assessment.

Ability to recognize fast breathing is a necessary skill for diagnosis of severe illnesses in most algorithms for assessing ill infants and children, including current WHO IMCI guidelines (World Health Organization & UNICEF 2005). Our analysis indicates CHWs in Sylhet identified fast breathing accurately with a sensitivity of 88% and specificity of 98%. Since fast breathing was the most commonly observed symptom in our study (14%), the high sensitivity and specificity with which CHWs identified fast breathing was an important factor in overall accuracy of their diagnoses. Previous CHW evaluations of sensitivity and specificity in detection of fast breathing reported lower sensitivity and specificity. In the Mirzapur study, two levels of fast breathing were included in the algorithm and sensitivity of both signs was very low (25% for ≥70 breaths per minute and 7% for 60–69 breaths per minute) (Darmstadt et al. 2009). In the Kenya study mentioned above, CHWs recognized tachypnoea in children <5 years of age with moderate sensitivity (41–66%) (Kelly et al. 2001). Another study of children <5 years in Uganda reported that community drug distributors with 2 days of training on detection of ARIs correctly counted and classified fast breathing with 75% sensitivity and 83% specificity (Kallander et al. 2006).

The strong initial training and continuing supervision and support received by CHWs in our study are likely reasons why we observed such strong agreement between assessments. Seventy eight percent of CHWs working at the time of the validation study received 6 weeks of training and had worked for the project for a year or more. CHWs in most validation studies discussed here had shorter training periods, sometimes only a few days, and little ongoing supervision or feedback on performance. The Bangladesh study mentioned previously compared assessments of CHVs who received routine supervision by para-professionals to those of CHVs who did not (Hadi 2003). Regular supervision in that study included monthly meetings to discuss performance and re-examination of selected patients by supervisors to confirm diagnosis. Supervised CHVs were significantly more likely to make a correct ARI diagnosis in children 3–59 months of age compared to those who were supervised irregularly (93%vs. 78%, P < 0.01). Logistic regression analysis to control for factors such as child’s age and gender and CHVs’ experience and training indicated odds of correct ARI diagnosis were more than four times higher for regularly supervised CHVs (OR 4.21, P < 0.001).

Sensitivities and specificities for CHWs’ assessments reported here are based on oversampling of sick newborns. There were 133 newborns (46%) with VSD or PVSD in the validation sample while only 17% of neonates in the parent trial population were classified with illnesses. To predict the probability of a correct assessment and diagnosis by CHWs, we applied prevalence data from all neonates assessed in the parent trial (Hussaini et al. 1999; Verbeek et al. 2000). Therefore, our reported PPV of 51% for VSD indicates an estimated 51% of newborns classified with VSD by CHWs during the entire trial were correctly diagnosed (Altman & Bland 1994). Based on these calculations, we estimate between one-third and one-half of newborns would be inaccurately classified as ill by CHWs in routine use of this algorithm in Sylhet. This could lead to considerable over-referral and subsequent burden on first-level health facilities, as well as unnecessary treatment. However, actual compliance with referral was low in the parent study. Rates of successful referral to qualified medical providers were 34% for VSD, 25% for PVSD with multiple signs and 10% for PVSD with one sign (Baqui et al. 2008b). High NPV for VSD and PVSD indicate that an estimated 2% of newborns classified by CHWs in the parent trial as healthy would be false negatives. Thus, risk of failure to treat ill newborns would be low. Considering the high neonatal mortality rate in Bangladesh, we prefer a low risk of missed diagnosis and higher potential referral rather than the reverse situation. Further CHW training targeted to improve assessment skills for signs with low sensitivity and refinement and simplification of the current algorithm would likely reduce the number of newborns unnecessarily referred. In addition, while we would hope to see higher successful referral in the future, low compliance during the trial suggest there is little risk expansion of CHW surveillance for newborn illnesses would generate an unreasonable burden on the local health system.

Strengths of this study include the design which ensured a sample with high percentage of ill babies and the ability to use prevalence data from the parent trial to predict accuracy of CHWs’ assessments under routine conditions. We present validation results from neonates assessed in their homes, a population infrequently studied in this context. A potential limitation of all validation studies is the lag time between consecutive assessments. Although efforts were made to time the physicians’ visits to follow as soon as possible after CHWs’, there may have been changes in the condition of the newborn between assessments.

In conclusion, this study corroborates limited evidence that CHWs can recognize signs of illness and classify illnesses in newborns during the first month of life using a clinical algorithm. Our finding that CHWs can assess newborns with very high sensitivity and specificity provides strong support for expansion of community-based neonatal health care in settings where the burden of newborn illness is high and care seeking is low.

Acknowledgements

The validation study presented here was a sub-study within a larger study known as Projahnmo (the Project for Advancing the Health of Newborns and Mothers). Projahnmo was a partnership of the ICDDR,B; the Bangladesh government’s Ministry of Health and Family Welfare; Bangladeshi non-governmental organizations, including Shimantik, Save the Children-US, Dhaka Shishu Hospital and the Institute of Child and Mother Health and the Johns Hopkins Bloomberg School of Public Health. We thank the members of the Projahnmo Technical Review Committee, the members of the Shimantik Executive Committee, and colleagues at the Bangladesh Ministry of Health and Family Welfare at the sub-district, district and national levels for their valuable help and advice. We thank Renata Pharmaceuticals Ltd, Bangladesh, for preparing the penicillin for study purposes. We thank Sylhet MAG Osmani Medical College for serving as our training venue as well as providing support for management of sick babies referred from the project area. We thank the many individuals in Sylhet district who gave their time generously as well as Projahnmo field and data management staff who worked tirelessly. The critical innovative inputs of Projahnmo study group members are acknowledged.

This research was funded by the United States Agency for International Development (USAID), through cooperative agreements with the Johns Hopkins Bloomberg School of Public Health and the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B), and by the Saving Newborn Lives program of Save the Children-US through a grant from the Bill & Melinda Gates Foundation. Study sponsors had no role in study design; the collection, analysis and interpretation of data; writing of the report; or in the decision to submit this article for publication.

Ancillary