A higher detection rate for colorectal cancer and advanced adenomatous polyp for screening with immunochemical fecal occult blood test than guaiac fecal occult blood test, despite lower compliance rate. A prospective, controlled, feasibility study
Gastroenterology Department, Rabin Medical Center, Tel Aviv University, Tel Aviv, Israel
Immunochemical fecal occult blood test (FIT) is a new colorectal cancer (CRC) screening method already recommended by the American screening guidelines. We aimed to test the feasibility of FIT as compared to guaiac fecal occult blood test (G-FOBT) in a large urban population of Tel Aviv. Average-risk persons, aged 50–75 years, were offered FIT or G-FOBT after randomization according to the socioeconomic status of their clinics. Participants with positive tests underwent colonoscopy. Participants were followed through the Cancer Registry 2 years after the study. Hemoccult SENSA™ and OC-MICRO™ (three samples, 70 ng/ml threshold) were used. FIT was offered to 4,657 persons (Group A) and G-FOBT to 7,880 persons (Group B). Participation rate was 25.9% and 28.8% in Group A and B, respectively (p < 0.001). Positivity rate in Group A and B was 12.7% and 3.9%, respectively (p < 0.001). Cancer found in six (0.49%) and eight (0.35%) patients of Group A and B, respectively (NS). Cancer registry follow-up found missed cancer in five (0.22%) cases of Group B and none in Group A (NS). The sensitivity, specificity, negative and positive predictive value for cancer in Group A and B were 100%, 85.9%, 100%, 3.9% and 61.5%, 96.4%, 99.8%, 9.1%, respectively. There was increased detection of advanced adenomatous polyp (AAP) by FIT, irrespective of age, gender, and socioeconomic status (Per Protocol: odds ratio 2.69, 95% confidence interval 1.6–4.5; Intention to Screen: odds ratio 3.16, 95% confidence interval 1.8–5.4). FIT is feasible in urban, average-risk population, which significantly improved performance for detection of AAP and CRC, despite reduced participation.
Annual or biennial guaiac fecal occult blood test (G-FOBT) screening reduces colorectal cancer (CRC) mortality by 16–33%1–3 or even more.4 The advantages of G-FOBT include privacy, noninvasiveness and cost-effectiveness. However, the standard G-FOBT is faulted for its low sensitivity for CRC and advanced adenomatous polyp (AAP), low specificity due to nonspecificity for human hemoglobin, the need for periodic testing, low patient adherence and the possibility of inaccurate development and evaluation.5–7
Recently, the FIT was developed to improve specificity and eliminate the need for dietary restriction. Laboratory-based, automated, immunochemical measurement of fecal human hemoglobin allows clinicians to choose a fecal hemoglobin threshold level to perform colonoscopy and can adjust this threshold to take account of the patient's risk for advanced neoplasia. FIT was first approved by the U.S. Food and Drug Administration as a qualitative test and then was recommended by the American Cancer Society.8 Recently, it was added to the recommendation of the American Task Force.9 The American College of Gastroenterology recommends colonoscopy for screening CRC of the average-risk population and FIT (instead of G-FOBT) as a second option.10
In a series of recent articles, we found FIT equal or better than G-FOBT in detecting CRC and AAP in symptomatic or asymptomatic high-risk patients.11–17 At the manufacturer-recommended threshold of 100 ng/ml, the three-sample test sensitivity and specificity for detecting cancer were 88% and 90%, respectively, and for detecting significant neoplasia (CRC and AAP) were 62% and 93%, respectively.15 van Rossum et al. conducted a population-based study on a random sample of 20,623 individuals aged 50–75 years randomized to either G-FOBT (Hemoccult II) or FIT (OC-Sensor).18 The number-to-scope to find one cancer was comparable between the tests. However, participation and detection rates for advanced adenomas and cancer were significantly higher for FIT.
The aim of this study is to compare FIT with G-FOBT in screening the average-risk people in a prospective, controlled study. Because FIT is more demanding procedure, and associated with complex logistics, we aimed to assess the feasibility of this approach in the urban population in Israel, with special relation to different socioeconomic classes.
The study was designed as a population based study where average-risk persons aged 50–75 years were offered either FIT (OC-MICRO™, Eiken Chemical Co., Tokyo, Japan) or G-FOBT (Hemoccult SENSA™, Beckman Coulter, Fullerton, CA) according to a randomization program based on the socioeconomic status (SES) of the primary care clinic. Patients from nine primary care clinics of Clalit Health Services (CHS) in Tel Aviv were included. Only screenees with positive tests were referred to colonoscopy. All the participants who performed the test were followed though the Israel National Cancer Registry from the end of the study (Fig. 1).
Asymptomatic people aged 50–75 years who belonged to the selected nine clinics were included in the study.
(i) Patients who underwent colonoscopy or sigmoidoscopy in the last 5 years. (ii) Patients who participated in the G-FOBT general screening program in the last 2 years. (iii) Patients who had an established CRC or inflammatory bowel disease (IBD). The information about previous colonoscopies, G-FOBT routine screening program and IBD were extracted from the computerized CHS database. Patients with visible rectal bleeding, hematuria, menstruation or with symptoms related to the gastrointestinal tract were instructed not to perform FOBT.
Randomization by the socioeconomic status of the clinic
Every patient of CHS belongs to a primary care clinic that are categorized according to the number of insured patients and the SES of the area. In Israel, every citizen is entitled for comprehensive medical insurance by law. People with low income are free of paying National Security tax. The authorities report CHS patients who are free of paying National Security taxes on a regular basis. The percentage of patients who are free of paying tax serves as a marker for the SES level of the clinic. According to CHS strategy, a clinic with up to 15% patients who are free of paying tax is leveled as high SES, between 16–30% leveled as medium SES, and above 30% as low SES.
Primary care clinics participating in the study
Nine medium-sized primary care clinics (1,000–2,000 patients) were included, three clinics from each SES. Because FIT is more expensive and needs more complex strategy (cooling bags and specific transportation), we divided clinics into 1/3 using FIT (one clinic from each SES) and 2/3 using G-FOBT (two clinics from each SES).
All included people received an invitation letter to participate in the study. Asymptomatic people willing to participate were instructed to go to the primary care clinic and ask for the FOBT kits. Those willing to participate were instructed how to prepare the FOBTs and were asked to bring it back to the clinic. The kits were then transported to a central laboratory. The patients with positive tests were referred to a consultant gastroenterologist with a recommendation to perform colonoscopy (Fig. 1).
Hemoccult SENSA™ (HOS)
Cards were provided at the primary care clinic. Patients willing to participate in the study received an oral explanation and written instructions about test preparation. Patients were requested to follow the manufacturer's instructions on diet and use of medications before and during the preparation for G-FOBT. They applied stool on six windows of three cards and brought them back to the clinic where they were provided. Then, the cards were collected and checked at the central laboratory of the CHS. Test result was sent to the patient and appeared online in his electronic file. The test is positive if any one of the six windows is positive.
This FIT has been described by us and by others.14 Patients willing to participate in the study received an oral explanation and written instructions about the test preparation. They were given the kit for fecal sampling and requested to prepare three consecutive daily samples without any limitation of diet or medication. The patients were instructed to keep the samples in the refrigerator and bring the samples back to the clinic using a cooling bag provided with the kits. Samples were refrigerated at 4°C until developed within 2 weeks of preparation. The samples were analyzed by the OC-MICRO™ instrument, hemoglobin amount was measured and the results were given automatically as ng/ml of buffer. We referred to 70 ng/ml threshold as a positive test (the highest of three tubes).
Colonoscopy was to the cecum or to an obstructing carcinoma if present. Otherwise, an incomplete examination was repeated or the patient excluded from the analysis. All lesions were noted, biopsied or removed and the number of polyps was noted, as were their size and sites. Polyps were classified as nonadenomas or adenomas. Adenomas were grouped by size: ≤5, 6–9 and ≥10 mm; by their histology: tubular, tubulo-villous and villous; and dysplasia: low-grade (LGD) or high-grade (HGD). The term “significant” neoplasms included CRC or AAP; this latter category included adenomas ≥10 mm in diameter or having ≥20% villous histology or having any amount of HGD independent of size. All AAPs <10 mm were reexamined to confirm their size and histology diagnosis.
Cancer registry follow-up
Using recode linkage analysis techniques and personal identification number given to all Israeli citizens on birth or immigration, as well as other demographic data, we identified cancer in the cohort of CHS members. The Israel National Cancer Registry (INCR) is a population-based national cancer registry in operation since 1960. Reporting is mandatory since 1982, and the registry meets all internationally accepted requirements for coding system and completeness of data. All the participants completed at least 2 years of follow-up before the transaction day.
Analysis and statistical methods
The study was designed as a population-based study where only patients with positive FOBT were sent to colonoscopy, therefore, the sensitivity of AAP could not be calculated. Detection rate was defined as the percentage of patients with lesions among the compliant population (performed FOBT, per protocol) or the invited population (intention to screen). Positivity rate was defined as the percentage of positive tests among the compliant population. Positive predictive value (PPV) was defined as the percentage of patients with lesions among the patients who underwent colonoscopy. Number needed to scope (NNS) was defined as the number of colonoscopies needed to perform to detect one AAP/CRC (the value is the reciprocal of PPV). Because fecal hemoglobin had no normal distribution, we used the median rather the mean for description of hemoglobin level and used nonparametric tests. We used the χ2 test or Fisher exact test for the comparison of different rates. Binary logistic regression analysis (forward, conditional) for the detection of cancer or cancer and advanced adenoma was performed including all covariates tested by univariate [age (<60 years, ≥60 years), gender (male, female), SES (low, medium, high) and National Security Tax Free (yes, no)] in both per protocol and intention to screen analysis. All p values appear in the tables, p <0.05 was considered as significant. Statistical analysis was performed with SPSS system for windows, software version 15.
Rabin Medical Center Institutional Review Board authorized the study in 2008. The patients were informed and asked whether they wanted to participate in the study but did not have to sign an informed consent. The analysis performed and the decision to publish the results were the responsibility of the authors.
A total of 3,822 of 16,359 patients were excluded (23.3%). Of the excluded patients, 2,367 underwent colonoscopy or sigmoidoscopy in the last 5 years, 1,694 participated in the on-going national G-FOBT screening program in the last 2 years, 180 patients had a diagnosis of CRC and 49 patients had a diagnosis of IBD.
A total of 12,539 patients were included in the study: 4,657 patients of Group A had FIT and 7,880 patients of Group B had G-FOBT. In Group A, the patients were younger (the difference in the proportion of patients younger than 60 years was 4.2%, p < 0.001) and the proportion of women was lower (the difference in the proportion was −2.8%, p = 0.002). The proportion of patients who were free of paying tax was not different among the two groups (difference of −0.6%, p = 0.44) and the percentage of patients who belonged to low SES primary care clinic was also similar (difference of 0.2%, p = 0.87).1
Table 1. Characteristics of invited persons to participate in the study according to test type
The compliance to take the FOBT kits from the clinic (kit dispension) was lower in Group A as compared to Group B (33.0% vs. 45.9%, p < 0.001). However, once dispensed, the proportion of patients who performed the test was higher in Group A than Group B (78.4% vs. 62.7%, p < 0.001). The overall compliance (test performed per invited population) was 25.9% and 28.8% in Group A and B, respectively (p < 0.001). Regression analysis model yielded that older age [odds ratio (OR) 1.025, p < 0.001] and female gender (OR 1.29, p < 0.001) were associated with test compliance. This was correct when calculated for both groups.2
Table 2. Test performance of G-GOBT versus FIT (≥70 ng/ml, highest of three tubes)
Test results and positivity rate (Table 2, Fig. 2)
The test was positive in 153 (12.7%) and 88 (3.9%) patients of Group A and B, respectively (p < 0.001). The range of FIT results was 0–3007 ng/ml. Because the distribution of FIT results is skewed, we used the median in describing the results and nonparametric tests for difference between the groups. Median FIT levels for normal or nonadvanced adenoma (n = 73), advanced adenoma (n = 29) and cancer (n = 6) were 170, 506 and 1828 ng/ml, respectively (p = 0.003 by Kruskal Wallis). Technical performance problem happened in 13 patients of Group A. Seven patients prepared only two tubes and two patients prepared one tube. These patients were included in the intension to screen (ITS) but not in the per protocol (PP) analysis.
Colonoscopy followed positive FOBT in 108 (70.6%) and 63 (71.6%) patients of Group A and B, respectively (p = 0.86). The colonoscopy was complete in all cases (up to the cecum or an obstructing tumor). In Group A, colonoscopy detected six cancers and 29 AAPs. In Group B, colonoscopy detected eight cancers and 14 AAPs. Polyp and cancer characteristics are presented in Table 3.
Table 3. Colonoscopy results with subdivisions according to histology and size of polyps (per patient analysis, according to most severe pathology)
Cancer registry follow-up
The transaction was performed 2 years after the last FOBT was performed. The registry detected five cancers missed by G-FOBT (had a negative G-FOBT). Two cancers were located in the sigmoid colon, one in the descending colon, one in the ascending colon and one in the cecum. All missed cancers occurred in persons older than 60 years and three were men. The registry did not detect missed cancer in patients with a negative FIT. One patient of each group, with a positive FOBT, had CRC that was not detected by referral colonoscopy (detected later by the registry). These cases were “colonoscopy noncompliant” patients but, for the purpose of FOBT performance analysis, they were regarded as “true positive.” The registry did not identify any case of gastric or esophageal cancer with positive FOBT.
Cancer detection rates and performance characteristics (Table 2, Fig. 3)
Detection rate was defined as the percentage of patients with lesion detected among the population that performed FOBT (per protocol) or invited to participate in the study (intention to screen). Cancer occurred in six patients in Group A and in 13 patients in Group B, the calculated rates per 10,000 persons performing the test is 49 and 57 cases, respectively (p = 0.74). However, FOBT detected all cases of cancer in Group A but only 8/13 cancers in Group B (these false-negative cases were detected later by the INCR). Therefore, the cancer detection rate was 49/10,000 in Group A and 35/10,000 in Group B (p = 0.50). Sensitivity of FIT was better than G-FOBT (the same for three tubes or first tube, threshold 70 ng/ml) (p < 0.001). Raising the threshold to 200 ng/ml and testing only the first tube would have result in the same sensitivity in both methods. Univariate analysis for the detection of cancer reveals an increased detection rate among patients older than 60 years (74 cases/10,000 and 51 cases/10,000 in Group A and B, respectively, p < 0.001) and patients from low SES clinics (75 cases/10,000 and 13 cases/10,000 in Group A and B, respectively, p < 0.001). However, logistic regression analysis, adjusting for age, gender, and SES of the clinic, reveals that test type was not a significant determinant.
AAP was detected in 29 patients in Group A and 14 patients in Group B. The calculated detection rate among patients performing the tests was 241 cases/10,000 and 62 cases/10,000, respectively (p < 0.001). The detection rate with FIT was better also when raising the threshold to 200 ng/ml and using only the first tube (p = 0.003). By univariate analysis, the AAP detection rate was significantly higher irrespective of age group, gender, SES class clinic and tax paying status. Adjusting for age, gender, SES and tax paying status reveals that FIT detected significant neoplasia (cancer and AAP) better than G-FOBT by both ITS and PP analysis [ITS: OR 2.69, 95% confidence interval (CI) 1.59–4.57, p = 0.001; PP: OR 3.16, 95% CI 1.8–5.4, p < 0.001].
Following the recent recommendations to use FIT instead of G-FOBT,10 we looked at the feasibility of performing FIT in a large-scale urban population where G-FOBT served as the usual CRC screening modality for the last 10 years. We found that the procedure was feasible and convenient, even though the study results should be taken with caution; not always logistic efforts associated with a study may be translated with the same success to daily practice.
Several studies compared the performance of FIT to G-FOBT in the average-risk population using different kits, randomization methods and design (see tables). Other studies compared FIT to G-FOBT in symptomatic patients referred to colonoscopy.11–17 In this population-based study, we compared the performance of FIT and G-FOBT in a large urban population. The strength of our study is in using the most sensitive representatives of each method, OC-SENSA and the OC-MICRO at a low threshold, using three samples; in having two arms, each for one procedure (performing both tests by the same person does not allow true comparison, biasing toward the more compliant participants) and including the SES in the randomization process, we could investigate its possible effect on compliance. In addition, we had a follow-up of 2 years after the end of the study through the INCR.
The sensitivity of FIT for the detection of cancer was significantly higher. FIT detected 49 cases of cancer/10,000 persons who performed the test, 14 cases/10,000 more than that detected by G-FOBT (p < 0.001). The cancer registry follow-up detected five missed cancers in the G-FOBT group (translated to 22 cases/10,000 more cases) and no missed case in the FIT group.
The WHO defined that the objective of mass screening for CRC is to detect 50 prevalent cases among 10,000 subjects older than 50 years.19 It seems that the FIT group has reached this goal. The sensitivity of FIT remains significantly higher than of G-FOBT even using 100 ng/ml, three tubes, as a threshold (p = 0.008). The increased sensitivity has opposite relationship with the number needed to scope to detect one cancer. Interestingly, using only one tube with a threshold of 70 ng/ml results in improved sensitivity but equal NNS as G-FOBT.
Data are accumulating about the improved cancer detection rate by FIT. van Rossum et al.18 performed a similar comparative study of FIT and G-FOBT. They did not follow the patients with “negative test” and used the less sensitive Hemoccult II test. They also reported an increased cancer detection rate using FIT with a cutoff of 100 ng/ml (three tubes). In their study, the positivity rate and the PPV were comparable between the two methods.
Guittet et al.20 investigated variation in sensitivity of FIT and G-FOBT in a sample of 20,322 subjects. The gain in sensitivity by using FIT increased from invasive cancers (ratio of sensitivities = 1.48) to high-risk adenomas (3.32) and was inversely related to the amount of bleeding. In a large retrospective Japanese study of nearly 22,000 asymptomatic, average-risk participants, the sensitivity and specificity of one day FIT for detecting invasive cancer was 65.8% and 94.6%, respectively, similar to our results, using only the first fecal sample and a 100 ng/ml hemoglobin threshold.21
We could not demonstrate higher participation rate for the FIT than G-FOBT calculated by intention to screen analysis, because more kits were dispersed in Group B, the G-FOBT arm. We believe that the reasons for this difference are the familiarity of the population with G-FOBT and the logistic efforts needed for keeping the FIT in the refrigerator and bring the samples to the clinic in a cooling bag. The overall participation rate with the FIT was 3.2% lower than that with G-FOBT. However, once the kit was dispensed, the compliance was 15.8% higher in the FIT arm.
Of the three samples, the study was designed with 70 ng/ml FIT cutoff. Patients with results ≥70 ng/ml were diagnosed positive and referred to colonoscopy. Testing the first tube only at the threshold of 100 ng/ml, as sometimes practiced, would have result at in a poorer performance. Similar finding was found by van Rossum et al.,22 as described in their second article, and Grazzini et al.23, 24
The main limitation of our study is the randomization by clinic site, with only nine clinics studied, and not by patients. We demonstrated a small advantage for G-FOBT in kit dispensing; however, once dispensed, the proportion of patients who performed the test was higher with FIT. Our data support using FIT in the population level, giving important data about number of tests and cutoff level for determination of a positive result but could not demonstrate a higher compliance rate of FIT than for G-FOBT as demonstrated by others.18, 20, 25–34
In summary, we demonstrated a successful screening project using FIT. This strategy is feasible in a large-scale urban population. Using FIT instead of G-FOBT resulted in an increased CRC and AAP detection rate, irrespective of age, gender and income. Further studies are needed to optimize FIT threshold, using three tubes, for prevention of cancer by the early detection of AAPs.
Eiken Japan provided the OC-MICRO™ instrument, reagents and partial financial support for administration. Author roles: Zohar Levi: Acquisition of data, analysis and interpretation of data, drafting of the manuscript, statistical analysis. Shlomo Birkenfeld: Acquisition of data, analysis and interpretation of data, drafting of the manuscript. Alex Vilkin: Acquisition of data, analysis and interpretation of data. Micha Bar-Chana: Cancer registry follow-up. Irena Lifshitz: Cancer registry follow-up. Miri Chared: Acquisition of data, coordination of the trial. Eran Maoz: Acquisition of data. Yaron Niv: Study concept and design, acquisition of data, analysis and interpretation of data, drafting of the manuscript, study supervision