A recent report in this journal1 meticulously demonstrated that use of 6 different qualitative fecal immunochemical tests (FITs), sometimes called immunochemical fecal occult blood tests (FOBTs or iFOBTs), had very different positivity rates, which likely reflected their different analytical detection limits for fecal hemoglobin to a large extent. Positivity rates varied considerably from 6.4 to 46.8%, and overall agreement between the different FIT was only poor to moderate, with κ ranging from 0.14 to 0.61. Different positivity rates meant that the tests had different clinical sensitivities and specificities. This work built on an earlier report from these authors,2 which showed major differences in the analytical performance of various qualitative FIT. Similar findings have been reported by others.3
As FITs have many advantages over traditional guaiac-based FOBT,4 it is hardly surprising that these tests are being used more and more in screening for colorectal cancer. Both quantitative, measuring the actual fecal hemoglobin concentration, and qualitative FITs have been developed. The latter are immunochromatographic methods involving simple visual interpretation of results on test cassettes as positive or negative. A major advantage of such qualitative FIT is their easy application without the need for sophisticated and expensive laboratory equipment and highly qualified staff.
We have investigated effective approaches to population screening for colorectal cancer involving qualitative FIT using what we term the 2-tier reflex FOBT/FIT strategy.5, 6 Interestingly, in our previous studies on 2 qualitative FIT, the first using a traditional tube collection device5 and the second a novel card collection device,6 in contrast to the recently published work,1–3 we found that there was no significant difference in analytical detection limits for the 2 FIT. Identical results were obtained when 200 consecutive fecal specimens from participants were analyzed with the Instant-View FIT (Alfa Scientific Designs, Poway, CA) test cassettes using the samples collected using the card of the hema-screen SPECIFIC (Immunostics, Ocean, NJ) after transfer of feces to the specimen preparation tubes. We attributed the differences in detection of significant neoplasia not to differences in the qualitative FIT, but simply to a smaller amount of feces being collected on the card compared with the tube collection devices.
As a result of these studies,5, 6 the Scottish Bowel Screening Programme (SBoSP) that began in 20077, after 3 very successful pilot screening rounds,8 uses a qualitative FIT as a second-line investigation. The FIT is offered to participants who have a “weak positive” guaiac FOBT on initial testing (that is, 1–4 windows positive out of the 6 possible) to cut down the number of false-positive FOBT results. As it is mandatory that the SBoSP has a positivity rate that allows further investigation with colonoscopy, a rather scarce resource in this country, the qualitative FIT has to be as constant as possible over time and geography so as to ensure that the eligible population resident in Scotland aged 50–74 years all participate in a program that has identical ongoing performance characteristics with a positivity rate of about 2.0% overall. In part because different FITs might have varying analytical and clinical performance characteristics, we have purchased FIT from a single manufacturer since the start of the SBoSP (hema-screen SPECIFIC, manufactured by Immunostics, supplied by Alpha Labs, Eastleigh, Hants, UK), and this strategy should minimize a now well-documented source of variability.1–3
However, and most importantly, over the course of the SBoSP, we have ensured that the various lots of FIT immunochromatographic test cassettes from this single manufacturer give positivity rates and clinical outcomes that are constant from manufacturing lot to lot: this aspect was not investigated in the research work by Brenner et al.1 or by others,2, 3 but it is absolutely vital to an operational national screening program.
We have instituted a novel approach, which we term “acceptance quality checks” (AQCs). AQCs are defined as structured objective examinations of product, before delivery and use, to ensure that preset quality specifications are met. Structured assessments were developed, incorporating objective examination of critical kit components. These were agreed with the supplier and the manufacturer. Representative samples (at least 50) from candidate lots were supplied for the AQC. Immunochemical test cassettes were compared to the lot in routine use in the SBoSP by simultaneous analysis of samples sent by participants in the SBoSP. Two very experienced staff performed the AQC with senior staff supervision and with a representative of the supplier present. The percentage in disagreement and κ were calculated. In 2009, AQCs were performed on 5 candidate lots of FIT cassettes. The percentages of results in disagreement were 4.0, 15.1, 3.8, 6.0 and 0.0%, and the respective κ were 0.92, 0.69, 0.91, 0.88 and 1.00. The second candidate lot obviously had lower analytical detection limit by visual assessment of the colors for positive samples on the test cassettes and higher percentage disagreement and lower κ. These were agreed by all to be unacceptable for use in the SBoSP. The other lots were deemed acceptable and subsequently used in the SBoSP.
In summary, AQCs have assisted in making sure that the analytical detection limit of FIT test cassettes is consistent over time, ensuring that the positivity rate remains the same. AQCs are useful in that clearly different lots can be detected early in the procurement process and not accepted before delivery. This has maintained the productive relationships with both supplier and manufacturer. We commend their adoption by organizers of other screening programs that use qualitative FIT.
We plan further work on this topic. Objective numerical criteria for what level of agreement/disagreement is acceptable must be developed and agreed, noting that there appear to be no simple models in the existing literature which would assist in this process. For such criteria, approaches based upon either percentage in disagreement or κ are possible, each having advantages and disadvantages, although κ is favored from a statistical point of view, as chance is taken into account.