Artificial intelligence for cancer detection of the upper gastrointestinal tract

In recent years, artificial intelligence (AI) has been found to be useful to physicians in the field of image recognition due to three elements: deep learning (that is, CNN, convolutional neural network), a high‐performance computer, and a large amount of digitized data. In the field of gastrointestinal endoscopy, Japanese endoscopists have produced the world's first achievements of CNN‐based AI system for detecting gastric and esophageal cancers. This study reviews papers on CNN‐based AI for gastrointestinal cancers, and discusses the future of this technology in clinical practice. Employing AI‐based endoscopes would enable early cancer detection. The better diagnostic abilities of AI technology may be beneficial in early gastrointestinal cancers in which endoscopists have variable diagnostic abilities and accuracy. AI coupled with the expertise of endoscopists would increase the accuracy of endoscopic diagnosis.


INTRODUCTION
S INCE THE 1970S, various methods have been used to analyze medical images. Among them, image analysis using deep learning has been widely used in many fields because it enables us to build a system to classify and recognize lesion images without writing complicated image processing algorithms. 1 Many evaluations have been done in colonic polyps with the development of deep learning algorithms such as detection, localization and segmentation. 2 Multiple reports have come up in the field of capsule endoscopy as well. 3 In this review, we analyze the papers showing the potentials of AI using breakthrough CNNs to help detect gastrointestinal cancers. Given that papers have been published suggesting that AI can supersede the diagnostic ability of a specialist even in gastrointestinal endoscopic diagnosis, we considered the future of their introduction into clinical practice.

DETECTION OF GASTRIC CANCER
A PPLICATION OF CNN-BASED AI to detect gastric cancer was first reported by Hirasawa et al. 4 in January 2018 (Fig. 1). Early gastric cancer is often present in the background gastric mucosal inflammation and is difficult for endoscopists to recognize. The false negative rate of gastric cancer detection in upper gastrointestinal endoscopy has been estimated at 4.6-25.8%, [5][6][7][8][9][10] and AI is a potential tool to address this issue. Hirasawa et al. prepared 13,584 highresolution with white light imaging (WLI), narrow band imaging (NBI), and chromoendoscopy using indigo carmine of pathologically diagnosed gastric cancers validated by specialists as teaching images. From this, they developed an AI for gastric cancer detection. A validation set of 2296 images with 77 lesions of gastric cancer showed a sensitivity of 92.2%; however, the positive predictive value was 30.6%. This resulted in misdiagnosis of gastritis or misinterpretation of the flexion of the gastric angle as gastric cancer. In the next step, Ishioka et al. 11 applied the system to videos, and the results were similar to those with still images (sensitivity 94.1%).
Another paper examining the detection of gastric cancer by AI was published by Wu et al. 12 in 2019. After validating 200 endoscopic images, they reported that the accuracy, sensitivity, and specificity of gastric cancer in their AI system were 92.5%, 94%, and 91%, respectively. Luo et al. 13 developed a gastrointestinal AI diagnosis system (GRAIDS) for detecting upper and lower gastrointestinal cancers, including gastric and esophageal cancers. The diagnostic accuracy for gastrointestinal cancers ranged from 91.5% to 97.7% with seven validation sets used in their multicenter study. GRAIDS proved a robust diagnostic ability by showing a high accuracy in multiple facilities. This achieved a diagnostic sensitivity similar to that of expert endoscopists (94.2% vs 94.5%) but superior to that of competent (85.8%) and trainee (72.2%) endoscopists. This paper also reports the results of using AI, in which case the cancer detection sensitivity increased significantly to 98.4% for experts, 97.8% for competents, and 96.4% for trainees. It has suggested the effectiveness of the combination of AI and endoscopists.

GASTRIC CANCER DIAGNOSIS
A FTER THE DETECTION of suspicious lesion, the diagnosis of early gastric cancer by magnifying endoscopy with NBI is more accurate than by WLI. 14 The usefulness of AI to differentiate between cancer and noncancer areas by magnified NBIs has been reported by several investigators. [15][16][17] The diagnostic accuracy of gastric cancer was 84-96%, which suggests the potential usefulness of AI. Horiuchi et al. 18 19 showed that the accuracies of T1, T2, T3, and T4 were 77%, 49%, 51% and 55%, respectively. However, one of the most important criteria for curative endoscopic resection is the tumor invasion depth. Curative resection by endoscopic mucosal resection can frequently be achieved for intramucosal cancers (M) and cancers with submucosal invasion < 500 lm (SM1), whereas surgery is required for gastric cancers with deeper invasion. Therefore, a more detailed classification than stage classification such as T1 T2 is required for the gastric cancer treatment policy decision to support AI. Zhu et al reported that their system could differentiate the depths of M or SM1 and SM2 from all gastric cancers including advanced gastric cancers. 20 The sensitivity, specificity, and accuracy of their system were 76.5%, 95.6%, and 89.1%, respectively, and they were significantly higher than those of skilled endoscopists. Nagao et al. 21 constructed AI that identifies the depth of gastric cancer using 8271 WLI, 2701 NBI, and 2656 indigo carmine-stained images, including images from various angles extracted from videos. The AUC of the WLI AI system was 0.9590. The lesion-based sensitivity, specificity, and accuracy of the WLI AI system were 84.4%, 99.4%, and 94.5%, respectively. The lesion-based accuracy of the AI system NBI and Indigo were 94.3% and 95.5%, respectively. AI could be used to support the endoscopic treatment policy.

DIFFERENTIAL DIAGNOSIS OF GASTRIC CANCERS AND ULCERS
A LTHOUGH AI WAS effective in detecting cancerous lesions, the diagnostic accuracy was insufficient due to the inability to precisely differentiate malignant and benign lesions. To solve this issue, Namikawa et al. 22 developed AI to diagnose gastric ulcer, which is a common disease representing a non-cancerous gastric lesion needing to be distinguished from gastric cancer. AI was trained with 13,584 gastric cancer images and 4826 gastric ulcer images. The sensitivity, specificity, and positive predictive value for distinguishing gastric cancer from gastric ulcer were 99%, 93.3%, and 92.5%, respectively.

POSSIBILITY FOR DOUBLE CHECK OF STORED ENDOSCOPIC IMAGES
T HE AI SYSTEM also has the advantage of detection time. Ikenoyama et al. 23 reported the detection time for 209 images of early gastric cancers from 2940 images by an AI system and endoscopists were 45.5 sec and 173.0 min, respectively. Furthermore, the sensitivity of AI was higher than 67 endoscopists with a difference of 26.5% (58.4% vs 31.9%). Although the specificity of AI was lower than that of the endoscopist, taking advantage of its high sensitivity and speed, AI may be used for double-checking images after endoscopy, in order to prevent overlooked cancers.  26 These results are superior to those of skilled endoscopists. Moreover, Shichijo et al. developed an AI system that discriminates between negative and current H. pylori infection as well as post H. pylori eradication. A total of 98,564 images of 742 H. pylori-positive cases, 3649 H. pylori-negative cases, and 845 after-H. pylori-eradication cases were used as teaching images and verified on 847 independent cases with 23,699 images. The accuracy of the AI system was 80% for negative diagnosis, 84% for accuracy post eradication, and 48% for positive diagnosis. 27 Since the presence or absence of H. pylori is clinically important to evaluate the risk of gastric cancer, an AI will afford transcendental information to the endoscopist in clinical use.

CLASSIFICATION OF ANATOMIC LOCATION OF THE STOMACH
T HE AI CAN also recognize the anatomical parts of the stomach. Takiyama et al. 28 30 in which they compared the blind spot rate of esophagogastroduodenoscopy with and without AI. The additional effect was highest in conventional esophagogastroduodenoscopy with AI system.

PERSPECTIVES OF AI IN STOMACH DISEASE
A S DESCRIBED ABOVE, AI-based gastric investigations could be realized to detect gastric cancer, H. pylori infection, and anatomical classification. It can be applied to image-enhanced endoscopy as well as WLI. The goals of AI systems can be geared towards standardization of diagnosis and safety. AI can assist endoscopists at any level for early cancer detection, without missing diagnoses. AI may allow optical biopsy that can reduce unnecessary biopsies or endoscopic resections. It is also expected to reduce the physician's burden in the field of gastric cancer screening. The AI-assisted checkup will spare physicians the burden of checking tens of thousands of endoscopic images; thereby, bringing immeasurable benefit. However, we need more evidence to apply the AI system to clinical use. Most of the studies were from Japanese or Chinese institutes, and were retrospective using still images, and video images in some cases. Only a few studies were conducted as prospective clinical trials. 30,31 The use of AI systems in the stomach field requires more clinical trials with video images to confirm that these systems can be used in clinical practice.

DETECTION OF PHARYNGEAL CANCER
W HEN DETECTED AT an advanced stage, advanced pharyngeal cancers require surgical resection and chemoradiotherapy, which decreases patient quality of life. On the other hand, early-detected superficial pharyngeal cancers (SPC) are treated by per-oral local resection such as endoscopic submucosal dissection (ESD) or endoscopic mucosal resection (EMR). [32][33][34] However, these techniques in detection of pharyngeal cancer are not prevalent worldwide and there is also a need for an efficient detection system for SPC. Tamashiro et al. 35 reported the AI system to detect SPC trained by 5403 images of pharyngeal cancers. The system detected all the SPCs in validation datasets, which largely consisted of SPCs including small lesions. This AI system would help endoscopists to avoid missing the diagnosis of SPCs.

A LTHOUGH THE OVERALL prognosis of patients
with advanced esophageal squamous cell carcinoma (ESCC) is poor, it can be favorable, if detected early. [36][37][38][39][40][41] However, early detection of ESCC is difficult using conventional endoscopy, particularly with WLI. 42,43 Image-enhanced endoscopy, such as NBI and blue laser imaging, is more useful to detect superficial ESCC. [44][45][46][47][48][49][50] However, this poses a problem for less experienced endoscopists because they detect ESCC with a low sensitivity of 53%. 51 Horie et al. 52 first reported a CNN-based AI system for detecting esophageal cancers, including ESCC and esophageal adenocarcinoma (EAC) (Fig. 2). The system accurately detected the cancers in 98% of the cases and analyzed 1118 images in 27 s. Notably, the AI diagnostic system could detect all lesions smaller than 10 mm. Moreover, it could specifically diagnose either superficial or advanced cancers with 98% accuracy. Ohmori et al. 53 reported a CNN-based AI system to detect and differentiate ESCC using images of both non-magnified endoscopy (non-ME) and magnified endoscopy (ME). Their AI system showed a high sensitivity for detecting SCC by non-ME and achieved a high specificity using ME, thereby decreasing false positives. There was no significant difference in the diagnostic performance between the AI and the experienced endoscopists. Cai et al. 54 also reported an AI system for detection of ESCC. Their AI system was superior to both inexperienced and experienced endoscopists in detecting ESCC using endoscopic images. Interestingly, the sensitivity of endoscopists in detecting ESCC was improved in the images with a rectangular frame displayed by the AI to indicate ESCC. Guo et al. 55 used still images to train, and video images for the validation set. The sensitivity and specificity of detecting ESCC in video datasets were shown to be 100% in both non-ME and ME NBI. Tokai et al. reported a CNNbased AI to detect and subsequently diagnose invasion depth of ESCC, This distinguished EP-SM1 and depths below SM2 in superficial ESCC using non-magnifying WLI and NBI images 56 (Fig. 3). The AI could detect 95.5% of the ESCC. In terms of diagnosing the invasion depth, the diagnostic ability was greater than the best AUC value obtained for 13 board-certified endoscopists. This study showed the possibility of the clinical application for diagnosing the invasion depth besides detection of ESCC.

B ARRETT'S ESOPHAGUS (BE) is a known risk factor
for the development of esophageal adenocarcinoma (EAC). The prognosis of EAC is strongly related to its stage at the time of diagnosis, 57 and endoscopic surveillance for BE is recommended. [58][59][60] However, detecting high-grade dysplasia and early EAC remains difficult for non-expert or general endoscopists who perform surveillance. 61 Ebigbo et al. 62 developed an AI system to detect early EAC trained by 148 WLI and 100 NBI images and achieved a high sensitivity of 92% and specificity of 100%, which was better than 11 of the 13 endoscopists. Based on this work, they developed a real-time AI system, which was designed for the classification of neoplasia in magnification. 63 The AI system could differentiate a normal Barrett esophagus and EAC with a sensitivity of 83.7% and a specificity of 100%. Hashimoto et al. 64 reported a system that achieved not only an accurate detection of early esophageal neoplasia in BE images, but also localization accuracy. The AI system accurately detected early neoplasia, with a mean average precision of 0.7533.
Groof et al. 65 worked on developing an AI system for Barrett's neoplasia detection. Moreover, overlapped neoplastic area delineated by experts and AI system were evaluated. The system achieved a per-image sensitivity of 95%, and specificity of 85%; and the location of the lesion was successfully recognized. They also developed an AI system, using five well-defined, independent endoscopy datasets for training and validation. 66 The AI system achieved higher accuracy than 53 non-expert endoscopists, and identified the optimal site for the biopsy of detected neoplasia in over 90% of cases. Subsequently, they performed in vivo studies for the detection of Barrett's neoplasia. 67 WLIs were obtained at every 2 cm level of the Barrett's segment and immediately analyzed by the AI system, providing instant feedback to the endoscopist. The AI system detected neoplasia with high accuracy.

PERSPECTIVES OF AI IN PHARYNGEAL AND ESOPHAGEAL CANCER
A LTHOUGH THE AFOREMENTIONED reports have demonstrated highly accurate AI systems using not only still but also video images, AI systems require further improvement to detect early ESCC in a clinical setting, particularly in screening endoscopies, in which the endoscope is quickly passed through the esophagus. Anatomically, it is highly challenging to observe the esophageal mucosa near the esophagogastric junction, and the cervical esophagus, particularly for non-experts. In addition, many reports exclude poor quality images resulting from halation, blur, defocus, mucus, and poor insufflation of the air from validation videos. Therefore, we think that the AI system should be validated by using high-speed videos than the actual speed in clinical practice. It would also be useful if AI could predict the cancer risk with normal background esophagus in WLI or NBI, as multiple unstained areas after iodine stain show the risk of ESCCs. Such a prediction would help to detect cancers in the daily practice of endoscopic examination. With these achievements, AI systems would effectively assist endoscopists in detecting early esophageal and pharyngeal cancers, thereby improving survival rates.

PERSPECTIVES OF AI FOR DETECTION OF CANCERS OF THE UPPER GI TRACT
C ONTEMPORARY AI IS only specialized in analyzing image information. At present, there is no AI that can perform comprehensive diagnosis by combining image information, medical history, and laboratory data. Such an AI may be possible in over a decade from now; however, for the time being, AI does not make a diagnosis alone but serves as an excellent assistant for doctors.
As we have seen in this review, AI is reportedly highly sensitive in early cancer. The combination of AI and endoscopy may increase the chances of early detection of cancers. However, more evidence is needed from prospective studies using AI in the field of endoscopy, to strengthen these findings.
For the time being, AI is only taught by high-quality still images; and even if the same accuracy can be obtained by video verification, the specificity is often lower than that of human doctors. AI that can produce truly effective results when used in real-time video in clinical settings will be developed. Until then, endoscopic doctors will make up for the low specificity of AI.
Furthermore, endoscopic diagnosis support AI is treated as a medical device in many countries and cannot be used clinically without obtaining regulatory approval from each country.
However, these problems will be resolved in the coming years. The introduction of AI to endoscopic clinical practice will also decrease the training time of endoscopists, reduce the burden on training instructors, and compensate for agerelated vision problems in experienced doctors. By understanding the characteristics of endoscopic AI and using it in conducting examinations together, endoscopists will provide more accurate diagnoses and treatments.
Launching multiple AI products with different validation materials makes it difficult to determine which AI is superior. However, if a standard AI verification method is established for each disease, it will be easy to compare AIs. In the near future, endoscopic AI will inevitably become commonplace if sufficient evidence is provided for its recommendation in guidelines or if it is covered by insurance. In endoscopic AI for the colon polyp, the use of endoscopic AI is suggested to improve adenoma detection rate. European Society of Gastrointestinal Endoscopy (ESGE) published guidelines for weak recommendation for the incorporation of AI for colonoscopy. 68 Cost effectiveness for AI use in colonoscopy is also suggested. 69 Since endoscopic AI is a real-time video diagnosis, the accuracy of endoscopic AI may be improved by applying an AI algorithm that comprehensively judges the previous and next images instead of AI diagnosis for each image. In addition, not only will the image quality of the endoscopic device improve to 4K8K in the future, there is a possibility that 3D mapping of the organ by the endoscope will be done together with other sensing technologies. The potential for development in this area is endless.

CONCLUSION
T HIS REVIEW OUTLINES the current research and development status and the prospects of AI application for gastrointestinal cancer detection. Employing AI-based endoscopes would enable early cancer detection and consequently improved prognosis. The use of AI technology would be beneficial in terms of better diagnostic capabilities, as these vary widely among endoscopists in early gastric cancer. AI diagnosis for early esophageal cancer is expected to have the same effect.
Although no AI tools for cancer detection/characterization have been validated in a prospective trial. AI has a potential to contribute the evolution of endoscopic medicine and support doctors as an excellent assistant. However, AI does not make a definitive diagnosis and cannot perform endoscopy itself, so the need for a doctor remains the same. From now on, endoscopists will need the skills to understand and utilize AI.
ACKNOWLEDGMENTS W E WOULD LIKE to thank Editage (www.editage.c om) for English language editing.

CONFLICT OF INTEREST
T ADA T IS a shareholder of AI Medical Service Inc.
Other authors have no COI to disclose.

FUNDING INFORMATION
T HIS PAPER WAS not funded.