The CNN model aided the study of the clinical value hidden in the implant images

Abstract Purpose This article aims to construct a new method to evaluate radiographic image identification results based on artificial intelligence, which can complement the limited vision of researchers when studying the effect of various factors on clinical implantation outcomes. Methods We constructed a convolutional neural network (CNN) model using the clinical implant radiographic images. Moreover, we used gradient‐weighted class activation mapping (Grad‐CAM) to obtain thermal maps to present identification differences before performing statistical analyses. Subsequently, to verify whether these differences presented by the Grad‐CAM algorithm would be of value to clinical practices, we measured the bone thickness around the identified sites. Finally, we analyzed the influence of the implant type on the implantation according to the measurement results. Results The thermal maps showed that the sites with significant differences between Straumann BL and Bicon implants as identified by the CNN model were mainly the thread and neck area. (2) The heights of the mesial, distal, buccal, and lingual bone of the Bicon implant post‐op were greater than those of Straumann BL (P < 0.05). (3) Between the first and second stages of surgery, the amount of bone thickness variation at the buccal and lingual sides of the Bicon implant platform was greater than that of the Straumann BL implant (P < 0.05). Conclusion According to the results of this study, we found that the identified‐neck‐area of the Bicon implant was placed deeper than the Straumann BL implant, and there was more bone resorption on the buccal and lingual sides at the Bicon implant platform between the first and second stages of surgery. In summary, this study proves that using the CNN classification model can identify differences that complement our limited vision.

concern in implantology. 3Those factors are established and scrutinized by health providers based on their clinical information in various forms and medical knowledge and expertise.However, the available medical knowledge and expertise currently established in the field might have limited the vision of health providers during the procedure.Hence, the non-human assistant identification of clinical factors, unbiased by limited vision, would be an excellent supplement to the wisdom of clinical practitioners.
Artificial intelligence (AI) in various forms is pushing the frontiers of medicine.The idea of deep learning is also being applied to various fields of medicine. 4,5pecifically, convolutional neural networks (CNN) are being widely applied.As one of the core models of artificial neural networks and deep learning, CNN is a special deep feedforward network consisting mainly of the input,convolutional,pooling,fully connected,and output layers, which provides computer vision capabilities, including medical image classification. 6][9][10][11] However, compared to the well-established classification, little attention is given to the training process of CNN models.As a nonhuman aid to identification, the training process of a neural network model comprises the accumulation of differences in source images. 12The training algorithm dictates that the training process naturally identifies and accumulates various factors related to the differences among the images.Particularly, neural network models can extract image differences that humans may never notice. 13These differences can be used as entry points for researchers to study the impact of various factors on clinical outcomes.It is notable that the training process of unsupervised networks is processed as a black box and is not disturbed by subjective human factors. 14In other words, the CNN training process is based only on feature extraction learning from a large amount of medical data, without the need to label and segment images or classify features before training by people, which is an excellent way to avoid model classification errors due to subjective factors or limitations in the expertise of researchers or clinicians. 8,15,16Therefore, CNN has the potential to provide us with an additional complement to study the impact of various clinical factors on implantation outcomes without the bias of limited vision, broadening the horizons of researchers' knowledge in the field of implantology.
Based on the broad application of the classification CNN model, we propose a new and essential application concept of neural networks for evaluating radiographic image recognition results.In other words, we used the CNN model and the gradient-weighted class activation mapping (Grad-CAM) to present the differences in implant image identification, which can supplement the limited field of vision of experts and clinicians when studying the influence of various factors on implant images and clinical implantation results.Further reasonable inferences, assumptions, and verifications are made for the identification differences, to help us extract meaningful guidance for clinical implantation work.It is well known that a large amount of imaging data is generated in the clinical work of oral implantation.At present, image data are more commonly used in diagnosing and treating a single patient's disease, and we think that they have not been fully applied.Therefore, we propose a new method using the CNN model that can simultaneously analyze a large number of clinical implantable images and present the identification differences between the images, enabling further application of implant images.After deep learning based on large amounts of data, such differences presented at the overall level can provide reliable direction and inspiration for clinicians,so that they can further study various factors affecting the clinical implant repair effect.The proposed method would reveal novel perspectives on how AI can help clinicians and other medical experts.

2.1.1
Data collection

Image processing
Usually, when using a CNN model for learning, the radiographic images used as the dataset should focus on each implant.Therefore, we clipped periapical radiographic images before training to ensure that there was only one target implant in each image.The radiographic images were all at the proximal and distal midplanes of the implant.Subsequently, all manually cropped images were inserted into the computer picture cropping program so that the size of each image was 250 × 250 pixels, 15 which ensures that the inserted images can allow the CNN to learn and yield accurate results.

Constructing a classification CNN model based on VGG-16 and transfer learning for deep learning
We constructed our CNN model based on VGG-16 for deep learning.VGG-16 is a CNN model developed by the Visual Geometry Group of Oxford University and Google DeepMind 18 and used mainly for image identification and classification. 19We adopted the convolutional layer structure of the VGG-16 as the convolutional layer of our CNN model.Furthermore, our model added a dropout layer after the fully connected layer to reduce data overfitting 20 and a flattened layer to realize the transition from the convolutional layer to the fully connected layer. 7,21The structure of the CNN model is illustrated in Figure 1a, and the specific parameters are shown in Table 1.The deep-learning process of the CNN in this study also adopted the RMSprop optimization algorithm.RMSprop optimizer is based on an accelerating gradient descent speed suitable for CNNs, which can further improve the accuracy and recall rate of the model for implant radiographic image classification. 22he training sets required by CNN are enormous, and more than one million training sets are required to complete the training on ImageNet.However, collecting radiographic images of oral implants in such sufficient numbers is difficult.Therefore, to ensure the effectiveness and reliability of deep learning,we adopted transfer learning 23 and used only the implant datasets to train TA B L E 2 Tooth positions of the included implants.

Bicon 8 51
Straumann BL 7  35   the fully connected layers.In contrast, the convolutional layer and pooling layer before the fully connected layer were constructed by directly loading the pre-trained VGG-16 model to form a complete CNN model for image classification. 9,24The transfer learning process used to construct a CNN model to recognize implant radiographic images is shown in Figure 1b.

Presenting thermal maps through Grad-CAM
Grad-CAM is a CNN visualization method used to show differences between image categories by generating thermal maps. 23In this study, Grad-CAM was used to visually present the identified sites of the implant, which helped to improve the understanding of the CNN training process. 26,27Figure 1b shows the process of visualizing the recognition results of implant radiographic images using Grad-CAM.
The Grad-CAM thermal maps were obtained by using the 207 periapical radiographic images from the test set in the CNN model, including 55 Bicon implants and 152 Straumann BL implants.We subsequently divided implants into four sites: apical, body (thread), platform transfer department, and repair parts, including the abutment and screws, 28 and counted the number of identified sites of the two types of implants obtained from the Grad-CAM thermal maps.The results were averaged over three counts by a researcher.The percentage of identified sites was obtained by dividing the number at each site by the total number of implants.Statistical analysis was performed using SPSS version 25.0 (IBM Corporation, Armonk, NY, USA), and the chi-squared test was performed for the counts of the identified sites of the implants.

2.4
Measuring bone thickness around the implant

Measurement object
We randomly selected the CBCT images of 77 patients after the first and second stages of surgery from the collected data sets as measurement objects: We collected the images of 77 patients with 59 Bicon implants and 42 Straumann BL implants.We included implants located in the premolar and molar positions. 29The tooth positions of the included implants are shown in Table 2, and there was no significant difference between implants (P > 0.05).

Measurement method
We used the distance measurement tool in Onevolume Viewer software to measure each measuring position.

Measurement position
The measurement positions are as follows: bone height on the buccal (H1), lingual (H2), mesial (H3), and distal (H4) sides above the implant platform; The bone thickness on the buccal and lingual aspects of the implant platform (0 mm); bone thickness of the buccal side at 1, 2, 3, and 4 mm under the implant platform; and the bone thickness on the lingual aspect was 1 and 2 mm under the implant platform (Figure 3a). 30

Statistical analysis
SPSS version 25.0 was used for the statistical analyses.
Results are expressed as mean ± standard deviation (SD).After the normality and homogeneity of variance tests, a completely random design t-test was used if the conditions were met.Otherwise, the rank sum test was used.P < 0.05 was considered to be statistically significant.

Differences between the radiographic images of two types of implant identified by the classification CNN model
The classification effect of the CNN model after transfer learning on the two types of implants is shown in Figure 2a and b.According to the output data, the training loss of this model for 207 test sets was only 0.37%, and the classification accuracy rate reached 100.00% for the test dataset.The classification accuracy of CNN for implant radiographic images has been fully confirmed.The results ensure that both the identified differences presented by Grad-CAM and the analysis based on these differences are reliable.
Part of the deep learning results of the two implants activated by Grad-CAM is shown in Figure 2c.After classifying the results obtained on 207 thermal maps, we found that there were significant differences in the process of recognition and accumulation of differences in implant radiographic images by the CNN model, and the chi-square test showed that the difference was statistically significant (P < 0.05).As shown in Figure 2d, the sites identified by the CNN of the Bicon implants were mainly the neck area, that is, the platform transfer department and repair parts, including the abutment and screws.The identified sites of the Straumann BL implants are mainly the thread and apex.

Analysis of the reasons behind the differences between radiographic images identified by CNN
The statistical results showed that CNN had different implant identification rates.Our study showed that the identification rate of CNN of the platform transfer department of Bicon was 74.5%, whereas that of repaired parts was 65.5%.Moreover, the identification rates of those two sites in the Straumann BL system were only 36.4% and 10.9%, respectively.This difference indicated that, after the training process, the CNN model identified a significant difference in the neck area between both implants and placed a high weight on that site during the training process. 18We propose two hypotheses to explain the possibility of differences in the identification of the implants' neck area.First, the implants themselves are designed differently.Compared with the Straumann BL system, the Bicon has a sloping neck design at the platform transfer, resulting in signif-icant narrowing, 31 which may make the area a feature of the image extracted by deep learning.Second, we noticed that the areas covered by the thermal map of the implant contained not only the implant component, but also a portion of the bone component.Therefore, bone attachment around the implant may also affect the extraction of implant radiographic image features by the CNN.
However, the above two hypotheses were only proposed based on thermal maps and two-dimensional measurement results, and there are still many limitations and uncertainties.The root cause of the identification differences may need further study in light of the principles and procedures of the CNN model.
Using the Onevolume Viewer to measure 12 positions of 101 implants in 77 randomly selected patients.As shown in Figure 3b and c, the mesial and distal bone heights of the Bicon implants were 2.18 ± 1.17 mm and 1.80 ± 1.09 mm, whereas the mesial and distal bone heights of the Straumann BL implant were 1.09 ± 0.75 mm and 0.71 ± 0.77 mm.The mesial and distal bone heights of both implants were significantly different (P < 0.05).Since the two measurement sites of the mesial and distal bone heights are located in the implant neck area, there are differences in bone attachment between both implants' neck, which may affect the extraction of image features by neural networks, thus leading to differences in identification sites.
F I G U R E 3 Statistical analysis results of peri-implant bone measurement.Part a is the measurement diagram; parts b-e used the t-test; and parts f and g used the rank sum test.

Differences identified by CNN assists in studying the effect of implant type on clinical implantation outcomes
The observation of differences identified by CNN led us to focus on the differences in the bone attachment around the neck.Subsequently, we measured and analyzed them according to clinical experience. 32,33The measurements showed that the mean height of the mesial, distal, buccal, and lingual bone of the Bicon implant was 1.76 ± 1.21 mm, whereas the mean bone height of the Straumann BL implant in these four positions was 0.70 ± 0.73 mm.The statistical analysis of measurements is shown in Figure 3b-e.The heights of the mesial, distal, buccal, and lingual bone of the Bicon implant post-op were greater than those of Straumann BL (P < 0.05), indicating that the neck of Bicon implant was placed approximately 1 mm deeper than the Straumann BL implant.The amount of bone thickness variation at the buccal and lingual sides of the implant platform was 0.57 ± 0.89 mm and 0.74 ± 1.03 mm for the Bicon implants and 0.28 ± 0.61 mm and 0.35 ± 0.72 mm for Straumann BL implants at these two positions, respectively.The statistical analysis of measurements is shown in Figure 3f and g.The amount of bone thickness variation at the buccal and lingual sides of the Bicon implant platform was greater than that of the Straumann BL implant (P < 0.05), indicating that there was more bone resorption in the neck with Bicon implants between the first and second stages of surgery.
The findings mentioned above are derived from the CNN training process, which presents differences in implant neck identification rather than from the professionals' direct conjecture about how implant type affects implant depth and bone resorption.Clinical studies have shown that the implant placement depths may affect both hard and soft tissues around the implant, 34,35 resulting in different peri-implant bone remodeling. 36one resorption has been used as an indicator to identify biological complications such as peri-implantitis and subsequently to determine the success rate of the implant. 37,38Therefore, these results allowed the evaluation of the likelihood of complications and long-term survival rates of Bicon implants and Straumann BL implants.In summary, we used the differences identified during the training process of the CNN classification model to extract the influence of implant type on implantation outcomes and operation process.
It is feasible and convenient to use CNN to investigate the influence of various factors on clinical outcomes.Previous studies tended to use AI to classify implant images, 9,39,40 with much attention being paid to the specific results presented by AI.However, our study focused on the training process, that is, on the differences identified and accumulated by the CNN while training and the reasons behind the formation of differences.In this study, CNN and Grad-CAM were used to help identify and present differences in the implant radiographic images caused by pre-treatment, surgical, and restorative factors, especially those that cannot be detected directly by researchers.This is a further application of the large number of implant images in clinical work, showing the features that are not easy to find directly and helping to find the potential value in implant images.It's also a significant complement to the potentially limited vision of researchers. 41In addition to our clinical knowledge, we can evaluate radiographic image identification differences and use them to guide the clinical diagnosis and treatment in oral implantology.For example, the same method can be used to present image differences due to factors such as implant placement angle and occlusal design to assist in investigating their impact on clinical implantation outcomes.

CONCLUSION
According to the differences between Bicon and Straumann BL implant radiographic images identified and presented by the training process of the CNN classification model, we found that the identified-neck-area of the Bicon implant was placed deeper than the Straumann BL implant, and there was more bone resorption on the buccal and lingual sides at the Bicon implant platform between the first and second stages of surgery.This study proved that using the CNN classification model can identify image differences caused by implant type, which assisted us in studying the influence of implant type on implantation outcomes.In summary, we construct a new method to evaluate radiographic image identification results based on artificial intelligence.The central idea of this method is to identify image differences, which can complement the limited vision of researchers.According to these identification differences, researchers can further discuss the clinical value contained in implant images, so that a large number of clinical implant image data can be further applied.
The logic behind our design can be extended to various fields with a high demand for medical expertise.
In the future, studies that include other factors, especially those that include larger amounts of data, are needed to implement the extracted clinical guidance and determine the validity and creativity of the new method for researching more clinical factor influences.

F I G U R E 1
Overview of the general scheme for presenting implant image differences based on CNN models and Grad-CAM.Part a describes the detailed structure of the CNN model based on VGG-16, and part b represents the overall process of transfer learning and Grad-CAM.

F I G U R E 2
Results of implant image dataset recognized by classification CNN model.Part a is training loss function, and part b describes the classification accuracy.Part c shows part of the thermal maps presented by Grad-CAM, and part d indicates the statistical results of the implants identification sites.

Layers (categories) Output shape Number of parameters
TA B L E 1Structural parameters of the CNN model.