Identification of Location and Geometry of Invisible Internal Defects in Structures using Deep Learning and Surface Deformation Field

On‐site inspection of invisible subsurface defects in multiscale structural materials by conventional nondestructive testing (NDT) methods, such as X‐ray and ultrasound, requires complex sample preparation and data acquisition processes. Moreover, the inspected area is very small. Herein, a simple, inexpensive, and ultrasensitive NDT method for identifying and classifying the geometries of subsurface defects using commercial cameras, digital image correlation software, and object detection (OD) algorithms is developed. Three OD algorithms—Faster region‐based convolutional neural network (Faster R‐CNN), Mask R‐CNN, and you‐only‐look‐once (YOLO)v3—are evaluated for their ability to locate defects and identify defect geometries. Specifically, bounding boxes of two sizes (large and small) are applied to the regions of defect‐induced perturbations in strain tensors, which serve as virtual representatives of invisible subsurface defects. The performance of the proposed approach is validated on test datasets of known and unknown defect types. The experimental results confirm that the proposed approach can effectively utilize the surface deformation field information to accurately and reliably locate and identify subsurface defects. The method is nondestructive and low cost, enables real‐time detection, is robust against noise‐dominated deformation fields, and can be applied to various structural deformations. The method is therefore suitable for multiscale structural health monitoring and characterization of internal defects in materials.


Introduction
Object detection (OD) is a powerful computer vision technique used to identify objects within images by creating rectangular bounding boxes and classifying object categories. [1]However, the application of conventional OD methods is limited by their high computational complexity, high window redundancy, and manual feature design.Girshick et al. first proposed an OD approach that utilized the features of a region-based convolutional neural network (R-CNN) that also incorporated deep convolutional neural networks (DCNNs). [2][8] These models employ different network architectures, training strategies, and optimization functions.[20] As a contribution to the On-site inspection of invisible subsurface defects in multiscale structural materials by conventional nondestructive testing (NDT) methods, such as X-ray and ultrasound, requires complex sample preparation and data acquisition processes.Moreover, the inspected area is very small.Herein, a simple, inexpensive, and ultrasensitive NDT method for identifying and classifying the geometries of subsurface defects using commercial cameras, digital image correlation software, and object detection (OD) algorithms is developed.Three OD algorithms-Faster region-based convolutional neural network (Faster R-CNN), Mask R-CNN, and you-only-look-once (YOLO)v3-are evaluated for their ability to locate defects and identify defect geometries.Specifically, bounding boxes of two sizes (large and small) are applied to the regions of defect-induced perturbations in strain tensors, which serve as virtual representatives of invisible subsurface defects.The performance of the proposed approach is validated on test datasets of known and unknown defect types.The experimental results confirm that the proposed approach can effectively utilize the surface deformation field information to accurately and reliably locate and identify subsurface defects.The method is nondestructive and low cost, enables real-time detection, is robust against noise-dominated deformation fields, and can be applied to various structural deformations.The method is therefore suitable for multiscale structural health monitoring and characterization of internal defects in materials.field of defect detection, this study utilizes OD to detect and classify invisible internal defects in materials based on their geometric information.Moreover, OD can enable nondestructive material testing and failure analysis.Internal defect detection is crucial for determining the safety, mechanical integrity, and reliability of various structures (civil, mechanical, nuclear, and aeronautical structures), transportation vehicles, industrial equipment, and microelectronic devices.
During manufacturing or while in service, materials can develop internal abnormalities such as voids and cracks due to severe mechanical conditions.[23][24] Thus, failure to inspect and locate these defects before they become critical can result in catastrophic failure, injury, and even loss of human life. [25][28] However, such methods require complex and expensive sample preparation techniques and data acquisition systems.Moreover, the inspected area is small.These characteristics make them unsuitable for inspecting the integrity of multiscale structures on-site.Therefore, new approaches that are aligned with industrial practices (e.g., visual inspection) must be developed.
[31][32] This method tracks pixel movements through correlated speckle pattern subsets from each image in a sequence to accurately describe the full-field deformation over the specimen surface.This can aid in detecting invisible internal defects by correlating the defect-induced perturbations in the deformation field.[35] However, this approach requires a finely tuned finite-element model (FEM) of the structure, which can be computationally expensive.[35] Additionally, plastic deformation around the subsurface defect can affect the accuracy of measurements.
To eliminate the need for computationally demanding FEM models, mesh-free methods that incorporate physics-informed neural networks have been used to solve the geometric identification problem in continuum solid mechanics. [36]However, these approaches require training datasets of the deformation field from all surfaces as well as the interior of the structure.Since DIC only measures surface deformation, the lack of internal deformation information makes it challenging for physics-informed neural networks to solve the inverse problems.
To address the aforementioned challenges, this study proposes an indirect method of identifying and classifying the geometry of invisible subsurface defects using the OD algorithm and surface deformation field acquired using DIC.The workflow of the proposed method is illustrated in Figure 1.The proposed method uses elastic and plastic deformation to construct 3D strain tensors with three strain components (e yy , e xx , e xy Þ to build the training and testing datasets.The training dataset comprises six classes of distinct defect types such as lines and circles with varying depths, whereas the test dataset contains unknown defect types such as triangles with varying depths and intact structures (free of defects), in addition to the classes involved in the training (known defect type).As indicated in Figure 1, the perturbations in the strain map, induced by the subsurface defect, propagate to a larger area compared with the apparent defect size from the measurement surface.Herein, we considered two sizes of bounding boxes corresponding to large and small areas of the defect-induced perturbations in the strain tensors.These serve as virtual representations of the invisible subsurface defects.The large bounding box contains the majority of the perturbations, whereas the small bounding box includes perturbations equivalent to the projected shape of defect that appears from the measurement surface.In particular, three popular OD algorithms-Faster R-CNN, Mask R-CNN, and YOLOv3-were trained individually with the large bounding box to compare and validate their predictions.In addition, only Mask R-CNN was trained with the small bounding box, leveraging the additional task of generating masks of any polygonal shape.The overall architecture of the OD algorithms remains the same as in previous reports, [4,5,8] with modifications in the input layer to accommodate the input strain tensor in NumPy format.The results confirm that the proposed approach exhibits remarkable detection ability even for unknown defect types and significant perturbations that do not appear in the strain tensors, as illustrated in Figure 1.However, some misclassification at shallow depths under small deformations suggests that the proposed method has room for improvement.The proposed method establishes a new direction in structural health monitoring, as it is nondestructive and cost-effective, permits real-time inspection, can be implemented even with noise-dominated strain fields, and is applicable to a wide range of structural deformations.

Performance Evaluation of OD Algorithms with Large Bounding Box and Known Defect Types
The strain tensors obtained from the DIC measurement as the test dataset are visualized in Figure 2 in terms of e yy sequential strain maps-from zero external load to the load that induces plastic deformation for all six classes included in the training set.The corresponding external load of each strain map is displayed on the top.In each class, prominent defect-induced perturbations with distinctive shapes can be observed from the strain maps (last row of Figure 2).However, the perturbation quality gradually degrades, as can be seen from the last row to the first, and at a certain point, the perturbation completely disappears.As bounded by the dotted lines, this disappearance occurs earlier in classes with shallow depth (circle-0.5 and line-0.5)compared with those with high depth (circle-2.5 and line-2.5).In other words, the load required to have a clear vision of defectinduced perturbation increases with decreasing depth.For example, the noticeable defect-induced perturbation in circle-2.5,circle-1.5,and circle-0.5 are 5.9, 16.66, and 18.81 kN, respectively.A similar trend is observed for line-2.5 (10.87 kN), line-1.5 (14.86 kN), and line-0.5 (16.65 kN).Note that the invisible subsurface defects produce distinctive perturbations in the e xx and e xy strain maps of the same defect types, as illustrated in Figure S1 (Supporting Information).
[35] This noise can be observed in the strain maps presented in the first row of Figure 2, which depict exclusively the noise distribution when the corresponding external load is zero.As several factors (as mentioned earlier) cause noise in DIC measurements, noise elimination remains challenging. [35]owever, we are more interested in deriving the strain tensors than capturing the strain tensor images because acquiring a minimum amount of true strain data that responds to the underlying subsurface defects can significantly assist the OD algorithms in identifying and characterizing invisible subsurface defects of various shapes and depths at varying load limits.
As observed in Figure 2, the majority of the training dataset comprised strain tensors with imperceptible perturbations.
Nevertheless, the well-converged curve of the total training loss of Faster R-CNN, Mask R-CNN, and YOLOv3 plotted in Figure S2 (Supporting Information) suggests that the OD algorithms can effectively learn to scan through the noisecontaminated strain tensors and identify the location and classes of the defects.Note that we incorporated various augmentations such as random cropping, rotation, horizontal flipping, and vertical flipping during the training process to prevent overfitting. [37]he predicted bounding boxes and corresponding classes generated by Faster R-CNN, including their ground-truth bounding boxes, are displayed in Figure 2. The results obtained from Mask R-CNN and YOLOv3 are presented in Figure S3 and S4 (Supporting Information), respectively.Although the OD algorithms can predict one or more bounding boxes, this research only considered the bounding box with the highest confidence score, as the other bounding boxes have significantly lower confidence scores.As observed, the positions of the ground-truth boxes shifted randomly owing to random cropping.
As the deep learning-based OD algorithms were trained with randomly cropped strain tensors, the tested strain tensors were cropped uniformly at random to assess the ability of the model to identify the random positions of defects.The majority of the predicted bounding boxes in Figure 2, S3, and S4 (Supporting  .Sequential e yy strain maps representing test dataset of known defect types and the results predicted using Faster region-based convolutional neural network (Faster R-CNN) trained with a large bounding box.The dotted lines distinguish notable defect-induced perturbations from unnoticeable perturbations; the ground-truth bounding box is colored pink, whereas the predicted bounding box is black.Additionally, the predicted class with the corresponding confidence score and intersection over union (IoU) between the ground-truth and predicted bounding boxes is provided.The color bar represents the strain distribution.
and defect localization tasks effectively, despite certain misclassifications in the classes with shallow defects under small deformation, such as line-0.5 and circle-0.5.This classification task is critical, as it aids in interpreting the geometric information of the invisible subsurface defects.The misclassification with shallow defects, particularly at small deformation, is due to insufficient perturbations on the strain field.The magnitude of concentrated strain in the vicinity of a shallow defect is low compared to that in the vicinity of a deep defect; therefore, it is not sufficient to induce perturbations on the strain field. [38]Moreover, noise further degrades data quality, which is more severe when deformation is low.This is because the magnitudes of noise and strain are in the same range.Thus, data quality in line-0.5 and circle-0.5 is good enough to localize the defects but not good enough to correctly classify the defect in the case of small deformations.
To evaluate the performance of the deep learning-based OD algorithms for each class, the IoUs of numerous sequential frames in each class were evaluated according to the predicted bounding boxes and ground truth.The corresponding bar plots and load curves are presented in Figure 3.The confidence score of each frame is presented in Figure S6 (Supporting Information).To highlight the classification task, the IoUs are plotted with six distinct colors, each representing a predicted class.The first, middle, and third columns summarize the results predicted by Faster R-CNN, Mask R-CNN, and YOLOv3, respectively, wherein each row represents a given class.
Based on the load curve, the tested strain tensors can be categorized into three distinct categories: no external load, linearly increasing load (elastic deformation), and nonlinearly increasing load (plastic deformation), as depicted in the first column of Figure 3.In the load-free region where the strain tensors represent noise distribution, all models occasionally predicted the defect location.The detection in the elastic region can be classified into three primary categories: inconsistent detection with misclassification, consistent detection with a mixture of misclassification and accurate classification, and consistent detection with accurate classification.The load spans of these three regions relied on the defect depth.The load spans of the first and second regions decreased as the depth increased, whereas that of the third region increased with depth.For instance, the results obtained by Faster R-CNN show that the load span of the first region in line-0.5, i.e., 0-9.5 kN, diminished to 0-2.5 kN in line-1.5 and further to 0-0.5 kN in line-2.5.Similar tendencies were observed with Mask R-CNN and YOLOv3, with marginal variations in the load span.Note that the first category of the elastic region exhibited certain similarities with the load-free region in terms of the detection rate.However, the consistency in IoU in the first category of the elastic region suggests that the detection was not random as observed in the load-free region.

Performance Evaluation of OD Algorithms with a Large Bounding Box and Unknown Defect Types
The performances of the well-trained Faster R-CNN, Mask R-CNN, and YOLOv3 models with large bounding boxes were further evaluated for classes with varying geometric shapes that were excluded from the training process.Specifically, three samples with triangular defects of varying depths were tested to demonstrate the models ability to locate unknown defect types.
The predicted bounding boxes and classes from the evaluation are illustrated in Figure 4a (Faster R-CNN), Figure S7a (Mask R-CNN), and Figure S8a (Supporting Information) (YOLOv3) in the corresponding sequential e yy strain maps.The sequential strain maps were obtained for a range of loads-from zero load up to the load that induced plastic deformation.Moreover, the strain tensors were cropped uniformly at random to randomize the locations of the subsurface defects.Most of the predicted bounding boxes in the figures closely correspond to their ground-truth bounding boxes, and the displayed IoU values indicate adequate overlap even in the absence of significant perturbations.Thus, unknown defect types can be detected using the OD algorithms trained on various defect types.
Regarding classification, the three models predominantly selected line-1.5 to classify tri-0.5 and tri-1.5, whereas line-2.5 was selected to classify tri-2.5.However, the perturbation features triggering the models to display an inclination toward certain defect types included in the training could not be precisely determined.Nonetheless, a closer inspection of the perturbations presented in Figure S1 (Supporting Information) suggests that certain features of the perturbations in one of the strain components of the predicted class may overlap with those of the perturbations in one of the strain components of the unknown defect types.For instance, the two elongated perturbations observed in the e yy strain maps of tri-0.5 and tri-1.5 resemble those of line-1.5 in the e yy strain maps, which can be attributed to the classification of tri-0.5 and tri-1.5 as line-1.5 by the model.
The IoU color bar plots for tri-0.5, tri-1.5, and tri-2.5 illustrated in Figure 4b-d S9 (Supporting Information).In the load-free region, all three models exhibited an occasional tendency to predict the defect location, but the locations were random as observed previously with known defect types.As discussed earlier, the detection characteristics were comparable in the elastic region, with the initially inconsistent detection followed by consistent detection.With the increasing depth, the load span for inconsistent detection contracted while that for consistent detection expanded.For instance, the load span of the first region in tri-0.5 for Faster R-CNN, i.e., 0-13.5 kN, diminished to 0-3.5 kN in tri-1.5 and further to 0-0.5 kN in tri-2.5.Compared with Faster R-CNN and YOLOv3, Mask R-CNN detected more frames in the inconsistent detection region of elastic deformation.Note that Mask R-CNN exhibited superior detection ability for the known defect types (see Figure 3), indicating that it is a more efficient OD algorithm for detecting invisible subsurface defects with strain tensors without any significant perturbation.Mask R-CNN has an additional branch for predicting segmentation masks on each region of interest (ROI) in a pixel-to-pixel manner, while Faster R-CNN and YOLOv3 are not designed for pixel-to-pixel alignment between network inputs and outputs. [4,5,8]Consequently, Mask R-CNN with the additional network, where each element of the strain tensor dictates the total loss function of the model, performed better.
The models were further tested with strain tensors from the defect-free sample.The strain maps exhibit the locally inhomogeneous distribution of the strain along with occasionally predicted results (Figure 4a, S7a, and S8a, Supporting Information).To generalize the performance, numerous sequential frames were tested with all three models, and the bar plots of the confidence score as well as the load curves obtained from Faster R-CNN, Mask R-CNN, and YOLOv3 are plotted in  By comparing the confidence score plots of the defect-introduced and defect-free specimens, two distinct features were observed: first, the tendency of not detecting both the elastic and plastic regions in the defect-free specimen, and second, the randomness in the classification throughout the loading in the defect-free specimen.The first characteristic was more prominent in Faster R-CNN, whereas the second was more prominent in both Faster R-CNN and YOLOv3.Notably, similar characteristics were observed in the load-free regions of every sample.Since the models were trained to detect defect-induced perturbations in strain tensors, the absence of the perturbations in the strain tensors in the load-free region as well as in the defect-free specimens showed similar outcomes, which were different from the outcomes of the loaded defect-introduced specimens.

Performance Evaluation of Mask R-CNN with a Small Bounding Box and Known and Unknown Defect Types
The findings of this study suggest that the three OD algorithms using large fixed-size bounding boxes can potentially detect and classify invisible subsurface defects in both known and unknown defect types.However, as the bounding boxes were larger than the actual defect size, the precise locations of the defects could not be determined.Additionally, the fixed-size bounding boxes did not account for the size and shape variations of defects occurring in real-world scenarios.Among the three algorithms tested in this study, the Mask R-CNN algorithm could arbitrarily segment the defect-induced perturbation.Thus, we investigated this feature using the same training and testing datasets as earlier but with different mask sizes and shapes, such as circular and stadium shapes (for line classes), as depicted in Figure 5a.The total training loss curve shown in Figure S2 (Supporting Information) implies that the model converges well.
The predicted masks illustrated in Figure 5a demonstrate that the model can detect the defect location with the same set of masks applied in the training.Despite accurately identifying the defect location in tri-1.5, the predicted mask corresponded to the predicted class, highlighting a limitation of supervised learning.Nonetheless, the potential for detecting both known and unknown defect types by considering a small portion of the defect-induced perturbation appeared promising and is further confirmed in Figure S10 and S11 (Supporting Information).Segmenting unknown object types is a current research topic under the framework of the class-agnostic semi-supervised learning framework in OD algorithms. [39]A similar framework will be applied in the course of future work to improve upon the limitations of the present work.
To evaluate the detection and classification performance of the model with known and unknown defect types, the bar plots of the IoU and the load curves are plotted in Figure 5b-g (known defect types) and Figure 5h-j (unknown defect types).The corresponding confidence score bar plot is depicted in Figure S12 (Supporting Information).For known defect types, the elastic region can be divided into three components as previously observed with known defect types under the framework of the large bounding box: inconsistent detection with misclassification, consistent detection with a mixture of misclassification and accurate classification, and consistent detection with accurate classification.Notably, fewer frames were detected in first portion of the elastic region and its maximum load slightly increased as compared with the performance of Mask R-CNN with a large bounding box in each class.The results for the third part of the elastic region as well as the plastic region were similar in both bounding box sizes.Therefore, reducing the bounding box size decreased the amount of strain information originating from the subsurface defect, thereby affecting the ability of the model to locate the defect from the limited defect-induced strain information, particularly when the deformation is small.For the unknown defect types, two regions-inconsistent detection and consistent detection-were observed in the elastic region, as previously observed for unknown defect types under the framework of the large bounding box.The classification task exhibited a similar tendency, with most frames classified as line-1.5 and line-2.5.
In particular, an abrupt reduction in IoU was observed, as illustrated in Figure 5c,e,g-j, which was caused by the unequal size of the predicted and ground-truth bounding boxes.Additionally, the confidence score box plot for the defect-free specimen illustrated in Figure 5k exhibits two distinctive characteristics, which were observed earlier under the framework of the large bounding box: failure to detect and randomness in the classification throughout the loading.
The overall performance of Mask R-CNN was similar across the frameworks for small and large bounding boxes, except for occasional misdetections in the first part of the elastic deformation in the case of using a small bounding box.Although a higher detection rate was achieved with a large bounding box, the evaluation metrics presented in Table S1 and S2 (Supporting Information) indicate that the overall classification accuracy was comparable in both cases.However, certain differences can be observed in the IoU values.The overall mean IoU was lower with the small bounding box (0.68) compared with that with a large bounding box (0.89).Nonetheless, the IoU values reported in Table S2 (Supporting Information) are within the acceptable range.
Note that the misclassification observed in the case of unknown defect geometries, such as triangular defects and defect-free instances, stems from inherent limitations within the utilized generic OD algorithms in this study.In particular, models such as Mask R-CNN, Faster R-CNN, and YOLOv3 operate on the principles of supervised learning, where they classify objects into predefined classes based on training data.This approach limits their performance when confronted with unseen defect types.The prevalence of certain classes (e.g., line-1.5)being chosen by the models for classifying triangular defects (tri-0.5 and tri-1.5)strongly indicates that the models can discern resemblances between new defect types and those encountered during training.As mentioned earlier, this similarity perception could be attributed to the shared perturbation features or strain patterns that the models ingrained from their training dataset.Importantly, if the training dataset had encompassed strain tensors from triangular and defect-free instances, complete with their corresponding labels, misclassification might have been averted.However, the current study involved testing three samples showcasing triangular defects with differing depths and one sample without any defects.The objective was to assess the extent to which the models could accurately pinpoint previously unseen defect types while observing the nuances of their classification tendencies.

Discussion
The findings of this study reveal that defect localization and classification performance improved as the load and depth increased because of the higher quality of data obtained at greater depths and loads.Consequently, deep defects can provide sufficient data to identify defects with accurate geometrical information, even with external loads that are smaller than their existing loadbearing capacity.Moreover, deep defects are more likely to cause catastrophic structural failure than shallow defects, thereby rendering accurate localization and characterization of their geometry crucial for structural health monitoring.Despite early-stage misclassification issues of elastic deformation in shallow defects, the defect detection performance of the present method was excellent, demonstrating its ultrasensitive ability and indicating that, even with misclassified geometric information, defect detection can be used to monitor structural health when the defects are shallow.Moreover, the detection and classification characteristics of structures with and without defects could provide valuable information for monitoring structural health.The results revealed that using a significant portion of defect-induced perturbations improves the accuracy of defect detection, even with shallow defects under small deformations, compared to using a small portion of perturbations.In addition, using a small portion of perturbations could aid in identifying the exact locations of invisible subsurface defects.Therefore, using both types of bounding boxes is better than using a single box.Among the three OD algorithms used herein, the performance of Mask R-CNN was slightly better with low deformations and could produce masks of any size, which can be significant for outlining the shapes of invisible subsurface defects using the surface deformation field.
Although several researchers have explored crack detection in structures such as roads, [40] aeroengine blades, [41] pipelines, [42] bridges, [43] tunnels, [44] and human limbs, [45] these studies have been limited to surface cracks.The proposed method is an advancement over existing methods in that it can identify invisible subsurface defects and reveal the geometric information at a low cost for real-time monitoring of structures in their natural states.The optical camera used for image recording can measure deformations in multiscale structures using multiple magnification lenses.Overall, the defect identification time was extremely short.DIC measurement for a single frame required %13.5 s with an Intel Xenon Gold 5220 2.2 GHz CPU.The OD time was comparatively small, with Faster R-CNN, Mask R-CNN, and YOLOv3 requiring %0.72, 0.15, and 0.17 s, respectively.Note that the DIC program used herein is open source, and a variety of powerful commercial DIC software such as GOM correlate Pro and Vic-Software can perform real-time strain measurements. [46]Therefore, real-time defect identification can be enabled by upgrading the DIC software.The proposed method is cost-effective because no sophisticated instruments are required.Moreover, it is user-friendly and can be applied to a wide range of materials and deformations, without requiring additional information on the material properties or complex mathematical equations.
Adding to the advantages mentioned earlier, DIC is capable of delivering high spatial resolution by tracking individual pixels or sub-pixel features.This ability allows for intricate analysis of strain patterns.Additionally, DIC can comprehensively capture the entire surface deformation, offering a better understanding of how strain varies across different materials.Moreover, it is versatile and can be used for a wide array of materials, including electrically insulating and conducting materials.An emerging technology for nondestructive structural testing involves the use of coplanar capacitive sensors. [47]While these sensors can be used with a wide range of materials, their primary capability lies in measuring deformation within localized areas where electrodes are positioned.By contrast, DIC can provide data across the entire field of view and can capture more detailed strain patterns.A significant advantage of electronic devices used for visualizing strain distribution over DIC is that they do not require an unobstructed line of sight to the surface being analyzed.This can be especially useful in situations where the surface is complex or inaccessible, a limitation that DIC might encounter.Another emerging technique is microwave NDT, which involves using microwave or radar signals to evaluate internal material properties, including the detection of defects and variations in composition. [48]However, interpreting microwave data can be complex and requires expertise in electromagnetic-wave behavior and material interactions, making it less user-friendly than the DIC method.Machine learning can be incorporated into microwave NDT to predict the geometry of defects. [48]Although microwave NDT is effective in detecting defects and shapes, DIC offers a more comprehensive range of information, including stress and strain distribution.Over the past few years, mechanoluminescence has risen as a promising avenue for transforming NDT. [25,29,49]This strategy utilizes the emission of light by a material when it is subjected to mechanical stress, allowing for noninvasive assessment with remarkable spatial and temporal precision.However, the impact of afterglow on mechanoluminescence is an issue that needs to be resolved. [50]While continuous recharging from ultraviolet (UV) light can mitigate afterglow, it exposes the operator to health hazards from UV radiation.Lately, there has been considerable research into eddy current pulsed thermography (ECPT) techniques for defect assessment, offering the advantage of rapid testing and clear visualization of subtle thermal variations. [51]In scanning mode, ECPT efficiently surveys extensive areas to detect irregularities swiftly.The primary strength of ECPT lies in its proficiency at detecting defects, yet it does not provide the same level of comprehensive deformation data as DIC.Additionally, because it relies on eddy currents, ECPT can only be applied to conductive materials.
The current study has certain limitations that should be addressed in future research.The accuracy of measuring artificial defect size, derived from OD deep learning algorithms in this study, exhibited inconsistencies before reaching a specific threshold load.Additionally, this threshold load displayed dependence on the depth of the defect.Below this threshold load, inaccuracies in classification might have occurred due to low-quality data caused by noise dominance during periods of minimal deformation.Utilizing more robust commercial DIC software, such as GOM Correlate Pro and Vic-Software, known for their high signal-to-noise ratios, may mitigate this noise issue.Furthermore, the generic OD algorithms used here were originally optimized for object recognition in RGB images.
[54] Additionally, the current dataset only included six classes of defect geometries, and a larger dataset of strain tensors for various defect geometries in actual structural bodies under various loading conditions is required for successful implementation in the field.FEM simulation can be used to acquire the necessary training dataset.

Conclusion
A cost-effective and highly sensitive NDT method for identifying and categorizing subsurface defects using commonly available cameras, DIC software, and OD algorithms was successfully demonstrated.This study explored the ability of popular OD algorithms, specifically Faster R-CNN, Mask R-CNN, and YOLOv3, to pinpoint defect locations and classify their geometrical characteristics.Evaluations were performed using two bounding box sizes, large and small, to capture regions where subsurface defects caused strain tensor perturbations.The results indicated that utilizing a significant portion of defect-induced perturbations using a large bounding box yielded more precise defect detection, even for shallow defects and minor deformations, compared with a smaller bounding box.However, the latter aided in precisely pinpointing the locations of invisible subsurface defects.The results also revealed that defect localization and classification improved with increasing load and depth.This improvement was attributed to the higher-quality data obtained at greater depths and under heavier loads.Among the three OD algorithms assessed, Mask R-CNN performed best, particularly with low deformations.Overall, the method exhibited swift defect identification times, with Faster R-CNN, Mask R-CNN, and YOLOv3 requiring approximately 0.72, 0.15, and 0.17 s, respectively.The innovative approach developed in this study opens up new horizons in the domain of multiscale structural health monitoring and characterization of concealed internal defects within materials through surface deformation fields.

Experimental Section
Materials: The specimens were prepared using aluminum alloy 6061-T6 (AA6061-T6) purchased from a local market in South Korea.To generate the random-speckles pattern on the specimen surface, white and black spray paint were obtained from TOOLSMRO Company.
Specimen Fabrication: As depicted in Figure 6a, 28 specimens having dimensions of 170 Â 55 Â 3 mm (length Â breadth Â thickness) were fabricated from Al6061-T6 using a water-jet cutting machine.The two ends of each specimen were drilled with five through holes of 6 mm diameter to secure the specimen onto the loading stage.A defect was artificially introduced in 27 of the 28 specimens using a water-jet cutting machine.Circular openings of 6 mm diameter were drilled in 12 of these specimens at depths of 0.5 mm (4 specimens), 1.5 mm (4 specimens), and 2.5 mm (4 specimens).Similarly, stadium-shaped openings of 6 mm length and 1.5 mm width were drilled in 12 of these specimens at depths of 0.5 mm (4 specimens), 1.5 mm (4 specimens), and 2.5 mm (4 specimens).Triangular openings (equilateral triangle of side 6 mm) of depths 0.5, 1.5, and 2.5 mm were drilled in three specimens.One specimen was not drilled.The depths of 0.5, 1.5, and 2.5 mm were 16.67%, 50%, and 88.3%, respectively, of the specimen thickness.To facilitate DIC measurement, random speckle patterns were generated on the defect-free surfaces of the specimens by spraying white and black paint.Experiment: The experimental configuration is depicted in Figure 6b.An Instron 5567 universal testing machine was utilized to perform the loading tests at a crosshead speed of 0.01 mm s À1 until plastic deformation occurred.During the loading process, the deformation surface was captured using a Sony a7 IV camera.To ensure adequate light reflection from the specimen surface for producing high-contrast speckle pattern images required for DIC measurement, two white LED tube lamps were vertically mounted near the specimen.
Strain Map Generation using DIC Method: The experiment yielded 28 videos (one video for each specimen), from which images were extracted at a frame rate of 3 frames per second (fps).Thereafter, the strain maps were generated using Ncorr, which is an open-source 2D DIC MATLAB program. [30]The generated strain maps are illustrated in Figure 6c, including longitudinal (e yy ), transverse (e xx ), and shear strain components (e xy ).The same DIC parameters such as the subset radius (15 pixels), subset spacing (1 pixel), and strain radius (15 pixels) were applied to all measurements.Ncorr can handle multiple photographs, thereby enabling the generation of strain maps from all photographs of the corresponding video.
Data Preprocessing: The data were preprocessed in a Python environment.First, the three strain maps for each frame were stacked to create 3D strain tensors with a shape of width Â height Â channel, similar to the images captured with RGB channels.The channels were ordered as (e yy , e xx , e xy Þ.Because the camera position was fixed for the entire duration of the experiment, the defect position had to be artificially modified to avoid overfitting while training the deep learning-based OD algorithms.To this end, the strain tensors were randomly cropped, which caused a random shift in the defect position in both the horizontal and vertical directions.Before random cropping, a reference point (R) was established on the strain tensor, which coincided with the central point of the defect projection on the measurement surface, as illustrated in Figure 6c.With the reference point as the origin, a cropping window of size 400 Â 600 pixels was set to traverse randomly in all directions, which resulted in a new strain tensor of the same size as the cropping window.The cropping window was restricted to move within AE30 pixels in the horizontal direction and AE130 pixels in the vertical direction, ensuring that the defect was adequately confined within the cropped tensor with sufficient space for adjusting the bounding box.Considering the advantage of random cropping, each strain tensor was randomly cropped four times to yield four times more training datasets, while each tensor was randomly cropped once to generate the test datasets.After cropping, the strain tensors were rescaled to -1 and 1 in every channel and saved in NumPy format, as depicted in Figure 6c.
Bounding Box/Mask Measurement: For a large bounding box, we considered a fixed size of 300 Â 300 pixels.To comply with the input format of Faster R-CNN [4] and YOLOv3, [7] the minimum and maximum values of the x-and y-coordinates of the bounding box were recorded for the former, whereas the x-and y-coordinates of the center of the bounding box including its width and height were recorded for the latter.Similarly, for Mask R-CNN, [5] a mask 300 Â 300 pixels in size with %200 coordinates was recorded along with the bounding box information, similar to the process for Faster R-CNN.The reference point established during random cropping was used to extract the box and mask coordinates.
For the small bounding box, the coordinates of the mask corresponding to the shape of the defect were extracted using the reference point.In addition, two types of masks-circular and stadium-were used to accommodate the mask variations in the training dataset.The mask was defined by recording %200 coordinates, and the minimum and maximum x-and y-coordinates of the bounding box enclosing the mask were recorded.
Training and Test Datasets: The training datasets contained six classes denoted as "line-0.5","line-1.5","line-2.5","circle-0.5","circle-1.5",and "circle-2.5".The "line" class represents stadium-shaped defects, whereas the "circle" class represents circular defects.The numerical value following "line" or "circle" indicates the defect depth.For each class, we conducted four experiments, with the strain tensors of the three specimens allocated for training and that of the remaining specimen for testing.After performing random cropping four times, the total number of strain tensors available for training was 4684 (line-0.1. Training Configuration of Faster R-CNN, Mask R-CNN, and YOLOv3: Faster R-CNN, Mask R-CNN, and YOLOv3 are prominent OD frameworks, each having distinct advantages.Faster R-CNN introduced a region proposal network that efficiently generated region proposals, eliminating the need for external proposals and improving object localization accuracy. [4]t achieved a balance between precision and speed through its two-stage architecture, making it suitable for tasks demanding accurate detection.Mask R-CNN built upon Faster R-CNN by incorporating an additional mask prediction branch, thereby enabling pixel-level instance segmentation. [5]This precise delineation of object boundaries is invaluable for applications such as image manipulation and medical image analysis.[8] Its single-stage architecture directly predicts bounding boxes and class probabilities on a grid, ensuring remarkable speed, making it a prime candidate for applications requiring swift responses, such as autonomous vehicles and video Table 1.Types of classes with defect dimensions, specimen quantity, and number of strain tensors for training and testing object detection algorithms.The units are expressed in millimeters (mm).L, W, D, and R denote length, width, depth, and radius, respectively.

Figure 1 .
Figure1.Workflow of the present study.A video of a defect-free surface with an artificially created speckle pattern was recorded during loading.Individual frames extracted from the video were used to generate the three strain components used to create 3D tensors, which were rescaled to a range of À1 to 1 in each channel to represent the final form of data for training and testing the object detection (OD) algorithms.Two distinct bounding boxes were investigated, considering the large and small portions of the defect-induced perturbation on strain components as virtual representatives of an invisible subsurface defect.Additionally, the presence and absence of defect-induced perturbations along with the underlying defect were highlighted in the strain maps representing elastic and plastic deformation.All possible outcomes from two different bounding boxes are presented in the figure.

Figure 2
Figure2.Sequential e yy strain maps representing test dataset of known defect types and the results predicted using Faster region-based convolutional neural network (Faster R-CNN) trained with a large bounding box.The dotted lines distinguish notable defect-induced perturbations from unnoticeable perturbations; the ground-truth bounding box is colored pink, whereas the predicted bounding box is black.Additionally, the predicted class with the corresponding confidence score and intersection over union (IoU) between the ground-truth and predicted bounding boxes is provided.The color bar represents the strain distribution.
(Faster R-CNN), Figure S7b-d (Mask R-CNN), and Figure S8b-d (Supporting Information) (YOLOv3) along with their corresponding load curves reveal the possibility of detecting unknown defect types with greater detail.The bar plots of the corresponding confidence score are charted in Figure

Figure 3 .
Figure 3. IoU color bar plots illustrating the correlations of detection and classification tasks of OD algorithms with the load and depth in a test dataset of known defect types.The first, middle, and third columns represent the results of Faster R-CNN, Mask R-CNN, and you-only-look-once (YOLO)v3, respectively, whereas each row represents a specific defect class.Furthermore, predicted classes are presented using distinct colors, as expressed in the legend at the bottom.The load curves are segmented into three regions: load-free region, linearly increasing load (elastic deformation), and nonlinearly increasing load (plastic deformation).

Figure 4e ,
Figure4e, S7e, and S8e (Supporting Information), respectively.By comparing the confidence score plots of the defect-introduced and defect-free specimens, two distinct features were observed: first, the tendency of not detecting both the elastic and plastic regions in the defect-free specimen, and second, the randomness in the classification throughout the loading in the defect-free specimen.The first characteristic was more prominent in Faster R-CNN, whereas the second was more prominent in both Faster R-CNN and YOLOv3.Notably, similar characteristics were observed in the load-free regions of every sample.Since the models were trained to detect defect-induced perturbations in strain tensors, the absence of the perturbations in the strain tensors in

Figure 4 .
Figure 4. Faster R-CNN performance on test dataset of unknown defect types.a) Sequential e yy strain maps from tri-0.5, tri-1.5, tri-2.5, and defect-free specimens illustrating the evolution of defect-induced perturbations and the results predicted using Faster R-CNN.b-d) IoU color bar plots illustrating the correlation of the detection and classification tasks of Faster R-CNN with the load and depth in the test dataset of unknown defect types.e) Color bar plots of the confidence score including the corresponding load curve of the defect-free specimen.

Figure 5 .
Figure 5. Mask R-CNN performance with a small bounding box on the test dataset consisting of known and unknown defect types.a) Predicted mask and bounding box (black) with ground-truth mask and bounding box (pink) are illustrated on e yy strain maps of line-1.5, circle-1.5,and tri-1.5 classes.b-g) IoU color bar plots correlating the detection and classification tasks of Mask R-CNN with the load and depth in the test dataset of known defect types.h-j) IoU color bar plots correlating the detection and classification tasks of Mask R-CNN with the load and depth in the test dataset of unknown defect types.k) Bar plots of confidence scores along with the corresponding load curve of the defect-free specimen.

Figure 6 .
Figure 6.Experimental procedures and data generation.a) Schematic of four different specimen types, where three specimens contain artificial defects of varying geometries.Defect geometry ranged from the surface appearing as a stadium, triangle, and circle.b) Experimental setup for loading tests along with the video recording system.c) Sequential procedures used to generate a 3D strain tensor as well as the bounding box and mask required for training and testing deep learning-based object detection algorithms.