Ultrasound imaging in thyroid nodule diagnosis, therapy, and follow‐up: Current status and future trends

Ultrasound, the primary imaging modality in thyroid nodule management, suffers from drawbacks including: high inter‐ and intra‐observer variability, limited field‐of‐view and limited functional imaging. Developments in ultrasound technologies are taking place to overcome these limitations, including three‐dimensional‐Doppler, ‐elastography, ‐nodule characteristics‐extraction, and novel machine‐learning algorithms. For thyroid ablative treatments and biopsies, perioperative use of three‐dimensional ultrasound opens a new field of research. This review provides an overview of the current and future applications of ultrasound, and discusses the potential of new developments and trends that may improve the diagnosis, therapy, and follow‐up of thyroid nodules.

formal consent is not required, however, one or more of our cited studies used human subjects, and these were, to the best of our knowledge, in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

| Current status-overview
Thyroid nodules are very common. Based on ultrasound examinations and autopsy, up to two-thirds of adults has one or more thyroid nodules, of which roughly 5% are symptomatic and 5%-10% are malignant. [1][2][3][4][5][6][7] With these prevalences, it is important to accurately identify the malignancies to offer the appropriate management to each patient. Thyroid ultrasound examination is focused on the thyroid and surrounding lymph nodes. 8 Single ultrasound characteristics are not fully indicative for either benign of malignant nodules. However, as current guidelines show, combined into a risk stratification system (RSS) they can offer a more accurate differentiation prediction. 8  Based on the guidelines of the various thyroid associations, [8][9][10][11][12][13] we have identified current trends for ultrasound use in diagnosis for thyroid nodules; the use of an RSS and its ultrasound characteristics, and biopsy.

| Current status-risk stratification system and ultrasound characteristics
While ultrasound is the primary diagnostic and stratification tool, FNA is the gold standard. The combination of ultrasound characteristics into risk stratification systems, that predict the chance of the nodule being malignant, have been developed with varying success. 14,15 The reasons for this varying success are: the high inter-and intra-observer variability of ultrasound (up to 30%), 16,17 the degree of agreement between observers using the RSSs, which show substantial agreement but leave room for improvement, 18 and single ultrasound characteristics having limited accuracy. 7,[19][20][21][22] Moreover, the evaluation studies are not precise; having a selection bias by excluding indeterminate results, comparing significantly different case series as well as the fact that FNA is not 100% accurate itself. 14 The latest RSSs allow weighing of ultrasound characteristics combined with the nodule size, in classifying thyroid nodules and then suggest further policy. For the thyroid the most commonly used RSSs are Thyroid Imaging Reporting And Data System (TIRADS) 8,9,12,23 and American Association of Clinical Endocrinologists/American College of Endocrinology/Associazione Medici Endocrinologi (AACE/ACE/ AME) protocols. 13 A side by side comparison of these protocols shows similarities (all but one have a similar scoring system, using risk factor groups) as well as differences (additional risk factors, FNA size indication). 19,20 There is global consensus on most characteristics to include, however not on how to weigh them, mainly due to the lack of sufficient evidence to conform to any single RSS. Most studies were retrospective, lacked the inclusion of indeterminate FNA results, or were performed mainly on patients with papillary thyroid cancer. 14,20 The prospective study by Grani et al. shows that the current RSSs are able to reduce the number of unnecessary biopsies by half, and should F I G U R E 1 Thyroid nodule management flow chart. The current trends of the US-modality and its acquisition methods highlighted in white, will be discussed in this review. 2D, two dimensional; CNB, core needle biopsy; CT, computed tomography; FNAB, fine needle aspiration biopsy; MRI, magnetic resonance imaging; P.E., physical examination; RSS, risk stratification system; US, ultrasound therefore be used in the clinic. 19 However, more multi-center randomized, unselected histology focused, trials should be performed to obtain high quality evidence to support the use of these RSS's in the clinic. 14 There are three items of note when comparing these systems: first, the ultrasound characteristics are all prone to observer variability. The echogenicity appears to be the parameter with the lowest interobserver agreement according to Lam et al. 24 Choi et al. show a high interobserver agreement. 25 However, both papers omitted nodules with irregular margins, which makes extrapolating this data to general practice less reliable. Furthermore the variation in positive predictive values, the diagnostic odds ratio and the malignancy risks in the various categories shows that a more extensive RSS has to be developed in order to cope with the full width of thyroid nodular disease, as also advised by Ha et al. 15 Second, American Thyroid Association (ATA)-TIRADS and AACE guidelines add extra risk factors to their primary stratification: extrathyroidal extension, stiffness, and vascularity. The extrathyroidal extension is a risk factor that all protocols take into account, however only the ATA and AACE guidelines use it in their primary stratification. 12,13 Nodular vascularity assessment during ultrasound imaging is currently performed using color-and power Doppler. 26 Various studies have evaluated the impact of adding these parameters in differentiating between benign and malignant nodules, where some studies found higher sensitivity and accuracy for vascularity assessment and characteristics therein. [27][28][29] However, these studies are small and all suggested further research to be performed. [27][28][29] Others found little to no benefit using vascularity assessment. 30,31 The reason for these different findings is likely due to selection which nodules are included. Often, selections are made for primarily papillary thyroid cancer or solid nodules instead of complex consistency. This can aid in determining the effect of risk factors, however it makes applicability in clinical practice less relevant. 30 Another likely explanation is the variety in devices and settings used as well as the experience and skills of the clinician. 32 Due to these varying results, none of the guidelines incorporate vascularity in their primary stratification.
The other potential risk factor mentioned is the stiffness of the nodule. This can be measured by strain elastography, acoustic radiation force imaging or shear wave elastography. [33][34][35][36] Elastography images the elasticity/strain variation of tissue when applying an external compressive force. Conclusive evidence for the success of elastography as a pre-biopsy tool is still missing, as comparing the studies that have been performed is difficult due to the selection of predominantly TIRADS 3 or higher nodules. This makes the data of limited use in clinical practice. 33 In their meta-analysis, Razavi et al. found a positive effect for an elastographic score and strain ratio in determining malignancy, when compared to individual ultrasound risk factors, using normal B-mode imaging. 36 However, when assessing the positive predictive value of the three elastography approaches in combination with RSS's for determining malignancy, the benefit turned out to be limited or not present at all. [33][34][35] The negative predictive value (NPV) for the three elastography approaches is high according to two meta-analyses 96.7%-97%. 33, 34 Razavi et al. determined a negative likelihood ratio of 0.16-0.27, in which elastography using a strain ratio scored better in this respect than those with an elasticity score. 36 However, the scoring systems used different scales or measurement methods which makes comparing more difficult. 36 Ünlütürk et al. found similar results, that is, no added value for elastography using a score system in distinguishing malignant nodules, and their suggestion is to study more the quantitative approaches (i.e., shear wave elastography). 37 Sung et al. found no interobserver agreement for elastography parameters in differentiating thyroid nodules. 38 In the absence of a tested standardized quantitative elastography method, elastography is not being incorporated at present in the primary stratification of most guidelines, except for the AACE/ACE/AME guideline.
However, elastography is mentioned as complementary in the rest of the guidelines due to its high NPV.
Lastly, the cut-off value for nodule size is varying between RSSs.
A recent study showed that the cut-off value should be updated for the TIRADS 4 and 5 class to 12 mm. 39 Another study supports this claim that the current size thresholds may underestimate the amount of malignant lesions. 40 For instance, the ACR-TIRADS protocol is the least likely to indicate an FNA of all the protocols due to its higher size thresholds. The resulting lower sensitivity is taken into account with more follow-up. 41,42 Thus RSS's can still function well, while being less specific or sensitive due to different size-thresholds, 15  As the TIRADS are not fully accurate in their differentiations, FNA will still need to be performed as it is the current gold standard.
Improving the differentiation power of the TIRADS may aid in reducing the number of required FNA's.

| Current status-biopsy/FNA
FNA is used for a cytological assessment and, despite being the gold standard, suffers from sampling uncertainties in 10%-20% of the cases and undetermined significance in 6%-50% of the cases. 47,48 The results are categorized according to the Bethesda classification, which places the biopsy result in one of six categories with a sensitivity of 97.2%. [49][50][51][52] Most often the thyroid biopsies are adequately guided by 2D ultrasound, however it can yield an indeterminate result in 22.5% (category B1 and B3) of the cases requiring a second FNA 49 of which 16% is still inconclusive. 53 Of the B1 classifications 10%-30% is caused by insufficient biopsied material during the FNA. 13,54 In addition, a negative correlation with benign nodules was found indicating that these are more difficult to aspirate from, resulting in a higher indeterminate result. 55 Stewart et al. have shown that obtaining an accurate preoperative diagnosis is important to reduce over and undertreatment. 56 Therefore, improving the diagnostic yield of these biopsies is necessary.

| Future trends-overview
With the advent of new three-dimensional (3D) matrix transducer technology and artificial intelligence new approaches in Computer Aided Intervention (CAI) and -Diagnosis (CAD) can be studied. Figure 2 presents an overview of these major ultrasound related future trends for imaging thyroid nodules, and these are discussed in the following.

| Future trends-RSS adaptations
Heterogeneity in certain parameters of the various RSSs leads to a variable diagnostic yield. 19 Research is focused on addressing this heterogeneity by identifying the most accurate current and new parameters, ultimately resulting in an international RSS, the I-TIRADS. 41 An interesting finding by Jinih et al. is that the current method of nodule size measurement does not appear to be indicative for malignancy. 46 Since the cut-off values of every RSS, whether to perform an FNA, are based on nodule size this may require a change in the way these RSSs will approach FNA indication for the various categories.
Such a change can be using 3D ultrasound, which obtains objective volume estimations instead of the current manual digital caliper measurements (length Â width). Caliper measurements have large interobserver variability of up to 46%, resulting in a 17% overestimation of thyroid volume compared to 3D-US. 44 Multiple analyses can be added to an RSS to improve its accuracy for detecting benign and malignant tumors. Bojunga states that the RSSs currently available are based on solely B-mode US scans and that the addition of strain elastography can improve the NPV of the RSS. 32 Shreyamsa et al. compared two types of RSSs: the "originals" that focus on US characteristics, and the 'multimodal' that add color-Doppler, strain elastography and cervical node involvement, as well as point reduction for benign features. 57 They showed that the area under the curve (AUC) for the Thyroid Multimodal Imaging Comprehensive-RSS was 0.924 and for the ACR-TIRADS was 0.801, a significant increase in performance. 57 In the same study they found that the French TIRADS, which includes elastography in addition to the standard echography characteristics, achieved an AUC of 0.874 thus outperforming the ACR-TIRADS as well. 57 Similar results were found by Xue et al. when combining TIRADS with strain elastography. 58 These studies were all performed using manual interpretation.
Jin et al. studied, retrospectively, the use of a deep learning method combined with a TIRADS protocol and found an improved AUC (0.902 vs. 0.845) and comparable diagnosis sensitivity and specificity for thyroid cancer as compared with a group of experienced radiologists. 59 Multimodal imaging can also be used for a specific TIRADS class, such as the TIRADS 4 class which is difficult to diagnose, as is suggested by Han et al. 60 In conclusion, making use of advanced US analyses, such as 3D US and elastography, can improve the RSS's specificity and sensitivity, and subsequently may reduce the number of benign biopsies. Furthermore, the use of CAD methods in combination with TIRADS protocols seems promising and warrants further investigation.

| Future trends-3D ultrasound
With the availability of advanced imaging and processing hardware, the capabilities of ultrasound systems are dramatically expanding.
While mechanically swept and tomographic 3D ultrasound have been available, these approaches lack the framerate and real-time imaging capability suitable for application during intervention; the radiologist F I G U R E 2 Future trends overview. The white boxes show the future trends in ultrasound imaging and where in the thyroid nodule management process they are to be applied. 3D, three dimensional; CAD, computer aided diagnosis; CAI, computer aided intervention; CNB, core needle biopsy; FNAB, fine needle aspiration biopsy; RSS, risk stratification system; US, ultrasound still relies on two-dimensional (2D) ultrasound to guide the intervention. With the advent of matrix transducers this problem may be overcome. For instance, matrix transducers are now available with up to 56.000 elements, allowing for real-time imaging of volumes (see it was more accurate in determining the volumes than its 2D counterpart. 43 Furthermore, they stated that the volumes could be useful in future analyses or second-opinions. 43 Rago et al. found similar results, 2D ultrasound overestimated the thyroid lobe volume by 10% as compared to tomographic 3D ultrasound. 45 Andermann et al. studied the interobserver variability, finding that it was reduced when using 3D ultrasound as compared to 2D ultrasound. 44 Freesmeyer et al. used a new DICOM standard to view 3D ultrasound data and found an interobserver variability of 5.6%. 61 Kim et al. elaborated on the use for mechanically swept 3D ultrasound as compared to 2D by studying its use in an "off-site" setting, where radiologists were able to view the volume as one would with a CT-or MRI-volume. 62 65 Due to the decrease in transducer size, research was started into the possibility of "stitching" ultrasound volumes together in order to manage large thyroid volumes and it was found to be feasible for a 2D tomographic and 3D mechanically swept transducer, although further research is required. 66,67 Single positron emission computed tomography (PET-CT)/US fusion imaging is also possible when using 3D US volumes. 68 A recent study studied a 2D, 3D mechanically swept and a 3D matrix transducer, and found the 3D matrix transducer to outperform the others when estimating the volume of a phantom nodule. 69 Not all studies performed had positive findings: Yi et al. found that using 3D ultrasound to diagnose extra-thyroidal extension with papillary thyroid carcinomas had no significant benefit over 2D ultrasound. 70 Almost all these studies show potential benefits of 3D ultrasound over the current 2D transducers in thyroid diagnosis. However, little F I G U R E 3 Three-dimensional ultrasound acquisition of a human neck. A 3-axis recording (from left to right: transversal, sagittal, and coronal) and volume rendering (bottom right) of a human neck with thyroid. The recording is acquired using the XL14-3 matrix transducer (Philips, Amsterdam, Netherlands) research is currently performed on the application of 3D thyroid ultrasound in the clinic, possibly due to the limited availability of 3D transducers. Which is most likely due to the high costs associated with piezoelectric arrays. An alternative, cheaper, way of producing such arrays has been developed in the past decades resulting in: the capacitive micromachined ultrasound transducers (CMUT). 71 With these CMUT, matrix transducers are easier to produce and may become cheaper, so that widespread adoption of 3D ultrasound may finally be possible.

| Future trends-angiography
Doppler angiography for thyroid nodules is still controversial, still.
However, research into the diagnostic angiographic field has expanded to also include 3D-US, contrast enhanced ultrasound (CEUS) and micro-vessel imaging. were found, with a pooled AUC of 0.9263. 75 In addition, they showed reduced heterogeneity in small nodules (<1 cm, I 2 = 0.0%). 75 However, for variable sizes, heterogeneity was still an issue. 75 Metaregression analysis showed that visual features and (semi) quantitative measurements cause this heterogeneity. 75 Zhao et al. has shown for sub-centimeter nodules that CEUS can aid the differentiation based on the enhancement pattern. 76 Zhang et al. suggest to further investigate these parameters and standardize as well as validate them in the future. 75 In addition to the diagnosis, CEUS can aid in thyroid FNA. Li and Luo showed that CEUS helped identify 25% more papillary thyroid carcinomas as compared to conventional US guidance. 77 Larger vessel characteristics have often been studied, however the microvasculature may offer additional insight on the differentiation of thyroid nodules, as it has for various tumor cases. Zhan and Ding showed that by using CEUS they can visualize the micro-vessels dynamically. 73 Wang et al. showed this correlation may hold true for thyroid nodules as well. 78 Nayak et al. have looked in their pilot study for a way to improve visualization of the microvasculature by suppressing motion artifacts caused by the carotid artery. 79 These motion artifacts cause reduction in image resolution and reduction in accuracy for Doppler signal integration, for the microvasculature. 79 Further, the visualization of the injected contrast bubbles is userdependent, as many acquisitions are with handheld ultrasound, 17,25,80 moreover the timing of acquisition and, in direct correlation, the concentration of the microbubbles are factors making acquisition challenging. 81 Therefore, more research must be performed. It is thought that "one of the key advances" in visualizing microvasculature is the field of super-resolution ultrasound imaging. 82 In this field, the use of contrast agents allows for better resolution of the vasculature, 82 the use of a convolutional neural network speeds up the reconstruction of the super-resolution images, 83 and using 3D super-resolution US to visualize a volume rather than just one scanning plane allows for organ-wide scanning. 81 Microvasculature can also be visualized, without the use of contrast-agents, with ultrafast ultrasound Doppler imaging, which resulted in improved malignancy detection. 84 Doppler is not a necessity for visualizing flow in vasculature, an example being B-Flow imaging (GE Healthcare, Chicago, USA). This acquisition method allows for perfusion imaging without using Doppler by tracking the moving speckles in the image and through subtraction identifying changes in those flowing parts which can be visualized. 85 This technique has been used successfully for imaging the perfusion of human placentas 86 as well as in thyroids showing microcalcifications, in addition to flow, which were not visible on normal B-mode. 87 Variations of non-contrast enhanced imaging are being developed, solving the need for injecting contrast agents. 79,84 Important to note, the superficial position of the thyroid is the main contributor that non-contrast enhanced imaging is possible.
In conclusion, ultrasound angiography will contribute to the multi-modal imaging of thyroid nodules thereby offering additional data, which may be added to the TIRADS protocols to improve the differentiation accuracy. Furthermore, it may strengthen the performance of deep learning algorithms and the subsequent CAD systems, which is discussed in "future trends-computer aided diagnosis and deep learning algorithms." To improve the quality of the angiography data and visualize the nodule micro-vasculature, super-resolution ultrasound imaging using microbubble contrast agents seems the most promising approach.

| Future trends-tissue characteristics
As stated before, the use of score based elastography had limited success in improving nodule differentiation. 37,38 However, there still may be use for score based and more quantitative elastography, such as shear-wave elastography or strain ratios, to confirm benignancy due to their high NPV, thereby preventing unnecessary biopsies or hemithyroidectomies. Shuzhen et al. studied the use of score based elastography compared to standard B-mode imaging and found that the elastography performed superior in specificity, accuracy and NPV. 88 However, they stated that B-mode US remains the basis and elastography therefore is an additional tool to be used by clinicians. 88 Shweel and Mansour found that the combined use of elastography scoring and high-resolution ultrasound was outperforming the use of only one. 89 Elsayed and Elkhatib found similar results for the combination of elastography scoring and B-mode imaging in the detection of malignant thyroid nodules, as did Trimobli et al. 90,91 Yang et al. studied both score based and quantitative elastography, both analyses proved to be beneficial in the differential diagnosis of thyroid nodules. 92 More specifically per imaging technique, Pandey et al. have studied the effectiveness of acoustic radiation force impulse in differentiating thyroid nodules, and found that it has a benefit as an additional tool (AUC = 0.922). 93 The above-mentioned studies investigated elastography before FNA. Additionally elastography can also be used in combination with FNA to improve the final diagnostic accuracy, as studied by Zhu et al. 94 This might be problematic for the indeterminate nodules, however Qiu et al. found in their systematic review that shear-wave elastography and strain ratio elastography are more efficient for differentiation of indeterminate thyroid nodules. 95 In addition, they concluded that the combination of elastography and other ultrasound techniques improves evaluation of indeterminate thyroid nodules. 95 Moraes et al. found similar results. 96 Nell et al. showed that for asteria class 1, which is a scale of 1-4 to color code the elastography image, a fraction of up to 15% can be classified as benign without the further need of biopsies or thyroid lobectomy. 33 A potential reduction in biopsies was also found by Tan et al., in addition to a significant benefit for differentiation of malignant nodules of up to 10 mm. 97 A similar potential reduction in biopsies was found by Zhao et al. as well. 98 They used a 3D shear-wave elastography method to further reduce the operator dependency and acquire a full organ view, see also  96 Hyperextension of the neck may be a solution to the neck muscles and breathing drawbacks. 96 Most of the aforementioned studies did not combine their results with that of an RSS, though this may result in more accurate differentiation. Pang et al. used a logistic regression model to aid in thyroid nodule differentiation, one of the significant factors contributing to the differentiation was the elastography score. 99 Bojunga found in his review that elastography was predominantly useful as a negative predictor for malignancy (a high NPV 93%-99%). 32 Baig et al. also describes elastography as an adjunct to standard-and Doppler imaging and emphasizes the need for larger studies to study its effects. 100 Du et al. found that elastography combined with a RSS was superior to using only the RSS for differentiating small thyroid nodules. 101 Aghaghazvini et al. found that using both qualitative and quantitative shear-wave elastography is promising in performing pre-operative malignancy risk stratification. 102  Despite these positive results Tumino et al. wrote in their minireview that further studies into elastography and its use are required, due to the evidence for the benefit of elastography not being at a satisfactory level for acceptance, 20 as was also pointed out by the revised Korean guidelines on thyroid nodule management. 10 Looking at the data presented in literature so far, a lack of multi-center prospective studies, that include the entire range of Bethesda classifications, can be observed.
Thus, looking at all these results, future research identifying benign nodules should focus on using shear-wave elastography, due to its lower user-variability and high NPV. For differentiation of malignant nodules, a combination of ultrasound analyses (elastography, Doppler, normal greyscale, etc.) should be considered to improve the sensitivity of the differentiation, preferably in combination with a TIRADS.

| Future trends-computer aided diagnosis (CAD) and deep learning algorithms
Most of the studies mentioned above were performed with the help of clinicians. The experience of these clinicians varies from 1 year to 10+ years of experience. This results in varying accuracy of the interpretations of US images, as is demonstrated in the RSS studies. A tool to aid clinicians in more precise and more accurate decision making is a CAD-system. CAD-systems use machine-and deep learning approaches, of which the latter is the dominant contributor to the thyroid CAD-systems. 104 To emphasize, CAD is a collective term for these methods when they are used for diagnostic purposes. Therefore, to make a comparison between methods a distinction must be made what the specific goal of the method is and what exact method it uses to achieve that goal. Sharifi et al. have made a systematic review on the use of deep learning for ultrasound images in the diagnosis of thyroid nodules. 104 They describe the following four applications methods: classification, detection, segmentation and feature selection, in the order of most used in the reviewed literature. 104 These application methods can use various networks such as "VGGNet," "GoogleNET" and "ResNet," amongst others. Training and testing of these methods is performed with a variety of different, often small, datasets of which most are not open-access, which makes comparing methods difficult. 104 In addition, the annotation of these datasets make or break the algorithm its performance, mostly this is done manually, making it a costly investment considering time and money and suffering some manual inaccuracies. 104 Nevertheless, comparisons can be made, albeit with reservations and a call for more open-access and larger datasets. Despite these challenges, the current networks offer mostly a comparable or improved sensitivity, specificity and accuracy as compared to senior radiologists. 59,60,[105][106][107][108] One study showed the results of a ReS-NET with a large dataset, 13 984 nodules, and found a comparable risk stratification when compared to senior clinicians. 109 Machine-and deep learning methods can be used to determine a few other parameters, for example, volume estimation. This is a standard measurement in thyroid diagnosis, often performed manually using digital calipers. These measurements are prone to intra-and interobserver variability. Deep learning algorithms can be useful to perform accurate and precise measurements. A review by Chen et al. showed that machine-and deep learning-based methods offered comparable results to the older segmentation methods. 110 However, when more annotated datasets become available these algorithms will outperform classical segmentation, which makes them future proof for classifying neck structures. 110 Even more so, when additional imaging data is added to the data set, for example, multi-modality data, or the richer raw radiofrequency (RF) data extracted directly from the machine. Liu et al. studied the use of the ultrasound RF-signals in combination with its corresponding ultrasound image in a convolutional neural network, and found improved accuracy, sensitivity and area under the curve. 111 Additional US-analyses, such as shear-wave elastography, can be used as well. One study showed the use of a convolutional neural network with information fusion, 112 another of a two-branched ResNet-50, 113 both using B-mode and shear-wave elastography US images. Moreover all these methods use 2D-US data, whereas utilizing 3D-US data may offer larger data stacks for the algorithms to work with and give a more complete overview of the nodule. 104 In general, the "black box" effect of these algorithms hinders implementation of CAD-systems in the clinic. Nauta et al. showed a way to identify why the algorithm has used certain features to come to its conclusion on the outcome of comatose patients. 114 When these algorithms are less of a black box it can help the clinicians to gain trust in the validity of the algorithms, and subsequently let the diagnostic process genuinely be computer aided.
Thus, in line with many of the authors mentioned above, we conclude that these methods are very promising in improving the accuracy of thyroid nodule diagnosis. Especially the combination of various US images and the available RSSs seems a promising area of research. We also re-emphasize the need for larger, annotated, datasets that make use of multi-modal images, advanced acquisition methods and 3D-US to offer a complete picture of the nodule.

| Future trends-computer aided intervention (CAI): needle tracking/navigation tools
The use of a needle tracking/navigation tool can improve the biopsy by hitting the target area with greater precision, this may reduce the number of inconclusive biopsies. This is a form of CAI. No specific literature was found for thyroid nodule biopsies; however, we think these technologies could be implemented using the experiences from other fields.
Such a needle tracking technology has been developed for RF-ablation and shown to be applicable for ablation interventions. 115 However this system is based on electromagnetic markers, which may result in a deviation of the actual needle tip with the virtual location due to bending of the needle when encountering stiff tissue or exerting too much force with the transducer. 115

| Future trends
We believe volume estimation performed during watchful waiting should be performed with 3D-US. As described before, 3D-US is superior to the current 2D approach. [43][44][45][61][62][63]69 Thus, research should be performed into an updated cut-off value currently used in the RSSs. This may result in more accurate diagnosis at follow-up, due to improved accuracy of the measurements.
It is imperative to evaluate the ablation efficacy. Sim et al. have shown with their "initial ablation ratio," a ratio of the ablated nodule volume over the total nodule volume (vital and ablated nodule tissue), that an improved ratio indicates better treatment outcome i.e. improved volume reduction. 119 Furthermore, Sim and Baek have shown that remaining vital nodular tissue may cause regrowth, and in follow-up the vital tissue area growth should be identified and not the size of the entire nodule. 120 Thus, a challenge remains on how to improve that ablation ratio. Nodule morphology may be a factor in tackling this issue as Gambelunghe et al. found that an agglomerate nodule morphology negatively impacts the volume reduction ratio for laser ablation of thyroid nodules. 121 Image fusion during the intervention can aid in needle localization. This has been proven effective in ablating liver tumors. 122,123 Translating such a method to the thyroid is possible as demonstrated by Turtulici et al. see also Figure 5. 115 As the number of minimallyinvasive procedures grows the need for these virtual needle guidance systems will grow too. To be able to fully ablate complex cases, such as those near critical structures, we believe a guidance system is essential. This is also suggested by the Korean guideline and a subsequent review. 116,124 However, a more recent review on the use of RF-ablation and laser ablation for benign thyroid nodules did not offer any suggestion as to the use of guidance systems to improve treatment outcome. 125 This may be due to tracking issues encountered by these systems. As mentioned before, the system used by Turtulici et al. is based on electromagnetic tracking of markers on the needle which can suffer from needle bending. 115,116 As the current system is not suitable for application in the thyroid, we think that further development of these CAI systems should focus on using the US images (preferably 3D-US images) for tracking of the RF-ablation needle. This should enable clinicians to always visualize the needle during ablation.
To assess the progress of the ablation and where to position the needle next, the increase in hyper-echogenicity area and scattering due to gas formation is used. 126  shown that the elastographic characteristics change for ablated nodules. 131 As mentioned before, 3D-US may offer a solution for the operator dependency. When 3D-US is applied in combination with the aforementioned acquisition methods a full organ view can be acquired, and evaluation can become more complete.  110 To aid in the development of such CAI systems realistic phantoms are useful. These function as controllable ground truths with which the performance of the CAI system can be assessed. 69 The phantom should mimic the human neck closely, however some simplifications can be made without compromising the use of the phantom. In addition, such sophisticated phantoms can also serve as ablation training objects for clinicians that want to start with ablation.

| Current status
Watchful waiting has been discussed under treatment, here only the follow-up after treatment is considered. For the current trends this does not extend further than regular B-mode imaging with its digital caliper measurements and Doppler-imaging. Therefore, there is room for improvement and the future trends hold promise to improve the follow-up after ablation.

| Future trends
The follow-up future trends do have overlap with that of the diagnosis part, however the goal is different. For follow-up addressing the regrowth issue is the most important. If those nodules that recur can be identified earlier that could result in a lower frequency of follow-up for many patients. Sim and Baek have shown that remaining vital nodular tissue may cause regrowth, and this process can be identified earlier. 120 Thus, new precursors for regrowth should be studied.
Elastography (shear-wave) and vascularization (via Doppler, B-Flow, CEUS, for example) may be able to distinguish between ablated and non-ablated tissues and thus function as a precursor if changes in the tissue-ablated tissue ratio occur. A recent study by Yan et al. found CEUS to be more accurate in determining the ablated volume as compared to standard B-mode US imaging, and that the B-mode imaging overestimated the ablated volume, which directly impacts the volume reduction ratio that indicates treatment success. 132 Jiao et al. found that performing a CEUS at 12 months is sufficient to identify 95% of all regrowth cases. 133 Additionally, it aids observers in determining the volume of the ablated area without clinically relevant disagreement. 134 While precursors can be found in the functional tissue characteristics, their shape can also be an indicator. 135 A new feature introduced recently is the surface area to volume ratio: which aims to include changes to the shape of the nodule as well, other than solely volumetric changes. However, it performs with a similar AUC at predicting non-recurrent nodules. 135 To fully make use of this ratio, 3D-US may be a useful tool. For hemithyroidectomies Frank et al. found use for 3D-US in differentiating neck lymph nodes after surgery, improving specificity from roughly 0.74-0.9 and increasing the confidence of the readers in their diagnosis. 136 In conclusion, we believe that the use of advanced analysis such as CEUS, shear-wave elastography and 3D-US may aid the cli- such as (3D) super-resolution ultrasound imaging, (3D) elastography and 3D nodule characteristics seem increasingly promising in improving the success rate of RSSs for differentiating thyroid nodules during diagnosis and follow-up. Combining those results into a CAD system for the clinician to be used during diagnosis may improve diagnostic yield even further.
Since ablation techniques (such as RF-ablation) are relatively new for thyroid nodules, applications of CAI-systems have not been studied thoroughly in this field. Nevertheless, the use of 3D-US perioperatively opens a new field of research utilizing CAI systems in thyroid ablative treatments as well as during biopsies. Ultrasound is a versatile modality and will remain the golden standard for thyroid nodules. In the future, new CAD and CAI applications will improve the clinical workflow for clinicians and improve the clinical outcome for the patient.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.