AI‐Enhanced Diagnosis of Challenging Lesions in Breast MRI: A Methodology and Application Primer

Computer‐aided diagnosis (CAD) systems have become an important tool in the assessment of breast tumors with magnetic resonance imaging (MRI). CAD systems can be used for the detection and diagnosis of breast tumors as a “second opinion” review complementing the radiologist's review. CAD systems have many common parts, such as image preprocessing, tumor feature extraction, and data classification that are mostly based on machine‐learning (ML) techniques. In this review article, we describe applications of ML‐based CAD systems in MRI covering the detection of diagnostically challenging lesions of the breast such as nonmass enhancing (NME) lesions, and furthermore discuss how multiparametric MRI and radiomics can be applied to the study of NME, including prediction of response to neoadjuvant chemotherapy (NAC). Since ML has been widely used in the medical imaging community, we provide an overview about the state‐of‐the‐art and novel techniques applied as classifiers to CAD systems. The differences in the CAD systems in MRI of the breast for several standard and novel applications for NME are explained in detail to provide important examples, illustrating: 1) CAD for detection and diagnosis, 2) CAD in multiparametric imaging, 3) CAD in NAC, and 4) breast cancer radiomics. We aim to provide a comparison between these CAD applications and to illustrate a global view on intelligent CAD systems based on machine and deep learning in MRI of the breast. Level of Evidence 2 Technical Efficacy Stage 2

B REAST CANCER is the most common cancer among women but has an encouraging cure rate if diagnosed at an early stage. Thus, early detection of breast cancer continues to be key for effective treatment. Magnetic resonance imaging (MRI) is an established essential tool in breast imaging for high-risk screening, assessment, diagnosis, staging, and follow-up of breast cancer. 1,2 It has a proven value in important areas such as evaluating local extent of disease, multicentricity, response to neoadjuvant chemotherapy, and in the assessment of the integrity of implants. 1,3 Currently dynamic contrast-enhanced MRI (DCE-MRI) is the most sensitive imaging technique for breast cancer diagnosis with a high specificity, is independent of breast density, and detects noninvasive breast cancer. The limitations in specificity can be overcome by employing additional functional MRI techniques such as diffusion-weighted imaging (DWI) and proton MR spectroscopy. These techniques have demonstrated an improved diagnostic accuracy as well as response assessment 4 ; their combined application is called multiparametric MRI and can be utilized for the detection and characterization of breast tumors. 5,6 However, the acquisition of multilayered multidimensional data poses new challenges to radiologists; and thus new tools for reliable, reproducible, and quantitative assessments are warranted for improved diagnosis, tumor characterization, and treatment monitoring.
Inspired by computer-aided diagnosis (CAD) systems to support diagnostic and screening activities in conventional X-ray mammography, research initiatives nowadays focus on similar techniques to aid or even automatize the diagnosis in MRI of the breast. These efforts started as early as 2002 with the application of the multilayer perceptron (MLP) as a classifier of tumor-extracted features describing dynamic, morphological, or combined characteristics [7][8][9] for breast MR segmentation and lesion detection, 10 while achieving results comparable to that of an expert radiologist. Other machinelearning (ML) techniques used in the early days were the fuzzy c-means clustering-based technique for automatically identifying characteristic kinetics from breast lesions 11 and the mean shift clustering for determining accurate regions of interest (ROIs) in breast MRI lesions. 12 All these techniques for CAD systems of mass lesions outperformed an experienced radiologist and demonstrated that ML techniques can support the radiologist in the diagnosis of breast lesions.
However, the delineation, detection, and diagnosis of nonmass enhancing (NME) lesions is clinically very challenging since the standard morphological and kinetic features that are relevant for masses fail to achieve equally good results for NME lesions. 13 Thus, CAD systems strongly relying on morphological and kinetic parameters are proven to be insufficient to obtain a satisfactory performance for NME lesions. Designing robust and reliable CAD systems for these NME lesions represents a challenge for the medical imaging specialist. Few studies in the literature explore CAD systems based on ML techniques are reported for this type of lesion.
To provide useful insights for ML techniques in connection with important CAD systems of NME lesions in MRI of the breast, this review article consists of seven sections. Section Application of ML and Dynamic Contrast Enhancement (DCE) Kinetics of Breast Tumors describes the basic kinetics aspects in breast MRI, while Section Application of ML and Morphology of Breast Tumors explains the equally important morphological criteria used in detection and diagnosis. Section Machine Learning Techniques provides an overview about the most important standard and novel ML techniques and their processing steps. The next sections present applications of ML to relevant topics in MRI of the breast and include all necessary preprocessing steps to achieve a diagnostic solution. Section CAD in MRI of the breast presents intelligent diagnostic solutions based on tumor-extracted features and enhancement curves. Section Future Trends: CAD Systems for Novel Applications covers future trends including novel applications of ML in MRI of the breast.

Application of ML and Dynamic Contrast Enhancement (DCE) Kinetics of Breast Tumors
As mentioned in the previous section, morphologic, kinetic, or combined features represent important lesion characteristics for a computer-assisted interpretation. For example, time-signal series, as measured during a DCE-MRI examination for each image voxel, represents an important component in designing CAD systems for MRI of the breast. Early studies have demonstrated that the shape of the time-signal intensity curve provides an important biomarker for discriminating between benign and malignant enhancing lesions in DCE-MRI and it is a key step of reporting MRI of the breast. 14 It has been shown that the enhancement kinetics, as represented by the time-signal intensity curves, visualized in Fig. 1, differ significantly for benign and malignant enhancing tumors and thus are representative of differential diagnosis: plateau or washout-time courses (type II or III) are mostly found in cancerous tissue. Steadily progressive signal intensity time courses (type I) are typical of benign enhancing lesions. Typical features representative of kinetics are maximum enhancement, time to peak, uptake rate, washout rate, enhancement at first postcontrast timepoint and signal enhancement ratio.
Recently, new k-space acquisition strategies have been introduced for dynamic breast MRI such as time-resolved angiography with stochastic trajectories (TWIST) and differential subsampling with Cartesian ordering (DISCO). 15,16 These ultrafast sequences can be used to capture the inflow of contrast in breast lesions, heavily undersampling the outer part of the k-space in order to increase the spatial resolution FIGURE 1: Schematic drawing of the time-signal intensity (SI) curve types. KMK+ 99 Type I corresponds to a straight (Ia) or curved (Ib) line; enhancement continues over the entire dynamic study. Type II is a plateau curve with a sharp bend after the initial upstroke. Type III is a washout time course. In breast cancer, plateau or washout-time courses (type II or III) prevail. Steadily progressive signal intensity time courses (type I) are exhibited by benign enhancing lesions.
for an improved diagnostic quality. Thus, it can be employed in clinical settings to acquire breast DCE-MRI data with both high spatial resolution for accurate tumor morphology assessment and high temporal resolution for accurate representation of the contrast agent kinetics. 2,17 The potential of these new breast MRI techniques for screening and automated characterization of breast lesions has not yet been explored. A single study 2 has shown that the ultrafast protocol yielded a high diagnostic accuracy compared with the standard protocol when the maximum slope of the relative enhancement vs. time curve (MS) was used as a kinetic information vs. the Breast Imaging-Reporting and Data System (BI-RADS) curve types. A more precise evaluation can be achieved based on advanced computer tools that could additionally incorporate the morphologic information and assist the radiologist in image interpretation and patient workup.
There is clinical evidence that novel enhancement curve parameters combined with morphological features are improving the diagnostic accuracy for the ultrafast protocol. 18 Although signal characteristics represent an important biomarker for a radiologist to distinguish between different tissue states, their assessment is quite a time-consuming task. This becomes challenging when the heterogeneity of lesion tissue is considered, which causes the spatial variation of signal characteristics. In addition, this variation reflects specific tissue properties that should be considered when assessing the state of lesions. Kinetic parameters extracted either from qualitative BI-RADS or quantitative empirical mathematical models measures of kinetics have proven to be insufficient when it comes to the differential diagnosis of NME lesions. 19

Application of ML and Morphology of Breast Tumors
Morphological parameters describing either the shape or structure of the ROI are obtained from manual or semiautomatic detection. They are either qualitatively or quantitatively extracted from lesions and represent valuable diagnostic biomarkers. 20 The most important morphological features are area, compactness, perimeter, smoothness, radial length, roughness, sphericity, volume, spiculation, curvature, and edge. While qualitative morphological features have a high interobserver variability, 21 quantitative ones provide a more standardized and objective diagnosis. Other nonkinetic features besides the morphological are histogram features, spiculation, textural, geometric, and binary object features. Since the NME lesions exhibit ambiguous characteristics when limited to only dynamical or morphological parameters alone, a fusion of different dynamic and morphologic characteristics proved beneficial in terms of diagnostic sensitivities and specificities. 22,23 Based on morphology and type of enhancement, lesions are assigned according to risk assessment and a quality assurance tool, the BI-RADS lexicon, to mass enhancement, nonmass, and focus. 13,24,25 Masses are 3D tumors that have either a round, oval, lobular, or irregular shape; nonmasses have poorly defined boundaries and considerable overlap in kinetic characteristics between malignant and benign lesions; and foci represent small spots of enhancement that cannot be characterized as a mass. The diagnosis of mass enhancement lesions is straightforward and employs typical characteristic parameters such as spiculation (morphology), rim enhancement (texture), and washout kinetics. However, the diagnosis of foci and nonmass-like enhancing lesions pose a challenge to both clinical reading and CAD systems. Therefore, standard parameters cannot be applied, and novel image and signal processing techniques need to be developed and integrated into the CAD system. While for mass-enhancing lesions several BI-RADs descriptors are used for the differential diagnosis, the existing BI-RADS descriptors for NME lesions have proven to be insufficient for the automated differential diagnosis. Nonmass-enhancing lesions represent a diagnostic challenge in MR, as exemplified in Fig. 2.
The correct detection of these lesions, in combination with many clinical applications ranging from diagnosis to therapeutic solutions, demand sophisticated image processing paradigms in connection with feature extraction. The response to these challenging processing tasks has guided the development of novel ML techniques, which will be described further below.

Machine-Learning Techniques
Several ML techniques are incorporated as a classifier in CAD pipelines for breast cancer detection, prediction of neoadjuvant chemotherapy outcome, and diagnosis. A brief description of the most important techniques is given in this section. We start with "classical" ML approaches such as support vector machine and random forest and conclude with a brief discussion of deep learning.

Artificial Neural Networks Classifiers
Artificial neural network (ANN) classifiers are an attempt to emulate the processing capabilities of biological neural systems. The architecture of the MLP is completely defined by an input layer, one or more hidden layers, and an output layer. Each layer consists of at least one neuron. The input vector is processed by the MLP in a forward direction, passing through each single layer. A neuron in a hidden layer is connected to every neuron in the layers above and below it. MLPs have been applied successfully to sophisticated classification problems. The training of the network is accomplished based on a supervised learning technique that requires given input-output data pairs. The training technique, known as the error backpropagation algorithm, is bidirectional, consisting of a forward and backward direction. During the forward direction a training vector is presented to the

Random Forests
The random forest represents a powerful statistical learning technique and is an ensemble method 26 composed of many smaller models. The classification and prediction are achieved by combining the outputs of these smaller models that are usually classification and regression trees (CART). CART operates based on a repeated partitioning of the input data in order to estimate the conditional distribution of a response (output class) for a given set of feature variables. The algorithm implements a binary decision tree where every single feature of the input is considered a candidate for the split. Binary decision trees are nonlinear multistage classifiers. This classification system operates by rejecting sequentially classes until the correct class is found. In other words, the correct class corresponding to a feature vector is determined by searching a tree-based decision system. The feature space is divided into regions corresponding to the different classes. The goal of CART is that the emerging subgroups after the split are homogeneous. The trees are combined to a forest based on bagging. The variance of the predictions of a model is decreased by fitting several models and then taking the average over their predictions to achieve in the end a regularized prediction. To avoid overfitting, each model is fitted only to a sample of the same size as the original input data but selected with replacement. This sample technique is known as the bootstrap sample.

Support Vector Machines
The support vector machine (SVM) represents a feedforward single layer classifier that can be employed either for linear or nonlinear separable datasets. [27][28][29] It became for many classification problems in biomedical imaging the first-choice classifier.
The basic idea of the SVM algorithm is to design a hyperplane: characterized by its direction vector w R n and its exact position in space or bias w0. The hyperplane separates the labeled input or training data into two classes by leaving the maximum-margin from both classes. A given set of N labeled training examples {(x, y)i}, i = 1,…, N, xi R n assigned to two different classes yi {−1, 1}, is separated by a maximum-margin hyperplane such that the distance between the hyperplane and the closest examples (the margin γ) is maximized. This hyperplane is fully specified by a subset of those training examples that lie closest to the decision surface and pose a challenge for a correct classification. The training samples that lie closest to the hyperplane represent the support vectors. To employ the SVM for nonlinearly classifiable data, we need to employ the so-called "kernel trick." This symmetric and nonlinear kernel function evaluates the inner product between two examples after their transformation by a nonlinear function by maintaining the original architecture of the linear SVM.
KNN Classifier K-nearest neighbors (KNN) is a supervised classifier. This algorithm stores all available patterns and classifies new patterns based on a similarity measure (eg, distance functions). This procedure can be very easily elucidated based on a two-class classification task: an unknown pattern x should be assigned to one of the two classes C1 or C2. The decision is made by determining its Euclidean distance d from all the trainings vectors belonging to various classes. We define two hyperspheres with the radius r1 and r2, respectively centered at x. Let V1 and V2 be the two hypersphere volumes corresponding to the two classes C1 and C2.
The k-nearest neighbor classification rule can be easily formulated in the case of two classes C1 and respectively C2 as: Bayesian Classifier Bayes decision theory represents a fundamental statistical approach in pattern classification assuming mutually exclusive and exhaustive classifications with known prior probabilities. Simplified formulated, the probability that a pattern belongs to a given class is determined. A simple example represents the two-class case with C1, C2. The a priori probabilities P (C1) and P (C2) are assumed to be known a priori since they can be easily determined from the available training dataset. Given are the pdfs p(xijCi), i = 1, 2. These pdfs p(xijCi) are known as likelihood functions of Ci with respect to x.
The Bayes classification rule can be easily stated for the two-class case ω1, ω2 as: If P C1P C2x ð Þ, x is assigned to C 2 ð Based on the above classification algorithm, a feature vector can be either assigned to one class or the other. This is equivalent to determining the maximum of the conditional pdfs evaluated at x.
Bayes Classification Based on LDA and QDA As stated before, the Bayes classification 30 is based on determining the prior probabilities πi for each class Ci. This value describes the prior estimates about how probable a class is.
This classification method assigns each new training sample to the class with the highest posterior probability.
Thus, the classification rule becomes: where μj represent the means of the classes and Σj is the corresponding covariance matrix. The assignment to a certain class j for a certain input pattern is made based on the smallest computed value of Cj .
The covariance matrices can be either different for each class or identical. In the first case, we have a quadratic discriminant analysis (QDA) classifier, while in the latter case we have a linear discriminant analysis (LDA) classifier.

Fisher's Linear Discriminant Analysis
Fisher's linear discriminant analysis (FLDA) is both a projection and classification method. Similar to SVM, we are looking for a linear function f (x) = w T x + b that is used to discriminate multiclass data labels. The method employs adequate dimension reduction from the initial data space to discriminate between q classes. The optimization problem becomes finding w defined by q − 1 basis vectors.
This technique identifies the first discriminating component based on finding the vector w that maximizes the discrimination index, given as: with B denoting the interclass sum-of-squares matrix and W the intraclass sum-of-squares matrix.

Decision Trees
Decision trees represent a nonlinear multistage classifier in which classes are rejected over a sequence of decisions until a finally accepted class is reached. This means that the feature space is split sequentially in specific regions that correspond to the classes. Each feature vector traverses an existing tree based on a sequence of decisions and follows a path of nodes until it reaches the region where it belongs. In other words, the correct class corresponding to a feature vector is determined by searching a tree-based decision system. This classification scheme is extremely beneficial when a large number of classes is given.
The most popular decision trees are binary decision trees. Binary decision trees separate the search space into hyperrectangles with sides parallel to the axis. The tree is searched in a sequential manner and a decision of the form xi ≤ α, with xi being a feature and α a threshold value, is made at each node for individual features.
This processing scheme is an essential part of many tree-based vector quantization algorithms.

Fuzzy Classifiers
In crisp classification the membership of each sample from the dataset to a given class is either zero or one. In fuzzy clustering, a sample can belong to more than one class and its membership takes any value between zero and one. A fuzzy classifier is described by the classical fuzzy IF-THEN rules. 31,32 A popular algorithm is the fuzzy c-means algorithm, being an unsupervised classification algorithm applied in medical imaging mostly to pixel segmentation. In the beginning, a set of classes has to be determined and the centroid of each class is computed. Each pixel is then classified by its membership values of the classes according to its attributes. Membership value for a certain class indicates the probability of the pixel belonging to that class. The objective of the fuzzy c-means algorithm is to compute membership values to minimize the within-cluster distances and maximize the between-cluster distances. The cluster centers are updated iteratively.

Deep Learning
Classical ML techniques cannot be applied to images directly, and hence it is required to define suitable features (mathematical descriptors) to encode discriminative properties of the lesions of interest. The emergence of deep-learning (DL) architectures allows working directly with images, and not with extracted or "engineered" features from these images, by learning the feature representation along with the classifiers. 33 Inspired from the brain processing in the visual cortex, an architecture achieving several layers of abstraction based on a hierarchy of transformation appears as a well-suited answer to the above problem. The most common architecture of this type is the convolutional neural network (CNN). Like the human brain, the first layer of this hierarchical network detects edges, then further layers primitive shapes and subsequently more complex visual shapes until a semantics concept is built. The number of layers determines the depth of the network and those networks with up to two hidden layers are considered shallow, while those with more than three are deep. Each layer can be viewed as creating a feature vector while the DL network can be viewed as a modality for learning a hierarchy of features. Thus, the higher layers implement a higher abstraction of the representation or mapping as reflected by the respective feature vector of that layer. With this novel "coding scheme," the network is generating the features by itself without the need of human intervention. A typical deep CNN is shown in Fig. 3.
Due to the complex nature of DCE-MRI, with both spatial (volumetric) and temporal variations, feature extraction either through conventional or DL-based techniques is crucial to achieve good performance. Hundreds of features have been proposed in the literature to encode both morphological (spatial) and kinetics (temporal) properties of the tumor and its enhancement. CNNs were initially proposed to deal with 2D, low-resolution, RGB images, and therefore need to be adapted in order to effectively process multiparametric inputs and encode both volumetric (spatial) and temporal changes. 34 At the same time, CNNs are complex networks with millions of parameters that require large datasets for effective training. The combination of these two factors make DL particularly challenging for breast MRI, and especially for lesions such as NMEs, for which collecting large datasets is especially challenging. Indeed, most the reviewed techniques are still firmly based on traditional feature extraction.
Evaluation Criteria ML techniques have to be evaluated regarding their performance in the testing phase based on new, previously unseen data samples.
The ability of a classifier to discriminate diseased malignant from benign cases is based on receiver operating characteristic (ROC) analysis. The most important evaluation metrics are accuracy, sensitivity, specificity, and area under the curve (AUC). Accuracy determines the ratio of the correct classified samples in relation to the total samples. The sensitivity (Sn) is the probability that a test result will be positive when the disease is present (true positive rate, expressed as a percentage). The specificity (Sp) is the probability that a test result will be negative when the disease is not present (true negative rate, expressed as a percentage). The AUC represents the degree of separability between the two classes and is a common figure of merit to compare the performance of different classifiers.

CAD in MRI of the Breast
CAD in the Detection of Diagnostically Challenging Lesions Based on Tumor-Extracted Quantitative Features NME lesions show a heterogeneous appearance in MRI, with high variations in kinetics and morphological characteristics 13,24,35 and have a lower reported specificity and sensitivity than mass-enhancing lesions. Most research initiatives in the past have been centered on automated analysis of mass lesions, since they were more straightforward, [9][10][11][36][37][38][39][40][41][42][43] while very few studies have investigated the characterization of the morphology and/or enhancement kinetic features of NME lesions. 19,[44][45][46] These studies showed a much lower sensitivity and specificity for NME lesions compared with masses, suggesting the need for more advanced algorithms for the diagnosis of nonmass-like enhancement. The diagnosis of NME lesions is more challenging, as both benign conditions and tumors such as fibrocystic or proliferative changes, and malignant lesions such as ductal carcinoma in situ (DCIS) and invasive lobular cancer (ILC) often present as such. 46 A systematic classification of NME lesions would be highly beneficial and cost-effective for clinical management, and would contribute towards a reduction of the number of biopsies and follow-up exams.
A search through the most important databases was performed to identify various studies related to the employed ML techniques. The primary aim was to categorize the studies according to the following research questions 47 : What are the ML techniques? 48 What are the evaluation criteria used for their assessment? and 49 What are the datasets used? Several databases were searched including Springer Link, Web of Science, IEEE Xplore, and PubMed. The following search keywords were used: "breast cancer," "MR imaging," "nonmass lesion," and "machine learning." It is important to provide an improved differential diagnosis for these diagnostically challenging lesions based on a CAD system that ideally incorporates the spatiotemporal properties of these lesions and provides the radiologist with a fast and accurate computational diagnosis support. Available features describe the breast signal in the 4D space and may capture the temporal dynamics, the morphological characteristics, and also the spatial variations within the tumor. In this subsection, techniques that are rooted in quantitative feature extraction are reviewed. Table 1 shows a summary of the articles describing CAD systems for NME lesions based on solely tumor-extracted morphological features, whereas Table 2 shows a summary of articles describing CAD systems for NME lesions based on both dynamics-and tumor-extracted enhancement curves or spatiotemporal features. The highest predictive value in NME lesions is achieved by both morphological and kinetic parameters. 13,35 A variety of ML techniques have been used for NME analysis, as shown in Tables 1 and  2, with random forest and SVM being the most common choices.
Based on the reviewed literature, it appears that ML techniques are a promising solution towards NME detection and characterization. There are, however, several challenges to be tackled.First, new techniques are needed for the simultaneous movement correction and segmentation considering spatial and temporal profiles: automatic motion correction represents an important prerequisite for a correct automated lesion evaluation. 57,58 Therefore, spatial registration has to be performed before enhancement curve analysis. At the same time, accurate segmentation of the lesion is critical, since the spatiotemporal features have to be extracted from the tumor region. Current segmentation algorithms include only spatial properties and are suitable for mass-enhancing lesions 11,59-61 and will require modifications for NME lesions. Novel elastic combined image registration and segmentation methods based on a variational model and level set approach are needed. These should incorporate spatial as well as temporal contrast-enhanced images.
Another important point is the development of novel feature extraction for spatiotemporal modeling algorithms that can capture the subtle local variations in NME lesions. BI-RADS-based features proved to be insufficient to differentiate between malignant and benign for NME lesions, and therefore additional descriptors are needed to reduce the high proportion of false-positive diagnosis and unnecessary biopsies. 46 Automated extracted features that have been applied to lesion characterization capture either variations in their temporal enhancement or in spatial (morphological) structures, or are computed as global features that are unable to capture and describe local variations in the morphological and temporal characteristics of NME lesions. This latter shortcoming can be addressed by implementing novel mathematical spatiotemporal feature descriptors that are able to capture the properties of segmental, focal, dendritic, and clustered ring enhancement.
At the moment, the most known spatiotemporal feature descriptors are: Zernike velocity moments, 62 the scaling index method, 63 and voxel-based adaptive spatiotemporal modeling. 64, 65 Ngo et al showed that spatiotemporal features such as Zernike velocity moments have achieved the highest sensitivity (87.5%) compared with morphologic (62.5%) or kinetic features (70.8%) alone. 54 The scaling index method is a technique that can capture both morphology and kinetics. Originating from the theory of complex systems, the scaling index extracts the local structure around a given point in an arbitrary dataset. This technique requires converting the image in point distribution, where each voxel corresponds to a point and its state is given by its coordinate and its gray scale intensity value. In the context of MRI of the breast, each point (or voxel) is thus described by its sagittal, coronal, and transverse positions along with the observed intensity value.
In clinical practice, dynamic medical images (ie, images acquired over time) are often assessed qualitatively. However, there is a need to quantify the results from these images in order to provide an objective and effective method for the diagnosis for evaluation of treatment efficacy. Voxel-based adaptive spatiotemporal modeling can accomplish this. Typically, images suffer from low signal-to-noise ratio, which makes quantitative voxelwise evaluation hard. One way to overcome this problem is to aggregate imaging data in an ROI. When using an ROI, however, one obviously loses the spatial information of the image. 65,66 To this end, a Bayesian approach can be used to gain robust estimates of the voxelwise dynamic. 65,67 This approach uses the spatial information inherent in the image, to strengthen the local modeling in each voxel. These approaches are usually based on Markov Random Fields (MRF). 68

CAD in the Detection of Diagnostically Challenging Lesions Based on Tumor-Extracted Enhancement Curves
Kinetic parameters extracted either from qualitative BI-RADS or quantitative empirical mathematical model measures of kinetics have proven not to be useful when it comes to the differential diagnosis of NME lesions. 19 Table 3 shows a summary of articles describing CAD systems for NME lesions based on tumor-extracted enhancement curves. The same search criteria were applied as for Table 1.
A simultaneous registration and segmentation can be achieved with independent component analysis (ICA). This technique facilitates the challenging segmentation of NME lesions and does not require a predefined ROI mandatory for manual analysis or an accurate threshold for semiautomated analysis. It incorporates spatial as well as temporal properties and provides an accurate motion correction and segmentation, even for noisy images and spatial smoothness compared with conventional level set methods or clustering-based techniques. This method detects the interior contours automatically, filters out the noise, and is robust with respect to noise, and includes and evaluates tumor-specific enhancement curves. ICA has proven to be an excellent method to separate motion artifacts in biomedical image processing [71][72][73][74][75][76] and recover underlying signals. The task of classifying pixels with similar time-courses corresponds to finding clusters based on ICA techniques such as topographical ICA 77 or treedependent component analysis. 78 Figure 4 shows two cluster assignment maps and the associated time curves for one benign and two malignant lesions. We can see that each lesion has a unique enhancement pattern. Thus, it can be hypothesized that the spatiotemporal behavior of these lesions is determined not by a single ROI-derived kinetic curve but by two specific interacting signal intensity time curves.

Future Trends: CAD Systems for Novel Applications
CAD for Multiparametric MRI To overcome limitations in specificity, additional functional MRI techniques such as DWI and proton MR spectroscopy have been explored and have demonstrated improved diagnostic accuracy as well as response assessment. 79 Their  combined application is defined as multiparametric MRI for detection and characterization of breast tumors. 13,24,35,80 Few CAD systems [81][82][83] for breast masses were proposed for multiparametric MRI. It has been shown 83 that there is the potential for development of multiparametric CAD that incorporates information from both DWI and DCE-MRI in breast lesion classification. The multiparametric imaging via MRI / positron emission tomography (PET) and the combination of extracted parameters was shown to improve diagnostic accuracy for breast and prostate lesions 81,82 All studies have elucidated that the amount and complexity of the acquired multiparametric data requires the development of advanced analysis tools.
The bottleneck that remains for providing an improved differential diagnosis, and thus contribute to advancing CAD systems beyond the current level, are determining the descriptors that incorporate the diagnostic information from multiparametric MR images for NME lesions. Important steps include: 1. Development of a novel image normalization framework for these multiparametric images. The normalization step represents a crucial step for the subsequent feature extraction and classification, since the images stem from heterogeneous sources. Usually, standard preprocessing step is followed by a novel joint segmentation and registration algorithm. A better solution is represented by novel joint segmentation and registration algorithm based on a variational model and level set approach that incorporates spatial as well as temporal contrastenhanced images. The multiparametric images are registered such that all segmented images will be in the same reference frame.
The multiparametric MR images arise from heterogeneous sources and need to be regularized before relevant features for the CAD system can be extracted. This step includes a preprocessing stage and a joint tumor segmentation and registration stage such that all images are in the same reference frame. The preprocessing step includes noise filtering performed based on wavelet shrinkage, 84 bias correction, 85 and SI normalization/standardization based on z-score computation to remove the variability between patients and to enforce the repeatability of the MRI examinations.
Due to the elasticity and heterogeneity of breast tissue, only nonrigid image registration methods are suitable. At the same time, accurate segmentation of the lesion is critical since the spatiotemporal features have to be extracted from the tumor region. Different solutions have been proposed to solve this problem: these range from purely image-based statistical and geometrical models for regularization 86 to more accurate physics-based models for mechanical deformation 87 and nonrigid diffeomorphic registration algorithms for volumetric 3D images. 88,89 The segmentation algorithm is applied to 3D images and uses the information from all available images when determining obscured boundaries, as in the case of NME lesions. This new algorithm can detect the interior contours automatically and provide an accurate motion correction and segmentation even for noisy images and spatial smoothness compared with conventional level set methods or clustering-based techniques. 2. Identifying novel descriptors such as structure tensors and texture from T 2 -MRI and advanced DWI methods such as intravoxel incoherent motion (IVIM) maps, restriction spectrum imaging, or multidimensional DWI. The apparent diffusion coefficient (ADC) is the most prevalent method for quantifying diffusion in clinical practice and is based on fitting a monoexponential model usually to two images acquired without diffusion-weighting and with relatively high diffusion-weighting. However, lesion heterogeneity is insufficiently described by a single ADC threshold and thus more detailed structural and functional image features have to be extracted from T 2 -MRI and DWI. Novel descriptors should include additional information from multiparametric MRI and capture the structure of the breast tissue in a unique manner. Experimentally, the monoexponential fit provided by the ADC was in practice found to be only applicable to simple cysts, whereas malignant and benign lesions required a more complex biexponential model fitted from six or more images with varying diffusion-weighting parameters. 90 Techniques such as IVIM provide separatequantitative parameters for tissue diffusivity, perfusion fraction, and pseudodiffusion and has been shown to be helpful for the differentiation between benign and malignant breast lesions. 90 This provides motivation for further research regarding the suitability of the IVIM features in DWI for nonmass-enhancing lesions.
A few studies have exploited first-order texture measurements statistics reflecting the lesion ADC heterogeneity, 91 an approach that has already demonstrated increased potential in MRI for prostate cancer. Nineteen different texture features were extracted describing the image from the gray-level cooccurrence matrix (GLCM): contrast, correlation, cluster prominence, cluster shade, dissimilarity, energy, entropy, homogeneity, maximum probability, sum of squares, sum average, sum variance, sum entropy, difference variance, difference entropy, information measure of correlation, inverse difference, inverse difference normalized, and inverse difference moment normalized. These features have the potential to characterize homogeneity, gray-level transitions, and the presence of organized structures.
3. Identifying novel spatiotemporal descriptors from DCE-MRI images as the most powerful discriminators. Some lesions exhibit a high variance in morphological and kinetic characteristics and the consequence is a high proportion of false-positive diagnoses. 46 Automated extracted features that have been applied to lesion characterization are either features that capture the variations in their temporal enhancement or in spatial (morphological) structures or are global features that are unable to describe local information. To address this latter shortcoming, novel mathematical spatiotemporal feature descriptors are needed such as local velocity moments, scaling index, and dynamic texture derived from geometrical multiscale decomposition that are able to capture the segmental, focal, linear, regional, and diffuse, and internal enhancement patterns (homogeneous, heterogeneous, clumped, clustered ring enhancement, dendritic), and lesion heterogeneity.
Dynamic texture features can be extracted based on the 2D+T curvelet transform. 92 It yields a spatiotemporal decomposition that represents an extension of the temporal domain of the 2D curvelet transform.
This novel technique is relevant for extracting nonlocal phenomena propagating temporally and operates based on a geometrical multiscale decomposition. As in the 2D case, a separable 3D convolution can be factored into 1D convolution along rows, columns, and image indexes of the MRI scans. As a result of this transform, a spatiotemporal segmentation algorithm is produced. The coefficients of the 2D+T curvelet transform contain discriminative information that can be employed for recognizing different dynamic textures. As in the case of 2D texture, the wavelet decomposition is employed to build feature vectors from detail subbands. The feature vector is composed by the average, standard deviation, energy, and entropy of the detail subbands. By adding a discrete cosine transformation to the 2D+T curvelet transform a morphological transformation can be implemented that considers also the local phenomena. Thus, the geometry of the dynamic texture can be additionally captured. This novel descriptorcould sufficiently represent the dynamical properties of the temporal texture characterizing the heterogeneous behavior of diagnostically challenging lesions. A possible CAD system for multiparametric breast MR images is shown in Fig. 5. Key components should be: spatiotemporal descriptors and tensor fields for the evaluation of diagnostically challenging lesions from multiparametric 3T images, thus increasing specificity without compromising the sensitivity of DCE-MRI.

CAD in Neoadjuvant Chemotherapy
Neoadjuvant chemotherapy (NAC) is the standard of care and is widely used in patients with locally advanced breast cancer, offering several advantages, such as reduction of tumor and enabling breast-conservation surgery instead of mastectomy as well as response-guided NAC approaches. In patients undergoing NAC for breast cancer the achievement of a pathological complete response (pCR) is associated with a significantly improved disease-free and overall survival. However, a pCR is achieved in only 30% of the patients after the completion of NAC and clinical studies have shown that the therapeutic outcome can be improved after treatment modifications during NAC. Predicting the pathological response after NAC in breast cancer patients is crucial and quantitative computerized methods represent an important step towards an accurate and effective breast cancer treatment. The first study assessing the role of an automatic CAD system in DCE-MRI predicting the pathological response to NAC has been described. 93 Tumor response is monitored in the latest clinical studies with PET/MRI. These techniques vary a lot in the performance of NAC response monitoring of different breast cancer types and the combined use of PET and MRI has been shown to have a complementary value 94,95 ; however, there is still room for improvement. The success of therapeutics in breast cancer could be improved based on developing novel distinctive and consistent imaging parameters extracted from a combined use of PET and MRI that are tailored for enhancing the pCR after NAC and validating them in a CAD scheme. Such a possible CAD scheme is shown in Fig. 6.
Several ML techniques were applied for NAC, mostly for breast masses. A CAD scheme based on a Bayesian classifier 93 used DCE-MRI data and extracted texture features from an automatically segmented 3D mask of the tumor and predicted pathological response to NAC. A similar method based on radiomics was employed. 96 A Gaussian SVM processing quantitative kinetic and texture-based image features from MR images for NAC has been proposed. 48 An SVM was also applied for NAC. 97,98 An ANN processes a new clinical marker based on quantitative kinetic image features analysis and assessing its feasibility for NAC was presented. 49 Fuzzy c-means clustering was employed for NAC in connection with level set segmentation. 60 DL methods have been applied to automatically score HER2, a biomarker that determines the patients who are eligible for anti-HER2 targeted therapies. 99 That study shows that DL is able to identify cases that are most likely misdiagnosed in the traditional clinical decision-making. An important application of DL applied to NAC when analyzing different contrast timepoints has been shown 100 : they applied CNNs to extract features from DCE-MRI and determined that the image acquired before contrast injection was the most effective at predicting response to therapy, with performance moderately increasing when including also images acquired after contrast injection.

Breast Cancer Radiomics
In the past 3 years, a novel computational approach-radiomics-is emerging to represent oncological tissues based on quantitative descriptors. 47 Currently, in computational radiology there are two concurrent research lines: radiomics and artificial intelligence (AI). Radiomics is the ML-based approach of extracting handcrafted features descriptive of a tumor, while AI employs DL techniques and works directly with the medical images.
Radiomics represents a novel approach to achieve a detailed quantification of the tumor phenotypes by analyzing a large number of image descriptors. It has been hypothesized that a large number of radiomic features tremendously increase the diagnostic, prognostic, and predictive power. With the increasing importance of "personalized medicine,"  new treatment strategies are being sought to respond to the specific characteristics of each patient and cancer phenotype. So far, personalized medicine is centered around molecular characteristics with genomics and proteomics data analysis. In Hoffman et al, 101 a quantitative radiomics approach was applied based on shape, texture, and kinetics tumor features and was evaluated in comparison with a reduced-order feature approach in a CAD system applied to diagnostically challenging lesions.
The potential of radiomics as a training-independent diagnostic decision tool has been shown. 102 The radiomics classifiers performed well in the differentiation of malignant and benign lesion; however, their performance was lower than that of an experienced radiologist. Prasanna et al introduced a new radiomics descriptor, the Co-occurrence of Local Anisotropic Gradient Orientations (CoLlAGe). 103 It is able to distinguish benign and pathologic phenotypes when they appear similar to each other on anatomic imaging. This new descriptor can capture their local entropy patterns and thus reflect hidden local differences in the tissue microarchitecture.
A comparison between DL and radiomics was performed 104,105 and a fusion between DL and CNN-extracted features. 106 The benefit of including multiple radiomic features, automatically extracted, in a lesion signature significantly improved the ability to distinguish between benign lesions and luminal A breast cancers, compared to using maximum linear size alone. 107 The diagnostic accuracy was evaluated 108 using ROI-based, radiomics, and DL methods, by taking peritumor tissue into consideration. A few studies are employing the radiomics approach in connection with multiparametric breast images for NAC prediction. 90,109 The only large study including NME lesions was presented in Ref. 110. The specifics of this study are described in Table 4.

Standardization and Repeatibility in Breast DCE-MRI
Advanced breast imaging techniques such as DCE-MRI and DWI are complex and highly adjustable procedures. The difference in hardware and software implemented by different vendors can produce noticeable differences in image quality and appearance. In addition, acquisition protocols vary across and within studies, vendors, and acquisition centers, and may include different spatiotemporal resolutions, contrast agents, or imaging parameters (TR, TE, fat suppression, etc.).
Postprocessing, including delineation and segmentation of the tumoral area, may further complicate this picture. ML models rely on quantitative features, either hand-engineered or learned by CNNs, which may be heavily affected by such changes. Since collecting data for all possible acquisition protocols is unfeasible, these aspects need to be carefully considered in the design, training, and validation of ML models. The problem of how to design robust ML models that can generalize to multiple settings is still, in many ways, an open research question.
There are two main approaches in order to build ML models robust to acquisition parameters: image standardization/harmonization and more robust feature extraction/selection. Besides working alongside vendors to standardize image acquisition, a laudable but notoriously difficult quest, image or feature harmonization may be more feasible. Feature harmonization was demonstrated to significantly improve benign vs. malignant lesion classification in an DCE-MRI dataset acquired from multiple international institutions. 111 The harmonization was applied separately within features categories, that is, morphology, texture, and kinetics, by aligning the distribution of features from multiple centers, after adjusting for covariates. Still, a large dataset including more than 1000 cancer cases per institution was available, which may be unfeasible to collect for lesion subtypes such as NME.
Another line of research analyzes repeatability and reproducibility of individual features in order to select those features that guarantee a higher reproducibility. 112 In a systematic literature review published in 2018, this aspect was extensively investigated for imaging modalities such as CT and PET, whereas only study was available for MRI. 112 Indeed, MRI involves larger variability in imaging parameters and requires extending the analysis to temporal as well as spatial features. A recent study analyzed the effect of acquisition parameters (specifically, scanner model, magnetic field strength, and slice thickness) on features related to lesion and fibroglandular tissue morphology, texture, and enhancement. 113 The authors found that these features have a significant effect on the extracted radiomics/radiogenomic features; however, those extracted from fibroglandular tissue are more susceptible to image parameters than those extracted from the tumor area, which is encouraging, as the latter are of higher clinical interest. However, more studies are needed to cover a wider range of imaging parameters and features. Another important issue to be settled is whether CNNbased features are more robust than hand-engineered features to such variations.

Discussion
This systematic review aimed to give an overview of the currently available methodology and applications of ML-based CAD systems for diagnostically challenging lesions in MRI of the breast. ML techniques have been successfully applied in medical image processing. Over the past decades, we have witnessed the transition of ML techniques from feature extraction from medical images to working directly with the raw images, as enabled by newer models such as CNNs.
To date, applications of state-of-the-art CAD systems are based on established feature engineering and enhancement curves extraction from DCE-MRI; these techniques have proven to be valuable tools for the detection and diagnosis in clinical praxis. Radiologists can benefit from such ML-based CAD systems, resulting in reduced interobserver variability and improved interpretation of breast imaging for the presence or absence of breast cancer.
Future directions for research and development aim to develop ML-based CAD systems not only for diagnostic but also predictive and prognostic purposes, by including other MRI methods such as T 2 -weighted or DW sequences or hybrid (PET/MRI) techniques, as well as extracted quantitative radiomics features. Such advanced multiparametric MLbased CAD systems are expected to further improve not only diagnostic accuracy for challenging lesions but also provide predictive and prognostic indicators for breast cancer. It has to be noted that, despite encouraging results, we are still at the dawn of a widespread implementation of ML-based CAD systems in breast MRI. To date, studies have been mainly retrospective, single-institution, using different equipment, scan protocols, sequence parameters, and postprocessing steps, and have included relatively small numbers of patients, which limits the statistical power of the studies and may compromise the generalizability of the results. Rigorous standardization of MRI hardware and software, quantitative MRI techniques, and multicenter large-scale studies are needed to build and validate robust machine-learning models that are applicable across patients and institutions to provide clinical value.