Anniversary Paper: History and status of CAD and quantitative image analysis: The role of Medical Physics and AAPM

The roles of physicists in medical imaging have expanded over the years, from the study of imaging systems (cid:1) sources and detectors (cid:2) and dose to the assessment of image quality and perception, the development of image processing techniques, and the development of image analysis methods to assist in detection and diagnosis. The latter is a natural extension of medical physicists’ goals in developing imaging techniques to help physicians acquire diagnostic information and improve clinical decisions. Studies indicate that radiologists do not detect all abnormalities on images that are visible on retrospective review, and they do not always correctly characterize abnormalities that are found. Since the 1950s, the potential use of computers had been considered for analysis of radiographic abnormalities. In the mid-1980s, however, medical physicists and radiologists began major research efforts for computer-aided detection or computer-aided diagnosis (cid:1) CAD (cid:2) , that is, using the computer output as an aid to radiologists—as opposed to a completely automatic computer interpretation—focusing initially on methods for the detection of lesions on chest radiographs and mammograms. Since then, extensive investigations of computerized image analysis for detection or diagnosis of abnormalities in a variety of 2D and 3D medical images have been conducted. The growth of CAD over the past 20 years has been tremendous—from the early days of time-consuming ﬁlm digitization and CPU-intensive computations on a limited number of cases to its current status in which developed CAD approaches are evaluated rigorously on large clinically relevant databases. CAD research by medical physicists includes many aspects—collecting relevant normal and pathological cases; developing computer algorithms appropriate for the medical interpretation task including those for segmentation, feature extraction, and classiﬁer design; developing methodology for assessing CAD performance; validating the algorithms using appropriate cases to measure performance and robustness; conducting observer studies with which to evaluate radiologists in the diagnostic task without and with the use of the computer aid; and ultimately assessing performance with a clinical trial. Medical physicists also have an important role in quantitative imaging, by validating the quantitative integrity of scanners and developing imaging techniques, and image analysis tools that extract quantitative data in a more accurate and automated fashion. As imaging systems become more complex and the need for better quantitative information from images grows, the future includes the combined research efforts from physicists working in CAD with those working on quantitative imaging systems to readily yield information on morphology, function, molecular structure, and more—from animal imaging research to clinical patient care. A historical review of CAD and a discussion of challenges for the future are presented here, along with the extension to quantitative image analysis. © 2008 American Association of Physicists in Medicine . (cid:3) DOI: 10.1118/1.3013555 (cid:4)


I. INTRODUCTION
Research and development of methodology and instrumentation for diagnostic or therapeutic applications are among the major responsibilities of medical physicists. In medical diagnosis, physicists have been contributing to the development of imaging techniques since the discovery of x-rays by W. C. Roentgen. The roles of physicists in medical imaging have expanded in all directions over the years, from the study of imaging systems ͑sources and detectors͒ to the assessment of image quality and perception, the development of image pro-cessing techniques, and the development of image analysis methods to assist in detection and diagnosis, to name a few. The latter is a natural extension of medical physicists' goals in developing imaging or other techniques to help physicians acquire diagnostic information and improve clinical decisions.
The benefit of a medical imaging exam is dependent both on the physical quality of the medical images and on the ability of the radiologist interpreting them. Studies indicate that radiologists do not detect all abnormalities on images that are visible on retrospective review, and they do not always correctly characterize abnormalities that are found. In the clinical interpretation of medical images, limitations in the human eye-brain visual system, reader fatigue, distraction, the presence of overlapping structures that camouflage disease in images, and the vast number of normal cases seen in screening programs provide cause for detection and interpretation errors. [1][2][3][4][5][6] Lusted discussed the use of computers in the analysis of radiographic abnormalities in the mid-1950s. 7 In the 1960s and 1970s, researchers including physicists and clinicians started to investigate computerized image analysis aimed at automated detection or classification of abnormalities, [8][9][10][11][12][13][14][15][16] including analyses on breast images 11 and chest radiographs. 12,13 However, limited computer power and quality of the image digitization equipment at that time may have limited the chance of success for these early attempts. The goal of stand alone, automated computerized detection or diagnosis also made it difficult to achieve the accuracy and the acceptance required for clinical use. In the 1970s and 1980s, with the advent of digital subtraction angiography and the application of other digital images, various investigators started developing computer-based quantitative analysis of angiographic vasculature. 17,18 In the mid-1980s, a team of medical physicists and radiologists in the Kurt Rossmann Laboratories in the Department of Radiology at the University of Chicago started their research efforts for computer-aided detection or computeraided diagnosis ͑CAD͒, that is, using the computer output as an aid to radiologists-as opposed to a completely automatic computer interpretation-focusing initially on methods for the detection of lesions on chest radiographs and mammograms. [19][20][21][22] In this usage, CAD can be defined as a diagnosis made by a radiologist who uses the output from a computer analysis of the image data in their decision making process. The final medical decision is made by the radiologist, not the computer. Note that with CAD, the role of the computer analysis is not to replace the radiologist but rather to aid the radiologist in his/her image interpretation and/or decision making. For more than the past 20 years, investigations of computerized image analysis for detection or diagnosis of abnormalities in a variety of 2D and 3D medical images have been conducted through collaborations between medical physicists and radiologists. Radiologists were expected to ultimately use the output from computerized analysis of medical images as a "second opinion," like a spellchecker, in detecting and characterizing lesions as well as in making diagnostic decisions, as schematically shown in Fig. 1. Many reviews and chapters have already been written on the development and implementation of CAD methods. [23][24][25][26][27][28][29][30][31][32][33][34][35][36][37] It is important to note that success in CAD required knowledge of imaging physics ͑i.e., image acquisition method͒ as well as knowledge of various computer vision and artificial intelligence techniques. Because of the numerous works that have been conducted, this brief review is by no means exhaustive, but only serves as a historical perspective of the importance of CAD research in diagnostic imaging and medical physics, and reports on the various roles played by medical physicists in the evaluation and understanding of CAD and its limitations.
The growth of CAD over the past 20 years has been tremendous-from the early days of time-consuming film digitization and CPU-intensive computations on a limited number of cases to its current status in which developed CAD approaches are evaluated rigorously on large clinically relevant databases. Figure 2 illustrates the growth of CAD research in terms of number of publications in Medical Physics. CAD research by medical physicists includes many aspects-collecting relevant normal and pathological cases; developing computer algorithms appropriate for the medical interpretation task including those for segmentation, feature extraction, and classifier design ͑Fig. 3͒; developing methodology for assessing CAD performance; validating the algorithms using appropriate cases to measure performance and robustness; conducting observer studies with which to evaluate radiologists in the diagnostic task without and with the use of the computer aid; and ultimately assessing performance with a clinical trial. Currently, CAD has been extended to include image analysis of various disease typesbreast cancer, lung cancer, interstitial disease, colon cancer, osteoporosis, osteolysis, vascular plaque, aneurysms, and others-on various modalities, including analog and digital radiography, ultrasound, CT, PET, MRI, and others.
CAD techniques and systems can broadly be categorized into two types-computer-aided detection ͑CADe͒ and computer-aided diagnosis ͑CADx͒. CADe implies that radiologists use computer outputs of the locations of suspect regions, leaving the characterization, diagnosis, and patient management to the radiologist. CADe is basically a detection task, i.e., a localization task. CADx extends the computer analyses to yield output on the characterization of a region or lesion, initially located by either a human or a computerized detection system. The computer might output mathematical descriptors to characterize the lesion and/or estimate the probability of malignancy ͑or other abnormality͒, leaving the final diagnosis and patient management to the physician. CADx is a classification task for differential diagnosis. Ultimately, the goal of CAD is to reduce search errors, reduce interpretation errors, and reduce variation between and within observers.
There is strong synergy between CAD and quantitative image analysis. With continued growth in CAD techniques and the associated increase in accuracies, quantitative image analysis is a natural extension of the new algorithmic methods to help extract quantitative features and absolute measures of morphology and function to improve medical diagnosis. Conversely, quantitative imaging accentuates the need for highly robust and efficient computer-assisted image analysis tools and stimulates the development of CADe and CADx for the new imaging applications.
Medical diagnostic imaging lends much of its scientific development to the adaptation of signal detection theory 38,39 to guide its technological evolution and performance evaluation. Medical physicists play an important role in this process. [40][41][42][43][44][45][46] One fundamental concept is the relationship between image quality measures such as the signal-to-noise ratio ͑SNR͒ and the detectability of signals in an image. [47][48][49][50][51][52][53] The development of various medical imaging modalities centered around the goal of improving image quality and SNR of the lesion of interest, which is important for both human observers and machine vision. To achieve this goal for CADe systems, CAD researchers proposed the difference image technique [19][20][21][22] in which the input image was processed to generate a SNR-enhanced image and a SNR-suppressed image, as demonstrated on a chest radiograph in Fig. 4͑a͒, which shows the processing of the nodule prior to additional computer vision techniques for nodule detection. Subsequently, the difference of the two processed images is obtained to yield an image in which the conspicuity of the lesion is greatly increased ͓Fig. 4͑b͔͒. This method of enhancing the SNR was an extension of the prior research of these medical physicists. Although the implementation differs and depends on the lesion of interest, many CADe systems to date follow a similar approach of enhancing the SNR as a first step.

II.A. CADe in mammography
Breast cancer detection is one of the principal research areas that has been studied since the early days of CAD research. Mammographic interpretation is a difficult task because mammographic signs of breast cancer such as microcalcifications and soft tissue masses can be very subtle and often obscured by dense fibroglandular breast tissue. The recommended annual screening mammography for women over 40 years of age results in a large volume of mammograms to be read by radiologists. Studies indicate that the false-negative rate of mammography ranges from 10% to 30%. [54][55][56][57][58][59][60] In a study that reviewed retrospectively prior mammograms of breast cancer patients, it was found that 67% of the cancers were visible on the prior mammograms. 61 CADe, therefore, potentially can be very useful for mammography.
Computerized analysis systems for mammography usually are focused on the detection of either clustered microcalcifications or mass lesions, with more recent methods on architectural distortions. These methods have been reviewed extensively. [23][24][25][26][27][28][29]31,33,62,63 A number of investigators have reported computerized methods for detection of microcalcifications. 16,19, Although the specific techniques used in different systems varied, they generally contained several major steps. The breast region is first extracted by boundary detection. The mammogram may then be processed by image enhancement methods to increase the SNR of the microcalcifications. The signal candidates are identified and segmented based on their SNR difference or gray level contrast from the surrounding background tissue. Features that characterize the shape, size, contrast of the individual microcalcifications, and of the cluster are extracted and used as input to classifiers for differentiation of true and false signals. Additional false-positive ͑FP͒ reduction techniques, such as artificial neural networks, may be trained to further distinguish between true signal patterns ͑i.e., the lesion͒ and normal anatomic background. The clustering property of significant microcalcifications is used to further reduce FPs, and the remaining clusters are flagged as suspicious lesion locations.
Soft tissue masses are imaged as focal densities on mammograms. Masses with well-circumscribed margins are more likely to be fibroadenoma or a benign cyst whereas masses with ill-defined or spiculated borders have a high likelihood of being malignant. However, there is large overlap between the border characteristics of malignant and benign masses. Initially, a few investigators developed automatic algorithms for detection of masses on mammograms 11, 64,85,86 comparing regions between the left and right breast images. The development of mass detection systems evolved more rapidly since the late 1980s.  The overall scheme of these systems generally contains several major steps similar to those in a microcalcification detection system. The breast region is FIG. 4. Difference-image approach to detecting nodule candidates on chest radiographs. The approach aimed to enhance the nodule with one processing filter and to suppress the anatomical background with another processing filter, with the difference resulting in an image for further analysis. Reprinted with permission from Giger et al. 1988 ͑Ref. 21͒. first segmented from the mammogram. The mammogram may be preprocessed with a spatial filter or nonlinear technique to enhance the suspicious regions. The mass candidates are segmented from the breast image based on their gray level contrast, gradient orientation, or spicule information. Feature descriptors are extracted from the segmented objects. Rule-based classifiers or other linear, nonlinear, or neural network classifiers are then trained to classify the mass candidates as true mass or FPs.
While many analyses of mammograms include the specific stages of lesion segmentation and feature extraction, some investigators have focused on extracting information directly from the image data. Zhang et al. trained a shiftinvariant neural network to detect individual microcalcifications in a background-corrected region. 74 Tourassi et al. used information theory in developing a content-based retrieval and detection system that took as input regions throughout the mammogram in the detection of masses. 113 In 1990, Chan et al. 114 reported on the first observer study to compare radiologists' detection of microcalcifications with and without the aid of a computer-aided detection ͑CADe͒ system using receiver operating characteristic ͑ROC͒ methodology and demonstrated that the radiologists' performance was improved significantly with CADe ͑Fig. 5͒. This study established the potential usefulness of CADe as a second opinion. It also revealed the important concept that it is not necessary for the CADe system performance to be as high as or higher than that of the radiologists in order to provide a useful second opinion, as long as it can provide information complementary to what radiologists may have. Additional studies followed for both clustered calcifications and mass lesions. 92 Figures 6͑a͒ and 6͑b͒ show the first prototype CAD system ͑circa 1994͒, along with an example output on thermal paper, which was developed and applied to screening mammography at the University of Chicago. The system received as input a screen/film mammogram, which was subsequently digitized and automatically analyzed by the computer. The output annotation from the system would indicate suspect locations ͑clustered microcalcifications or mass lesions͒ on a thermal paper printout or monitor. FIG. 6. ͑a͒ First prototype CADe system-developed for screening mammography at the University of Chicago ͑circa 1994͒; ͑b͒ system annotated output on thermal paper.
The first commercial CADe system for screening mammography was approved by the Food and Drug Administration ͑FDA͒ in 1998. Other systems for mammography have obtained FDA approval since then and approval of CADe system for digital mammography also followed. A large number of CADe systems are being used clinically in screening screen film and digital mammography both in the United States and overseas. Several reports have been published on the performance of some of the commercial systems in clinical practice, as summarized in Tables I and II. [387][388][389][390][391][392][393][394][395][396][397] The results indicated that the cancer detection rate in general increased with an accompanied increase in the recall rate, as can be expected. The design of these clinical studies can be separated into two major groups: ͑i͒ a sequential reading design in which interpretations by the same radiologist without CADe are immediately followed by interpretation with CADe and ͑ii͒ a longitudinal in time ͑historical͒ design in which a statistical comparison is made of a group of radiologists over two periods of time before and after CADe is implemented in the practice. The former design, therefore, collected without and with CADe data from the same patient cohorts and the same radiologists, whereas the latter design collected data from different patient cohorts and the radiologists may not be the same. The rationale and biases of these designs have been discussed by CAD researchers. 36,115,116 The latter design may introduce additional variabilities from factors such as differences in the patient characteristics and the radiologists' experiences in the two periods of time. The larger variances may make it more difficult to observe the incremental gain in sensitivity with CADe compared to without CADe. The different biases and variances may account for part of the differences in the observed effects of CADe on the sensitivity and specificity in these prospective studies.
The effects of CADe can be expected to depend on many other factors, including the level of expertise and vigilance of the radiologist and how the radiologist utilizes the CADe marks. Current CADe systems for screening mammography are designed to be used as a second reader, not as a concurrent reader. The radiologist should first interpret the case thoroughly as if there is no CADe, and should not reduce their level of suspicion at locations where there are no CADe marks. It is well known that CADe systems can miss lesions that radiologists detect routinely and mark many FPs. The benefits of CADe often rely on its detection of some lesions that radiologists may overlook and the willingness of radiologists to work up some of the CADe marks. If CADe is used as it is designed, the sensitivity will never decrease and the recall rate is expected to increase. For radiologists with low false negative rates, the incremental gain by CADe will likely be small. Furthermore, the incremental gain in sensitivity will not be realized if radiologists become too dependent on the CADe system and reduce their vigilance in interpreting the mammograms themselves, or if they ignore the CADe marks because of too many FPs. It is important that the user understands the capability and the limitations of the CADe system and uses it properly in order to take advantage of CADe.
The relatively large number of FPs in current CADe systems is a major drawback of using CADe for some radiologists. Continued efforts are needed to improve the sensitivity TABLE I. Prospective clinical trial of commercial CADe systems for screening mammography. These studies used a sequential reading design in which the interpretations by the same radiologist without CADe immediately followed by with CADe were recorded for individual cases. The results were, therefore, collected from the same patient cohorts and the same radiologists.

Investigators
Number of cases and the specificity of the systems. Most current CADe systems concentrate on detection in a single mammogram. One promising approach to improving the performances of CAD systems is to incorporate multiple image information, including correlation of two mammographic views ͑CC and MLO views͒ of the same breast, comparison of current and prior mammograms, or comparison of bilateral mammograms. These strategies emulate those routinely performed by radiologists in mammographic interpretation to detect new lesions and reduce FPs. 54,117,118 Studies have been conducted to incorporate information from multiple mammographic views of the same breast, such as the CC and MLO views, for lesion detection and the reduction of FPs. 84,[119][120][121][122] Radiologists compare the left and right mammograms to detect asymmetry in the density patterns of the breasts. Thus, researchers used digital bilateral comparison techniques, including methods for image registration, to incorporate information from both breasts and identify asymmetries. 64,89,91,[123][124][125] These studies indicate that multiview information fusion has a strong potential for improving the performance of CADe systems.
Radiologists routinely compare the current and prior mammograms, if available, for detection of newly developed mammographic abnormalities. Automated analysis of interval changes in serial mammograms requires identification of corresponding locations on two mammograms of the same view. The deformability of the breast and lack of invariant "landmarks" make it difficult to correctly register two breast images using conventional registration techniques. Various investigators have developed methods for use in temporal subtraction using automatically delineated skin line and nipple positions, 126 as well as regional registration techniques to localize corresponding lesion locations on mammograms of the same view to within a small search region of the true location. 124,127,128 Multimodality imaging is a promising approach to improving breast cancer detection. There is strong interest in developing a combined full breast 3D ultrasound and digital mammography system in which the ultrasound scanning will be performed automatically in the same compression as the digital mammogram so that the corresponding lesions between the two can be correlated geometrically. 129 To facilitate the implementation of such a system in screening mammography, ideally one will have a CADe system that can automatically detect suspicious masses on the digital mammogram and initiate the ultrasound scanning, if needed, while the breast is still under compression. After image acquisition, the CADe system will automatically detect the lesions in the 3D ultrasound volume and correlate the lesions with those detected on the digital mammograms. The combined information from the two modalities can be used to improve cancer detection and reduce recalls.
With the advent of direct digital mammography systems, a number of new breast imaging techniques are under development, including digital breast tomosynthesis, [130][131][132][133][134] and single-energy or dual-energy contrast-enhanced digital subtraction mammography 135,136 and breast computed tomography ͑CT͒. [137][138][139] Tomosynthesis mammography and breast CT hold the promise of improving breast cancer detection and diagnosis, especially in dense breasts. Combined tomosynthesis mammography and 3D ultrasound scanning is also being developed. 129 These new modalities or multimodality images drastically increase the number of images that radiologists have to interpret for each case. If CADe systems are available to assist radiologists in the analysis of the new modalities efficiently and in integrating the information from different modalities effectively, it may facilitate the introduction of the new techniques to clinical practice. Development of CADe systems for tomosynthesis mammography is underway. 140,141 It can be expected that CAD development for the other modalities will also be initiated when image databases become available for design of the CAD systems.

II.B. CADe in thoracic imaging
CADe systems for various lung diseases have been reported in the literature. Chest radiography is the most commonly performed procedure in medical imaging, however, interpretation of chest radiographs is a difficult task because of the overlapping ribs and its low contrast sensitivity for subtle abnormalities. CAD of lung disease was attempted in the 1970s. 12,14 Dedicated efforts in the 1980s revived the interests in development of CADe systems for chest radiographs. 20,21,142 Over the last two decades, a large number of studies have been conducted to develop computerized methods for analysis of various abnormalities in chest radiographs, including detection of lung nodules, 20,21,143-152 detection and classification of interstitial diseases, 142,153 detection of pneumothorax, 154 and temporal subtraction of chest radiographs to detect interval changes. [155][156][157] The effects of CADe for lung nodule detection on radiologists were evaluated by a number of observer performance studies. 144,[158][159][160][161] Similar to CADe for breast cancer detection in mammography, these studies indicated that the detection accuracy for lung nodules in chest radiographs could be significantly improved with the use of CADe. A commercial lung nodule CADe system for chest radiography was approved by the FDA in 2001 but no large-scale prospective clinical trials have been reported to date.
The Early Lung Cancer Action Project ͑ELCAP͒ study showed that thoracic CT has higher sensitivity for detection of early stage lung cancer than chest x-rays. 162 However, it is not known whether early detection can actually reduce the mortality rate or increase the chance of survival for lung cancer patients. An NCI-sponsored randomized, controlled study, National Lung Screening Trial ͑NLST͒, was conducted to compare the mortality rate of lung cancer patients using helical CT or chest x-rays but the results are not yet available. Thoracic CT, especially helical CT, produces a large number of slices for each case. There will be a dramatic increase in radiologists' workload if CT is recommended for lung cancer screening in the future. The potential usefulness of CT for lung cancer screening has stimulated interest in the development of CADe systems for lung nodule detection on thoracic CT scans. A number of research groups have reported CADe methods in this area. [163][164][165][166][167][168][169][170][171][172][173][174] The performances of these systems vary, and the performances were evaluated on data sets using different CT scan protocols and having cases of different nodule characteristics. The NCI recognized the need for CAD techniques for lung CT interpretation and supported the Lung Imaging Database Consortium ͑LIDC͒ to collect a standard database of lung CT images for this purpose. 175 The first commercial CADe system for thoracic CT was approved by the FDA in 2004. Although no prospective clinical trial of lung CADe has been reported to date, retrospective observer performance studies indicated that observers' accuracy in detection of lung nodules on chest CT scans can be significantly improved with the use of CADe, [176][177][178][179][180][181] indicating the potential for CADe to assist in radiologists' reading in clinical practice.

II.C. CADe in colon imaging
CT colonography is another important area of application for CADe. Colon cancer is the third leading cause of cancer deaths for men and women in the United States. Colon cancer screening involves detection of polyps, which can be the precursor of colon cancer, and cancerous growths on the walls of the large intestine. Currently the most reliable procedure for colon cancer screening is a colonoscopy. CT colonography ͑CTC͒ is being studied as an alternative procedure. Interpretation of CTC is time consuming and difficult even with the help of the virtual colonoscopic view that helps the radiologist fly through the entire colon to search for abnormalities. The radiologist's sensitivity of polyp detection in CTC varies over a wide range as reported in the literature, which was attributed to many factors such as the variability in CT scanning techniques, colon preparation methods, size of the polyps in the studied patient cohort, and the radiologists' experience with CTC.
CADe may be a useful adjunct to CTC to reduce false negatives and reader variability. A number of research groups have reported CADe methods for analysis of CTC in the past few years. [182][183][184][185][186][187][188][189][190][191][192][193] The current CTC CADe systems have sensitivity ranging from 80% to 100% at an FP rate of 2 to 15 per scan. Most of the studies used a small data set for evaluation so that the variances of the results may be high. In addition, the performance of CADe depends strongly on the data set characteristics, including the polyp size in the data set and the CTC scanning protocol, as well as the method used for scoring of the true positives and FPs of the CADe algorithm. It is still unknown how these performances would generalize to unknown cases in prospective studies. Several retrospective observer studies have been conducted to evaluate the effects of CADe on radiologists' interpretation of CTC. 185,[194][195][196] These studies indicate that radiologists reading with CADe outperformed radiologists alone. The usefulness of CADe for CTC has yet to be evaluated in prospective clinical trials.

III. COMPUTER-AIDED DIAGNOSIS-FOR DIFFERENTIAL DIAGNOSIS
Once a lesion is detected, for example, such as in a screening program, further imaging of the abnormality may be necessary in order to justify subsequent patient management such as invasive evaluations ͑e.g., a biopsy͒ and/or therapeutic interventions. Thus, the role of a CADx system is to aid in the characterization of an already-found lesion or other abnormality in terms of its morphological or functional attributes, and in the estimation of its probability of malignancy or other disease state. Such a computer system is expected to aid a radiologist in his/her differential diagnosis and improve the positive predictive value ͑PPV͒ of the interpretation. The input to a CADx algorithm could be either a radiologist-detected or a computer-detected lesion or region. This input could be in the form of an indication of the approximate center of the lesion or an actual delineation of the lesion outline. As clinical CADe systems begin to give more information beyond just localization, CADx is slowly being introduced.
Just as radiologists use multiple modalities in the work up of a patient's case, so can a computer system. Medical physicists, armed with their knowledge of the physics of the various imaging modalities, such as x-ray radiography, special radiographic views, sonography, and MRI, are able to develop CADx for the various modalities and use the information individually or in combinations. Radiologists' use of the output of a CADx system is expected to improve the sensitivity for cancer diagnosis, reduce the number of benign biopsies, and reduce variability between and within radiologists. Extensions of such systems will potentially be developed for assessing prognosis, assessment of tumor growth rate, and response to treatment.

III.A. CADx in breast imaging
Medical physicists have played key roles in developing CADx methods in breast imaging across the modalities. Computerized classification systems can be designed to take as input either human-perceived lesion features or computerextracted features. Note that a diagnostic task involves both the extraction of lesion characteristics and the subsequent merging of these characteristics into a diagnosis. In 1988, Getty et al. demonstrated that radiologists' performances improved when using a CADx system that merged the lesion characteristics that the radiologists had indicated via a checklist. 197 Although such human-perceived lesion features, e.g., BI-RADS rating, can be subjective and may vary between radiologists, the usefulness of merging such features by computer systems has been demonstrated. [197][198][199][200][201][202][203][204][205][206] Computer-extracted features, i.e., mathematical descriptors, can characterize the lesion using features either that radiologists can perceive such as mass spiculation or distribution of microcalcifications, or those that are not so visually intuitive to the eye, such as those obtained with co-occurrence matrices. 62, These computer-extracted features can be obtained from standard mammographic views ͑CC and MLO͒, special view mammogram, prior mammographic exams, or from tomosynthesis mammograms, as well as from sonograms and/or breast MR images. Note that computerextracted lesion features can be obtained from either radiologist-delineated lesion margins or from computer-segmented lesion margins. Various methods have been proposed for this important segmentation stage in CADx systems. 217,230,242,[249][250][251][252][253] A poor segmentation of the lesion margin would subsequently yield erroneous mathematical descriptors ͑features͒ of the lesion. As with radiologists, computer performance in diagnosing lesions improves for special-view mammograms, as opposed to standard views, 228 and also when prior mammograms are included in the overall analysis. 226,247 With the advent of FFDM systems, investigations have been conducted to understand the necessary conversions of a CADx system when going from digitized screen/film images to FFDM images. 254,255 For example, investigators have shown that a mammographic CADx system developed for characterizing clustered microcalcifications on screen/film mammograms as malignant or benign can also be used for FFDM images; the system appeared to maintain consistently high performance without requiring substantial modification from its initial development on screen-film mammography. 246,256 While the underlying concepts regarding malignant features remain, the importance of the different features on a correct output may be dependent on the physical image quality of acquisition system, and, thus, retraining ͑calibration͒ may be necessary.
The understanding of the imaging physics of breast sonography allows for the development of additional lesion features, such as posterior acoustic shadowing, and, thus, their corresponding mathematical descriptors. CADx systems for ultrasound include mathematical descriptors of texture, margin, and posterior acoustic shadowing criteria. 208,220,232,236,[257][258][259][260] Medical physicists have led the field in robustness evaluation studies across institutions and across manufacturers on sonographic CADx, 236,240 and in extending the analysis to 3D images. 239 The use of dynamic contrast-enhanced MRI continues to increase in diagnostic work up and preoperative staging for breast cancer. Assessment of contrast medium uptake and washout are related to tumor blood flow, and, thus, the associated kinetics are related to angiogenesis and likelihood of malignancy. 261 There is a need for standardization of MR breast imaging as protocols can vary greatly between and within institutions, and, thus, standardized lexicon are being developed. 262 Success in MRI analysis depends on knowledge of the underlying biology, the physics of the acquisition, and computer vision techniques. Some commercial systems focus on just the kinetic aspects of breast MRI and plot the kinetic curve ͑uptake and washout of the contrast agent͒ of regions of interest on the display workstation. CADx systems being developed for MRI yield morphological features, kinetic features, or combinations. 215,223,231,235,243,399 MRI CADx systems have the potential to improve both the accuracy and efficiency of interpretation.
Medical physicists have led various observer studies demonstrating CADx as an aid to radiologists in the task of distinguishing between malignant and benign lesions, 222,237,244,263,398 demonstrating that radiologists' performance in classification of malignant and benign microcalcifications or masses could be improved significantly by use of CADx. Others showed that improvement in performance with the use of CADx can be obtained by both expert mammographers and community-based radiologists with the performance of the aided nonexperts reaching the levels of the unaided experts. 263 Use of computer output has also been shown to reduce the variability among radiologists' interpretations. 264 Medical physicists have also conducted observer studies for serial mammographic exams, 265 sonographic CADx including those for both 2D 237 and 3D ultrasound systems, 239 and for multimodality breast CADx in which the CADx system outputs analyses of both mammography and ultrasound. 244 Effective and efficient communication of the computer output to the radiologist is a necessary step in the overall CADx protocol. During residency, radiologists learn through the review of cases from conferences, teaching files, and atlases, and, thus, the access to online cases of known pathology during a radiologist's daily practice may be helpful for continuous learning. Searching an online image atlas can be based on individual features, on likelihood of malignancy, or on psychophysical measures of similarity. One of the first display systems for computer analysis output, by Sklansky et al., 266 used a graphical method to show a chosen number of similar malignant lesions and the same number of similar benign lesions. Swett et al. utilized an expert system to control the display of similar cases. 267, 268 Giger et al. developed a CADx workstation interface that includes mathematical descriptors of lesion characteristics as well as an estimate of the probability that the suspect lesion is abnormal or not, with the output given in terms of a numerical estimate of the probability of malignancy, a retrieval of similar lesions from an online database, and/or a graphical representation of the case in question relative to the distributions of normals and abnormals in a given population. 269,270,244 This interface, shown in Fig. 7, displays similar images and uses color coding to indicate whether the similar images are malignant or benign ͑red outlines= malignant, green outlines= benign le-sions͒. It searches either via the computer-estimated probability of malignancy or by way of specific lesion characteristics, and shows a specific number of the closest similar lesions-whether they are all malignant, all benign, or a mixture. 269,270,244 Investigators using the psychophysical aspects of similarity have combined the computer-extracted lesion characteristics with subjective similarity measures obtained from observers reviewing pairs of images [271][272][273] or from observers giving subjective perceived ratings of lesion features. 274 Multimodality CADx output can be given separately for each modality or as a combined output that includes features from each modality, both of which have been shown to improve performance. 275,241 In addition, medical physicists are investigating the appropriate output in terms of an estimate of the probability of malignancy, knowing that the specific cancer prevalence in the training database may affect the actual output value. 276

III.B. CADx in thoracic imaging
Due to that range of potential diseases present in the thorax, various types of computer-aided diagnosis methods are being developed for both chest radiography and CT, and include computer-aided diagnosis algorithms for pulmonary nodules and interstitial lung diseases.
Use of computers for the differential diagnosis of lung nodules in chest radiographs and thoracic CTs has advanced in recent years. Candidate nodules detected on thoracic CT may be categorized as malignant or benign, or as actionable or not. Research parallels that for breast lesions in that characteristic features of the nodules are extracted from chest radiographs and merged using classifiers to yield a likelihood of malignancy. [277][278][279] Others have developed classification methods for nodules detected on CT-both conventional and thin-section CT. [280][281][282][283][284][285] This characterization of lung nodules on CT has been enhanced with the advent of PET/CT systems, allowing for characteristics from both modalities to be used in the computer classification.

IV.A. Quantitative metrics in anatomical imaging
While CAD is a quantitative tool that appears to its radiologist users as a qualitative tool, radiologists also make use of physically relevant quantitative measures under limited circumstances. These quantitative values have physical meaning in the radiographic interpretation. The use of distance and angular measurements using rulers or protractors is the most ubiquitous example, and clinical applications include the ultrasound-determined crown-rump length for fetal aging, 286 angular measurements for scoliosis, 287 and measuring tumor width. 288 Medical physicists were not needed for radiologists to capitalize on simple length measurements for diagnosis, but radiologists do use the fact that the Hounsfield unit ͑HU͒ in CT is proportional to the linear attenuation coefficient, and medical physicist Godfrey Hounsfield ͑also a Nobel Prize recipient͒ developed the normalization procedure that made this possible. Lung nodules that exceed a certain HU value are considered benign due to their calcification, while nodules under this value have a higher probability of malignancy. 289 Medical physicists have played a role in understanding the limitation of quantitative CT. 290 Dual energy x-ray absorptometry ͑DEXA͒ is capable of accurately determining the projected bone mineral density ͑mg/ cm 2 ͒, and has been used to assess fracture risk for over two decades. 291 CT can measure the bone mineral density 292 in three dimensions ͑mg/ cm 3 ͒, and has also been used to quantify fracture risk. 293 The 3D capabilities of QCT provide the additional benefit of discriminating between trabecular and cortical bone density, 294 and here again the calibration methods necessary for accurate bone mineral quantitation were developed by medical physicists. 294 The use of digital subtraction angiography allows interventional radiologists to assess the anatomical constriction of a vessel, and from the DSA procedure, a quantitative measure of stenosis can be easily derived. DSA was developed by Mistretta and colleagues, [295][296][297] and is the worldwide standard for peripheral angiography today. Stenosis can be repeatedly measured during a revascularization procedure such as stent placement or angioplasty to monitor the success of the intervention and provide guidance to the interventionalist as to whether or not they have successfully dilated the vessel lumen.

IV.B. Quantitative metrics in functional imaging
Imaging modalities used in the traditional nuclear medicine department, including planar imaging 298 and SPECT, 299 compensate for their comparatively low spatial resolution by providing unique functional information concerning metabolism, pharmacokinetic uptake, and other biodistribution information. Nuclear medicine procedures became faster with the development of the Anger camera, 300 developed by medical physicist Hal Anger, and faster imaging in turn gave rise to quantitative kinetic studies. Physicists played an important FIG. 7. Computer/human interface for a multimodality workstation with computer outputs in numerical, pictorial, and graphical modes for both ͑a͒ mammography CADx output and ͑b͒ sonography CADx output. role as nuclear medicine hardware started to incorporate computers, 301 a process that enabled quantitative imaging in nuclear medicine. These images, both in 2D ͑planar͒ and 3D ͑SPECT͒, can be quantitative [302][303][304] when calibrated appropriately and used for kinetic measurements of the heart including the ejection fraction, myocardial perfusion, and ventricular volume. 305 The vascular dynamics of SPECT imaging can also be applied in neuroradiology for brain perfusion in acute stroke. Medical physicists, working in collaboration with radiologists ͑e.g. Ref. 306͒, helped develop specific nuclear imaging procedures. Commercially available SPECT/CT systems are inspired by the early work of Hasegawa and colleagues 307,308 in building dual modality scanners.
Positron emission tomography ͑PET͒ was invented by medical physicists 309,310 and used in the research setting for many years, but enjoyed widespread clinical assimilation in the United States only when reimbursement mechanisms became established. In the present form of PET/CT, developed by Townsend and colleagues, 311,312 this hybrid modality has revolutionized oncologic imaging and allows metabolic information ͑PET͒ to be evaluated along with high resolution anatomic information ͑CT͒. Furthermore, the inclusion of CT in the PET examination allows the PET image to be corrected for photon attenuation, 313 transforming PET imaging into a more quantitative modality.
Magnetic resonance imaging ͑MRI͒ was developed initially by spectroscopist Paul Lauterbaur, 314 but most of the hardware development and refinement of pulse sequences was performed by medical physicists. [315][316][317][318] The biological effects of MRI were also studied early on by medical physicists. 319 Functional MRI ͑fMRI͒ is a tool widely used by psychiatrists and neurophysiologists to study the spatial and temporal aspects of cognition and emotion. Blood oxygenation level dependent 320 ͑BOLD͒ techniques are used to monitor blood flow in the brain while simultaneously providing an audible, visual, or other sensorial stimulus to the patient. These techniques were developed by medical physicists working with other scientists. 321,322 The techniques are quantitative because the activity maps that are generated from these studies use correlation and other more sophisticated statistical measures to map spatiotemporal patterns of brain activity specific to the sensorial stimulus. While such techniques are used more for fundamental research than clinical evaluation, clinical applications such as mapping epileptic foci and surgical planning are becoming more common.

V. EVALUATION OF CAD AND QUANTITATIVE IMAGE ANALYSIS SYSTEMS
Medical physicists have played important roles in developing methodology for evaluating image analysis systems, assessing variations between such systems, and conducting observer studies. In CAD evaluations, performance levels can be determined for the computer alone or for radiologists when they are using the system output as an aid in their interpretations. 323 CADe methods typically employ FROC curves to understand sensitivity versus average false-positive detections per image in assessing performance of the computer analysis, whereas CADx methods are evaluated using ROC analysis to assess computer performance in the task of distinguishing between malignant and benign lesions. 115,324-327 As decision making extends beyond twoclass diagnoses, n-class classifiers will require appropriate measures of performance, and these efforts are also being led by medical physicists. [328][329][330][331] Furthermore, during Jiang et al.'s research on CADx of clustered microcalcifications, the investigators realized the need for a more relevant measure of performance-beyond the area under the ROC curve ͑AUC͒-in situations such as diagnostic workup in which a high level of sensitivity is crucial, and, thus, the partial area index was developed as demonstrated in Fig. 8. 332 The database characteristics, for example, in terms of size, lesion distribution, difficulty, as well as the integrity of the truth, can greatly influence the training and testing of a CAD algorithm. Various investigators have demonstrated the effect of different databases on mass or microcalcification detection performance using FROC analysis, 333,334 the effects of differences in scoring methods on the sensitivity and specificity of a CAD system, 62,335 and the potential biases resulting from insufficient sample size and improper feature selection methods, 336-343 the effect of dominant features in the training of artificial neural networks, 221 and effect from training and testing with similar images. 344 Studies of robustness in which CAD systems are evaluated across institutions and across manufacturers are necessary in the translation of the research to the clinical arena. 225,236,240,345 Additional investigations have focused on the computer performance on lesions not initially detected in screening programs. 61,346 Medical physicists have led the efforts of the LIDC ͑lung imaging database consortium͒ initiated by NCI. The LIDC has demonstrated and provided methods for the careful and necessary collection of images and relevant diagnostic information to enable CAD research. It has also considered various issues including the integrity of expert-defined "truth," radiologist variability in the identification of lung nodules on CT scans, and a comparison of different size metrics for pulmonary nodule measurements. 175,[347][348][349][350][351] Databases are only as good as the associated truth about the abnormality, whether it be the location of the lesion, biopsy results on malignancy/benignity, or consensus opinions. In the development of CADe systems for lung nodule detection, for example, different investigators have used different "truths," and have trained and evaluated systems with images of cancerous lung nodules, with all types of lung nodules ͑both malignant and benign͒, and/or with any "actionable regions," i.e., a region that is suspicious enough to cause further examinations or diagnostic actions.
Ultimate evaluation of CAD involves evaluating the performance of radiologists using the computer output as an aid ͑i.e., in observer studies or clinical trials͒. Various observer studies have been cited throughout this article. With observer studies, researchers aim to mimic the interpretation task on a database that might be enriched with a higher prevalence of cancer cases. Radiologists' performance with CAD systems has been compared to double reading by humans. [352][353][354] While results on performances are obtained from observer studies, ultimately clinical trials need to demonstrate efficacy of CAD systems.
It is important to realize that with CADe systems, which are focused on screening programs in which most cases will be normal, a large number of cases will be necessary for there to be sufficient power to demonstrate an actual improvement. Jiang et al. have reported that to detect an increase of one additional cancer per reader per 1000 screening mammograms with 80% power, a trial with a new modality ͑such as CADe͒ would need at least 25 radiologists, who would each read at least 8,000 screening mammograms. 355 In addition, the measure of performance selected may also affect the overall conclusion, as noted by Horsch et al. in the analysis of performances in terms of radiologists' reported probability of malignancy, in terms of BI-RAD ratings, and in terms of the patient management decision to biopsy or not. 356 Evaluation studies on quantitative image analysis include additional needs since the absolute metrics such as tumor volume or blood flow must be correlated with actual physical conditions. Studies, such as clinical trials for therapeutic response or drug discovery, require careful standardization of the imaging protocol. In an effort to develop consistent and quality-controlled imaging protocols, uniform protocols for imaging in clinical trials ͑UPICT͒ ͑http://upict.acr.org/͒ was formed. Validation of quantitative imaging metrics have similar requirements as those for CAD methods in that verification of "truth" and adequate statistics are necessary.

VI. CHALLENGES, LESSONS LEARNED, AND THE FUTURE
The first FDA-approved CAD system in 1998 was for computer-aided detection in screening mammography. While the detection of many types of cancers lend themselves to CADe due to the potential of oversight "errors" in a screening population of many normal cases, mammography was a good imaging exam for commercialization of CADe, since screening mammography is basically a dedicated imaging protocol, i.e., there is no other major "disease" or "incidental findings" for which the exam is ordered. The multitude of potential diseases/conditions presenting on a chest radiograph combined with the inconvenience of film digitization for just one CAD task, slowed the research on CAD for lung cancer. However, CADe on thoracic CT ͑and on digital radi-ography͒ appears to be thriving, as image data are now primarily digital and the display of the computer output can be activated by a software button.
The potential for CADe is being explored for many other modalities and diseases. Examples include detection of pulmonary embolism 357,358 and hepatocellular carcinoma on CT scans, 359 coronary artery diseases on cardiac CT angiograms, 360 urinary tract cancer in CT images, 361 masses on breast ultrasound images, 236 vertebrate fracture on lateral chest radiographs, 362 brain tumor or intracranial aneurysm on MR angiograms, 363,364 retinography, 365 and detection of tumor change on whole-body nuclear medicine scans. 366 Although CADe developments in these and other areas are still at an early stage, it can be seen that researchers will continue to expand their interests and efforts in CADe to various areas of applications.
Although the clinical community is accepting CADe to their practice, challenges exist for the current CADe systems and the development of new CADe systems. CADe systems may suffer from high FP rates. Most FPs might be dismissed by radiologists easily but some might require unnecessary work up. Some radiologists are concerned with the medicolegal consequences that the CAD marks that are not worked up may turn out to be malignant. Improving the CADe algorithms to reduce FPs is a constant goal for CAD developers. To develop CADe for a new area, the most difficult step is often the collection of a large database, representative of the patient population, for training and testing the CADe algorithms. Furthermore, whether a CADe system is useful as an aid to radiologists may be evaluated in prospective clinical trials. As discussed above, the outcomes of a clinical trial may be influenced by the study design and other human factors such as radiologists' experience and vigilance, and their response to the CAD marks, in addition to the performance of the CADe system. Understanding these issues will be important for the study of the impact of CADe on medical diagnosis and for motivating radiologists to take best advantage of CADe in clinical practice.
The incorporation of CAD into new imaging modalities will become commonplace. Just as computers continue to be an integral part of our lives-so will they grow in medical imaging. CAD is now an integral component in most major medical imaging meetings and numerous CAD papers are published in journals such as Medical Physics each year. CAD will play an important role in the process of medical image interpretation and become an indispensable component in diagnostic imaging in the not-too-distant future. Vari-ous medical physicists are continuing to expand the role of CAD beyond computerized detection and diagnosis, such as in assessing cancer risk, [367][368][369][370] risk of osteoporosis, [371][372][373][374][375] and potential occurrence of osteolysis. 376 Extension of techniques developed for CAD are expected to also play a role in measurements of response to therapy, such as in assessing changes in tumors following chemotherapy 377,378 and in the quantitative analysis on thoracic CT in the assessment of mesothelioma. 379 Furthermore, as new imaging modalities make available more and more data for interpretation, inclusion of CAD will be a necessity. Examples include assessment of multiple disease states in thoracic/abdominal CT [380][381][382][383] and improved analysis of cardiac CT. 384,385 As emphasized elsewhere, 386 the use of quantitative information that is accessible through imaging is predicted to dramatically increase over the next decade. Several factors lead to this observation. ͑1͒ We are in the postdigital image era, and virtually all image data are interpreted in digital format by radiologists using an imaging workstation ͑a computer͒. Thus, the image data are readily available for computerized assessment by interpreting physicians. ͑2͒ Radiography and other planar imaging modalities are slowly giving way to tomographic imaging, in CT, PET, SPECT, ultrasound, and MRI. Tomographic images provide a much richer data set in which spatial and other geometric parameters can be quantified. ͑3͒ The scan times for most modalities are steadily decreasing, giving rise to temporal imaging protocols ͑which provide image data at two or more time points͒ for the assessment of time-dependent physiological parameters such as blood flow, perfusion, permeability, and other velocitybased metrics. ͑4͒ Medicine is in a state of transition from a practice-based specialty to diagnoses and treatment decisions, which are evidence-based. This change will place more emphasis on quantitative parameters as diagnostic endpoints.
Medical physicists will have a very important role to play in this future landscape of quantitative imaging, by validating the quantitative integrity of scanners and developing imaging techniques, and image processing tools, which provide quantitative data in a more automated and accurate fashion. While the medical physicist played an essential and undeniable role in the development of imaging systems over the past 50 years, as imaging systems become more complex and the need for better and more accurate quantitative information from images grows, the role of the physicist will be even more important in the next 50 years.
The future includes the combined research efforts from physicists working in CAD with those working on quantitative imaging systems to readily yield information on morphology, function, molecular structure, and more-from animal imaging research to clinical patient care.

ACKNOWLEDGMENTS
Maryellen Giger is grateful for the many fruitful discussions with the faculty and research staff in the Department of Radiology and Committee on Medical Physics at the University of Chicago. Certain parts of the chapter are the result of research supported in parts by USPHS grants from NCI, NIBIB, and NIAMS, as well as from the U.S. Army Breast Cancer Research Program, the American Cancer Society, the Whitaker Foundation, and The University of Chicago Cancer Research Center. M. Giger is a stockholder in R2 Technology, a Hologic Company ͑Sunnyvale, CA͒. It is the University of Chicago conflict-of-interest policy that investigators disclose publicly actual or potential significant financial interests that may appear to be affected by the research activities. Heang-Ping Chan is grateful for the efforts by the faculty and researchers in the Department of Radiology and the CAD Research Laboratory at the University of Michigan. Certain parts of the chapter are the result of research supported in parts by USPHS grants from NCI and NIBIB, as well as from the U.S. Army Breast Cancer Research Program. John Boone was funded in part by a grant from the NIH ͑R01 EB002138͒.