- Top of page
- Material and Methods
In human medical imaging, the performance of the monitor used for image reporting has a substantial impact on the diagnostic performance of the entire digital system. Our purpose was to compare the display quality of different monitors used in veterinary practice. Two medical-grade gray scale monitors (one cathode-ray tube [CRT], one liquid crystal display [LCD]) and two standard consumer-grade color monitors (one CRT, one LCD) were compared in the ability to display anatomic structures in cats. Radiographs of the stifle joint and the thorax of 30 normal domestic shorthair cats were acquired by use of a storage phosphor system. Two anatomic features of the stifle joint and five anatomic structures of the thorax were evaluated. The two medical-grade monitors had superior display quality compared with standard PC monitors. No differences were seen between the monochrome monitors. In comparison with the color CRT, the ratings of the color LCD were significantly worse. The ranking order was uniform for both the region and the criteria investigated. Differences in monitor luminance, bit depth, and screen size were presumed to be the reasons for the observed varying performance. The observed differences between monitors place an emphasis on the need for guidelines defining minimum requirements for the acceptance of monitors and for quality control in veterinary radiography.
- Top of page
- Material and Methods
In digital radiography the imaging chain comprises four separate technical steps: signal acquisition, signal processing, image archiving, and image presentation. The performance of a digital radiography system depends on the interplay of those interdependent parts.1,2 There has been a transition of image presentation from reviewing images on film, so-called hard copy viewing, to reviewing images on computer monitors, or soft copy viewing. Soft copy viewing offers advantages over hard copy reading since the image can be adjusted on-line. The option to use the entire spectrum of attenuation differences recorded by the detector means that more information is available.3–5 Furthermore, with soft copy viewing zooming or measurement tools are available and the cost of film, film processing, and hard-copy image storage and retrieval are eliminated.3,4,6 In the human medical profession, the transition from hard copy to soft copy viewing was not instantaneous and was based on substantial prior work in the field of medical monitor displays and workstation technology. Historically, soft copy viewing in the human medical profession was affected by limited monitor performance. New generations of monitors offered better display properties. Gray scale cathode-ray tube (CRT) monitors and, later on, gray scale liquid crystal displays (LCD) became the display media of choice for medical images. Differences in monitor performance can influence the display quality and consequently the overall final diagnosis.7,8
To ensure a high and consistent level of image display quality in human medical practice, guidelines exist that define the minimum technical prerequisites for monitors and methods of quality assurance.1,9 To our knowledge, comparable regulations for veterinary radiology do not exist. Because the price of monitors specifically designed for medical imaging can exceed the price of a standard computer monitor by a factor of ten, consumer-grade monitors often are used in veterinary practice. Considering the corresponding radiation safety aspects, poor monitor selection or inadequate calibration violate the ALARA principle. At worst, veterinary personnel receive occupational exposure to create an image that cannot be evaluated adequately due to an unacceptable monitor.
This study was motivated by the uncertainty of whether specific information dealing with monitor evaluation in the human medical profession is applicable to veterinary radiology. Our aim was to compare the display quality of selected monitors on the basis of subjective assessment of the appearance of anatomic structures in feline radiographs. We hypothesized that monitors recommended for primary image interpretation in human radiology offer superior display properties in feline radiographs as well. Furthermore, due to the variability in object contrast and size, differences in the ratings between the selected criteria could be expected.
Material and Methods
- Top of page
- Material and Methods
Four types of monitors were evaluated (Table 1). The two monochrome displays represented medical-grade devices consistent with national standards.10–12 The color displays were standard consumer grade monitors. At the beginning of each reading session the settings of the gray scale monitors were rechecked. Brightness and contrast of the color monitors were adjusted to the achievable optimum with the help of a SMPTE RP-133 test pattern.13 The monitors were controlled by the graphic card of the individual computer.
Table 1. Technical Specification of the Monitors
|Type||Gray Scale CRT||Gray Scale LCD||Color CRT||Color LCD|
|Manufacturer labeling||Philips 21 CY9*||Totoku ME 181L†||ADI Microscan PD959‡||Fujitsu Siemens Amilo A§ (laptop)|
|Physical size (in.)||21||18.1||19||15.1|
|Matrix||1280 × 1024 (1.3 MP)||1280 × 1024 (1.3 MP)||1600 × 1200 (1.9 MP)||1024 × 768 (0.8 MP)|
|Dot pitch (mm)||0.35||0.28||0.24||0.30|
|Maximum luminance (cd/m2)||650||700||120||200|
|Operating luminance (cd/m2)||250||360||100||200|
|Graphic card (type)||SUN 81-76||Matrox Millenium P650 PCIe 128M||NVIDIA GeForce 7300LE||ATI IGP 320M|
|Calibration DICOM GSDF||Yes||Yes||No||No|
Under identical exposure conditions 30 lateral stifle joint radiographs and 30 right lateral thoracic radiographs of 30 anesthetized normal domestic shorthair cats older than 1 year were acquired. General anesthesia was required for reasons other than for radiography, e.g. castration, removal of orthopedic implant, and treatment of a dental disease.
The radiographs were made using a storage-phosphor system* on a Bucky table† (Table 2). Uniform processing was employed for both the stifle and the thoracic radiographs. Dynamic range reconstruction algorithm and unsharp mask filtering was employed for the images of the stifle joint and the thorax, respectively. In pre-studies, the parameters of these processing algorithms were optimized with regard to detail visibility (Table 3).
Table 2. Exposure Technique
|Type||Philips Bucky Diagnost|
|Filtration||2.5 mm Al|
|Focus size||0.6 × 0.6 mm2|
|Storage phosphor system|
|Reader||Philips AC 500|
|Spatial frequency||5 lp/mm|
|Detective quantum efficiency(70 kVp: 1 lp × mm-1)||21%|
|Exposure conditions||Stifle joint||Thorax|
|Focus-to-detector distance||110 cm||110 cm|
|Field size||10 × 8 cm2||15 × 21 cm2|
|Tube potential||44 kVp||52 kVp|
|Tube current||8 mAs||6.3 mAs|
|Exposure time||36.0 ms||21.6 ms|
|Dose–area product||0.9 cGy × cm2||3.5 cGy × cm2|
Table 3. Image Processing Parameters
| Processing algorithm: dynamic range reconstruction|
| Contrast equation||0.8 (Kernel size: 135)|
| Contour sharpness||1.00 (Kernel size: 3, Curve type: F)|
| Processing algorithm: unsharp mask filter|
| Gradient amount (GA)||1.17|
| Gradient type (GT)||E|
| Gradient center (GC)||1.80|
| Gradient shift (GS)||± 0.28|
| Frequency rank (RN)||9|
| Frequency type (RT)||U|
| Frequency enhancement (RE)||1.4|
| Kernel size||5|
The investigation was designed as an observer performance study. In an absolute visual grading analyses (VGA) study the observer stated their opinion on the visibility of a certain feature on the basis of an absolute scale without reference pictures.14 The images were evaluated independently on the various monitors by four observers with a minimum of 3 years of experience with digital radiography (one board-certified radiologist, three residents of a national specialization program). Two features of stifle radiographs and the appearance of five anatomic structures of the thorax were scored on the basis of a four-point scale (4, excellent; 3, average; 2, borderline sufficient; 1, insufficient) (Fig. 1). The observers were trained for their task using a separate set of images. Consistent with the practical routine of image reading, the radiologists were encouraged to apply the entire workstation functionality‡ to record as much information as possible. Evaluation time per image was unlimited. To ensure uniform ambient conditions all workstations were placed in the same reading room. The ambient light and other conditions of the viewing environment fulfilled the requirements for medical image interpretation.11,15 Lighting was indirect, and illuminance at the monitor surface was <100 lx. Observers were unaware of the animal identification.
Figure 1. Definition of criteria for the depiction of diagnostically relevant anatomic structures in the evaluation of image quality
|Radiological structures||Anatomical criteria|
|(A) Stifle joint|
| Bone||Identification of the subchondral borders (black arrows), discrimination between trabecular and compact bone (black circles), delineation of the patella, fabella(e), and popliteal sesamoid (black asterisks)|
| Soft tissue||Demarcation of the infrapatellar fat pad (open white triangles), and extraarticular soft tissue structures (closed white triangles: muscle contours; white arrows: patellar ligament)|
| Trachea||Discrimination of trachea and principal bronchi from the adjacent mediastinum|
| Cranial lung field||Visibility of small vessels (white arrows) in the cranial lung field|
| Sternum||Visibility of the border and the architecture of the sternebrae|
| Cardiac silhouette||Identification of the caudal border of the cardiac silhouette|
| Caudodorsal thoracic field||Rendition of the aorta (open triangles), the caudal vena cava (closed triangles), pulmonary vessels (arrows), and contour of the diaphragm|
Download figure to PowerPoint
Because the image quality and visibility of the anatomic structures in this study were rated subjectively, an assessment of consistency was desirable to get an impression of the objectivity and reliability of the image evaluation. Kappa statistics were not applicable because of the multivariate character of the data, caused by the number of observers and rating categories. Instead, Spearman's rank correlation was performed for all criteria of all observer combinations. The level of significance was calculated for each correlation. A significant positive correlation indicates a high level of consistency between observer ratings.
Median and average absolute deviation of the ratings were calculated for each anatomic structure. Visual grading characteristic (VGC) analysis was applied to analyze the data of the VGA study. In principle, VGC analysis treats the scale steps as ordinal with no assumptions of the distribution of the data being made. It handles visual grading data in a fashion similar to ROC data. The area under the curve (AUCVGC) is a measure of the difference in the image quality between two modalities. A curve equal or close to the diagonal, equivalent to an AUGVGC of about 0.5, indicates equality between the monitors compared (Fig. 2).16
Figure 2. The visual grading characteristic (VGC) curve (right) from the data of the ratings for the criterion “cranial lung field” for the color cathode-ray tube (CRT) and the color liquid crystal display (LCD) (left). The boxes represent the operating points corresponding to the observer's interpretation. The area under the curve (AUCVGC) differs significantly from 0.5, indicating a superior display quality of the color CRT (A) in comparison with the (B) color LCD.
Download figure to PowerPoint
- Top of page
- Material and Methods
The analysis was based on a total of 3360 individual observer decisions. For the individual readers (E.L., A.W., C.B., K.G.) the mean values of the correlation coefficients averaged over the two criteria of the stifle joint images were 0.73, 0.74, 0.66, and 0.64, respectively. The mean values over the five structures of the thoracic images were 0.75, 0.67, 0.71, and 0.65, respectively. Because of the sustained high level of significance of the underlying correlations (P≤0.001) the subsequent VGC analysis was based upon the pooled data of the four observers (Table 4).
Table 4. Spearman's Rank Correlation Coefficients of the Individual Observers
| Soft tissue||0.70||0.73||0.60||0.57|
| All structures||0.73||0.74||0.66||0.64|
| Cranial lung field||0.83||0.78||0.77||0.57|
| Cardiac silhouette||0.71||0.67||0.66||0.55|
| Caudodorsal thoracic field||0.76||0.67||0.74||0.73|
| All structures||0.75||0.67||0.71||0.65|
The results of the evaluation of the image quality are summarized in Table 5 and in Fig. 3. The data revealed a complete uniform ranking order of the four monitors over all features evaluated. The two medical-grade monitors offered clear superior display quality. There were no significantly different rating results noted between those two modalities. In comparison, the display quality of the color CRT was inferior. This was characterized by both lower median values and significant differences in the corresponding AUCVGC values based on the direct comparisons. The median of four out of the seven features of the color LCD ratings was lower than those for the color CRT ratings. Concerning the individual allocations in the color LCD in five criteria an insufficient quality (grade 1) was recorded. Using the color CRT this was seen only for one out of the seven features. In the gray scale monitors there was no grade 1 allocation and even grade 2 ratings were documented only for a single feature.
Table 5. Tabulated Results of the Ratings
| ||Stifle Joint||Thorax|
| ||Bone||Soft Tissue||Trachea||Cranial Lung Field||Sternum||Cardiac Silhouette||Caudodorsal Thoracic Field|
|Gray scale CRT|
| Number of ratings|
| Average absolute deviation||0.22||0.18||0.17||0.19||0.07||0.10||0.20|
|Gray scale LCD|
| Number of ratings|
| Average absolute deviation||0.27||0.20||0.30||0.32||0.14||0.18||0.31|
| Number of ratings|
| Average absolute deviation||0.37||0.30||0.32||0.43||0.47||0.47||0.32|
| Number of ratings|
| Average absolute deviation||0.55||0.38||0.57||0.58||0.25||0.50||0.45|
- Top of page
- Material and Methods
The quality of a digital radiographic image is limited by the weakest part of the imaging chain. We found obvious differences between various monitors with regard to the rendering of anatomic detail in feline radiographs. Image quality was better when using a monitor that met defined criteria regarding primary image evaluation in human medical practice.
A comparison of our findings with results from other studies performed with either clinical images from humans or phantoms was hampered for major reasons. One was that the monitors investigated diverged substantially in their technical properties. The second related to the vastly different target structures. Thus, it is not surprising that a number of human studies described differences in the monitor performance8,9,17–19 while others confirm equal display quality.20–23
The quality of a monitor is determined by the interplay of several factors, such as screen size, pixel size, luminance, contrast ratio, and bit depth. Further characteristics such as phosphor type, gray scale- or color monitor, glare- and reflection characteristics, and display calibration are important as well.1,9,24,25 The importance of brightness and spatial resolution have been emphasized.9,22,23,25,26 It is likely that the superior performance of medical-grade monitors in our study was related primarily to the ability of the monitors to display more shades of gray. Luminance was three to five times higher and therefore different look-up tables were applied. The main advantage of high luminance is that it is easier to see the entire gray scale from white to black in an image. Brighter monitors always yield better perceived contrast.1,25 Furthermore, the gray scale monitors were able to display 1024 shades of gray compared with 256 shades for the color monitors, which also improved gray scale rendition. Additional benefits of the medical-grade gray scale monitors were that they were calibrated to the DICOM part 14 Grayscale Standard Display Function (GSDF).27 The aim of the calibration was to obtain consistent presentation on all displays by distributing the total contrast of the display across the entire gray scale. As a result, objects were presented with the same contrast regardless of whether they were located in dark or bright parts of the image.28
In general, the pixel size of a monitor is an important quality factor. But, it is not very likely that the differences in results found in this study were related the pixel size. Basically, to ensure adequate resolution, the matrix of the monitor should be as close as possible to the matrix of the preprocessed image data. Alternatively, high resolution is attainable with magnification function.1,22,29 The spatial frequency of the applied storage phosphor system was 5 line-pairs/mm. According to the Nyquist theorem, this corresponded to a detector pixel size of 0.1 mm (100 μm). The pixel size of the monitors included in the study ranged from 0.24 to 0.35 mm. Consequently none of the monitors was able to display the exposed field of the thoracic radiographs of 15 × 21 cm2 in the original resolution without the use of the magnification function. Because the observers used the zooming function, it is unlikely that the differences of the monitor pixel size had a significant influence on the results. However, generally such an influence may not be ignored completely because the larger monitors allowed for a higher magnification. Beyond that, other factors, such as monitor technology (CRT vs. LCD; gray scale vs. color) or graphic card could also have attributed to the differences.18,24,30
For the study, monitors that are commonly used in veterinary practice were chosen. They were selected on the basis of different technologies (CRT vs. LCD; gray scale vs. color) and physical properties (e.g. luminance, contrast ratio, spatial resolution, bith depth). Because some vendors advertise the use of notebook computers for primary interpretation, such as in mobile practice, a standard laptop display was included. There are significant price differences between medical-grade and consumer-grade monitors, making the cheaper standard PC monitors appear attractive. For example the current price of a 21 in. medical-grade gray scale LCD monitor is approximately €4000, whereas a consumer-grade color LCD of the same size ranges between €120 and €400. Because of those price differences, monitor recommendations for human medical practice are based on the diagnostic purpose for which the workstation should be used. Basically, expensive high quality monitors were recommended for image interpretation, whereas less expensive monitors with a lower performance can be used for image viewing without the need for an immediate final diagnosis. Therefore two categories of monitors have been distinguished currently: monitors for interpretation of medical images for rendering a clinical diagnosis, termed primary or diagnostic monitors, and monitors for viewing medical images without the requirement for clinical interpretation, e.g. for viewing images by medical staff or specialists other than radiologists after an interpretive report has been rendered, termed secondary or nondiagnostic monitors.11,12,31–34 Within both of these categories, the minimum specification differs dependent on the application, e.g. thorax, skeleton, and mammography. Beyond that, it was proposed to expand this limited classification range to match the full range of applications more precisely.34 When a single workstation will be used for multiple applications, the monitor specification has to match the highest level needed.11 Accordingly, in our study, the monitor would have to fulfill the requirements for reporting thoracic images. The two gray scale monitors met the national requirements for primary image interpretation for both thoracic and skeletal image interpretation in the human medical field11 while the two consumer-grade monitors did not. The quality of the color CRT was acceptable for secondary viewing. The color LCD was inadequate even for secondary image review.
Thoracic and joint structures of the cat were selected because they are small objects with a wide spectrum of attenuation differences. In the thorax, motion caused by respiration has to be considered to avoid loss of evaluability. In the human medical profession, comparable challenging conditions are restricted to neonatal radiography.35 As in pediatric radiology it was assumed that the chosen regions placed high demands on all parts of the imaging chain to display structures of interest with high diagnostic quality.36 The rating differences seen were unrelated to both the two regions and the target structures evaluated. This was somewhat unexpected, because the display quality of low-contrast structures such as the cranial lung field and caudodorsal thoracic field in the thoracic radiographs, and soft tissue in the stifle images, were theoretically more dependent on monitor properties related to the display of luminance and contrast than target structures with higher attenuation differences.37 However, other results from phantom20,23 and clinical human studies19,23 agree with ours.
The major drawback of this study was the small number of monitors investigated. We recognize that the entire spectrum of monitor quality could not be addressed. Furthermore, more sophisticated monitors are becoming available continually. There has been a trend away from CRT monitors to LCD flat panel displays.18 We did not evaluate large screen color LCD monitors. Large screen color LCD monitors with high brightness and resolution (display diameter: ≥20 in., maximum luminance: ≥200cd/m2, matrix: ≥2 MP) can perform similar to gray scale monitors in human medical practice.21–23 The ability of large screen LCD monitors to display subtle structures in small animals has not been characterized. Also, monitors cannot be evaluated without considering the associated graphics card. Our study design was inadequate to assess the effect of the graphics card independently.
Another limitation might be that anatomic structures were evaluated instead of pathologic lesions. It was assumed that the ability to detect pathologic changes is related to accurate anatomic presentation.14,38 In contrast to pathologic structures, anatomic landmarks have a more uniform appearance. Consequently it is easier to evaluate the quality of their radiographic presentation for a meaningful interpretation. In human medical practice, the quality of reproduction of anatomic structures is the basis of established standards of quality assurance.39,40 In observer performance studies dealing with comparative evaluation of the quality of clinical images, these criteria are reliable measurement instruments.41–43 Despite the lack of comparable standards in veterinary radiology, quality criteria can be deduced from generally accepted paradigms of image interpretation.44–47 Once the requisite level of radiographic rendition of a diagnostic relevant criteria has been verbalized, the description can be applied to observer performance studies in general and for VGA studies in particular.36 However, some have argued that it is more difficult to identify existing pathologic changes.19,48 Accordingly the results of our study could be considered as too optimistic. Such an interpretation underlines the need of high-quality monitors even more strictly. Another limitation was that it was not possible to hide the monitor type from the observers. Consequently, preferences of individual observers could not be excluded. However the consistent positive correlation of ratings between observers weakens this argument.
In summary, we have shown that the performance of the monitor used for soft-copy interpretation influences image interpretation significantly. Monitor quality is a critical element within the imaging chain in small animal radiology. Deviation from high quality monitors is accompanied by a loss of information. From the view of radiation safety considerations such a loss may not be tolerated as it represents a violation of fundamental radiation safety principles. In consequence, guidelines are needed that define minimum requirements for devices used for soft-copy interpretation in veterinary radiology. Because of similarities of many target structures and the needed quality for their radiographic presentation, guidelines for acceptance and quality testing of display devices in human medical imaging could be followed in veterinary medicine.