Iron overload is associated with disorders such as β-thalassemia major (TM), thalassemia intermedia (TI), sickle cell disease (SCD), and myelodysplastic syndrome. The precise and accurate assessment of the body iron load is essential to ensure that tissue iron concentrations are maintained within clinically acceptable limits. The measurement of liver iron concentration (LIC) has been shown to provide an accurate estimate of total body iron stores in TM patients (1).
There are several methods available for measuring LICs. These include chemical assay of liver biopsy specimens, biomagnetic liver susceptometry using superconducting quantum interference device technology, computed tomography of the liver, and magnetic resonance imaging (MRI) of the liver. MRI is emerging as the method of choice due to the fact that it is noninvasive, safer, and generally less expensive to perform than liver biopsy, suffers less from sampling variability, can be used more frequently than liver biopsy, and, in expert hands, has been shown to have good reproducibility. MRI is also more widely available than biomagnetic liver susceptometry.
Several MRI-based methods have been developed to measure LIC (2–7). The most widely used are relaxometry methods based on measuring R2 (4) or R2* (2). For measuring liver R2, the spin-density-projection-assisted (SDPA) method of St Pierre et al. (4) (also known as FerriScan®) has been widely adopted and has been approved by regulatory authorities in the USA, Canada, Europe, and Australia.
SDPA R2-MRI has been clinically proven to have high sensitivity and specificity for the measurement of LIC over the entire range encountered in clinical practice (from 0.3 to >43 mg Fe (g dry tissue)−1) and its results are not impacted by the presence of fibrosis (4). However, one current limitation of the SDPA R2-MRI method is the length of time it takes to acquire the MRI images. The current method for obtaining data for liver R2-MRI uses single spin echoes with a pulse repetition time (TR) of 2500 ms and spin echo times (TE) of 6, 9, 12, 15, and 18 ms (with no breath holding). Depending on the size of the patient, it normally takes between 20 and 30 min for the required MR images to be obtained. A shorter scan time would be very attractive to patients, clinicians, radiologists, MRI centers, and health authorities. Furthermore, patients will more easily tolerate the shorter scan time, which could potentially lead to better quality MRI images being obtained.
Demand for access to MRI scanners in most countries around the world is very high, and the cost imposed on patients or health care systems is generally related to the amount of time patients are required to spend in the MRI scanner. Therefore, faster MRI data acquisition is better for both practical and economic reasons, assuming that there is no significant impact on the accuracy and precision of the resulting measurements.
For any measurement technique, the measurement result is dependent on both the system being measured and the technique used. A change in the measurement technique may cause both a systematic change in the measured value for a given system and the magnitude of the random error. As such, measurements of liver R2, like any other tissue property measured with MRI, will be protocol specific. For example, in the case of the measurement of tissue R2 by single spin echoes, the value of TR used in the measurement protocol has the potential to systematically shift the R2 value as measured by the rate of decay of signal with TE (the degree of shift depending on the value of T1 for the tissue). Hence, for the purposes of LIC measurement, it is important to ensure that the calibration curve used to relate R2 to LIC is specific to the protocol used since the R2 measured will have some dependence on the type of protocol used. In this report, we look at the degrees of both systematic differences and random errors on the protocol-specific liver R2 values measured using two different TR values.
This study was designed to formally evaluate the effects of reducing the TR from 2500 ms to 1000 ms in the SDPA R2-MRI protocol on measurement of liver R2. The aims of the study were to:
areassess the repeatability of the TR 2500 ms protocol,
bmeasure the repeatability of the TR 1000 ms protocol,
cassess the 95% limits of agreement between individual measurements of liver R2 using the TR 2500 ms protocol and the TR 1000 ms protocol, and
dcompare the time taken to acquire the MR images using the TR 2500 ms and TR 1000 ms protocols.
Protocols with TR less than 1000 ms were not attempted to avoid the variability of liver T1 (8) confounding R2 measurements. The magnitude of this confounding factor can be estimated for a TR of 1000 ms by taking a worst case scenario of a two non-iron-loaded livers with R2 of 20 s−1, one with a T1 of 1000 ms and the other with a T1 of 500 ms. The difference in measured R2 would be only 1.5% according to the fundamental equation describing the decay in signal intensity in spin echo measurements
With even moderate iron loading, which decreases both T2 and T1, this difference rapidly diminishes.
MATERIALS AND METHODS
Ethical approval was obtained from the Faculty of Medicine, Cairo University Ethics Review Committee and Faculty Research Ethics Committee, the Ege University, before the study commencement. In addition, written, informed consent was obtained from all study participants before any assessments or procedures being performed.
Participant Recruitment and Eligibility
Participants were recruited from the Hematology Clinic of the Pediatric Hospital of Cairo University, Egypt (by AE) and from the Department of Pediatrics and Thalassaemia Center, Ege University Hospital, Izmir, Turkey (by Y.A.). Participants were eligible for the study if they were aged ≥12 years, had a documented diagnosis of TM, SCD, myelodysplastic syndrome, or thalassemia intermedia, and had no contraindications for MRI. After applying the inclusion criteria, 50 eligible participants were identified comprising 44 with TM, 4 with SCD, 1 with myelodysplastic syndrome, and 1 with thalassemia intermedia (referred to hereafter as the “patient group”). Of these, 15 patients were recruited from Egypt and 35 from Turkey. There were 23 males and 27 females in the patient group. Age at entry into the study ranged from 12 to 67 years (median 21 years). The median SF measurement for all participants in the patient group was 1394 ng mL−1 (2361 ng mL−1 for participants with TM and 3293 ng mL−1 for participants with SCD).
An additional 10 participants were recruited in Turkey who had no documented diagnosis of TM, thalassemia intermedia, SCD, or myelodysplastic syndrome, and no evidence of, or suspected to have any other active liver disease (referred to hereafter as the “control group”). The control group was included to ensure that data from subjects with LICs in the reference range were included in the analysis. There were 4 males and 6 females in the control group. Age at entry into the study ranged from 20 to 43 years (median 30.5 years). The median SF measurement for the control group was 29 ng mL−1.
Measurements were conducted on a 1.5-T Siemens Symphony Vision (Munich, Germany) for subjects in Turkey and a 1.5-T Philips Intera (Best, Netherlands) for subjects in Egypt. LIC measurements were made using SDPA R2-MRI (FerriScan®). Detailed methodology is described elsewhere (4, 9). In brief, axial images were acquired with a multislice single spin-echo (SSE) pulse sequence, with a pulse TR of 2500 ms, spin TEs of 6, 9, 12, 15, and 18 ms, and slice thickness of 5 mm. A matrix size of 256 was used with typical fields of view being between 350 and 400 mm (exact dimensions depending on subject size). The phase-encoding direction was set to anterior–posterior with technicians being given the freedom to adjust the phase field of view to accommodate the shape of the subject. Data were acquired in partial Fourier mode to reduce scan time with one acquisition. The Siemens Symphony Vision used zero filling (partial scan factor 0.54) whereas the Philips Intera used homodyne processing (partial scan factor 0.625). Bandwidths were minimized while still enabling a minimum TE of 6 ms to be achieved. A bandwidth of 295 Hz was used for the Siemens Symphony Vision. For the Philips Intera, a bandwidth of 548 Hz was used at TE 6 ms and 365 Hz for all other echoes. No fat suppression was used. A 1000-mL bag of normal saline solution was imaged with each subject to provide an external long T2 reference for the correction of instrumental gain drift and signal intensity variations due to any bandwidth changes. Each subject was positioned so that the liver was located central to the phased array torso coil. Slices (n = 11) were collected for each subject, with the gap between slices being 5 mm and, the first slice being positioned near the top of the liver such that slices were spread across the majority of the liver.
R2 measurements were made on the slice containing the largest cross section of the liver. The region of interest (ROI) was defined by drawing a line along the inside edge of the liver to define the entire shape of the liver. Any artefacts, lesions, or large vascular structures such as the major branching of the portal vein and hepatic arteries were then removed from the ROI by drawing lines around them. Analysts treated each dataset as an independent measurement; hence, no attempt was made to register the chosen slice or the ROI between the first and second measurements. R2 values were calculated throughout the ROI by curve fitting the equation for the biexponential decay in transverse magnetization to the voxel intensity data as a function of TE (10). A mean R2 value was calculated for each voxel by summation of the fast and slow components of the proton transverse relaxation rate weighted by their relative population densities as described elsewhere (10). To reduce image noise, the voxel intensities were smoothed by neighborhood averaging over a 7 × 7 window kernel before curve fitting. No voxels outside the defined ROI were used in the smoothing process. Therefore, no special precautions were required to avoid areas near the periphery of the organ or surrounding blood vessels. Respiratory ghosting in the spin echo images was reduced before the generation of the R2 images using methods described elsewhere (11). The generation of the liver R2 images is described in greater detail elsewhere (11). For each subject, the largest axial slice of the liver was selected for R2 image analysis. Image data analysis was carried out by authors W.P. and C.S. who are both employed by Resonance Health Analysis Services Pty Ltd. The analyses were carried out under the company's certified (ISO 13485 and ISO 9001) quality management system.
All study participants attended two separate MRI scanning visits, a minimum of 1 h and a maximum of 7 days apart (median time between sessions was 1 h, 53 mins). During each visit, MR images were first acquired using a TR of 2500 ms and, second, using a TR of 1000 ms. No other parameters were changed. Hence, four sets of relaxometry data were obtained in total for each subject. All MRI measurements were made between 14 May, 2009, and 19 July, 2009.
The methods of Bland and Altman (12) were used (a) to assess the repeatability of liver R2 measurements by each protocol and (b) to assess the 95% limits of agreement between individual measurements of liver R2 by each protocol.
LICs calculated from the liver R2 data from the TR 2500-ms measurements together with the calibration curve reported by St Pierre et al. (4) ranged from 0.7- to 1.3-mg Fe (g dry tissue)−1 for the control group and from 0.8- to 48.6-mg Fe (g dry tissue)−1 for the patient group. The LICs were non-normally distributed. A Mann–Whitney test indicated no significant difference in the median LIC between the group of subjects measured on the Philips Intera and the group of subjects measured on the Siemens Symphony Vision (P = 0.29).
Using the TR = 2500 ms Protocol
Figure 1(a) shows the liver R2 measurements made at the second visit plotted against those made at the first visit using the TR = 2500 ms protocol. Values of R2 covered the range 22.2 to 301.9 s−1 [corresponding to LIC values from 0.5 to 48.6 mg Fe (g dry tissue)−1 using the calibration equation published by St Pierre et al. (4)]. The differences between measurements of liver R2 at the two visits are plotted against the mean of the two liver R2 measurements in Figure 1(b). No significant correlation between the absolute differences and mean liver R2 was found (Spearman's rank order correlation coefficient −0.21, P = 0.10). The repeatability coefficient was found to be 13.7 (± 0.5) s−1.
Using the TR = 1000 ms Protocol
Figure 2(a) shows the liver R2 measurements made at the second visit plotted against those made at the first visit using the TR = 1000 ms protocol. Values of R2 covered the range 25.0 to 300.7 s−1. The differences between the measurements of liver R2 at the two visits are plotted against the mean of the two liver R2 measurements in Figure 2(b). No significant correlation between the absolute differences and mean liver R2 was found (Spearman's rank order correlation coefficient 0.08, P = 0.52). The repeatability coefficient was found to be 12.2 (± 0.4) s−1.
Limits of Agreement Between Liver R2 Measurements
Figure 3(a) shows the mean of the two liver R2 measurements made using the TR = 2500 ms protocol plotted against the mean of the two liver R2 measurements made using the TR = 1000 ms protocol. At lower R2 values, the data points tend more often to fall below the line of equivalence suggesting a systematic bias. The absolute difference between the difference between the two measurements and the mean difference between the two measurements was found to correlate significantly with the mean of all four measurements (P = 0.0025). As such, the differences between the mean measurements made by each protocol were not amenable to analysis by the method of Bland and Altman (12) without transformation. The natural logarithms of the mean liver R2 measurements were calculated for both protocols. The differences between the natural logarithms of the mean liver R2 values for each protocol are plotted against the mean of the two natural logarithms of R2 in Figure 3(b). There is a significant correlation between the differences and the mean of the liver R2 logarithms (Spearman's rank order correlation coefficient ρ = 0.30, P = 0.022). It is of note that a significant correlation remains (ρ = 0.26, P = 0.045) even if the outlier is omitted. As such, linear regression was used to determine a relationship between the differences and the mean. The line of best fit is shown in Figure 3b and has the following form:
where y is the difference between the mean R2 logarithms for each protocol (log mean R2 for TR 2500 ms protocol – log mean R2 for TR 1000 ms protocol) and x is the mean of the mean R2 logarithms for each protocol. There is no significant correlation between the absolute residuals of the linear regression and the mean of the R2 logarithms. As such, the residuals are suitable for analysis to obtain 95% limits of agreement between the two protocols (12). The upper and lower 95% limits of agreement between the two protocols were found to be 0.195 (95% CI 0.229–0.160) and −0.195 (95% CI −0.160 to −0.229), respectively. These limits of agreement correspond to 95% of pairs of liver R2 measurements being expected to have ratios (R2 measured with TR = 2500 ms to R2 measured with TR = 1000 ms) between 0.82 (95% CI 0.80–0.85) and 1.21 (95% CI 1.17–1.26) after adjustment using Eq. [ 2].
The observed systematic bias in measured R2 necessitates a slight modification to the calibration curve (4) relating R2 to LIC previously reported for the TR of 2500 ms protocol. The new calibration equation relating liver R2 (in units of s−1) measured with a TR of 1000 ms to LIC (in units of mg Fe [g dry tissue]−1) is:
To determine the impact of replacing the TR 2500 ms measurements with TR 1000 ms, it was necessary to compare the 95% limits of agreement between the two protocols with the repeatability of each protocol. Although the repeatability coefficients were determined for each protocol as shown above, further analysis was required since the limits of agreement between the two protocols were expressed in ratios while the repeatability coefficients were found to be best expressed as relaxation rate differences in units of s−1. As such, the repeatability coefficients for each protocol were recalculated by examining the differences between the logarithms of the R2 values obtained at the first and second visits (see Figure 4) and were found to be 0.157 (95% CI 0.147–0.167) for the TR 2500 ms protocol and 0.180 (95% CI 0.168–0.192) for the TR 1000 ms protocol. These repeatability coefficients correspond to 95% of pairs of measurements having ratios between 0.85 and 1.17 for the TR 2500 ms protocol and between 0.84 and 1.20 for the TR 1000 ms protocol.
Figure 4 shows that the absolute differences of the R2 logarithms correlate with mean difference for both protocols (Spearman's rank order correlation coefficients of −0.34, P = 0.007 and −0.55, P < 0.0001 for the TR 2500 ms and TR 1000 ms protocols, respectively) and, hence, the repeatability coefficients quoted in terms of relaxation rate (in units of s−1) are more appropriate for assessing repeatability of liver R2 measurements alone. However, for assessing the impact of replacing the TR 2500 ms protocol with the TR 1000 ms protocol, the repeatability coefficients quoted in terms of ratios of liver R2 measurements are required.
The average time taken to acquire the images using the Siemens Symphony Vision scanner and a TR of 2500 or 1000 ms was 22.0 and 9.0 mins, respectively. The average time taken to acquire the images using the Philips Intera scanner and a TR of 2500 or 1000 ms was 34.8 and 14.6 mins, respectively. The longer scan times on the Philips Intera were due to the technicians choosing not to minimise the phase field according to the subject shape together with the higher partial Fourier fraction. Overall, the total MRI scanning time using a TR of 1000 ms was reduced to 42% of the time taken using a TR of 2500 ms. As expected, the total scan time is dominated by the number of phase-encoding steps acquired and the value of TR.
DISCUSSION AND CONCLUSION
The repeatability coefficient for the TR 1000-ms protocol was slightly better than that for the TR 2500-ms protocol (difference between the two coefficients being 1.53 [±0.61] s−1). The slight improvement with the TR 1000 ms protocol may be related to a lower degree of breathing artefact in the MR images owing to the shorter data acquisition time. When comparing the two protocols, a systematic difference in the measurement of R2 was found. Linear regression of the log-transformed differences between the two protocols enabled the systematic difference to be quantified and, hence, enabled an adjustment factor to be calculated for measurements using the TR 1000-ms protocol. The 95% limits of agreement between the two protocols were not significantly different in magnitude from the repeatability coefficients of the TR 1000-ms protocol (difference 0.014 ± 0.019) and were marginally larger than the repeatability coefficient for the TR 2500-ms protocol (difference 0.037 ± 0.018). However, this latter difference was not statistically significant when the outlier in Figure 3(b) was removed (difference 0.021 ± 0.017). Thus, these data indicate that the TR 2500 ms protocol can be replaced by the adjusted TR 1000 ms protocol with no significant change in accuracy or precision of measurement of R2. Thus, measurement of liver R2 using the TR 1000 ms protocol together with Eq. 3 can be used to measure LIC with an accuracy and precision not significantly different from the TR 2500-ms protocol. The significance of this observation is the demonstration that LIC can be measured with SDPA R2-MRI with significantly shorter study times benefiting both patients and radiology centers. The scan time of 9 mins for the TR 1000-ms protocol when used with minimized phase field of view is considerably longer than some single breath-hold R2* methods of LIC assessment. However, unlike R2* methods, R2 methods are insensitive to the size and shape of the imaging voxels and are not affected by external magnetic inhomogeneities caused by air interfaces or metallic clips. Some R2* methods have been shown to have a limited dynamic range meaning that not all patients can be successfully measured (e.g., upper limit of 25 mg Fe (g dry weight)−1 for method reported by Hankins et al. (7)). At the other end of the scale, the SDPA R2-MRI method has demonstrated high sensitivities and specificities for predicting biopsy LIC values in the very low-LIC range (4), whereas insufficient data have been presented for R2* methods.
The authors would like to thank Ms. Shameela Dermott for assistance with project and data management.