Magnetic resonance techniques provide for a noninvasive means of estimating fat content in vivo. It is widely accepted that single-voxel proton magnetic resonance spectroscopy (MRS) allows for MR fat quantification in the liver with superior sensitivity and dynamic range over that of MRI. However, MRS is prone to liver inhomogeneity, although this can be compensated for by using data acquisition from multiple voxels but with the disadvantage of an increased scan time.
To fully profit from the larger spatial coverage with MRI, a variety of fat quantification methods have been proposed. Among the MRI methods to date, chemical shift-based multipoint water–fat separation methods (to be referred to as multipoint water–fat separation) have been most widely used, which may be represented by the two-point Dixon method (1), the 3,4-point Dixon method with phase correction (2, 3), and the iterative decomposition of water and fat with echo asymmetry and least-squares estimation (IDEAL) method (4, 5). Among these multipoint water–fat separation methods, the original two-point Dixon (2PD) method in conjunction with a spoiled gradient echo sequence (SPGR) and the magnitude-based post-data processing (6) is often used in clinical applications due to its simplicity (7–14). Despite its well-recognized limited dynamic range when fat content is larger than water content in a pixel (15, 16), high correlations between the 2PD and MRS or histology have been reported in many studies (8, 13, 14, 16).
The outcome of MRS (e.g., signal-to-noise ratio (SNR)) in fat quantification largely depends on sequence parameters such as the number of signal averages and voxel size for a given pulse sequence. In addition to these sequence parameters, however, the sensitivity and dynamic range of multipoint water–fat separation MRI can also be strongly influenced by other factors such as sampling strategies and post-data processing algorithms (2, 3, 5, 15–17). As a consequence, the performance of multipoint water–fat separation MRI is more subject to variability. Combined with the significant variation in hepatic fat content even in healthy livers (18–24), such variable performance of the MRI methods may pose a problem, particularly when the presence and severity of a fatty liver needs to be identified.
In this report we address such variability in the performance of multipoint water–fat separation MRI and its influence on the diagnosis of fatty liver. Hepatic fat fractions (HFFs) are estimated in humans using MRS as a reference (22), 2PD (6), and a three-point IDEAL (3PI) (4, 5), whose performance is not limited to pixels with an HFF of less than 50%. To investigate the potential influence of liver inhomogeneity on the outcome, the 2PD data are collected from a single slice with multiple signal averages in a breath-hold, whereas the 3PI data are acquired with a single signal average but from multiple slices in order to obtain maximum coverage of liver volume in a breath-hold using a balanced steady-state free precession sequence (bSSFP). No direct comparison can therefore be made between these two MRI methods. Various potential sources of variability in fat quantification using multipoint water–fat separation MRI are discussed.
MATERIALS AND METHODS
All subjects gave informed consent according to a protocol approved by the Yale University Human Investigation Committee. A total of 28 subjects were included in this study (12M/16F; age = 10–30 years (mean ± standard deviation [SD] = 15.9 ± 5.3 years); body mass index [BMI] = 20.0–46.0 (31.9 ± 6.8) kg/m2). Within this group, five were lean (2M/3F; age = 15–30 years (22.1 ± 6.5); BMI = 20.0–23.1 (21.6 ± 1.4)) and 22 were obese (9M/13F; age = 10–18 years (13.9 ± 2.3); BMI = 25.8–46.0 (34.6 ± 4.9)). One male subject was lean with an abnormal liver and was therefore included only in the total subject group. Due to their obesity (BMI >95th percentile), the obese subjects are at an increased risk of having or developing hepatic steatosis, the metabolic syndrome, impaired glucose tolerance, type 2 diabetes, and all other metabolic complications associated with obesity. The lean adolescents who participated in the study had a family history of overweight/obesity, type 2 diabetes, and/or impaired fasting glucose.
The MRS, 2PD, and 3PI data were collected from all (n = 28) subjects.
1H MR spectroscopy was performed on a whole body 4.0T Medspec (Bruker Instruments, Billerica, MA) system using in-house designed and built MRS probes.
The fat content in the liver was measured with a coil assembly composed of a 12 cm circular carbon-13C coil and twin 13 × 9 cm elliptical proton RF coils arranged in quadrature for imaging, shimming, proton decoupling, and excitation/observation. The probe was secured onto the side of the subject's chest with a Velcro strap and a nonmagnetic pneumatic expansion bellows connected to the spectrometer and used for gating the MRS acquisition to the respiratory movements. Once the subject was positioned at the isocenter of the magnet the probe was tuned and matched and scout images of the chest were obtained to ensure correct positioning of the subject. After imaging the liver, localized shimming was performed over a sphere of 50 mm placed in the liver using the FASTERMAP method (25) with respiration gating. Hepatic fat content was measured by 1H respiratory-gated STEAM spectroscopy (26) in a (15 × 15 × 15 mm3) voxel with the following parameters: 2 ms SLR90, slice-selective excitation pulses, echo time (TE) = 20 ms, mixing time (TM) = 15 ms, repetition time (TR) = 3000 ms (lipid) or 5000 ms (water), 16 averages, 2048 points over 2500 Hz, three modules of CHESS water suppression (27), and a typical scan time of 30 min. Acquisition of spectra was synchronized to the respiratory cycle and triggered at the end of expiration when chest movement is minimal and there is a sufficient delay to complete a full pass of the pulse sequence. To prevent voxel misregistration due to chemical shift effects, hepatic fat content was estimated from the comparison of two spectra: a water-suppressed lipid spectrum (TR = 3000 ms) and a lipid-suppressed water spectrum (TR = 5000 ms), with the appropriate peak for each spectrum on-resonance. A minimum of two lipid spectra and two water spectra were time-averaged to minimize variations due to chest movements, and this sequence was carried out in different locations of the liver to account for liver inhomogeneity. A minimum of eight spectra was acquired for each subject and the total lipid content was averaged. Hepatic fat content was calculated as previously described (28) and was expressed as HFF (=lipid peak area/(water peak area+lipid peak area) × 100).
All MRI studies were conducted on a 1.5T Siemens Sonata scanner with a single channel body coil and a phased-array torso coil (USA Instruments, Aurora, CO) for the 2PD and 3PI data collection, respectively. The typical field of view (FOV) for MRI studies was 400 × 325 mm.
Two-Point Dixon (2PD)
The two-point measurement of HFF was performed using an SPGR sequence as part of the Dixon method as modified by Fishbein et al. (6). The imaging parameters were: matrix size = 128 × 256, flip angle (α) = 30°, TR = 18 ms, TEs = 2.38/4.76 ms (out-of-phase (OP) and in-phase (IP), respectively), bandwidth = 420 Hz/pixel, six averages, slice thickness = 10 mm, one slice, 2.3 sec/slice (for two-points), scan time = 14 sec on a single breath-hold.
The HFF was calculated as previously described (7). Briefly, for each image five regions of interest (ROIs) were placed in the liver parenchyma in areas where there was no contamination from blood vessels and the sum of the numbers of pixels from the five ROIs had at least 1000. From the mean pixel signal intensity data the 2PD HFF was calculated as [(Sin−Sout)/(2xSin)]x100 (6), where Sin and Sout are signal intensity of IP and OP images, respectively.
Three-Point IDEAL (3PI)
All 3PI data were collected using trueFISP (Siemens Medical Solutions, Erlangen, Germany) which was modified for adjustable TEs. The imaging parameters were: matrix size = 144 × 256, α = 55°, TR = 5.38 ms, TEs = 1.49/2.69/3.89 ms (sampling interval Δθ = 95° or Δt = 1.2 ms), bandwidth = 1028 Hz/pixel, 1 average, slice thickness = 10 mm, number of slices = 8–14 (mean number of slices ≈11), 2.3 sec/slice (for three-points), scan time = 18–32 sec on a single breath-hold.
All 3PI images were reconstructed from complex data according to the original IDEAL algorithm for multicoil (four channels) data acquisition (4) written in Matlab (MathWorks, Natick, MA). A single, continuous ROI was defined in each of the source images by including a maximum amount of parenchyma tissue of the liver avoiding major blood vessels. From the calculated water-only and fat-only images HFF images were obtained (HFF = fat/(water+fat)x100) and mean HFF was calculated over the predefined ROIs. As the majority of the subjects were obese, some of the images were degraded by banding artifacts and those slices were excluded in the data analysis (the mean number of slices included in the data analysis was ≈7). The SD of HFF across the slices was calculated for each subject and its correlation with the mean HFF was examined.
Combined Data Analysis
The 2PD and 3PI data were compared to the MRS data and the correlations between these MR HFF measures were examined.
The SNR of the images were measured from four subjects whose HFF measured by MRS was below 1%. For 2PD it was measured from the OP images of the livers. For 3PI it was measured from the images collected at TE = TR/2, for which the vector configuration of water and fat magnetization in bSSFP is approximated to be antiparallel for the given TR just as the OP image acquisition with 2PD.
The performance of the 2PD and 3PI as a means of diagnosing fatty liver was evaluated by differentiating between normal and fatty livers. First, an HFF of 2.9% (equivalent to a fat to water ratio of 3.0%) was used for the MRS data as a cutoff for normal livers according to our previous finding using MRS (22). Second, the upper limit for normal liver was defined for each of the MRI methods as m × SD above the mean HFF of lean subjects where SD is the standard deviation of the HFFs of lean subjects and m varies from 0 to 3 with a step size of 0.5 (i.e., upper limit = (m × SD) + mean HFF). Third, for the varying upper limits of HFF for normal liver, true-positive rates and false-positive rates were calculated for each of the MRI methods. The true-positive rate is defined as the ratio of the number of correctly diagnosed fatty livers to the total number of fatty livers, which is therefore equivalent to sensitivity of a diagnostic test, and the false-positive rate is defined as the ratio of the number of normal livers diagnosed as fatty liver to the total number of normal livers, which is then equivalent to (1 – specificity). Finally, a receiver operating characteristic (ROC) plot was obtained from these results (true-positive rate against false-positive rate).
All results are expressed as mean ± SD. For pairwise group comparisons, an unequal, two-tailed Student's t-test was used with unequal variances. A P-value of less than 0.05 was considered to indicate a statistically significant difference between groups. For linear regression the Pearson's correlation coefficient was calculated.
Figure 1 shows representative MRS spectra from two subjects: one (Fig. 1a) with a high HFF of 28.6% (Subject A) and the other (Fig. 1b) with a low HFF of 4.0% (Subject B). For the multiple spectroscopic measurements from which the total lipid content was averaged for each subject, the SD of HFFs varied from 0.02% to 9.2% with a mean SD of 1.99%. For the lean, obese, and total subject groups the range of HFF as measured by MRS was 0.3–3.5% (1.1 ± 1.4%), 0.3–41.5% (11.7 ± 12.1), and 0.3–41.5% (10.1 ± 11.6%), respectively.
When an HFF of 2.9% (equivalent to a fat-to-water ratio of 3.0%) was used for the MRS data as a cutoff for normal (22), there were 8 normal and 20 fatty livers in the MRS data. In the lean group (n = 5) there were 4 normal and 1 fatty livers. In the obese group (n = 22) there were 4 normal and 18 fatty livers.
Two-Point Dixon (2PD)
Figure 2 shows representative IP, OP, and calculated HFF images from the same two subjects (Subjects A and B in Fig. 1). For the lean, obese, and total subject group the range of HFF as measured by the 2PD was −6.3–2.2% (−2.0 ± 3.7%), −2.4–42.9% (12.9 ± 13.8%), and −6.3–42.9% (10.5 ± 13.7%), respectively. Negative HFFs were obtained for several subjects with a normal liver.
Three-Point IDEAL (3PI)
Figure 3 shows representative source, water-only, fat-only, and calculated HFF images from the same two subjects (Subjects A and B in Figs. 1 and 2). For the lean, obese, and total subject group the range of HFF as measured by 3PI was 7.9–12.8% (10.1 ± 2.0%), 11.1–49.3% (22.0 ± 12.2%), and 7.9–49.3% (20.0 ± 11.8%), respectively.
The SD of HFF across the slices measured by 3PI was 0.6–6.1% and is only moderately correlated with the mean HFF (r = 0.503, P = 0.006).
Combined Data Analysis
The HFF measured by 2PD is strongly correlated with that measured by MRS (r = 0.954, P < 0.001; Fig. 4a). The HFF measured by 3PI is also strongly correlated with that measured by MRS (r = 0.973, P < 0.001; Fig. 4b). The y-intercept in Fig. 4b occurs at an HFF of ≈10%. The HFFs measured by both MRI methods are also highly correlated with each other (r = 0.978, P < 0.001; Fig. 4c). As depicted by the x-intercept in Fig. 4c, the HFFs measured by 3PI are relatively higher than those by 2PD.
The SNR measured from the OP images of 2PD in four subjects whose HFF measured by MRS was less than 1%, was ≈14% higher than that measured from the images of 3PI at TE = TR/2 (37.8 ± 4.0 vs. 33.3 ± 9.2), but the difference was not statistically significant (P = 0.405).
With the diagnostic findings from the MRS data as a reference, Fig. 5 illustrates the true-positive rates (5a) and the false-positive rates (5b) for the 2PD and 3PI when the HFF cutoff for normal was set to m × SD above mean HFF of lean subjects for each MRI method and m varied from 0 to 3 (x-axis). In this range of m, the HFF cutoff for normal liver ranged from −2.0–9.1% for 2PD and 10.1–16.1% for 3PI. Figure 5c shows the ROC plot obtained from these results. According to the figure, the best diagnostic result with the 2PD (A) may be achieved when the cutoff for normal is set to 3.6% (m = 1.5) for which a true-positive rate (=sensitivity) of 0.80 and a false-positive rate (=1 − specificity) of 0.13 were obtained. On the other hand, the best diagnostic result with the 3PI (B) may occur when the cutoff for normal is set to 14.1% (m = 2), for which the true- and false-positive rates were 0.85 and 0, respectively.
We compared HFFs measured by MRS, the magnitude-based 2PD—which is most widely used in clinical MRI studies—and by 3PI, which allows for phase-based water–fat separation with a superior dynamic range to that of 2PD. Based on the 3PI data where no remarkable heterogeneity in HFF was found, the MRS data were used as a reference and differentiation between normal and fatty liver was performed using the MRI methods.
Our results reinstate that MRS provides excellent sensitivity and dynamic range over the multipoint water–fat separation MRI. However, as fat infiltration or sparing may potentially be focal (29), a multivoxel examination such as the approach used in this study may be desirable in order to take full advantage of MRS. In this regard, the use of magnetic resonance spectroscopic imaging (MRSI) techniques may be a good alternative. A high-speed MRSI technique for fat quantification has been reported previously (30).
The negative HFFs obtained from some of the subjects with normal liver can be attributed to the limited sensitivity of the 2PD. That is, when a pixel is dominantly comprised of water a similar signal is obtained in both IP/OP acquisition, and due to the noise performance negative HFFs can incur in this magnitude-based HFF estimation, thereby lowering the mean HFF of the lean group measured by 2PD. Similarly, our observation that the 3PI used in this study tends to overestimate HFF, particularly for those subjects whose HFF as measured by MRS is below ≈10%, is most likely a consequence of the lower sensitivity of multipoint water–fat separation MRI in general at the cost of larger spatial coverage (particularly in this single breath-hold data collection). As a result, the mean HFFs of the lean group obtained using these two MRI methods significantly differ from each other. Such variable performance of multipoint water–fat separation methods in combination with interindividual variations in fat content even in healthy subjects (18–24) can hinder an unequivocal diagnosis of early fatty liver using MRI.
In addition to the limited sensitivity, several sequence-specific factors may also have contributed to a certain extent to the tendency of overestimation of HFF by 3PI. That is, it may arise from the T2/T1-weighting nature of bSSFP as well, which is known to give rise to the “brighter fat signal” in comparison to T1-weighting SPGR. It may also be due to the J-(de)coupling effect of fat spins as seen in fast spin echo sequences (FSE), which also results in “brighter fat signal” (31–35). To our knowledge, such an issue has not been addressed in bSSFP imaging. Nonetheless, given the recycling of transverse magnetization and the preparation period, α/2–TR/2, which is typically implemented in contemporary bSSFP for a minimal transient period (36), the possibility of fat spins undergoing decoupling in bSSFP as in FSE may not be completely ruled out. These potential sequence-dependent sources of variability in apparent fat content measured by multipoint water–fat separation MRI would not influence study outcome significantly (e.g., relative fat content between study cohorts; variations in HFF in a longitudinal study; diagnosis based on fat-suppressed images) as long as the associated imaging parameters are maintained across multiple subjects (e.g., echo-spacing and echo train length in FSE (31–35)). However, in addition to the variable performance of multipoint water–fat separation MRI for a given sequence depending on sampling strategies and post-data processing algorithms (2, 3, 5, 15–17), they can exacerbate the difficulty of differentiating between normal and early fatty liver and interlaboratory comparisons.
To this end, it may be necessary to establish an HFF cutoff for normal liver specific to each imaging protocol's sequences and sequence parameters in order to minimize errors in the diagnosis of fatty liver using multipoint water–fat separation MRI. For instance, as demonstrated in our study, if HFF cutoffs for normal liver of 3.6% (m = 1.5) and 14.1% (m = 2) are chosen for the 2PD and 3PI, respectively, the resulting diagnostic precision can be comparable to that of MRS (2PD: sensitivity = 0.80, specificity = 0.87 and 3PI: sensitivity = 0.85, specificity = 1.00), although the HFF cutoffs for the two MRI methods are quite dissimilar.
Despite several potential limiting factors, the results from the 2PD and 3PI used in this study are highly correlated with those from MRS. Due to the substantially different imaging protocols, no direct comparison is possible between the two MRI methods. However, given the lower SNR of the images with 3PI than 2PD, at the expense of much larger spatial coverage, the higher correlation obtained with 3PI illustrates its higher performance. The fact that no substantial heterogeneity of HFF was found in this study further supports such evaluation in favor of the 3PI.
The strong correlation between the 2PD and 3PI data despite the limited dynamic range with 2PD may be due to the fact that there was no remarkable HFF heterogeneity and that none of the subjects had an HFF higher than 50% as measured by MRS. In support of the simple, magnitude-based 2PD, many studies have reported HFFs whose extent does not exceed 50% measured by either biopsy or MRS under a variety of clinical conditions (e.g., phosphorylase-b kinase deficiency with HFF ranging from 0–10% (37), familial hypobetalipoproteinemia (2–37%) (23), healthy (1–39%) (18), healthy+hepatic steatosis (1–43%) (38), general population (0–48%) (24) by MRS; nonalcoholic fatty liver disease (11–29%) (16), liver cirrhosis (0–25%) (14) by biopsy). Nonetheless, to avoid potential errors with the 2PD because of its limited dynamic range, phase-based multipoint water–fat separation methods such as IDEAL are preferable.
Due to the symmetric three-point data acquisition, the performance of the 3PI implemented herein is also limited for pixels containing water and fat in equal amounts (5, 17). While such a limitation can be substantially lessened by an asymmetric sampling (5, 17), the resulting increase in TR would render the 3PI images collected using bSSFP more subject to banding artifacts and spatially dependent measurement precision due to the pass-band mismatch effect (39). This would have been particularly problematic in our study, as the majority of the subjects were obese, requiring a larger FOV to be shimmed. Such tradeoffs between the higher SNR and the nonoptimal sampling strategy with bSSFP is indispensable in multipoint water–fat separation MRI (39).
The high correlation of the MRI data with the MRS data supports the use of multipoint water–fat separation MRI for better spatial coverage in a relatively shorter scan time. However, the apparent fat content measured by the MRI methods can be significantly variable depending on the sampling strategy and post-data processing for a given sequence as well as on the choice of sequences. Such variability may limit the clinical application of the MRI methods, particularly when a diagnosis of early fatty liver needs to be performed. Therefore, protocol-specific establishment of cutoffs for liver fat content may be necessary. Furthermore, to fully benefit from the high spatial coverage with MRI, phase-based water–fat separation methods such as IDEAL may be preferable in consideration of the limited dynamic range of the magnitude-based methods.