Sensitivity encoding reconstruction with nonlocal total variation regularization

Authors

  • Dong Liang,

    1. Department of Electrical Engineering and Computer Science, University of Wisconsin—Milwaukee, Milwaukee, Wisconsin 53211, USA
    Search for more papers by this author
  • Haifeng Wang,

    1. Department of Electrical Engineering and Computer Science, University of Wisconsin—Milwaukee, Milwaukee, Wisconsin 53211, USA
    Search for more papers by this author
  • Yuchou Chang,

    1. Department of Electrical Engineering and Computer Science, University of Wisconsin—Milwaukee, Milwaukee, Wisconsin 53211, USA
    Search for more papers by this author
  • Leslie Ying

    Corresponding author
    1. Department of Electrical Engineering and Computer Science, University of Wisconsin—Milwaukee, Milwaukee, Wisconsin 53211, USA
    • Department of Electrical Engineering and Computer Science, University of Wisconsin—Milwaukee, 3200 N. Cramer Street, Milwaukee, WI 53211===

    Search for more papers by this author

Abstract

In sensitivity encoding reconstruction, the issue of ill conditioning becomes serious and thus the signal-to-noise ratio becomes poor when a large acceleration factor is employed. Total variation (TV) regularization has been used to address this issue and shown to better preserve sharp edges than Tikhonov regularization but may cause blocky effect. In this article, we study nonlocal TV regularization for noise suppression in sensitivity encoding reconstruction. The nonlocal TV regularization method extends the conventional TV norm to a nonlocal version by introducing a weighted nonlocal gradient function calculated from the weighted difference between the target pixel and its generalized neighbors, where the weights incorporate the prior information of the image structure. The method not only inherits the edge-preserving advantage of TV regularization but also overcomes the blocky effect. The experimental results from in vivo data show that nonlocal TV regularization is superior to the existing competing methods in preserving fine details and reducing noise and artifacts. Magn Reson Med, 2011. © 2010 Wiley-Liss, Inc.

In parallel magnetic resonance imaging (pMRI), k-space data are acquired from multiple channels simultaneously such that they can be sampled at a rate lower than the Nyquist sampling rate. Standard pMRI reconstruction methods include sensitivity encoding (SENSE) (1, 2), simultaneous acquisition of spatial harmonics (SMASH) (3), and generalized autocalibrating partially parallel acquisitions (GRAPPA) (4). Among them, SENSE is known to theoretically be able to give the exact reconstruction of the imaged object in the absence of noise. However, in practice, a well-known issue with SENSE is noise amplification due to the ill-conditioned nature of the inverse problem. The issue is especially serious when a large reduction factor is employed.

Regularization (5–15) has been employed as one of the simplest techniques to address the ill-conditioning issue in SENSE by solving an unconstrained minimization problem as:

equation image(1)

where d is the vector formed from all k-space data acquired in all channels, f is the unknown n-dimensional vector defining the desired full field of view (FOV) image to be computed, and E is the sensitivity encoding matrix comprising Fourier encoding and sensitivity weighting, G(f) is the regularization term to describe the prior information of f and λ > 0 is the regularization parameter chosen to balance the data consistency (first term) and the deviation (second term) from the prior information of the image. The parameter λ can be set a global value heuristically or automatically using the L-curve method, variance partitioning method, or maximum likelihood estimation (7–9). It can also be set multivariate values based on different noise levels in wavelet domain (10, 11) or the g-factor map (16).

Most commonly used regularization techniques in SENSE include Tikhonov regularization (7, 12), total variation (TV) regularization (13, 14), and regularization with a Markov random field (MRF) model (9, 15). A common issue with Tikhonov regularization is the smoothing effect on edges due to assuming intensities vary smoothly over the entire image. To overcome this issue, some edge-preserving regularization techniques have attracted a lot of attention. TV regularization preserves edges by imposing the constraint that the image is piecewise smooth, where the regularization term is the TV norm of an image defined as a function of the image gradient (17):

equation image(2)

x and ∇y denote the gradient along horizontal and vertical directions respectively, and |…| denotes the complex modulus. Other edge-preserving regularization techniques use a MRF model with edge-preserving priors (9, 15). For example, in EPIGRAM (15), a truncated Gibbs prior is used with the regularization function in Eq. [ 1] being

equation image(3)

where ∇ denotes the local gradient between the nearest neighboring pixels and T is a constant. The method penalizes the intensity differences between neighbors only when they are below a predefined threshold T such that edges with the difference above the threshold are preserved. A drawback of TV and MRF-based regularization methods is that they both use local information only, which may cause blocky effects with a loss of fine structures while preserving edges in reconstruction.

Nonlocal regularization methods (18–24) such as nonlocal TV (NLTV) and nonlocal H1 have recently been studied in image denoising to address the issue of blocky effect by employing nonlocal pixels for calculating the gradients in the regularization term. Nonlocal H1 is regarded as a variational equivalent of the nonlocal mean filter and has been used in SENSE reconstruction for improving the signal-to-noise ratio (25). In this study, we investigate NLTV for SENSE regularization to address the issue of blocky effect with TV-regularized SENSE. NLTV has demonstrated its superior performance to TV (21–24) and also nonlocal H1 (18, 20) in other regularization applications. In NLTV, the gradient for the regularization term is calculated with pixels belonging to the whole image, instead of only the nearest neighboring pixels as used in TV regularization. In addition, a weighted graph between the current pixel and all image pixels is used in calculating the gradient. These differences allow the NLTV regularization to effectively remove noise without destroying the salient features of the original image. Our in vivo results demonstrate that the proposed NLTV regularization is able to preserve more details and fine structures than the existing regularization methods while suppressing noise.

THEORY

With NLTV as the regularization term, the regularized SENSE reconstruction is given by

equation image(4)

where ‖fNLTV is the NLTV norm of the image vector f and is calculated as ‖fNLTV = ‖∇w f2,1, where ‖ · ‖2,1 denotes the ℓ2,1 norm of a matrix and is defined as the ℓ2 norm of each column followed by the ℓ1 norm of the resulting row vector. The n × n matrix ∇w f represents a weighted nonlocal image gradient of all pixels and is defined as

equation image(5)

where equation image is the weighted image gradient between pixel p and its nonlocal neighbor q (20, 21):

equation image(6)

with w(p,q) being a graph weight function and f (p) and f (q) being the image values at pixels p and q. The pth element of the adjoint operator of ∇w f is the divergence of equation image defined as

equation image(7)

where equation image is the pth column vector and equation image is the (q, p) element of the nonlocal gradient ∇w f in Eq. [ 5]. The graph weight function w(p,q) of the image determines how much the difference between pixels p and q is penalized. The more similar the neighborhoods of p and q are, the more the difference should be penalized, and thus the larger the weight function should be. Given an image f, the graph weight function is calculated by

equation image(8)

and equation image is the normalization constant. The parameter δ > 0 controls the nonlocality of the method and also allows speeding up computation. The parameter σ corresponds to the noise level in general and is usually set to be the standard deviation of the noise (20, 23, 24). f(p +) and f(q +) are vectors representing the neighborhoods of pixels p and q of image f, respectively. The neighborhood of a pixel is defined as a two-dimensional patch of size m × m (m being an odd integer) centered at the pixel. Apparently, w(p,q) is constructed by calculating the ℓ2 norm of the difference between the neighborhoods of pixels p and q in an image. The definition in Eq. [ 8] suggests that the nonlocal graph weight function is significant if the neighborhoods of q and p are similar in the ℓ2 sense. Therefore NLTV promotes similarity between the regions that are known to be similar by penalizing the difference between these regions using a large weight function. In practice, when no exact knowledge about the original image is available, a reference image with features similar to the original image is usually used to estimate the graph weight function.

Figure 1 illustrates how the image gradient is calculated for both the NLTV norm and the conventional TV norm. In summary, the difference of NLTV from the conventional TV norm lies in: First, the gradient in TV norm is calculated using two nearest (horizontal and vertical) neighbors of the target pixel; while all pixels are used as the neighbors in NLTV. Second, equal weights are applied in TV for the image gradients at all pixels; while in NLTV, the weights are spatially varying depending on the similarity between the neighborhoods of the targeting and neighboring pixels.

Figure 1.

Calculation of the image gradient in conventional TV regularization (left) and NLTV regularization (right). The gradient of TV is usually calculated as the difference between the target pixel in red and its two nearest (vertical and horizontal) neighbors in blue. In NLTV, all pixels are used as neighbors in calculating the weighted gradient. The weights depend on the similarity of the patch structures between the target pixel in red and the neighboring pixels in blue. For example, the difference between pixels p and k is weighted by w(p,k), which is larger than weights w(p,i) and w(p,j) because the patch around pixel k has a “T” structure, which is similar to that in the patch around pixel p.

MATERIALS AND METHODS

The proposed NLTV regularization method was evaluated on two in vivo datasets of human brain. A sagittal dataset was collected on a 1.5-T SIEMENS Avanto system with a four-channel head coil using a two-dimensional T1-weighted spin echo protocol (echo time/pulse repetition time = Min Full/500 msec, 24 cm FOV, 256 × 256 matrix). A set of axial data was acquired from a 1.5-T commercial scanner (GE Healthcare, Waukesha, WI) with an eight-channel head coil (Invivo, Gainesville, FL) using a two-dimensional T1-weighted spin echo protocol (echo time/pulse repetition time = 14/500 msec, 24 cm FOV, three slices, 256 × 256 matrix). Two different slices were used for reconstruction (called axial 1 and axial 2). Informed consent was obtained from the volunteer in accordance with the institutional review board policy. The full k-space data were acquired and the square-root of sum-of-square (SoS) reconstruction was used as the reference for comparison. A set of low-resolution images generated from the central 32 k-space data were used to estimate the channel sensitivity profiles. To simulate the reduced SENSE data, some phase-encoding lines were manually removed to obtain the desired reduction factors. Net reduction factors of 1.97 (three central lines with 2× outer reduction) and 2.67 (14 central lines with 3× outer reduction) were used for the sagittal and axial datasets, respectively.

In the proposed NLTV regularization method, nonlinear conjugate gradient (26) was used to reconstruct the image iteratively as done in Ref. (20). The regularization parameter λ was selected using a modified L-curve method (22). Specifically, the NLTV norm versus the data consistency curve is plotted with different values of λ and the final λ is selected as half of the “corner,” because the conventional L-curve method is observed to over-smooth the image (22). To estimate the graph weight function w(p,q), a reference image equation image with features similar to the desired image is needed. Here we used the SENSE reconstruction with Tikhonov regularization as the reference image to obtain an initial weight function. As Tikhonov regularization reconstruction usually exhibits residual aliasing, the weight function was then updated iteratively during the iterative reconstruction. This updating process makes the function in Eq. [ 4] nonconvex and thus may affect convergence. For this reason, we chose to update the weight function after each outer iteration (consisting eight inner iterations) for better stability.

In all reconstructions, the window size for the nonlocal neighbors in Eq. [ 8] was chosen to be δ = 11, which means only the neighbors within an 11 × 11 window around the target pixel were considered when calculating the nonlocal image gradient. The patch size was chosen to be m = 5, which means a 5 × 5 neighborhood was used to calculate the graph weight function between pixels. Selection of these two parameters is discussed in more detail in the Results section. In addition, we used a wavelet-based noise-estimation method in Ref. (27) to estimate the parameter σ based on the reference image. More precisely, the noise level is given by

equation image(9)

where the vector c includes wavelet coefficients in the finest subband of the reference image after three-level wavelet decomposition using Daubechies-4 wavelet (28).

In reconstruction, Tikhonov regularization, TV regularization, and NLTV regularization were used to generate the final image from the reduced data. In Tikhonov regularization, a nearest-neighbor Laplacian matrix was used and the reference image was set to be zero as in Ref. (29). The regularization parameter λ in both Tikhonov and TV regularizations was automatically chosen using the same modified L-curve method in (22). All reconstruction methods were implemented in MATLAB (MathWorks, Natick, MA) on a workstation with 2.33-GHz CPU and 2-GB RAM. The reconstructed images from the same dataset are shown in the same figure for visual comparison of image noise, artifacts, and resolution. In addition, normalized mean-squared-error (NMSE) (9) was used to evaluate the noise-suppression capability of each method quantitatively.

We also study the use of NLTV norm in the reconstruction formula for compressed sensing (30, 31). Either used in conventional MRI (26) or pMRI (29), compressed sensing reconstruction formula usually uses TV norm as a piecewise-smoothness constraint in conjunction with other sparsifying transforms such as wavelet transform. For example, in pMRI, the image is reconstructed from multichannel randomly undersampled data using

equation image(10)

where Ψ is a sparsifying transform. The NLTV norm can be used to replace the TV norm in Eq. [ 10]. The reconstruction formula becomes

equation image(11)

To compare the performance of CS with NLTV and TV terms, we used Eqs. [ 10] and [11] to reconstruct images from the multichannel randomly undersampled axial data with a net reduction factor of 4. The Daubechies-4 wavelet transform (28) was used as the sparsifying transform Ψ.

RESULTS

Figure 2 shows the reconstructions of the sagittal dataset. As indicated by arrowheads, TV regularization loses some fine structures, and Tikhonov regularization shows severe aliasing artifacts. In contrast, NLTV regularization is able to preserve much more details visually without compromising signal-to-noise ratio or introducing artifacts. The reconstructions of two slices from the axial dataset are shown in Fig. 3 with a net reduction factor of 2.67. The SoS, NLTV regularization, TV regularization are shown from top to bottom. It is seen especially in regions pointed by arrowheads the conventional TV regularization exhibits considerable blocky effects. Some high-intensity details are lost due to the piecewise smooth constraint in TV regularization, which makes the image to appear blocky. This is because that TV is a local feature, which only calculates the gradient from the nearest two neighbors. In contrast, NLTV regularization preserves more detailed structures because these structures are expressed implicitly in the weight function of NLTV norm but not in the conventional TV norm. The improvement of NLTV over TV can be better visualized in Fig. 4 where a portion of the images in Figs. 2 and 3 are zoomed. It is seen that nonlocal TV regularization only slightly loses some resolution, while more are lost in TV regularization. Table 1 compares the NMSE values for all reconstructions. NLTV regularization has the smallest NMSE among all methods. This quantitative measure agrees with the visual observations of the reconstructed images.

Figure 2.

Reconstructions from four-channel sagittal data using different regularization methods. The net reduction factor is 1.97. Tikhonov regularized reconstruction shows severe aliasing artifacts and TV regularization loses some fine structures. NLTV regularizations are able to preserve much more details visually without compromising signal-to-noise ratio or introducing artifacts.

Figure 3.

Reconstructions of two slices from the eight-channel axial data. The net reduction factor is 2.67. TV regularizations exhibit considerable blocky effects. Some high-intensity details are lost due to the piecewise smooth constraint in TV regularization. In contrast, NLTV regularized reconstructions preserve more fine structures while suppressing noise.

Figure 4.

Zoomed-in portions of the sagittal (top), axial 1 (middle) and axial 2 (bottom) reconstructions. It is seen that nonlocal TV regularization only slightly loses some resolution, while more details are lost in TV regularization.

Table 1. NMSE (×10−002) of Reconstructions Using Different Regularization Methods
R Factor1.972.72.7
DatasetSagittalAxial 1Axial 2
NLTV0.200.210.24
TV0.280.290.34
Tikhonov0.400.350.41

The NLTV regularization preserves more resolution than the existing methods at the cost of increased computational complexity. Each nonlinear conjugate gradient step in NLTV regularization requires 22.1 sec when the divergence calculation defined in Eq. [ 6] was written in C language as a Dynamic Link Library file. To update the weight function, additional 16.4 sec are needed for calculation of Eq. [7] in MATLAB. The NLTV regularized reconstruction needs three outer iterations which include 24 nonlinear iterations in total and twice of weight updating. Table 2 shows the total running time for all methods on all datasets. It is seen that the computation time for NLTV regularized reconstruction is much longer that the running time for Tikhonov regularization and TV regularization. It is expected to be reduced by optimizing the NLTV algorithm and implementing it using GPU.

Table 2. Computation Time (Sec) for All Datasets Using Different Methods
DatasetNLTVTVTikhonov
Sagittal327.541.936.4
Axial 1568.270.960.4
Axial 2568.571.160.7

To demonstrate the convergence property of NLTV, Fig. 5 plots the NMSEs of the axial reconstructions at each iteration as well as the NMSEs of the estimated weight functions at each iteration. In calculating the NMSE of the weight function, the true weight function was estimated from the SoS reconstruction. It is seen that the updated weigh function converges after only a few iterations and the updating process does not compromise the stability of the method. The intermediate reconstructions at different iterations are also shown in Fig. 6. It can be seen that aliasing artifacts and noise exhibit in the first iteration but are alleviated with more iterations. After the third iteration, further iterations barely improve the image quality. Three outer iterations are always used in our experiments.

Figure 5.

NMSE plots of the intermediate NLTV regularized reconstructions and the corresponding estimated weight functions versus the number of iterations. All curves show the method converges after only a few iterations.

Figure 6.

The intermediate NLTV regularized reconstructions at different iterations. The aliasing artifacts and noise exhibit in the first iteration but are alleviated with more iterations.

Figure 7a shows the L-curves of the NLTV regularization for the two slices of the axial brain dataset when the net reduction factor is 2.67. In both L-curves, the regularization parameter λ varies in the range of [0.002, 0.5]. The corners of the two curves are located at 0.15 and 0.2, respectively, and the final λs for these two slices were thereby selected as 0.075 and 0.1 according to the modified L-curve method. To study the effects of the searching window size δ and patch size m on the reconstruction quality, the NMSEs were plotted as functions of the searching window size and patch size in Fig. 7b and c, respectively. It is observed that there are optimal values for both δ and m, which give the least NMSEs. In addition, both curves are seen to stay rather flat for a range of sizes near the optimal values. It suggests that the proposed method is rather insensitive to the choice of δ and m within a range of values. We empirically selected δ = 11 and m = 5 for all datasets while considering smaller δ and m require less computation time. This choice is consistent with those used in other nonlocal TV literatures (18–24) with similar image sizes.

Figure 7.

a: shows L-curves for the NLTV regularization with the regularization parameter λ in the range of [0.002, 0.5]. b,c: show NMSEs plots of the NLTV regularized reconstructions versus the searching window size and the patch size. It is observed that all curves are seen to stay rather flat for a range of sizes near the optimal values. It suggests that the proposed method is rather insensitive to the choice of δ and m within a range of values.

Figure 8 shows the compressed sensing reconstruction results using NLTV and TV regularizations with a net reduction factor of 4. As indicated by arrowheads, compressed sensing reconstructions using TV regularization show severe over-smoothing effect with a loss of details. In contrast, NLTV regularization preserves most details. The results demonstrate that compressed sensing reconstructions can also benefit from the use of NLTV in replace of TV.

Figure 8.

Compressed sensing reconstructions for pMRI with a net reduction factor of 4 when NLTV (middle) and TV (bottom) are used in conjunction with wavelet transform as the sparsifying transform. The results demonstrate that NLTV is superior to TV in preserving resolution while suppressing the noise when used for compressed sensing reconstruction.

All the above results demonstrate that the NLTV regularization method is superior to the existing regularization methods especially in preserving resolution while suppressing the noise. This superior performance is primarily due to the introduction of a weighted nonlocal gradient function which incorporates the prior information about the image structures in the graph weight function w(p,q). Therefore, the accuracy of the weight function is critical to the performance of NLTV regularization. To demonstrate the effect of the accuracy of the weight function, we compare in Fig. 9a,b the reconstructions when the weight function is fixed for all iterations and when the weight function is updated iteratively. The Tikhonov regularized reconstruction was used to calculate the fixed weight function as well as the initial one for the updated case. It is seen that the reconstruction with a fixed weight function exhibits artifacts in the center of FOV, while the one with updated weight functions shows no such artifacts. This is because that the weight function becomes more and more accurate in representing the true image structures when updated, and thus the reconstruction is closer to the true image. In addition, in order to demonstrate the sensitivity of the final reconstruction to the initial reference used for calculation of the weight function, we also show in Fig. 9c the reconstruction when the zero-filled Fourier reconstruction is used as the initial reference instead of the Tikhonov regularized reconstruction. The reconstructions with two different initial reference images show little difference (although the one with a poor initial reference may need more iterations to converge). It suggested that the updating procedure makes the algorithm rather insensitive to the inconsistency between the initial reference image and the original image.

Figure 9.

Comparison of reconstructions using a fixed weight function (a) and updated weight functions with Tikhonov regularization reconstructions (b) and zero-filled Fourier reconstructions (c) as the initial reference. The net reduction factor is R = 2.67. It is seen that the reconstruction with a fixed weight function exhibits artifacts in the center of FOV, while the one with updated weight functions shows no such artifacts. It also suggests that the updating procedure makes the algorithm rather insensitive to the inconsistency between the initial reference image and the original image.

DISCUSSION AND CONCLUSION

NLTV regularization is used to addresses the ill-conditioning problem and improves reconstruction signal-to-noise ratio in parallel imaging. The NLTV regularization replaces the conventional gradient function used in TV by a weighted nonlocal gradient function to reduce the blocky effect of TV regularization. The experimental results using in vivo data demonstrate that the proposed NLTV regularization method can preserve more details than the existing methods when suppressing noise.

The proposed NLTV regularization method can also be extended based on different priors used in the MRF-based regularization methods. Most MRF-based edge-preserving regularization methods (9, 15) calculate the difference only between the nearest neighbors but with priors other than the Laplacian used in TV. For images where the non-Laplacian priors [e.g., the truncated Gibbs prior in (15) or the Huber prior in (9)] better characterize the image contents, the nonlocal extension of the MRF-based regularization method may be of interest by replacing the TV norm by the “norm” associated with the corresponding prior.

Although the results presented here are based on reconstructions using data from Cartesian trajectories, the proposed NLTV regularization method should be easily extended to non-Cartesian trajectories.

Ancillary