Systematic evaluation of iterative deep neural networks for fast parallel MRI reconstruction with sensitivity‐weighted coil combination

To systematically investigate the influence of various data consistency layers and regularization networks with respect to variations in the training and test data domain, for sensitivity‐encoded accelerated parallel MR image reconstruction.


| INTRODUCTION
Parallel imaging (PI) 1-3 forms the foundation of accelerated data acquisition in magnetic resonance imaging (MRI), which is tremendously time-consuming. In the last decade, PI combined with compressed sensing (CS) techniques have resulted in substantial improvements in acquisition speed and image quality. [4][5][6][7][8][9] Although PI-CS can achieve state-of-the-art performance, designing effective regularization schemes and tuning of hyper-parameters are not trivial. Starting in 2016, deep learning algorithms have become extremely popular and effective tools in data-driven learning of inverse problems and have enabled progress beyond the limitations of CS.
Deep learning for image reconstruction is an enormously fast-growing field, which makes it challenging to keep an overview over the different approaches. For details on the developments of deep learning for MRI reconstruction, we refer the interested reader to survey papers. [10][11][12][13] In this work, we only focus on reviewing relevant approaches for 2D MRI reconstruction. Table 1 gives a compact overview of already clinical practice, where we are confronted with limited datasets and various imaged anatomies. published approaches.  The different approaches can be distinguished based on (1) the acquisition type, that is, singlecoil or multi-coil reconstruction, (2) the type of coil combination used in multi-coil approaches, (3) the type of application, (4) realization of consistency to measured k-space data, (5) network architecture, and (6) the use of adversarial training strategies in addition to commonly used similarity measures, for example, mean squared error (MSE). Several things can be noted: A majority of approaches work on single-coil reconstruction; however, the standard approach for MR acquisition is PI. Within the multi-coil approaches, different types of coil combinations, that is, root-sum-of-squares (RSS) and sensitivityweighted combination occur, similar to a preference towards sensitivity encoding (SENSE) 2 or generalized autocalibrating partially parallel acquisitions (GRAPPA). 3 20 showed the influence of training specific and joint networks for different anatomies of the musculoskeletal (MSK) system, including shoulder, hip, ankle, and knee images. A first instability analysis of neural networks for image reconstruction was studied in Antun et al. 40 However, a different instability analysis was conducted for the selected approaches that were proposed for a single-coil or multi-coil setting, and with or without DC layer. Furthermore, the approaches were tested on datasets that differed in levels of SNR. This makes it challenging to draw general conclusions.

K E Y W O R D S
The aim of this work is to bridge the gap of the aforementioned challenges that we have observed in deep learning for parallel MRI reconstruction. We study the influence of regularization networks, DC layers, and variations in the data, in a controlled experimental setup. To the best of our knowledge, this is the first work that studies the effect of different training data configurations, including variations in anatomies and sample size, for neural network reconstructions at a large scale, using the publicly available fastMRI datasets with approximately 5400 training cases. 21 We perform an extensive evaluation of different networks with varying DC layers and regularization networks. We propose a down-up network (DUNET) as the regularization network, and we show the superior performance of the proposed DUNET compared to other state-of-the-art approaches for varying training data scenarios. These scenarios include variations in anatomy, using knee and neuro data, and variations in the number of training samples. Hence, the systematic evaluation allows us to experimentally investigate the robustness and limits of DC layers and regularization networks with respect to different acceleration factors. All experiments are performed on the fastMRI multi-coil knee and neuro dataset, 21 where a fully sampled sensitivity-combined reconstruction, with an extended set of coil sensitivity maps, 41 is used as ground truth. To reproduce our findings, we provide all our source code along with the data processing scripts for the fastMRI datasets online.

| THEORY
Accelerated MRI reconstruction aims at recovering a reconstruction x ∈ ℂ N x from a set of undersampled k-space measurements y ∈ ℂ N y which are corrupted by additive Gaussian noise n ∈ ℂ N y following This inverse problem involves a linear forward operator A: ℂ N x → ℂ N y modeling the MR physics. Here, N x and N y define the dimensions of the reconstruction x and the k-space data y according to the underlying multi-coil or single-coil problem. We investigate a linear multi-coil operator A: ℂ N x → ℂ N y 2 using an extended set of M coil sensitivity maps to overcome field of view (FoV) issues, 41  (1) y = Ax + n.

| Learning unrolled optimization
An approximate solution x ∈ ℂ N x to the inverse problem in Equation (1)  . While ℛ[x] is fixed in classical CS approaches, we learn ℛ[x] from data. A solution is obtained by alternating optimization in ℛ and for a fixed number of iterations T. 11,14,16,29 We define the fixed unrolled algorithm for MRI reconstruction as for 0 ≤ t < T (see Figure 1). First, we take a step along the direction of the negative gradient −∇ x ℛ, which is replaced by a regularization network −f θ with trainable parameters θ. Hence, the regularization network naturally learns the residual.
The regularization network f : ℂ N x → ℂ N x has complex-valued input and output channels, represented as two-channel realvalued image, and the same network is applied separately to x m , m = 1, …, M. The DC layer is denoted by g. In the following, we describe regularization networks and DC layers that we use in our work in more detail.

| Regularization networks
The regularization network f θ can be realized by any type of CNNs, or it can be motivated by variational methods. 18 Commonly used regularization networks are a 5-layer CNN, 14,29 UNET 21,37 or the fields-of-experts model. 18,38 In this work, we introduce DUNETs that serve as an efficient alternative to the expressive UNETs. 42 The DUNET as shown in Figure 1 first downsamples the image by convolutions with stride 2 and then performs analysis on this coarser scale. Shifting the computation to a coarser scale is not only more memory efficient, but also does not lower the reconstruction quality at the original scale. 42,43 The core of DUNET are the multiple down-up blocks (DUBs) applied in an iterative way. This structure allows for an efficient propagation of information at different scales. 44 The outputs of the DUBs are concatenated and further analyzed by a residual convolution/activation block, followed by sub-pixel convolutions, which perform superior in terms of expressiveness and computational efficiency over upsampling convolution. 45

| Data consistency
The DC term allows us to consider the physics of MR acquisition in the image reconstruction problem, and measures the similarity to the acquired k-space data. The DC term can be incorporated in the learning-based reconstruction procedure in several ways.
One possibility is to perform a gradient step 18 related to the DC term [Ax, y] where A * denotes the adjoint operator of A. Instead of gradient descent (GD), DC can be modeled by the proximal mapping (PM) 14,29 This is especially feasible if the PM is easy to compute and a closed-form solution exists. If no closed-form solution exists, or a solution is intractable to compute, as this is typically the case for parallel MRI involving coil sensitivity maps, the PM can be solved numerically using a conjugate gradient optimizer as presented in Ref. [14].
To avoid the extensive computations of the PM, Duan et al 16 proposed a variable splitting (VS) scheme. To review VS, we first introduce the sensitivity-weighted multicoil operator for the qth coil as A q = MℱC q . The operator C q : ℂ N x → ℂ N x applies the qth pre-computed coil sensitivity map to x, for q = 1, …, Q. This is followed by a Fourier Transform (FT) ℱ: realizes the Cartesian sampling pattern and masks out kspace lines that where not acquired. VS divides the problem defined in Equation (6) in two sub-problems by using a coilwise splitting variable z q ∈ ℂ N x where α > 0 and β > 0 balance the influence of the soft constraints. Solving these sub-problems yields the following closed-form solution Here, I denotes the identity matrix and * the adjoint operation.
All presented DC layers, that is, GD, PM, and VS, ensure soft DC to the measurement data y, representing image reconstruction networks. By setting λ = 0 in Equation (5), DC is omitted and we achieve a pure residual network performing a post-processing task.

| METHODS
This section provides an overview of the used datasets and data processing as well as network setup and training. Specific details on the networks and data processing are given in the source code repository.

| fastMRI datasets
All our experiments were performed on the fastMRI knee and neuro dataset. 21 Training was performed on the multicoil training data, testing was performed on the multi-coil validation data. The number of training and testing samples are denoted by N train and N test , respectively. The knee dataset consists of two different sequences: The neuro dataset consists of four different sequences: For details on the sequence parameters, we refer to the original publication. 21

| Data processing
We defined the target as the sensitivity-weighted coil-combined image of the fully sampled data. We estimated two sets (M = 2) of sensitivity maps according to soft SENSE 41 to account for any field-of-view issues or other obstacles in the data. The number of auto-calibration lines (ACLs) needed for sensitivity map estimation varied according to the acceleration factor and was set to 30 ACLs for R = 4 and 15 ACLs for R = 8 for the training and validation set. These numbers were motivated by examining the number of given low frequencies in the test and challenge dataset. The data were normalized by a factor obtained from the low frequency scans by taking the median value of the 20% largest magnitude values, to account for outliers.
We also make use of foreground masks to stabilize training. Foreground masks were extracted semi-automatically for the knee dataset, 46 and by thresholding the RSS combination of the sensitivity maps for the neuro dataset.

| Training setup
All networks were trained using a combined 1 and structural similarity index (SSIM) 47 where ⊙ is the pixel wise product and |·| denotes the RSS reconstruction to combine the individual output channels [x 1 , x 2 ]. This loss formulation also involves a binary foreground mask m to focus the network training on the image content and not on the background. The parameter 1 = 10 − 5 is chosen empirically to match the scale of the two losses and is motivated by the fastMRI challenge requirements. Although we aim for maximizing the SSIM scores in testing, a combined loss is beneficial to stabilize training. 24,47 We used the ADAM optimizer 49 with learning rate 0.0001, default momentum (0.9,0.999) and learning rate scheduling every 15 epochs by γ lr = 0.5. We use a progressive training scheme, starting with 2 cascades in the first 2 epochs and increasing the number of cascades with every epoch, up to a total number of T = 10. We trained all network architectures for 60 epochs. To overcome the huge graphics processing unit (GPU) memory consumption during training, we randomly extracted patches of size 96 in FE direction. 29 Training was performed using an NVIDIA Quadro RTX 6000 (24 GB) and took approximately 12 days for a single network. Testing was performed using an NVIDIA Titan Xp (12 GB). We report average reconstruction times of the network architectures along with the number of trainable network parameters in Supporting Information Table S1.
We trained the networks on all contrasts and the acceleration factors R = 4 and R = 8 simultaneously as we want to examine how the different architectures respond to a generalized training setup. This allows us to have a general network that can be applied to any anatomy and acceleration factor. Hence, we did not aim for the best scores on a benchmark, although we would expect improvements with fine-tuning the networks for specific conditions.

| Experimental setup
We systematically investigate how state-of-the art architectures and the proposed DUNETs with varying DC layers perform on variations in training data. We study the domain shift problem for image reconstruction experimentally, that is, we study how the different architectures can deal if training and test data do not come from the same data cohort.

| Network architectures
The DUNETs have N f = 64 base features, resulting in a total number of 3 372 985 network parameters. We implement three different DC layers, that is, GD, PM, and VS. We compared the DUNETs to three state-of-the-art architectures. First, we omitted DC and implemented a residual UNET based on 21 with N f = 64 base features and kernel size 3, corresponding to 3 357 827 parameters. Second, we investigated MoDL, 14 which has a 5-layer CNN with N f = 64 base features and 3 × 3 filter kernels as regularization network, and PM for DC. The total number of network parameters was 113 155. We omitted batch normalization as this resulted in instable trainings. Third, we investigated VNs, which can be interpreted as unrolled GD scheme with a regularization network that derives from the fields-of-experts model. Following, 18 we learned N f = 48 filter kernels of size 11 × 11 and trainable linear activation functions with 31 nodes. The filter kernels are projected on the zero-mean and ℓ 2 norm-ball constraint after each parameter update. The total number of parameters is 131 051. The parameters are not shared over the cascades, following the original publication.
The regularization parameter λ was not trained as we experienced instabilities during training the PM-DUNET and MoDL. 14 For these two architectures, we initialized λ = 10, otherwise λ = 1. For the VS networks, we experimented with different settings for the parameters α and β, and we set α = β = 0.1 empirically for our experiments.

| Training data
Training and evaluation was performed on the fastMRI multicoil knee and neuro training and validation set, respectively. We performed following base experiments: • knee 100: Training on 100% knee data (N train = 973) • neuro: Training on 100% neuro data (N train = 4412) To study the influence on the number of samples and joint training of knee and neuro data, we performed an ablation study as follows: • knee 50: Training with 50% knee data (N train = 487) • knee 25: Training with 25% knee data (N train = 244) • joint 100: Joint knee and neuro training with 18% of all data, samples equal 100% of knee data (N train = 968) • joint 50: Joint knee and neuro training with 9% of all data, samples equal 50% of knee data (N train = 486) • joint 25: Joint knee and neuro training with 4.5% of all data, samples equal 25% of knee data (N train = 240) • joint uni 100: Joint knee and neuro training with uniform distribution of contrasts, samples equal 100% of knee data (N train = 974) • joint uni 50: Joint knee and neuro training with uniform distribution of contrasts, samples equal 50% of knee data (N train = 484) • joint uni 25: Joint knee and neuro training with uniform distribution of contrasts, samples equal 25% of knee data (N train = 243) We would like to note here that we performed two different sets of joint knee and neuro training, one with uniform and one with non-uniform distribution of the contrasts. As pointed out in Section 3.1, the number of samples differ for the knee and neuro set, and also for the different contrasts in the neuro dataset. While the uniform datasets contain the same number of samples from each available contrast, the non-uniform datasets contain only a fraction of samples such that the distribution of contrasts in the reduced dataset corresponds to the distribution of contrasts in the full dataset, which is a common scenario in clinical practice.
For quantitative evaluation, we report the SSIM. All experiments are visualized as ranked lists 50

| RESULTS
We plotted ranked lists for networks evaluated on knee and neuro data. The performance of networks trained only on knee and neuro data is depicted in Figure 2 (knee, R = 4), Figure 3 (knee, R = 8), Figure 4 (neuro, R = 4), and Figure  5 (neuro, R = 8). Results for trainings on different fractions of knee and joint training data are illustrated in Supporting Information Figure S1 (knee, R = 4), Supporting Information Figure S2 (knee, R = 8), Supporting Information Figure S3 (neuro, R = 4), and Supporting Information Figure S4 (neuro, R = 8). For an acceleration factor of R = 4, we observe that all post-processing UNETs perform inferior than the worst performing reconstruction network, independent of the number and type of training samples. This effect is more prominent on the neuro data compared to the knee data. For the neuro data and R = 4, the best post-processing UNET, trained on neuro data, achieves an SSIM an 0.9291 and the worst reconstruction method, VS-DUNET trained on knee 25 data, achieves an SSIM of 0.9460. For the knee data and R = 4, the best post-processing UNET, trained on knee 100 data, achieves an SSIM an 0.9142 and the worst reconstruction method, MoDL trained on neuro data, achieves an SSIM of 0.9153. For an acceleration factor of R = 8, the post-processing UNET trained and evaluated on the same data outperforms some image reconstruction networks which were trained and evaluated on different data.
The ranked lists also show a substantial performance gain at accelerations (4/8) in terms of SSIM of the best performing reconstruction DUNET compared to state-of-the-art reconstruction (0.0109/0.0284) and compared to post-processing methods (0.0376/0.0644) for neuro data and a performance gain to state-of-the-art reconstruction (0.0079/0.0266) and post-processing methods (0.0233/0.0415) for knee data. We observe larger performance gain for neuro data compared to knee data, especially for R = 8. Figure 6 shows results for an example coronal PDw scan with fat saturation and R = 4. The top row shows the best performing network, corresponding to knee 100 data. We already observe an anatomy change of the UNET in the interchondylar notch which is correctly depicted in all reconstruction networks. The DUNETs have the least artifacts and appear most homogenous compared to VN and MoDL, supported by the difference image in Supporting Information Figure S5. The bottom row shows the worst performing networks. For MoDL, VN, and UNET, networks trained only with neuro data perform worst. This is different for the DUNETs where the joint 25 dataset led to the worst results.
Results for a selected coronal PDw scan and R = 8 are illustrated in Figure 7, along with difference images in F I G U R E 2 Ranked list for the fastMRI knee dataset at R = 4 trained with knee and neuro datasets. All reconstruction networks perform superior than the post-processing networks. GD-DUNET trained on the knee dataset performs best. PM-DUNET trained only on the neuro dataset performs better than the state-of-the-art methods Supporting Information Figure S6. The UNET result appears blurry, however, it has less artifacts than MoDL and VN which have difficulties to reconstruct images at this high acceleration factor. The DUNETs are able to reconstruct the images with high quality when trained with knee data. The VN results between the best and worst performing network do not differ greatly. The UNET reconstructions appear artificial when trained on neuro data. All DUNETs show artifacts when trained on neuro data, however, the anatomy itself does not change. Figure 8 shows example results for an axial T 1 w scan and R = 4. The VN shows the most artifacts of the reconstruction networks. The drop in image quality between different datasets is lowest for MoDL and PM-DUNET, supported by the difference images in Supporting Information Figure S7. GD-DUNET and VS-DUNET show severe artifacts in the reconstructions when trained on knee 25 data. The post-processing UNET stays close to the zero filling solution when trained with the wrong data.
Results for a selected axial T 1 w post contrast scan for R = 8 is illustrated in Figure 9 along with the difference images in Supporting Information Figure S8. Reconstruction DUNETs trained with neuro data show the best image quality at this high acceleration factor. UNET cannot reconstruct this F I G U R E 3 Ranked list for the fastMRI knee dataset at R = 8 trained with knee and neuro datasets. PM-DUNET trained on the knee dataset performs best. PM-DUNET trained only on the neuro dataset performs better than all state-of-the-art methods. Post-processing UNET trained on knee data performs superior than VN and MoDL trained only on the neuro dataset F I G U R E 4 Ranked list for the fastMRI neuro dataset at R = 4 trained with knee and neuro datasets. All reconstruction networks perform superior than the post-processing networks. PM-DUNET trained on the neuro dataset performs best F I G U R E 5 Ranked list for the fastMRI neuro dataset at R=8 trained with knee and neuro datasets. PM-DUNET trained only on the knee dataset cannot compete with other state-of-the-art approaches trained on neuro data. Post-processing UNET trained on only neuro data performs superior than many networks containing knee data only. All networks trained on knee data fail for this dataset and acceleration factor.

| DISCUSSION
In this work, we investigate the performance and limits of deep neural networks with respect to different design parameters, including regularization networks, DC layers, and data variations in a controlled, experimental setup. Specifically, we compare three state-of-the architectures, namely UNET 21 (no DC), MoDL 14 (5-layer CNN, PM as DC), and VN 18 (fields of experts model, GD as DC) to our proposed DUNETs with GD, PM, and VS as DC. We deploy a challenging setup, where we train on all contrasts and acceleration factors simultaneously to study the robustness of all networks. This stands in contrast to tremendous amount of research that is conducted to improve the accuracy of deep neural networks on benchmarks, 24,31,46,51 and to overcome limitations of CS approaches, for a specific anatomy or acceleration factor. Our evaluation simulates a very common scenario in medical imaging where the source and type of images and acceleration factors might be unknown, or only limited ground truth data, but diverse test data might be available.
Up to now, the robustness of neural networks to training data has not been studied in literature, although this is a crucial part for a successful clinical translation of MRI reconstruction. Recent work focused on the robustness of sampling trajectories, 52 the robustness to noise levels and image contrast, 39 the effect of diverse MSK anatomies, 20 or performed instability analysis of neural networks with respect to image perturbations. 40 However, a drawback of these approaches is that the size of the dataset is limited, and datasets are homogeneous. Hence, the robustness of these approaches to large, inhomogeneous datasets is unknown.

Variations in DC
We first discuss the impact of different DC layers, with an expressive DUNET as regularization network. The results depict that the differences between DC layers for acceleration factor R = 4 are minor. We observe that the PM-DUNET performs most stable over different training datasets, independent of the number and type of training samples, especially at R = 8. For knee data and R = 4, PM-DUNET trained on neuro data even outperforms the best reported state-ofthe-art method. The implicit DC step in the PM allows the network to use a larger regularization parameter λ, resulting in stronger DC. A GD layer would require a smaller stepsize, hence, more iterations to impose the same λ. However, comparing DC layers with respect to unrolled iterations is out of scope of this paper and was studied previously in Ref. 14. Indeed, our results show that GD-DUNET and VS-DUNET are more sensitive to the type and amount of training data. VS-DUNET in general performs worse than GD-DUNET, which stands in contrast to the results reported in Ref. 16. This can be explained by the inhomogeneous dataset which makes it more challenging to tune the parameters α and β.

Variations in regularization networks
Interestingly, the behavior of DC layers cannot be directly transferred to networks with a less expressive regularization network deployed in VN and MoDL. While MoDL performs superior than VN on the neuro dataset, MoDL performs inferior than VN on the knee dataset, especially for the fat saturated knee data. In our study, MoDL is more sensitive to the content of the training dataset. We believe that this is due to the small CNN regularization and a strong PM DC, as this setup cannot capture the inhomogeneity between acceleration factors, SNR levels and F I G U R E 9 Axial T 1 w post contrast, R = 8 (file_brain_AXT1POST_200_6002237.h5, slice 1): The first column shows the target (top) and zero filling reconstruction (bottom). Columns 2-7 show the reconstruction results for the best performing training dataset (top) and worst performing training dataset (bottom). The DUNETs with varying DC show the best results when trained with neuro data. However, if they are trained on knee 25 data, even the ventricles disappear and instead an artificial structure resembling a knee appears | 1869 anatomies. This indicates that expressive regularization networks such as the DUNETs are able to compensate for inhomogeneities in the data.

Importance of DC
The ranked list visualizations show that all networks with DC perform superior than the UNET without DC for R = 4, independent of type and amount of training data. It is impressive that VN and MoDL only have 4% of parameters compared to UNET without DC, but they outperform UNET substantially for R = 4. For R = 8, UNET achieves superior quantitative results compared to VN and MoDL when trained on mismatched anatomy. However, we observe changes in anatomy for the UNET, as shown in Figure 7. These findings indicated that modeling the acquisition physics in the DC is more important than a large amount of training data, for successful learning in MRI reconstruction.

Robustness to variations in data
Our results indicate that domain shift is less of an issue at R = 4, and more general training data settings can be used to achieve decent reconstruction quality. For R = 8, the increase in dataset size tend to help but blindly increasing the dataset is not enough to account for the large domain shift. This becomes clear when examining the results for DUNET trained with knee 25 data in Figure 9. We observe a structure in the brain that resembles more a knee structure than brain ventricles, and we suspect overfitting to the small knee dataset. Hence, including aligned data for training and testing is more helpful at R = 8, even if the dataset is small in size. Additionally, the influence of DC is limited for high acceleration factors as less information is available in k-space. This rises the concern if structures are invented by the networks if the acceleration factor is pushed too far. Hence, both theoretical studies and radiologists' evaluations are required to estimate the limits of acceleration.
Radiologists' evaluation are also required to assess image quality. For training the networks, we often use global, quantitative measures, that is, MSE and SSIM, which represent the human perceptual system poorly. These measures cannot account for important local features such as subtle pathologies, and often result in blurry images. 47,51 However, training on different loss functions, including adversarial losses and unsupervised training, is out of scope of this paper. Furthermore, we did not investigate transfer learning 39,53 or fine-tuning on individual contrasts and acceleration factors in this study, which can further improve image quality.
Our work provides first insights into the robustness of neural networks for MRI reconstruction from an experimental perspective, where we cover a large-scale evaluation with respect to variations in training data, regularization networks, and DC layers. There are also other sources of variation that are not covered in our experiments, including variations in unrolled iterations, 14 number of features, loss functions, 24 and the influence of optimizers in training. 24 Furthermore, it still remains an open question how robust networks are to variations in SNR, field strength, scanner types, and hardware from different vendors, which have to be investigated in future work.
In the present work, we focused on static 2D imaging, but the basics of DC and regularization networks can be directly translated to higher-dimensional image reconstruction, for example, dynamic imaging. However, the presented DC and regularization networks might not be sufficient to exploit complex temporal dynamics. Recent works focused, for example, on improving the regularization by exploiting spatiotemporal redundancies in X-f domain, 54 or combining MoDL with a SmooThness regularization on manifolds (SToRM) prior for dynamic imaging. 55 We believe that additionally integrating advanced DC schemes, for example, motion-corrected DC, 56 or combining the classic DC term with temporal models 57 will be key ingredients to further improve dynamic reconstruction with deep learning.

| CONCLUSION
Large-scale studies are indispensable to assess the application potential of neural networks in clinical workflow, where we have to deal with both limited and inhomogeneous datasets. In our work, we experimentally validate hypotheses about acceleration limits, properties, and the robustness of neural networks to sources of variation in DC, regularization networks and training data. Our findings underpin the importance of DC layers, and suggest that PM 14 together with an expressive regularization network, that is, the proposed DUNET, leads to the most stable results over a wide range of training scenarios. For low acceleration factors, general and robust networks can be learned that do not depend substantially on the type and amount of training data. For high acceleration factors, the results are impressive only if train and test domain are aligned. Although we get the impression that neural networks add more details to the reconstruction, we should be aware that they cannot recover high-frequency information that has not been captured in the acquisition process.