Near offset reconstruction for marine seismic data using a convolutional neural network

Marine seismic data is often missing near offset information due to separation between the source and receiver cables. To solve this problem, a convolutional neural network is trained on synthetic seismic data to reconstruct the near offset gap. The synthetic data is created using a two‐dimensional finite difference method within a heterogeneous velocity model. These synthetics are generated with a source‐over‐receiver acquisition geometry so that they contain complete near offset data. The convolutional neural network is then trained on input‐target synthetic pairs where the inputs are common midpoint gathers with the near offset section removed, and the targets are the same gathers with the near offset section retained. Following training, the robustness of the method is investigated with regards to common midpoint data sorting, normal moveout correction and changes in the velocity model. It is found that training on common midpoint‐sorted data results in 2.8 times lower error than training on shot gathers, that normal moveout correction of the training data makes no significant difference in error levels, and that the model can reconstruct realistic near offsets on synthetic data generated 10 km away within the heterogeneous velocity model. In field data testing, first a dataset with source‐over‐cable acquisition geometry from the Barents Sea is used to compare the reconstructed wavefields to ground truth values. Although the reconstructed amplitudes require minor scaling to match the true values, predictions on this dataset yield 2.5 times lower near offset reconstruction error compared to a simple Radon transform interpolation method. Furthermore, amplitude versus offset gradient and intercept sections from the Barents Sea dataset are estimated with half the error when including the convolutional neural network‐predicted near offset data, compared to only using the conventionally‐acquirable portion of the data (beyond 112.5 m of offset). In a secondary field data test, a conventional northern North Sea dataset is used to demonstrate how the method may be applied in practice. Here, the convolutional neural network generates more realistic predictions than the Radon method, and the gradient and intercept sections calculated using the convolutional neural network‐predicted traces have higher signal‐to‐noise ratios than the sections calculated using only the original data. The combination of high‐quality synthetic training data and interpolation in the common midpoint domain enables near offset reconstruction at significant depth (1 s of two‐way traveltime or more), which is demonstrated in both synthetic and field examples.


INTRODUCTION
In marine seismic surveys, traces are often missing and undersampled due to limitations of the acquisition geometry.Perhaps the most fundamental limitation in a typical survey is the gap between the source and the receiver cables, which leads to a missing section of near offset traces.This gap is commonly on the order of 100-200 m, which translates into 8-16 missing traces, assuming a receiver spacing of 12.5 m.These near offset traces contain high-frequency information and are important for several seismic processing and imaging methods.Therefore, there is a need for techniques that can reconstruct them.
Some processing methods that are particularly harmed by a lack of near offset data include surface-related multiple elimination (Verschuur et al., 1992) and applications of spectral decomposition (Smith-Boughner & Constable, 2012;White et al., 2015).Furthermore, zero-offset migration algorithms often use stacked datasets as approximations for zero-offset data (Hill & Ruger, 2019), which can be improved by the inclusion of richer near offset information.Amplitude variation with offset (AVO) analyses can also be improved by including missing near offsets, due to lower error in estimating the zero-offset AVO intercept (Swan, 2001).As a practical example of these benefits, Apeland et al. (2018) achieved better near surface resolution and AVO quantity estimation after reconstruction of near offset data.
There are also environments in which the problem of missing near offsets is exacerbated.For example, in the Barents Sea, which has a rugged high-velocity seafloor in often shallow water, a large amount of energy is refracted at a relatively low critical angle, and many multiples and diffractions are generated from the seafloor (Houtz, 1980;Rønholt et al., 2015).This leads to limited offset information and difficulties in imaging, especially in the near surface.To solve this particular problem, a novel seismic acquisition configuration called TopSeis was developed (Dhelie et al., 2018;Vinje et al., 2017), which eliminates the source-receiver gap and provides richer near offset data and increased fold.In TopSeis acquisition, the source is towed in a separate ship above the middle of the receiver cables, resulting in split-spread data.Figure 1 illustrates the difference between conventional marine seismic acquisition and TopSeis acquisition, and shows a conventional shot gather (1c) versus a split-spread TopSeis shot gather (1d).
There are also several processing methods to address the problem of missing near offsets.The simplest method is to copy a normal moveout (NMO)-corrected trace from the nearest available offset and use it to replace all the missing traces.More sophisticated interpolation techniques include wave-equation trace interpolation (Ronen, 1987), trend-spline interpolation (Vershuur, 1991), the parabolic Radon transform (Nurul Kabir & Verschuur, 1995), frequency-space ( -) domain gap-filling algorithms (Sacchi & Ulrych, 1997), differential offset and shot continuation (Fomel, 2003), inverse shot record dip moveout (Baumstein, 2004), and predictive painting (Khoshnavaz, 2022).Additionally, there is a significant amount of work on interferometric interpolation methods using pseudoprimaries (cross-correlations of multiples with primary reflections) together with prediction-error or leastsquares matching filters to perform near offset reconstruction (Curry & Shan, 2010;Guo et al., 2011;Wang et al., 2009;Xu et al., 2018).
Recently, there has been an increase in the usage of deep learning methods for seismic interpolation and near offset reconstruction, where neural networks are trained to interpolate or generate missing traces.Examples of this have included a generative adversarial network for trace interpolation (Kaur et al., 2021), a convolutional neural network (CNN) for near offset extrapolation at shallow subsurface depths (<0.1 s of traveltime) (Qu et al., 2021), a CNN for interpolation of successively sampled seismic data (Li et al., 2023) and a multidirectional CNN for self-supervised reconstruction of gaps in seismic data (Abedi & Pardo, 2022).One of the main challenges of a deep learning approach is in creating or finding training data with rich near offset information, as source-over-cable field data is a rarity, and synthetic training data often lacks the heterogeneity and complexity of real seismic data.However, provided that adequate training data is used, deep learning methods may require less parameter tuning and be more generalizable to new data.
Typically, the task of interpolating or reconstructing seismic data in the shot domain is more difficult than in other domains.This is because trends in transformed domains are often easier to follow than the hyperbolic moveouts of seismic reflections, because there can be more heterogeneities in shot gathers compared to common midpoint (CMP) or common offset gathers, and because the near offset gap can consist of fewer traces to interpolate (for instance in CMP gathers vs. shots).Thus, many studies on seismic interpolation utilize transformations of the data, for example into the wavelet domain (Greiner et al., 2020), the CMP domain (Khoshnavas, 2022;Nurul Kabir & Verschuur, 1995), the Radon domain (Nurul Kabir & Verschuur, 1995;Xu et al., 2018), the common offset domain (Baumstein, 2004), Fourier domains such as  - and  - (Naghizadeh & Innanen, 2011;Schonewille et al., 2013;Spitz, 1991;Xu et al., 2005), or transformation through de-migration into regularized source-receiver configurations (Hlebnikov et al., 2022).
Although there is a difference between the task of extrapolating missing near offset traces and the task of interpolating between existing seismic traces, in practice they can be made into similar problems.This is because the concept of seismic reciprocity (Knopoff & Gangi, 1959) can be applied to a one-sided shot gather with a missing gap of near offset traces, yielding a split-spread shot gather.In the resulting split-spread gather, the near offset region is bordered by data on both sides, and therefore interpolation can be applied.This approach was followed by Qu et al. (2021) when training a CNN to interpo-late the missing near offset traces in a shallow section close to the seismic source.
In this study, we present a workflow to reconstruct missing near offset traces in the CMP domain using a CNN.We first generate two-dimensional synthetic shot gathers in a realistic velocity model using a finite difference method and then sort them into CMP gathers.From these CMP gathers, we form training data of input and target pairs -with the inputs having the near offset traces zeroed out, and the targets retaining the correct near offset traces.The CNN is trained to transform the inputs into the targets, reconstructing the missing near offset traces.The network is then tested on unseen synthetic data to demonstrate the robustness of the method with regards to factors such as CMP data sorting, NMO correction, and changes in the velocity model.Next, the synthetically trained CNN is tested on TopSeis (split-spread) data acquired in the Barents Sea and conventional data acquired in the northern North Sea.We compare the performance of the CNN and a simple Radon transform interpolation method in reconstructing the near offsets.AVO intercept and gradient sections are also calculated with versus without the CNN-predicted near offsets in order to look for improvements.In the TopSeis case, the results are compared to their ground truth counterparts, whereas in the conventional case, the results are necessarily evaluated qualitatively.

Synthetic data generation
Two-dimensional (2-D) synthetic seismograms are generated with an acoustic finite difference (FD) method to serve as training data for the neural network.First, a velocity model is built through impedance inversion of post-stack seismic data from the northern North Sea.The inversion method utilizes three-dimensional sparse spike deconvolution to estimate impedances, yielding a realistic velocity model (Kolbjørnsen & Evensen, 2019).More details on the generation of this velocity model are available in Evensen et al. (2023).A subset of the complete model is used in this study, which has dimensions of 37.5 by 17.5 km across by 1 km in depth, and is initially sampled at 5 m in depth and 12.5 m in map view.However, the model is interpolated to a grid size of 1 m (in both dimensions) to yield a higher-resolution model for the FD simulations, using nearest-neighbour interpolation (as opposed to linear interpolation) in order to maintain sharper contrasts between the layer boundaries.2, respectively.The complexity and heterogeneity of the velocity profiles are evident.For instance, note the significant velocity inversion between about 875 and 1000 m of depth, which is a common observation in Cenozoic sediment layers in the northern North Sea (Thyberg et al., 2000).
Acoustic modelling is conducted in Devito, an opensource Python package designed for FD numerical modelling (Louboutin et al., 2019).The details of the modelling parameters and survey geometry are summarized in Table 1.An FD grid size of 1 m is used in the horizontal and depth directions, along with a time sampling rate of 0.25 ms in order to yield stable and accurate results.However, the outputted seismograms are later downsampled in time to 2 ms to replicate the sampling of our TopSeis field data.An Ormsby wavelet is used for the source, with trapezoidal edge frequencies of 0, 4, 100 and 200 Hz, respectively, with the goal of emulating frequencies seen in typical marine seismic data.In terms of the acquisition geometry, a line of pressure-sensing receivers (to simulate hydrophones) is placed 15 m below the water surface, with a separation of 12.5 m between each receiver (a common receiver separation in marine seismic acquisition).Then, in the modelling, a series of shots is conducted at depth 15 m below the surface, moving across the sail line with an increment of 37.5 m.For each shot, a smaller subregion is used to model and save as a simulation.These subregions are 3 km across in offset and 1 km in depth.Absorbing boundary conditions are applied to the edges of these subregions.Wave propagation is recorded for 1 s in total from both sides of the central shot points, thus generating split-spread gathers that are apt for interpolation.
From modelling across the 5 lines extracted from the T&V Swath, 3625 shot gathers are generated (725 from each line).A parallelized computing approach is employed, with each gather taking approximately 300 s to generate, though this varies depending on the number of CPUs used.These T&V Swath shot gathers are sorted into 20,580 common midpoint (CMP) gathers, with 80% (16,464) being used as training data and 20% (4,116) being used as validation data for the neural network.The validation data is selected randomly from the data of all five lines, with the resulting offsets of the validation gathers being approximately evenly spaced across the entire 37.5 km of offset.Another 3625 shot gathers are generated across 5 lines extracted from the Testing Swath and are sorted into 20,580 CMP gathers.These are later used for testing the network's performance on varying data.
An example shot gather generated by the FD modelling is shown in Figure 3a.This gather is sampled at 12.5 m in offset and 2 ms in time.Note the heterogeneity and differences between the left and right side of the gather, with the most prominent differences present at about 700 ms.This heterogeneity is a consequence of the realistic velocity model and accurate FD solutions to the acoustic wave equation.The yellow dashed lines represent an 18-trace-wide near offset gap (of 225 m), which is the goal to interpolate (note that in conventionally-acquired data this would be a 9-trace or 112.5-m gap).The shot gather is shown after NMO correction with the root mean square (RMS) velocities of the model, in Figure 3b.Although the primary reflections are flattened at near offsets, there are still significant differences in character on the left and right sides of the gather (again, especially around 700 ms).This suggests that interpolation after solely applying NMO correction to a shot gather may still be difficult.Figure 3c shows a CMP gather, which includes the shot in Figure 3a (the shot point is approximately in the middle of the shot positions from which this CMP gather was aggregated).This CMP gather is sampled at 75 m in offset and 2 ms in time.Its near offset gap consists of three traces, thus also spanning 225 m.It is clear that the CMP gather is highly symmetrical, • Duration = 1 s and intuitively, it is easier to interpolate across its near offset gap compared to the near offset gap of the shot gather.Greater ease of interpolation for such a CMP gather is also likely due to the lower number of near offset traces to reconstruct (3) compared to the shot gather (18).The CMP gather following NMO correction is presented in Figure 3d, showing flattened primary reflections.

Training data and network architecture
The task of near offset interpolation is formulated as an image-to-image translation problem, where inputs to the neural network are seismic images with the near offset region removed, and targets are the same image with the near offset region retained.The goal is to train the network to reproduce the near offset section with as low of error as possible in comparison to the target, as measured by a cost function.
The overall workflow is illustrated in Figure 4.The left side of Figure 4 shows example input data to the network -CMP gathers with the near offset region removed, and the right side shows example output -the CMP gathers with the generated near offset region.The network architecture used is the U-Net (Ronneberger et al., 2015), which is a convolutional neural network (CNN) originally developed for biomedical image segmentation.This architecture has demonstrated effectiveness on various seismic imaging tasks, such as fault segmentation (Wu et al., 2019), multiple reflection removal (Bugge et al., 2020) and most relevant to this work -seismic interpolation (Fang et al., 2021;Li et al., 2023) and near offset extrapolation (Qu et al., 2021).Furthermore, it is a simple model to parameterize and train.The U-Net consists of an encoder downsampling block on the left, followed by a decoder upsampling block on the right, yielding an output with the same dimensions as the input.For this reason, the U-Net is apt for image-toimage translation applications, as described earlier.The input is fed through various convolutional filters to extract relevant features, and skip connections are employed to concatenate these features with the results of the upsampling and convolutional transpose operations occurring on the right half of the network.The result is a stable prediction that retains the low-and high-frequency information present in the original image.
Our specific implementation of the U-Net is presented in the middle of Figure 4, with the layer types and filter sizes outlined.A kernel size of 3 × 3 and rectified linear unit (ReLU) activation functions are used for the convolutional layers, which have filter numbers progressing from 16 to 32 to 64 to 128 (and in reverse on the right half of the network).The output layer is assigned to use a linear activation function, however, because ReLU activation only yields positive values, and it is necessary for us to obtain both the positive and negative amplitudes present in seismic data.An L2-regularization term of 1e-8 is also added to each convolutional layer, and max pooling layers of dimension 2 × 2 are inserted after each convolutional encoding block, both with the motivation of reducing overfitting to noise.
When prepared for input to the CMP-trained network, the CMP gathers are reshaped to 448 time samples by 40 offset samples, and for the shot-trained network, the shots are reshaped to 448 time samples by 72 offset samples.Note that the gathers are not downsampled in time, but instead the first 448 samples out of 500 are taken.In addition, in the case of the shot gathers, the data is not downsampled in offset, but rather a cropped section spanning 72 offset traces is taken as the input.This is done in order for the CMP and shot gather images to be of comparable size, while having dimensions that are divisible by 8 (as the numbers of convolutional filters are multiples of 8).Additionally, the seismic images are normalized by dividing by the maximum amplitude across all the images.
The mean absolute error (MAE) or L1-norm is used as the cost function during training.This is chosen instead of the commonly-used mean squared error, as MAE has often been found to yield less blurry images in reconstruction tasks (Isola et al., 2017;Qu et al., 2021).The model training is implemented with a minibatch size of 16 data samples, and converges to a minimum of approximately 1e-4 after 60 epochs.Each epoch takes about 8 min to complete, running on an NVIDIA A40-12Q GPU.The Adam (adaptive moment estimation) algorithm is used for optimization during training (Kingma & Ba, 2015).Training and validation loss curves for the case of the CMP gather-trained network are shown in the middle of Figure 4.

Synthetic testing
In this section, several tests are performed on unseen synthetic data in order to demonstrate the robustness of the CMP gather-trained model.These tests include (1) the difference between a network trained on (NMO-corrected) CMP gathers and a network trained on shot gathers, (2) networks trained on approximately NMO-corrected versus non-NMO-corrected CMP gathers and (3) testing of the CMP gather-trained network on data from the Testing Swath (10 km away from the T&V Swath within the heterogeneous velocity model).

Common midpoint gather-versus shot gather-trained network
To test the hypothesis posed when analysing Figure 3 -that CMP gathers may be easier to interpolate than shot gatherstwo networks are trained on CMP and shot gathers, respectively, and the results are compared.The shot gathers for the training and testing are generated using the T&V Swath, and the CMP gathers are created via the sorting of these shot gathers.The CMP gathers also have NMO correction applied, using the correct RMS velocities from the velocity model.The results of the CMP gather-trained network applied to validation data from the T&V Swath are presented in Figure 5.It shows an example of a synthetic split-spread CMP gather with its near offset section removed (the input to the network), the target CMP gather (which includes the ground truth near offset section), the CNN-predicted CMP gather, and the difference between the target and prediction (or the 'reconstruction error').It is evident that the reconstruction error is low in comparison to the amplitudes of the target and prediction.As a quantitative measure of the reconstruction quality, the normalized mean absolute error (NMAE) is used (Shcherbakov et al., 2013), which is calculated as where  are the predicted amplitudes,  are the target amplitudes,  is the number of points (or pixels) in the seismic image, and  min and  max are the minimum and maximum amplitudes in the target image, respectively.Thus, the NMAE is the MAE normalized by the amplitude range of the target seismogram.For the case of individual shot and CMP gathers, we only calculate the NMAE across the near offset region bounded by the yellow lines.In Figure 5d, this value is highlighted in the yellow box as 0.0014.
Figure 6 shows a comparison between a network trained on shot gathers, and the original network trained on CMP gathers, tested on the same example shot (the CMP gather results are sorted back into the shot domain).Note that the shot gathers show refractions because they were not NMOcorrected like the data input to the CMP-trained network.Figure 6b shows the shot-trained network prediction, whereas Figure 6e shows the CMP-trained network prediction.The difference plots and average reconstruction error for these two predictions are shown in parts (c) and (f) of Figure 6, respectively.The shot-trained network difference plot shows higher error, with the NMAE being about 2.8 times lower for the CMP-trained network (0.0013 vs. 0.0036).This strengthens the hypothesis that interpolation in the CMP domain is significantly easier than in the shot domain.It should be noted, however, that the results of interpolation for the shot gathertrained network are still of high quality in and of themselves, generating a realistic prediction.
In order to evaluate the near offset reconstruction performance across all the validation data, 'zero-offset sections' (common offset gathers at zero offset) are made using the predicted and target zero-offset traces.These zero-offset sections look similar to the velocity profiles in Figure 2, because the tested shots were randomly selected from all of the shots generated across the line.Figure 7b shows the predicted zerooffset section for the shot-trained network, and Figure 7e shows the predicted zero-offset section for the CMP-trained network.The targets and difference plots are laid out in similar fashion to Figure 6.Note that a different amplitude scale is used to better show the seismic structure.The reconstruction error is estimated by calculating the NMAE between the target and predicted sections.Similar to the results on the example shot gather, the NMAE is approximately 2.8 times lower for the CMP gather-trained network (0.0021 vs. 0.0058).
Training on common midpoint gathers with approximate and no normal moveout correction applied RMS velocities used for NMO correction, often estimated via stacking velocity analysis, will always be associated with some degree of uncertainty and cause misalignments and distortions in seismic images.Therefore, it is important that our method be robust to an imperfect RMS velocity model.This is tested by training a network with CMP gathers that are NMO-corrected with a simplified gradient velocity model.The gradient model is constructed by extracting 10 points in depth from the correct RMS velocity model and applying linear interpolation.This was done in order to yield a model that could plausibly be generated by a rough attempt at velocity analysis.Figure 8a shows an example CMP gather with the correct NMO applied (the exact RMS velocity model is shown in the upper left), whereas Figure 8b shows the same gather with approximate NMO applied (the approximate RMS velocity model is shown in the upper left).Note how, at larger offsets, the reflections bend upwards significantly for the gather corrected by the approximate velocity model, compared to the gather corrected by the exact velocity model.However, in the near offset regions (bounded by the yellow lines), for both cases, the reflections are approximately flat.Zero-offset sections are next made to compare the network trained on exact NMO-corrected gathers, and the network trained on the approximate NMO-corrected gathers.Figure 8d shows the zero-offset section difference between prediction and target for exact NMO with an NMAE of 0.0021 (note that this is the same result as in Figure 7f), and Figure 8e shows the result for approximate NMO with an NMAE of 0.0020.Because these error levels are approximately the same, we can likely conclude that the method is robust to imperfect NMO correction.
To further test the impact of NMO correction on the near offset reconstruction, a network is now trained on CMP gathers with no NMO correction applied.This test is performed using the same data as earlier, from the T&V Swath.The same CMP gather as in the previous examples, without NMO correction, is shown in Figure 8c.The zero-offset section difference for non-NMO-corrected gathers is shown in Figure 8f, yielding an NMAE of 0.0024.Because this error level is only slightly higher than for the exact and approximate NMO cases, it is likely that the NMO correction makes little difference in the interpolation.

Near offset reconstruction of data from the Testing Swath
In order to investigate how generalizable the network is to new data, near offset reconstruction is now conducted on data from the Testing Swath (which is 10 km away from the Training & Validation [T&V] Swath within the heterogeneous velocity model; Figure 2a).The tested network is still the same network trained on (accurately NMO-corrected) CMP gathers from the T&V Swath.The results are presented in Figure 9, with a shot gather example in the top row and the zero-offset section results in the bottom row.Note how this gather is different from the one tested in Figure 6 because it was generated in the Testing Swath.The average reconstruction error as measured by the NMAE for this shot is 0.0022, about twice as high as the error of 0.0013 seen during testing on the T&V Swath shot.The zero-offset section difference shows a NMAE of 0.0042 (compared to 0.0021 for the T&V zero-offset section).The highest concentration of error is located in the water bottom and shallow region.Regardless, the error levels are still relatively low and realistic predictions are generated.

Datasets and testing workflow
The network trained on (accurately NMO-corrected) common midpoint (CMP) gathers from the Training & Validation Swath is now tested on two field data examples.One example is TopSeis (split-spread) data acquired in the Barents Sea, and the other is conventional data acquired in the northern North Sea.These datasets have had standard early processing steps applied, including denoising, direct arrival attenuation,source debubbling, and deghosting.Importantly, the datasets are prior to demultiple processing, as one of the major rationales for near offset reconstruction is preparation for multiple removal algorithms such as surface-related multiple attenuation.
The workflow for the field data testing is outlined in Figure 10.The TopSeis (split-spread) data is first used for validation of the network's effectiveness, as it includes ground truth values with which to compare the results.First, near offsets are reconstructed for example gathers with the convolutional neural network (CNN), and compared to a simple Radon transform interpolation method.Next, zero-offset sections are made to compare the CNN and Radon methods across a larger amount of data.And finally, for the TopSeis data, amplitude variation with offset (AVO) intercept and gradient sections are calculated with the CNN-predicted near offsets versus without -this 'without' case is done by cutting out the near offset region of the data and treating it as if it were conventional data.The actual near offset values are used to calculate a ground-truth AVO intercept and gradient, in order to measure the improvement in the estimation of these quantities.A simple signal-to-noise ratio (SNR) estimation method is also used to measure the increase in resolution.
In testing the conventional data, the gathers are first converted into split-spread gathers via a reciprocity transform, because our network was trained on split-spread synthetic gathers.Next, the CNN and Radon methods are compared in reconstructing the near offsets of example gathers -this comparison is necessarily qualitative as there is no ground truth.Zero-offset sections are also made for the CNN and Radon predictions, and compared to the nearest-available offset section in the conventional data to check for reasonability.Finally, for this conventional data, AVO intercept and gradient sections are calculated with versus without the CNNpredicted near offsets, in order to look for improvements in noise level and resolution.SNR estimation is again used to measure these improvements.

Source-over-cable acquisition data (Barents Sea)
The synthetically-trained network is first tested on Top-Seis (source-over-cable acquisition; split-spread gathers) data acquired in the Barents Sea.This dataset consists of 1584 shot gathers collected across a section of approximately 61 km, with a shot spacing of 37.5 m.Similar to the synthetic data, the offset spacing is 12.5 m, and the time sampling rate is 2 ms.These shot gathers are sorted into 9264 CMP gathers, which are used as testing data for the CNN. Figure 11a shows an example input shot gather with the near offset section removed, and Figure 11b shows the gather with its ground truth near offsets retained (note again that the predictions are made on CMPs, but the results are sorted back into the shot domain).The prediction from the simple Radon method is presented in Figure 11c, and the prediction from the CNN is shown in Figure 11d.The Radon method is implemented in Python and involves a linear Radon transform using 700 equally-spaced samples spanning the angular range of −100˚to 100˚, followed by cubic spline interpolation across the near offset section.The Radon interpolation is similarly applied in the CMP domain, before the results are sorted back into the shot domain.Although this implementation of the Radon method is not as effective as those provided by modern industry standard software, it nonetheless provides an adequate baseline with which to compare the CNN results.Observe how the CNN prediction does a significantly bet-ter job of reconstructing the major reflections, including the water bottom reflection, and the reflections at approximately 440 and 840 ms.However, there is also steeply-dipping noise at approximately 620 ms and below that neither the CNN nor the Radon method is able to reconstruct, which is likely due to the source signature.Looking at the normalized mean absolute error (NMAE) for each method, the CNN achieves a value of 0.01796, whereas the Radon method returns a value of 0.0270, so the CNN prediction is about 1.5 times better.
In order to observe the overall near offset reconstruction quality, zero-offset sections are made with the ground truth data and the Radon and CNN predictions.Figure 12a shows the ground truth zero-offset section, Figure 12b shows the Radon-predicted zero-offset section, and Figure 12c shows the CNN-predicted zero-offset section.Parts (d) and (e) of Figure 12 show the difference between the ground truth and prediction for the Radon method and CNN, respectively.The amplitude spectra of the ground truth, Radon-predicted, and CNN-predicted data are shown in Figure 12f.Note that we scale the output of the CNN prediction by 1.2 and the Radon prediction by 1.9, to best match the ground truth amplitude spectra.Both the CNN and Radon predictions essentially reproduce the same seismic structure as the ground truth.However, the CNN prediction does so with finer detail and an NMAE of 0.0055 compared to 0.0135 for the Radon prediction, which is approximately 2.5 times lower in error.Additionally, the NMAE of 0.0055 is about the same as the errors from the shot gather-training example in Figure 7c (of 0.0058), and the synthetic Testing Swath example in Figure 9f (of 0.0042).This is promising given the significant differences between the synthetic and field data, such as northern North Sea velocities and a noise-free source being used for the modelling.Similar to previous examples, more reconstruction error is present around the water bottom and shallow subsurface, though there is also significant error localized in the water bottom multiple at about 800 ms.This may partly be due to higher shallow subsurface velocities in the Barents Sea (Houtz, 1980), compared to the northern North Sea model in which the synthetics were generated.Furthermore, a large amount of the error is low-frequency noise in horizontal stripes that extends upward in a repeating pattern about every 100 ms, which is again likely due to the source signature.
One of the main motivations for near offset reconstruction is to improve the estimation of AVO quantities via richer near offset information.Therefore, in order to test whether AVO quantities are more accurately estimated, we compare the AVO intercept and gradient calculated with the CNN-predicted near offset data versus the conventionallyacquirable part of the data.Because TopSeis data has zero and ultra-near offset traces, such a comparison can be made in relation to the ground truth AVO intercept and gradient.The workflow for the AVO comparison is illustrated in Figure 13.First an example CMP gather is extracted from an area of the data with more important geological features, namely the higher amplitude reflectors at approximately 9.5 km of offset and 480 ms of traveltime, which is outlined in Figure 13a.Then the CMP gather is NMO-corrected and muted beyond 30˚of incidence angle, in order to yield flat reflectors from which AVO quantities can be accurately calculated.This gather is shown in Figure 13b, with its zero/ultra-near offset data points (less than 112.5 m of offset) plotted in blue and 'conventionally-acquirable' data points plotted in red along a major reflector at 480 ms.These conventionally-acquirable data represent what a normal seismic acquisition would be able to record (only greater than F I G U R E 1 2 (a) Ground truth zero-offset section for the TopSeis field data; (b) Radon-predicted zero-offset section (note that the amplitude is scaled by 1.9); (c) convolutional neural network (CNN)-predicted zero-offset section (note that the amplitude is scaled by 1.2); (d) Radon prediction difference from the ground truth; (e) CNN prediction difference from the ground truth; (f) amplitude spectra of the ground truth zero-offset section (blue), the Radon-predicted zero-offset section (dashed red) and the CNN-predicted zero-offset section (green).The CNN prediction is scaled by 1.2 and the Radon prediction is scaled by 1.9 in order to best match the ground truth amplitude spectrum seen in this plot.
112.5 m of offset), and allow us to demonstrate the usefulness of the CNN-predicted near offsets in comparison to conventional data.The traces along the major reflector (480 ms) are extracted and plotted as wiggles in Figure 13c -observe how the amplitudes increase with offset (note that these are the ground truth values).
Then, in Figure 13d, linear regression is applied to these amplitudes (with angle of incidence as the x-variable) in order to estimate the AVO intercept and gradient.The intercept and gradient are estimated as the intercept and slope terms of the two-term Shuey approximation to the Zoeppritz equations (Shuey, 1985).The ground truth regression is shown with the blue line, which yields a gradient of 0.0021 and an intercept of 0.076.Another regression is performed using only the conventionally-acquirable data (or data beyond 112.5 m of offset), which is represented by the red line and yields a gradient of 0.0016 and an intercept of 0.090.Finally, regression is conducted using the CNN-predicted near offset data (previously scaled by 1.2 as explained in Figure 12), which is represented by the dashed green line and yields a gradient and intercept of 0.0022 and 0.074, respectively.Although the predicted near offset data (the green stars) is not exactly the same as the ground truth data (the blue dots), it provides a closer estimation of the AVO gradient and intercept along this reflector compared to only using the conventionally-acquirable data.
This workflow is next repeated for a larger region of the data, in order to check for aggregate improvement when estimating the AVO quantities.The selected region contains the extracted CMP gather in Figure 13 and is above the first water-bottom multiple -it is outlined in the lower right corner of Figure 14a.First, the ground truth AVO intercept and gradient sections are calculated using the same linear regression method in Figure 13d, but across all the reflectors in the extracted gathers -these are shown in Figure 14a,f, respectively.Next, the intercept and gradient sections are calculated using only the conventionally-acquirable data (beyond 112.5 m of offset) in order to emulate the results from a conventional seismic survey.These results are shown in Figure 14b,g, respectively.The difference between the conventionally-acquirable intercept section and the ground truth is presented in Figure 14c, and for the gradient section in Figure 14h.Following this, the intercept and gradient sections are recalculated, but this time including the CNN-predicted near offset data.The intercept and gradient sections for this case are shown in Figure 14d,i, and the difference plots in Figure 14e,j, respectively.The NMAE is calculated for the intercept difference plots, and it is found that including the CNN-predicted data results in an error of 0.01060, compared to 0.02070 when only using the conventionally-acquirable data.In the case of the gradients, the NMAE is 0.0166 when including the CNN-predicted data, compared to 0.0306 F I G U R E 1 3 (a) Entire TopSeis zero-offset section with the example common midpoint (CMP) gather location highlighted by the dashed red line; (b) CMP gather for amplitude variation with offset (AVO) analysis, NMO-corrected and muted beyond 30˚of angular incidence.Zero/ultra-near offset data (<112.5 m) is plotted with blue dots and conventionally-acquirable data (>112.5 m) with red dots, along a major reflector at 480 ms; (c) wiggle trace plot of the data from the CMP gather, again with blue and red colours to represent near and conventionally acquirable data, respectively; (d) linear regression of the amplitudes along the 480 ms reflector, in order to estimate the AVO intercept and gradient.The ground truth data and regression is plotted in blue, the conventionally acquirable in red and the convolutional neural network (CNN)-predicted in dashed green.
when solely using conventionally-acquirable data.Therefore, including the CNN-predicted traces results in about half as much error when estimating the AVO gradient and intercept, compared to using data from conventional larger offsets.
Note also that the intercept and gradient calculated with the CNN-predicted data appear to have lower noise levels and higher resolution than those calculated with just the conventionally-acquirable data.As an estimation of the noise level, the SNR is calculated by the ratio of the mean amplitude  to the standard deviation of the amplitudes  in the image, expressed in decibels (Young et al., 1998): The SNR values are calculated and listed in yellow boxes, and a moderate improvement can be seen in both the intercept image (−15.17dB, compared to −18.07 dB previously) and the gradient image (−26.96dB, compared to −29.04 dB previously).

Conventional data (northern North Sea)
A conventional two-dimensional dataset from the northern North Sea is now used to test the synthetically-trained network and demonstrate the way in which our method may be used in practice.This data is sampled at a 12.5-m interval in offset and 4 ms in time.Note that the temporal sampling rate of this data is lower than the TopSeis and synthetic data's sampling rates of 2 ms.The dataset consists of 2,936 shot gathers collected across a section of approximately 110 km, corresponding to a spacing of 37.5 m.These shot gathers are sorted into 17,382 CMP gathers to be used as testing data.They are then reshaped to 448 units in time and 40 units in offset and normalized in the same way as the synthetic F I G U R E 1 4 (a) Ground truth amplitude variation with offset (AVO) intercept section (area is highlighted in the red box of the survey in the lower right) calculated using linear regression on the true amplitude values; (b) intercept section calculated without the convolutional neural network (CNN)-predicted data (i.e.only using data at larger conventionally-acquirable offsets past 112.5 m).The signal-to-noise ratio (SNR) in decibels of this image is calculated using Equation ( 2) and listed in the yellow box; (c) difference from the ground truth without using CNN predictions; (d) intercept section calculated using the CNN-predicted data (in addition to conventionally-acquirable data), with SNR listed in the yellow box; (e) difference when using CNN predictions; (f) ground truth AVO gradient section; (g) gradient section calculated without using the CNN-predicted data, with SNR listed in the yellow box; (h) difference without using CNN predictions; (i) gradient section calculated with the CNN-predicted data, with SNR listed in the yellow box; (j) difference when using CNN-predicted data.
data -dividing each seismogram by the maximum amplitude observed in the data.Each shot gather has a (one-sided) near offset gap of 112.5 m (or nine traces).
This conventionally-acquired seismic data originally consists of one-sided shot gathers due to the gap between the source and receivers.To prepare the gathers for use in the network (which was trained on split-spread gathers), the principle of reciprocity is applied.This is where the source and receiver points are exchanged, and the wavefield is backpropagated in order to yield the response that is expected in the opposite direction.An example of a gather before and after the reciprocity transform is shown in the bottom left of the field-testing workflow diagram in Figure 10.Note that the reciprocity-generated left half of the gathers tends to be choppier, appearing undersampled compared to the right half.This is because the conditions for reciprocity are not perfectly met in the real world (Fenati & Rocca, 1984), with the likely culprit being how source and cable positions fluctuate with ocean waves.The undersampling could potentially be improved through regularization techniques, though this is outside the scope of this study.
Following the reciprocity transformation, shot gathers are sorted into CMP gathers, NMO corrected, and then fed into the synthetically-trained CNN (or alternatively, the Radon method) in order to output CMP gathers with reconstructed near offsets.These CMP gathers are then sorted back into shot gathers.A Radon-predicted shot gather with reconstructed near offsets is shown in Figure 15a, and the CNN prediction on the same gather is shown in Figure 15b.The CNN-predicted near offsets are not scaled by 1.2, as in the TopSeis exam-ple, because a ground truth amplitude spectra comparison for such a scaling is not possible.In order to focus on the reconstruction results that represent the actual data acquisition, we discard the left halves of the gathers and zoom in on the results in Figure 15c,d.Observe how the CNN prediction in Figure 15d follows realistic trends across the reflections, whereas the Radon prediction in Figure 15c is of lower quality and degrades when approaching the zero offset.Although the overall character of the Radon reconstruction is adequate, there are particular areas where it is less realistic, such as the water bottom reflection at about 450 ms and the primary reflection at about 1000 ms.Therefore, it is apparent that the neural network method produces more accurate results.
To observe the overall near offset reconstruction quality, zero-offset sections are made from the Radon and CNN predictions.However, because there are no ground-truth zero-offset traces in this conventional dataset, instead the nearest-available offset section is used for comparison.Figure 16a shows the nearest-available offset section, extending across the whole survey line of approximately 110 km.The location of the trace used to make the section is highlighted in blue on the example gather in the upper right corner.Figure 16b shows the Radon-predicted zero-offset section, and Figure 16c shows the CNN-predicted zero-offset section.The idea is that the zero-offset and nearest-available offset sections should be relatively similar in seismic structure and amplitude, as they are only separated by 112.5 m in offset.It is clear from Figure 16c that the seismic structure shown by the CNN prediction is more realistic than the Radonpredicted data (especially at the water bottom reflection).To determine the amplitude similarity, the Fourier transform of each offset section is calculated, and their spectra are plotted in Figure 16d.The amplitudes of the CNN's zero-offset predictions are slightly lower than for the nearest-available offset, and the spectra follow a similar pattern.This difference is likely due to the tendency for the CNN to return slightly lower predicted amplitudes compared to the ground truth (note the 1.2 scaling that was necessary to best match the spectra in Figure 12 for the TopSeis data).It is also important to note that the difference is more significant at higher frequencies, which is a common limitation with CNNs as they attempt to reconstruct high frequency details in images (Ayyoubzadeh & Wu, 2021).Because these higher frequencies are of lower magnitude and contribute less to the loss when training the CNN, this is an intuitive result.Regardless, the Radon prediction's frequency spectra is less realistic than the CNN's, especially differing in the dominant frequency range of the seismic data (approximately 10-60 Hz).
Building upon the AVO tests applied to the TopSeis data, we calculate the AVO intercept and gradient of the conventional data, with and without the CNN-predicted traces.A region with more apparent geologic features (such as dipping beds) above the first water bottom multiple is selected from the data.In the full section, this region is from about 55 to 95 km in offset and 300 to 1000 ms in traveltime, and is outlined in the upper right corner of Figure 17a.The AVO intercept and gradient are calculated in the same manner as for the TopSeis data -linear regression on the amplitudes from NMO-corrected and muted CMP gathers.The intercept and gradient calculated without the CNN-predicted near offsets (i.e. the original data) are presented in Figure 17a,c, respectively.Then the intercept and gradient are recalculated when including the CNN-predicted near offset traces (in addition to the conventional data), with the results shown in Figure 17b,d, respectively.Although there is no ground truth intercept and gradient with which to compare, we can observe that the noise levels have decreased in the sections after including the CNN predictions.This is corroborated by the SNR increasing from −40.94 to −33.38 dB in the case of the intercept image, and from −33.04 to −26.54 dB in the case of the gradient image.This result may suggest a more well-defined and accurate estimation of the AVO quantities.

DISCUSSION
The various synthetic tests demonstrated the robustness of the model and several important concepts.First, the comparison between a network trained on common midpoint (CMP) gathers versus shot gathers showed significantly lower reconstruction error (by a factor of 2.8) for the CMP gather-trained network.This is an intuitive result because of the lower number of traces to reconstruct (3 vs. 18), the greater number of CMP gathers as training examples resulting from the sorting process, and the smoother waveform trends across CMP gathers compared to shot gathers, as observed in the Figure 3 examples.It is also in keeping with previous studies that utilized transformation into the CMP domain to yield better interpolation results (Nurul Kabir & Verschuur, 1995;Wang et al., 2009).
Second, the test of using non-NMO-corrected CMP gathers in training the network resulted in essentially the same error compared to using NMO-corrected gathers.This finding suggests that the most crucial aspect of the presented method is the choice to train on CMP data, as opposed to training on shot gathers or applying NMO correction.In addition, the accuracy of the NMO correction was found to be relatively unimportant, as a simplified gradient velocity model used for NMO correction yielded similar reconstruction error.This is an intuitive result, as the negative impact of using incorrect root mean square velocities for NMO correction (provided they are not massively inaccurate) only manifests itself at farther offsets.However, it is still likely worth applying NMO correction to the data, as the impact of not applying NMO correction may be more significant for differing datasets (and especially for field data).For instance, the testing on the Testing Swath data and TopSeis data resulted in higher error, particularly for primary reflections at and near the water bottom.This may have been a consequence of differences in water depth and heterogeneities in the near surface, along with source signature effects in the TopSeis data.Furthermore, the velocity model used to generate the synthetics was inverted from northern North Sea data, which can differ significantly in velocities from Barents Sea data (especially in the shallow subsurface) (Houtz, 1980).Therefore, to be safe and ensure that the error was as low as possible, NMO correction was applied to the Testing Swath data and to the field data examples.Although it was shown that the shot gather-trained network yielded higher error than the CMP-trained network, it should be noted that the shot gather approach still produced high-quality interpolation results.Observing the shot-trained prediction in Figure 6b, major reflections and finer details are reproduced, with the differences from the target being largely indistinguishable to the naked eye.This provides evidence that the high-quality synthetic modelling, using a realistic velocity model and finite difference solver, was a major driver behind generating accurate near offset waveforms in the interpolation.
In the field data tests, the convolutional neural network (CNN) was able to generate more realistic near offsets than the simple Radon method.This was evident for the TopSeis example, where the normalized mean absolute error (NMAE) was a factor of 2.5 lower for the CNN prediction, and for the conventional example, where the CNN prediction was qualitatively more realistic.Another advantage of the CNN is that the prediction operation is much faster -taking approximately 0.1 s per gather, compared to about 20 s for the linear Radon transform and interpolation.When operating on a full dataset of thousands of gathers, the time conservation provided by the CNN method can be valuable.Furthermore, the predictions on the conventional data were still of high quality despite the data being at a different sampling rate (4 ms) compared to the synthetic training data (2 ms), suggesting the robustness of the network.Tests on the TopSeis field data also provided evidence that including CNN-predicted near offset data improves the estimation of the amplitude variation with offset (AVO) intercept and gradient.In addition, in the conventional data, the inclusion of these predicted traces led to noise reduction in the AVO intercept and gradient sections.As an early seismic processing step, our method could therefore not only prepare data for multiple removals and other processing algorithms but also potentially improve the results of subsequent quantitative seismic interpretation work.
Using a normalized error metric (the NMAE) enables a further comparison of the field data and synthetic data reconstruction results.For example, the TopSeis near offset reconstruction (for the zero-offset section; Figure 12e) yielded an NMAE of 0.0055, whereas the CMP network tested on the synthetic Testing Swath data (Figure 9f) yielded an NMAE of 0.0042.Additionally, the error levels were slightly lower than the shot gather-trained network's predicted zerooffset section (0.0058; Figure 7f), which was still a realistic prediction.The Testing Swath data is relatively close in character to the synthetic Training & Validation Swath data, with similar-looking shot gathers and a similar profile across the section.The TopSeis data, however, is significantly different in character from the synthetic data.This is especially true with regards to the noise patterns and source signature in the TopSeis data, which were like nothing included in the synthetic data, and with regards to the velocity differences between the Barents Sea and northern North Sea.Given these differences, the fact that these error levels are relatively close demonstrates the strong generalizability of the model.
The model yielded robust near offset reconstruction at significant depth (up to 1 s of two-way traveltime).In fact, most reconstruction error was localized around the water bottom and/or near surface (again perhaps due to variations in water depth and shallow subsurface velocities), and error tended to be lower at larger depths.This was observed in both synthetic and field examples.Because the important geologic features (especially with regards to hydrocarbon exploration) are located at greater depths, this is a promising result of the method.Extending the workflow for deeper seismic data would likely be as simple as regenerating the synthetic data at greater depths.
The demonstrated method resulted in accurate near offset reconstruction using a relatively simple machine learning model (the U-Net CNN).In recent years, more sophisticated methods have been developed for image reconstruction tasks, such as improvements on the U-Net architecture like the U-Net3+ (Huang et al., 2020), multidirectional CNNs (Abedi & Pardo, 2022), and vision transformers (Ali et al., 2023).So although the main focus of this study was on the geophysical workflow of utilizing high-quality synthetic data and interpolation in the CMP domain, future improvements could potentially be made by utilizing the same workflow with more advanced machine learning models.

CONCLUSIONS
A convolutional neural network (CNN) was trained to reconstruct missing near offsets in marine seismic data.A heterogeneous velocity model, created through impedance inversion of northern North Sea seismic data, enabled the generation of realistic synthetic training data through a finite difference method.Several tests were performed on synthetic testing data, in order to investigate the impacts of common midpoint (CMP) sorting, NMO correction, and variation of testing data on near offset reconstruction error.It was found that a network trained on CMP gathers yielded 2.8 times lower error than a network trained on shot gathers, that NMO correction made a negligible difference in error, and that testing on synthetics generated 10 km away within the heterogeneous velocity model still yielded realistic results.The CMP-trained network was tested on two field data examples, one being TopSeis (split-spread gathers with ground truth for near offsets) data acquired in the Barents Sea, and the other being conventional data acquired in the northern North Sea.The CNN's near offset predictions on the TopSeis data yielded 2.5 times lower error in comparison to a simple Radon transform interpolation method.Furthermore, estimations of amplitude variation with offset (AVO) intercept and gradient sections in the TopSeis data resulted in about half the error when including CNN-predicted near offsets, compared to only using the conventionally-acquirable offsets.In conventional data, the CNN-predicted near offset region was qualitatively more realistic than the predictions from the Radon method.In addition, the AVO intercept and gradient sections calculated using the CNN-predicted traces had lower noise and higher resolution compared to those calculated using the original data.Overall, the combination of high-quality synthetic training data for the neural network and interpolation applied in the CMP domain allows for realistic near offset reconstruction in both synthetic and field data examples.

A C K N O W L E D G E M E N T S
We thank Aker BP for permission to publish this study and the use of the TopSeis Barents Sea field dataset.We are grateful to CGG Earth Data for their permission to use and show the conventional northern North Sea dataset.We thank Aker BP, ConocoPhillips, the University of Oslo, and the Research Council of Norway for funding.We also acknowledge Vetle Vinje, Dennis Adelved, Thomas De Jonge, Saskia Tschache, Peter Bormann, Shelia Pinero, and Youfang Liu for their helpful feedback and discussions.

D A T A AVA I L A B I L I T Y S T A T E M E N T
Code and examples are available on GitHub at: https://github.com/orhuff/near_offset_reconstruction_CNN.The field datasets are confidential and cannot be released.

F
I G U R E 1 (a) Conventional marine acquisition diagram showing a source-receiver gap of 100-200 m; (b) TopSeis marine acquisition diagram illustrating the source-towing ship above the receiver cables; (c) example shot gather resulting from conventional marine acquisition, with a near offset gap between the red and dashed yellow lines; (d) example split-spread shot gather resulting from TopSeis acquisition.
The model is set to have a constant value for density, and the water bottom ranges from 100 to 130 m deep.As the modelling is conducted in 2-D, 'sail lines' are extracted from the velocity model to represent the path a seismic acquisition ship would follow.Lines are taken from two different regions, which are referred to hereafter as the 'Training & Validation Swath' (T&V Swath) and the 'Testing Swath'.These swaths are illustrated in Figure 2a, superimposed on a depth slice of the overall velocity model.Their areas are each 37.5 km long by 0.25 km across.Five lines, each separated by 50 m in the crossline direction, are extracted from the T&V Swath to generate the training data.Data from the Testing Swath is later used to test how well the model generalizes to different datasets, as this area is 10 km away from the T&V Swath within the heterogeneous velocity model.Example lines from the Testing and T&V Swaths with their interval velocities are shown in parts (b) and (c) of Figure

F•
I G U R E 2 (a) Depth slice of the entire velocity model with the Training & Validation and Testing Swaths highlighted in blue.Each swath's area is 37.5 km in offset by 0.25 km in the crossline direction, with five evenly spaced lines extracted from each swath used for modelling; (b) velocity model (interval velocities) of a line extracted from the Testing Swath; (c) velocity model of a line extracted from the Training & Validation Swath.Note the heterogeneity of the velocity models and the differences between them.T A B L E 1 Summary of the parameters used in the finite difference modelling.Depth = 15 m • Depth = 15 m • Width = 3000 m • dt = 0.25 ms • Hydrophone (pressure) receivers • Interval = 37.5 m • Interval = 12.5 m • Depth = 1000 m (downsample to 2 ms)

F
I G U R E 3 (a) Example modelled shot gather, with the conventionally-missing near offset region (18 traces, 225 m) outlined in yellow.A field TopSeis shot gather is presented in the upper left corner for comparison; (b) the shot gather in (a) with NMO correction applied; (c) a common midpoint (CMP) gather which contains the shot in (a), with the missing near offset region (3 traces, 225 m) outlined in yellow; (d) the CMP gather in (c) with NMO correction applied.Note how the CMP-transformed data in (c) and (d) have smoother patterns across the near offset region that are easier to interpolate, compared to the shot gather data in (a) and (b).The shot gathers are sampled at 12.5 m in offset, whereas the CMP gathers are sampled at 75 m.network, 3,625 data examples are utilized, with 80% (or 2,900) as training examples and 20% (725) as testing examples.

F
Workflow diagram showing (left) example training input images, (middle) the U-Net convolutional neural network (CNN) architecture and model training/validation loss curves, and (right) example output images.(Note that this diagram is for a network trained on common midpoint [CMP] gathers.For the shot gather case, the CMP and shot sorting steps are omitted and the inputs and outputs are shot gathers.)

F
I G U R E 5 (a) Example input to the network: a synthetic split-spread common midpoint (CMP) gather with the near offset region removed; (b) target CMP gather with the ground truth near offsets; (c) the predicted CMP gather; (d) the difference between the target and prediction (reconstruction error), with the normalized mean absolute error (average reconstruction error) delineated in the yellow box.Note that this network was trained on NMO-corrected CMP gathers.F I G U R E 6 (a) Target gather for the network trained on shot gathers; (b) prediction for the network trained on shot gathers; (c) difference between the target and prediction for the shot-trained network, with normalized mean absolute error (NMAE) listed; (d) target for the network trained on (NMO-corrected) common midpoint (CMP) gathers (note that the original predictions are CMPs but are sorted back into shot gathers); (e) prediction for the network trained on (NMO-corrected) CMP gathers; (f) difference between the target and prediction for the CMP-trained network, with NMAE listed.Note that the shot gathers in (a) and (b) show refractions because they were not NMO corrected.

F
I G U R E 7 (a) Target zero-offset section for the shot-trained network; (b) predicted zero-offset section for the shot-trained network; (c) difference from target for the shot gather-trained network's predicted zero-offset section; (d) target zero-offset section for the common midpoint (CMP)-trained network; (e) predicted zero-offset section for the CMP-trained network; (f) difference from target for the CMP-trained network's predicted zero-offset section.The normalized mean absolute error (NMAE) values are delineated in the yellow boxes.Note that the amplitude scale is smaller than in the individual shot gather example in order to better show the seismic structure.

F
I G U R E 8 (a) Example common midpoint (CMP) gather with exact NMO correction applied, along with the exact root mean square (RMS) velocity model for the entire section in the upper left corner; (b) the same CMP gather with approximate NMO correction applied, along with the gradient velocity model in the upper left corner; (c) the same CMP gather with no NMO correction applied; (d) the zero-offset section difference (between prediction and target) for the network trained on CMP gathers with exact NMO correction applied; (e) the zero-offset section difference for the network trained on CMP gathers with approximate NMO correction applied; (f) the zero-offset section difference for the network trained on CMP gathers without NMO correction applied.

F
I G U R E 9 (a) Target for the shot gather from the Testing Swath (10 km away from the Training & Validation Swath); (b) prediction from the network trained on (NMO-corrected) common midpoint (CMP) gathers on testing data from the Testing Swath (Note that the original predictions are CMPs, but they are sorted back into shot gathers.Also note that this is a different shot gather from the one shown in Figure 6.); (c) difference between the target and prediction in (b) and (a); (d) the target zero-offset section across all tested shots in the Testing Swath; (e) predicted zero-offset section; (f) difference between the target and prediction in (e) and (d).F I G U R E 1 0 Diagram outlining the field data testing workflow on TopSeis (split-spread) data from the Barents Sea (top row), and conventional data from the northern North Sea (bottom row).

F
I G U R E 1 1 (a) Input TopSeis field shot gather with the near offset section removed (note that the network predicts on common midpoint (CMP) gathers, but we sort the data back into shot gathers); (b) ground truth TopSeis field shot gather; (c) predicted gather from a simple Radon transform interpolation method; (d) predicted gather from the convolutional neural network (CNN); (e) difference plot for the Radon prediction; (f) difference plot for the CNN prediction.

F
I G U R E 1 5 (a) Prediction from the Radon method on an example shot gather (note that the actual input to the network is common midpoint [CMP] gathers, but we sort to and from shot gathers for visualization).The left half of the gather is generated via reciprocity prior to its interpolation; (b) prediction from the synthetically-trained convolutional neural network (CNN) on the same gather; (c) zoomed-in Radon prediction to show greater detail and discard the reciprocity-generated left half; (d) zoomed-in CNN prediction.

F
I G U R E 1 6 (a) Nearest-available offset section in the conventional data.The offset location is highlighted in blue in the example gather in the upper right corner; (b) the Radon-predicted zero-offset section, with its offset location highlighted in red in the upper right; (c) the convolutional neural network (CNN)-predicted zero-offset section, with its offset location highlighted in green in the upper right; (d) amplitude spectra of the nearest-available offset section (blue, a), the Radon-predicted zero-offset section (dashed red, b) and the CNN-predicted zero-offset section (green, c).

F
I G U R E 1 7 (a) Amplitude variation with offset (AVO) intercept section calculated without the convolutional neural network (CNN)-predicted near offsets (i.e. the original data).The location of the section is highlighted in the upper right corner; (b) AVO intercept section calculated with the CNN-predicted near offsets (in addition to the conventional traces); (c) AVO gradient section calculated without the CNN-predicted near offsets; (d) AVO gradient section calculated with the CNN-predicted near offsets.The signal-to-noise ratio (SNR) of each image calculated with Equation (2) is listed in the yellow boxes.Note the reduction in noise and improvement in resolution from including the CNN-predicted traces.