Review Paper: Methods of measurement for 4D seismic post‐stack time shifts

The estimation of time‐lapse time shifts between two, or several, repeated seismic surveys has become increasingly popular over the past eighteen years. These time shifts are a reliable and informative seismic attribute that can relate to reservoir production. Correction for these time shifts or the underlying velocity perturbations and/or subsurface displacement in an imaging sense also permits accurate evaluation of time‐lapse amplitudes by attempting to decouple the kinematic component. To date, there are approximately thirty methods for time‐shift estimation described in the literature. We can group these methods into three main families of mathematical development, together with several miscellaneous techniques. Here we detail the underlying bases for these methods, and the acknowledged benefits and weaknesses of each class of method highlighted. We illustrate this review with a number of time‐lapse seismic examples from producing fields. No method is necessarily superior to the others, as its selection depends on ease of implementation, noise characteristics of the field data, and whether the inherent assumptions suit the case in question. However, cross‐correlation stands out as the algorithm of choice based on the Pareto principle and waveform inversion the algorithm delivering best resolution. This is a companion study to the previous review of time‐shift magnitudes and a discussion of their rock physics basis.


I N T RO D U C T I O N
Historically, the business case for 4D seismic data was built using post-stack amplitude differences. As interpretation matured, more value was added by extending the analysis to prestack amplitude differences. Although post-stack time-shift analysis is less mature than amplitude analysis, it has become increasingly popular over the past eighteen years. Indeed, time-shift analysis has evolved from purely cross-equalization or residual migration techniques that 'clean up' amplitude differences, to a focus on their interpretative value, particularly as a measure of subsurface strain changes but also as a measure of velocity change. Importantly, the measurement * E-mail: c.macbeth@hw.ac.uk of very small (sub-sample) time shifts is achievable from 4D seismic data and interpretable as changes in the reservoir and overburden, as opposed to being, as previously thought, artefacts of acquisition or processing non-repeatability. Recent cross-equalization workflows have the capability to preserve small production-related changes by taking into account the variations that may corrupt the reservoir signal during seismic acquisition and processing, provided there is sufficient input on the changing near-surface conditions (we will describe these details in a future paper). In well-processed data, where adequate control of timings has taken place during processing, we have observed interpretable 4D time shifts for of processing workflows, overburden complexity and levels of non-repeatable noise. Despite the varied nature of the noise and distortion, time shifts can be successfully measured for repeat time periods of a few months to tens of years (Mac-Beth et al., 2019). We conclude that in some projects, time shifts may be easier to interpret than amplitudes and represent the more robust dynamic reservoir attribute (Byerley et al., 2006;Fehmers et al., 2007;Brain 2017). The list of potential benefits of time shifts to reservoir management is growing, with many excellent examples in clastic, chalk, hard rock carbonate, deep high-pressure high-temperature (HPHT) gas condensates, heavy oil reservoirs and even fields with thick stacked sequences of thin flow units (Tatanova 2018). Although to date the most popular motivation to use time shifts is geomechanical assessment, particularly of the overburden, there are a multitude of other benefits cited such as monitoring re-pressurization by gas or water injection; assessment of compartmentalization; delineation of flood movement or gas exsolution; and use as a seismic production logging tool to evaluate well perforation performance. In our companion paper on the topic of post-stack time-lapse time shifts , we reviewed and discussed published literature available to the authors to assess the magnitude of those time shifts originating from geomechanical or saturation-driven effects. Rock physics calculations helped us understand the link between the mechanisms of reservoir fluid flow and these small shifts in 4D seismic data. In this review article, we describe the many varied algorithms for measuring time shifts in 4D seismic data, the current limitations of these techniques and the future of time shift estimation. Despite the move towards prestack analysis, we will focus this review mainly on post-stack data.
The analysis of dynamic time shifts as shared with many outside disciplines such as speech processing, satellite imaging, robotic vision, sonar, radar, brain scanning and the film and gaming industry, and there has been some degree of crossfertilization between these disciplines and our own. From a historical perspective, time-shift measurement tends to be based on manual or automated picking of common events on the baseline and monitor, and interpretation of these misalignments. For example, Larsen and Lie (1990) published their 1989 observations of an underground blowout in the Frigg field, North Sea, which had very large time shifts (easily visible when the event picks are shown side by side). Guilbot and Smith (2002) used differences in event times to conclude that the overburden has stretched due to the compacting Ekofisk field. Travel-time differencing can be fairly accurate for large time shifts of a few milliseconds or tens of milliseconds but clearly requires human interaction, thus making measurement of these time shifts time-consuming. The time-shift analysis terminology developed in other domains is frequently referred to as 'warping'. In this review, we will therefore assume that 'time-shift analysis' and 'time warping' are synonymous. An automated approach to '3D warping' (time-shift and lateral shift) was first proposed by Hall et al. (2002), who applied a cross-correlation method to the compacting chalk reservoir of the Valhall field. These authors align discrete horizons dynamically and the time shifts interpolated between key events to fill the volume constrained by a deformable mesh. In this technique, the focus is still on isolated events rather than on the entire seismic cube. Thus, we cannot adequately cover highfrequency information for velocity changes using this technique. Subsequently, more automated methods have evolved to measure the entire trace. Despite shifts between the baseline and monitor volumes being strictly three dimensional in nature, the majority of the new techniques consider only vertical time shifts. All of the methods reliably estimate time shifts to sub-sample accuracy if noise permits, but each claims to offer certain advantages in terms of efficiency, stability and accuracy. The time-shift volumes, whilst most certainly containing considerable information content relative to the early development of 4D acquisition and analysis, still exhibit a certain degree of error and uncertainty. The intention of this review is to create a guide to the mechanisms of the most popular and technically promising algorithms. Application of the methods will be illustrated by field examples from a variety of different geographical settings available to the authors. Figures 2 to 12 will show these examples as variously applied to the data sets detailed in Appendix A. The performance largely depends on implementation, parameter settings and the user. We will not compare and contrast the individual methods in depth or discuss errors and uncertainty (the topic of a future follow-up paper). We begin with a description of the context in which post-stack time shifts are considered.

The objective of dynamic time-shift analysis or warping
In time-lapse seismic analysis, the baseline survey is ideally recorded before production or injection into the reservoir. The monitor survey then images the subsurface when the velocity (and density) has been moderately perturbed by the action of production or injection into the reservoir. During processing of the (pre-production) baseline and monitor data to final stacked images, it is common to migrate both data sets using identical velocity models. This is a natural choice determined by time restrictions, ease of processing and lack of information on the actual perturbations. The monitor survey is therefore migrated with the incorrect velocity model (supposing, of course, that the correct velocity model has indeed been found for the baseline data). There have been studies on the sensitivity of stacked images to incorrect choices in velocity model and the reliability of the information content of the final stacked images (for example, Grubb et al., 2001;Poliannikov and Malcolm 2016). These indicate that if the velocity corrections in the model are smooth, the continuity in the stacked amplitude of structural events remains largely more stable than their positions. Thus, for stacked volumes interpreted in the time or depth domain, the main effect on the monitor image is a shift both vertically and laterally with respect to the baseline. As time-lapse data are frequently interpreted in the time domain, the vertical shifts are referred to as time shifts. These shifts represent a cumulative response as a result of integration of the velocity perturbations along the seismic wavepath, and are therefore time variant and anticipated to be smooth in nature. For time-lapse studies with a slowly varying velocity change in the overburden, a relationship between lateral and vertical shifts can be determined Hale et al., 2008). These show that time shifts from a compacting reservoir are pointing downwards, whilst lateral shifts point outwards. The imaging conditions above are largely satisfied in practice as velocity changes due to geomechanical deformations in the overburden from a compacting (due to pore pressure depletion) or expanding (due to pore pressure inflation) reservoir are typically long wavelength in nature and small (up to 0.5%). By contrast, reservoir fluid-flow changes or pore pressure due to production or injection are generally confined to flow units that are very thin relative to the overburden, and the associated velocity changes are considered to have a higher spatial frequency. These changes are typically up to 5% in magnitude  and are particularly large around wells. Shifts are most commonly observed at base reservoir in a thick sequence (for example Valhall) or in the underburden. At the top of the reservoir interval itself, there are strong changes in reflection amplitude and also the possibly of phase changes as waveforms constructively or destructively interfere due to 4D tuning between the baseline and the monitor. Thus, a clean separation of amplitude from time-shift effects becomes harder to see at the reservoir level. Discrete velocity perturbations may also be present in the overburden due to the presence of leaked gas, injected gas or cuttings, and fault re-activations or stress re-distributions causing potential drilling hazards (Røste and Ke 2017;Borges et al., 2020). These perturbations may generate offset variations in both time shift and amplitude due to scattering, and thus impact the post-stack response.
It is usual in most 4D seismic interpretations to ignore the lateral shifts and to focus solely on the time shifts (denoted as t in this paper). These time shifts, t, are commonly estimated on the post-stack images in the time domain, and this will be the predominant focus of the current review. The usual practice is to first measure the time shifts and then correct the monitor data for these shifts, before differencing the volumes to derive corrected amplitude changes ( Fig. 1 gives an example for a single trace extracted from an observed seismic volume before and after warping) and an independent volume of time shifts to carry forward through the interpretation workflow. We will discuss towards the end of this paper the limitations of this post-stack flow. This imaging-based discussion of 'time' shifts and migration is slightly misleading, since if the correct velocity model for the monitor is employed, time shifts would still arise. However, if the stacked images in the depth domain are considered instead, events would now align provided no significant depth strain is present. Depth shifts would thus appear to be a more natural choice for comparison purposes. However, the reference body of time-shift measurements is extensive, and it is too late to reverse this trend.
To describe the techniques for time-shift estimation, we first consider the most popular mathematical model for poststack, time-domain data. Note that this is not the only model possible, but the one adopted by the majority of studies. In this, the monitor survey data m(t) are defined as a time-shifted version of the original baseline traces b(t) (defined as the preproduction signal g(t) plus noise n 1 (t)), plus an underlying time-shifted, reservoir-related signal s(t). Thus, if where t = f(t) is the time-variant time-shift function, s(t) is the reservoir signal that may consist of the amplitude change, and n 1 (t) and n 2 (t) are the noise on the baseline and monitor traces respectively. Both t and s(t) are the desired outputs from the methods. Time shifts range from a fraction of a millisecond to tens of milliseconds, largely dependent on the reservoir mechanisms and field properties . For a reservoir that is highly compressible, deformations will persist into the overburden and time shifts are predicted to evolve in a roughly linear fashion from the near surface to reservoir. For fluid changes confined solely to the reservoir, Figure 1 Baseline (B) and monitor (M) trace with the corresponding time shift t that warps one onto the other. Traces and warp are from data of an observed reservoir saturation change. The aim of the variously published methods is to provide an exact mapping between the two traces, and to recover a representative time shift that produces the mapping. M C represents the monitor trace corrected for the dynamic time shift.
the time shifts will start at the reservoir level. An identical effect is observed for pore pressure increases surrounding injectors. Detailed and accurate measurement of time shifts is hampered by the limited bandwidth of seismic data, noise and the oscillatory nature of the seismic traces. The solution to the extraction of a reliable measure of t and s is accomplished by imposing different assumptions on (1) as discussed below. In the following sections, we describe how each category of method copes with the time-shift measurement, and their initial underlying assumptions. Unless otherwise specified, most methods apply the following assumptions: • The changes in the reservoir properties are focussed mainly on the (P-wave) velocity • The reflectors where these changes take place are subhorizontal • The seismic traces are observed at near-zero offset • The velocity varies smoothly laterally In the case where we choose to migrate the baseline and monitor seismic volumes, with the same velocity field, the latter three assumptions imply that the relative time shifts are vertical. We may therefore perform our 'warping' calculations on a trace-by-trace basis (or groups of traces), vertically.

T I M E -S H I F T M E A S U R E M E N T A N D M E T H O D S
Methods in the literature split broadly into three distinct categories based on (1) the cross-correlation approach, (2) a firstorder Taylor series expansion and (3) non-linear inversion that fits the entire waveform. A fourth group of miscellaneous methods exploit a range of innovative signal processing techniques that do not readily fall into these three main categories. Methods are defined as miscellaneous if they relate to a singular application and have not yet been broadly implemented, and are thus more difficult to evaluate for field data application. Most notable of these is dynamic time warping (DTW; Hale 2007Hale , 2009Hale , 2013, for which some field data examples are recently emerging. This is significant as the majority of the other miscellaneous methods are window based (with the exception of the waveform inversion methods), being applied vertically on traces in the time domain. Another category of methods points to an attempt to incorporate image consistency in the methodology, taking into account the ray paths of non-normal incidence waves and complex structural dip. Finally, there are methods that measure the full 3D warping, that is, lateral and vertical shifts, either simultaneously or in sequence. Although infrequently applied in practice, these Table 1 Tabulated summary of published time-shift measurement methods in the recent geophysical literature with some historical additions to  add perspective   Category  Description  Reference   1  Discrete time shifts  Pick common events, baseline and monitor  Various  Cross-correlation  2 Cross-correlation, running window Knapp and Clapp (1976) and various 3 Cross-correlation, running window Nickel and Sonneland (1999) 4 Fast local cross-correlations Rickett et al. (2006) 5 Weighted fast local cross-correlations Hale (2009)  6 Phase correlation Tomar et al. (2016) Taylor series approximation 7 Non-rigid matching Nickel and Sonneland (1999) 8 Correlated leakage Whitcombe et al. (2010) 9 Least-squares solution Hatchell et al. (2003), Aarre (2006) 10 Multi-vintage Naeini et al. (2009) 11 Taylor series plus amplitude term Lie (2011)  12 Multi-scale and iterative refinement of optical flow (MSIROF) Zhang and Du (2016) Waveform inversion 13 Non-linear optimization Rickett et al., (2007) 14 Non-linear optimization with Gaussian reconstruction Nguyen (2018)  15 Inversion of time shifts and coupled amplitudes, inversion to relative velocity change Williamson et al. (2007), Keho (2014 et al., (2014) 16 Williamson including the density term Image-consistent time-strain analysis Hoeber et al., (2016) 20 Plane-wave deconstruction with spectral decomposition Phillips and Fomel (2017) 21 Local similarity Fomel and Jin (2009 22 Plane-wave destruction filters Phillips and Fomel (2016) 23 TT transform Naeini and Hoeber (2013)  24 Short-window variance method Zhen et al. (2016) 25 Third-order statistics Nikias and Pan (1988) 26 Mutual information and signal envelope Donno et al. (2017) 27 NRMS metric Lecerf et al. (2015) Space and time warping 28 Warping in X-Y space and two-way time, nodes at discrete reflectors, 3D cross-correlation, interpolation of warp vectors Hall et al. (2002) 29 Speedup in 3D cross-correlation Hale (2007) 30 Taylor series approximation to determine 3D warping Aarre (2006) 31 Extension of dynamic time warping, termed dynamic image warping Hale (2010) methods are important as they could close the loop with residual migration methods and merit further attention. All available methods published in the open literature to date are listed and referenced in Table 1, and the main categories described in more detail in the sections below. Our intention is to summa-rize the methods using easy to follow mathematics and thus provide a guide to the reader writing their own code and applying these to their own data set. In addition, for reference, several methods are chosen and applied to four test data sets A, B, C and D with different signal-to-noise characteristics and production effects (see Appendix A for summary details), and the results are shown in Figs. 2, 3, 4 and 5 for each data set. Each method is applied with optimal parameters, and has been fairly treated.

Methods based on local cross-correlation
The most widely implemented time-shift measurement methods are based on a local cross-correlation. Cross-correlation, or XCR for short, is the basis of many operations used to filter, process or align seismic traces. Examples of early work on the use of correlation for time-shift analysis of waveforms include that of Knapp and Carter (1976) for sonar applications. In seismology, cross-correlations between waveforms on different sensors have long been employed to estimate delay times (for example, VanDecar and Crosson 1990). The more recent time-lapse seismic applications typically assume that the monitor data can be expressed solely as a time-shifted version of the baseline data: that is, for each trace, m(t) = αb(t + t) + n(t), where α is a scalar constant and n(t) the non-repeatable noise term. Thus, time-lapse changes in the reflected amplitude or waveform expressed via the reservoir term s(t) in (1) are assumed to not impact the cross-correlation significantly.
An additional, critical limitation for cross-correlation is that only a t smaller than half of the dominant seismic period T can be measured as larger values cannot be resolved due to cycle skipping (for example for 25 Hz data, T = 0.04 s and t < 20 ms). The time shift must also be slowly varying (∂ ( t )/∂t 1). The most common implementation of XCR involves windowing the baseline and monitor traces by a sliding window of half-width N sample intervals (an odd number of samples is best such that the window centre t = t k is located on a sample). For each window, the cross-correlation coefficient C(τ j ) is computed for trial (integer) lags, τ j, introduced into the monitor trace relative to the baseline, and the value τ j = τ max that gives maximum correlation is determined. This lag then becomes the desired time shift t = τ max at that window location. Next, the window is moved to a new location and the process repeated. This process continues along the entire trace until the dynamic time shift is defined. For a window of half-width N samples, centred at the discrete two-way time t = t k , the desired cross-correlation function is ( 2 a )   or where w(t i ) is a window weighting function designed to prevent end effects, and the subscripts on the baseline trace b and monitor trace m in (2b) refer to the particular position of the trace value. The j index indicates the discrete lag, whilst the k index gives the location of the window centre. The output of this equation is a cross-correlation panel of lag versus two-way time for each trace. Typical weights include the triangular function (zero at either end of the window, with a linear ramp to unity at the mid-window position), the Hanning function (cosine tapered to zero at either end) or a Gaussian function. The Gaussian weighting function provides optimal resolution (Hale 2006). Typically, several cycles of the traces are included within the window. For long windows, local amplitude variations can bias the results when searching the cross-correlation panel. To avoid this, the normalized cross-correlation coefficient is used. Note that if the crosscorrelation lag is evaluated 'on the fly' for each window position, normalized cross-correlation will have no effect. Such a computation, however, gives results that are noisy and lack-ing robustness. Bias, artefacts and noise in the results can also arise if the window is too small or for an incorrectly chosen weighting function. The key to satisfactory use of this method is in the choice of window size and step length, which determines the balance between instability and accuracy for any particular data set, frequency content and set of time-shift values (Hodgson 2009). This is usually accomplished by trial and error. A long window size provides smooth and robust solutions but delocalizes the measurements. This window suppresses noise, but lacks resolution, smears across zones and can under-predict values at top reservoir where sharp changes need to be resolved. The various choices made during the application of cross-correlation and their subsequent impact on the time-shift calculations are discussed by Hodgson (2009). To progressively enhance resolution, an iterative multi-scale approach may be followed. Here a long window length is first applied to access long wavelength time shifts, and then the monitor data corrected for the derived time shift field. A second application of the method with a slightly shorter window then picks up finer, short wavelength detail, and the monitor data once again corrected. This process is repeated several times until the desired level of detail is obtained or noise in the results appears. Figure 6 gives an example of this iteration and what may be achieved at each step. The results for both time shifts and the corresponding corrected 4D data are often subtle, but are particularly relevant when chasing small-scale features of importance. An alternate to this approach, analogous to full-waveform inversion (FWI) implementation (FWI involves the inversion of the diving wave components of the seismic waveform), is to start with data filtered to a lowest frequency passband (say 5Hz), be-fore progressively increasing bandwidth in a multi-scale fashion. Finally, in order to gain stability and spatial smoothness across the time-shift volume, in most implementations crosscorrelation functions from several neighbouring traces (both crossline and inline) are summed before the maximum is calculated. In practice, a 5 × 5 spatial window around the trace of interest is sufficient to ensure adequate lateral smoothing. Local windows can make cross-correlations sensitive to dominating high amplitude events and spurious crosscorrelation peaks. To avoid this, the normalized crosscorrelation function can be used instead (Hale 2009). Finally, for any window-based technique that constructs a metric based on a limited number of samples, we must look at statistical significance (Huang et al., 2012). In the data volume, it is possible to keep track of the individual time shifts and corresponding peak cross-correlation values. A cross-plot of these two quantities allows the user to filter out those time shifts with lower confidence (hence lower correlation coefficient for a particular window size). This technique can be of interpretative value. Figure 7 shows how filtering with the maximum cross-correlation values may assist in adjusting our confidence in the time-shift results. Accepting only results with specific cross-correlations above a defined threshold creates holes and stripes in the results relating to low signal-to-noise values in the data. The final result may, however, be less acceptable for further interpretation.
One major drawback of the local cross-correlation approach is that it is computationally time-consuming to apply on large data volumes, and thus a compromise between stability, speed and resolution must always be sought. Several improvements have been suggested for this well-recognized problem. In particular, Rickett et al. (2006) introduced a fast local correlation using a recursive formula to calculate crosscorrelations at successive window positions without the need to completely re-compute (2) at each new position. Thus, the cross-correlation coefficient C j,k+1 at position t = t k+1 is related to the cross-correlation C j,k evaluated at window position t = t k by the recursive relation Equation (3) provides a fast algorithm for computing local cross-correlations centred at all N k values of k across the entire 3D data volume. According to Rickett et al. (2006), the cost of calculating the local cross-correlation explicitly as in (2) is proportional to N k N 2 , but this reduces to N k N using (3) (Hodgson 2009). Furthermore, the recursive formula can also be applied to subvolumes containing several neighbouring traces either side of the windowed trace of interest, such that summed correlations within subvolumes can be treated with relative efficiency. A similar cost reduction is achieved by Hale (2006) using Gaussian windows for the local correlation and re-interpretation of the problem as recursive Gaussian filtering. In this solution, the computational cost is now independent of window size. Another potential disadvantage of cross-correlation is that the maximum is often not accurately determined at the desired resolution. This may be improved by down-sampling the data or interpolation between samples function to locate the inter-sample peak. This typically consists of a quadratic function fit in the least-squares sense to correlation values bracketing the maximum. To improve on the definition of the maximum in the cross-correlation, Tomar et al. (2016) employed the phase correlation approach originally developed for satellite imaging. The phase correlation method constructs the inverse Fourier transform of the cross-power spectrum with the amplitude spectrum set to unity. This essentially whitens the amplitude spectrum of the cross-correlation function, retaining only the phase, and creates a sharp and distinct correlation peak. To extend the method to sub-sample accuracy, the result must be interpolated, and for this they use the analytic approach of Foroosh and Balci (2002). In all of the modifications above, interpolation does still incur additional computation time. Figures 2(a), 2(b), 3(a), 3(b), 4(a), 4(b), 5(a) and 5(b) show applications of fast cross-correlation and phase correlation to our four test data sets in Appendix A. All computations were iterative and based on decreasing window size from 800, 600, 400 to 200 ms. Both techniques give similar results, although phase correlation is more sensitive to low signal-to-noise data sets (such as D).

Methods based on a Taylor series expansion
Another set of methods rely either implicitly or explicitly on a first-order Taylor series approximation of the image signal. For this approach, it is assumed that the monitor survey is the time-shifted baseline data, that is, Then, for small time shifts that are slowly varying functions of t, the time-shifted data b(t + t) can be expanded as a Taylor's series about the original baseline data b(t), retaining only the first-order term where terms of second order or higher in t are ignored. The s term in (1b) is also ignored. The paper of Horn and Schunck (1981) from the field of artificial intelligence is often cited as an early example of solving for the problem of image registration using this approach. For the 4D seismic problem, Nickel and Sonneland (1999) use an algorithm called rigid matching that is similar to Horn and Schunck (1981), although their first paper does not explicitly reference this paper. Nickel and Sonneland (1999) introduced this in the context of cross-equalization of data acquired from non-repeatable acquisition and processing. Exact details are not given, the authors mention that the method can also be applied to estimate lateral shifts. In a later paper by Nickel et al. (2003), Horn and Schunck (1981) are cited in the application to vertical warping, and the method is applied to the compacting chalk reservoir of Ekofisk. In a following study, Aarre (2008) provides details of the algorithms used and extend the method to 3D (inline, crossline and vertical shifts), before further applying the technique to time-shift interpretation on the Norne field. More recently, Zhang and Du (2016) propose a multiscale, iterative refinement of optical flow (MSIROF) in the general context of seismic image registration, RMS analysis and gather flattening. The method is used to align data both in time and laterally, and in principle works in three spatial dimensions and with temporal snapshots -although only lateral and time-shift examples are shown in this publication.
The multi-scale aspect of the algorithm, in which we access successively coarser scales, surmounts the difficult problem of measuring large time shifts and overcomes cycle skipping. By repeated application of the method, the shift estimates can be refined, avoiding the problem of Taylor series methods only being applicable to small shifts. This particular multi-scale algorithm breaks down the data into tiered, resampled volumes, starting with the coarse scale before then moving to the next finest scale until completion (similar to the multi-scale window approach above for cross-correlation). Finally, a multivintage approach is also included, where many monitors may be involved in the computation. The use for 4D seismic data analysis specifically is demonstrated by comparison with several other algorithms for time-shift estimation (XCR and correlated leakage method (CLM), see below). Kanu et al. (2016) compared MSIROF with four other techniques (XCR, nonlinear inversion (NLI), and DTW -again, see below for an explanation of these acronyms). These are applied to synthetics generated from a Gulf of Mexico subsalt model. Returning to the original 1D Taylor series expansion in (4), there are a number of other methods based on this expansion. First we re-write (4) as whereḃ is the time derivative of the baseline trace. To estimate the time-variant time shift t, one straightforward solution is to minimize the least-squares objective function, J, between the monitor and baseline data with respect to the lag t, using the approximation in (5) in place of the monitor data m(t) (Hatchell et al., 2003) where J k,j is calculated within a running time window of halfwidth N samples and centre at t = t k . As before, the data can be weighted to avoid end effects. The least-squares solution to minimize J k,j is determined from (5) by multiplying both sides of the equation byḃ where the summations are again over the time window of interest. The original Taylor series estimation of Hatchell et al. (2003) employs a triangular filter within the time gate and also an areal average of several traces to stabilize the result (similar to the cross-correlation computation above). Note that Taylor expansion solutions such as in equation (7) will tend to underestimate time shift because the noise term is not cancelled in the denominator. Aarre (2006) proposed a similar equation and the use of 1D Taylor series expansions to solve 3D warping (inline, crossline and vertical shifts) by iteratively computing each dimension in sequence. As an improvement, Naeini et al. (2009) develop the Taylor' series expansion approach for multi-vintage analysis -that is a baseline and several monitors. They exploit the requirement for forward (between monitor 1 and monitor 2) and backwards (between monitor 2 and monitor 1) solutions to agree. This is achieved by forcing this consistency subject to smoothness constraints applied via Lagrange multipliers, from which they are able to find a global solution (which they call the Lagrange-Taylor solution) applicable to a large number of monitor surveys. Whilst the approach uses the power of multiple data sets and large combinations for time-shift calculation to combat noise, it still suffers from the small time-shift limitation of the Taylor series expansion. For completeness, the Taylor series expansion approach can be further extended to include the possibility of a slowly varying, time-variant amplitude scaling term ε which can be approximated to first order in t as Minimizing the objective function of equation (6) again with respect to t and the amplitude scalar ε yields two equations that can be solved to determine the t and ε independently: and where the summation is, as before, over a weighted time window (the weighting term is not included for reasons of compactness). The approach allows for the possibility of timelapse amplitude terms to be calculated simultaneously with the time shifts, without the need to link these terms explicitly by velocity change. The method assumes that the amplitude scalar is also smoothly varying, which is a poor assumption. In a later section, we will show a method in which time shift and amplitude can be simultaneously inverted to yield a common velocity change.
An alternate approach exploiting the Taylor series expansion is given by Whitcombe et al. (2010), where equation (4) is first rewritten in terms of the difference data: Next, the average of the baseline and monitor data λ(t) = [m(t)+b(t)]/2 is time-shifted by a known quantity T and differenced from its original non-time-shifted form On the additional assumption of Tλ(t ) ≈ T.ḃ(t ), which is valid for small time shifts, the cross-plotting of m(t) − b(t) against λ(t + T) − λ (t) yields the desired time-shift estimate relative to a known user-defined reference shift (i.e. t/ T) as a gradient. According to Whitcombe at al. (2010) the calculation of the gradient of the cross-plot to obtain time shift avoids problems of zeros or noisy derivatives usually encountered with other techniques. Numerical implementation can use a 1D, 2D or 3D window of trace segments added into the cross-plot and gradient evaluation to suppress noise at the cost of increased computational time. The approach is given the name correlated leakage method (CLM). It is stated that CLM can be modified to also estimate lateral shifts, although no publications have been found that apply this extension. The appropriate working limit is for time shifts of magnitude less than 10% of the dominant seismic period ( t < T/10); thus, the method should be primed by other methods such as the cross-correlation if a wider working range is desired. Figures 2(c), 3(c), 4(c) and 5(c) show applications of the Taylor series method of Hatchell et al. (2003) (TSH) to our test data sets, whilst Figs. 2(d), 3(d), 4(d) and 5(d) show the results of the correlated leakage method (CLM) of Whitcombe et al. (2007). Again, all methods are applied in an iterative fashion (800 → 600 → 400 → 200 ms windows). TSH utilizes a window of eight traces whereas CLM uses seven traces, which result in the benefit of spatial smoothing. There are clear apparent differences in results, and a problematic data hole in the TSH results for data set C. We will return later below to more techniques that use the Taylor series expansion either implicitly or explicitly.

Waveform-based estimation (non-linear inversion)
Waveform inversion-based techniques, beginning with tomography methods (for example, Pratt et al., 1998) and evolving to full-waveform inversion (see, for example, Warner et al., 2013) are widely used for seismic velocity analysis and inversion. In the context of 4D time-shift estimation, waveform inversion was first proposed by Rickett et al., (2007a,b) in a method called non-linear inversion (NLI). Here, use is made of standard non-linear optimization techniques to match the monitor volume m to the baseline volume b via an inverted and regularized time-shift volume. Unlike previously described methods, this approach operates on the entire volume of traces rather than a specific window and therefore seeks a global solution (we will see below that DTW also seeks to provide a simultaneous windowless solution). The problem is framed as the estimation of a time-shift volume t that, once applied, will minimize the misfit between baseline and monitor seismic waveforms on the assumption that the monitor volume is a time-shifted version of the baseline trace. The misfit function F R is computed as the square of the L2 norm of the difference between the time-shifted monitor volume and the original baseline data where f is an operator that applies the appropriate time shifts t at each sample to the monitor data (see the later section below on time-shift correction, there are several possibilities). Rickett et al. (2007b) chose for optimization a descent-based Gauss-Newton method that searches for a local minimum by linearizing around the solution (primed by a linear preconditioned conjugate gradients solver), followed by further rounds of update and iteration. Gauss-Newton in many ways resembles the first-order Taylor expansion from the previous section, in that the time-shift operation is linearized, but is clearly multi-dimensional in nature. In this procedure, the time-shift volume is iteratively updated, such that the n + 1th iteration is given relative to the nth via where r is the residual f (m) − b, and J is the derivative of the residual with respect to each time shift. In an application to the Genesis field, Gulf of Mexico, Rickett et al. (2007b) found that five iterations of this approach are sufficient to recover the desired time shifts. To obtain a robust and optimal solution, three smoothness constraints are added to the objective function given by where α, β and γ are the regularization weights (or Lagrange multipliers) that determine the balance between smoothness and stability. The two constraints that control the lateral gradients of shift via the differentials ∇ x and ∇ y are imposed for smoothing along the inline and crossline directions to obtain consistent time shifts from trace to trace. The vertical constraint is controlled by the second derivative of the time shift and the operator ∇ t 2 . This helps to provide stable estimates of time strains -as the weighted second derivative of time shift needs to be minimized to ensure that the first derivative is stable. The three weights together control the relative weighting of the smoothing constraints, and their values are usually set through trial and error until the best result is obtained. In practice, the regularization weights depend on the absolute data values. To avoid this, the data should be normalized with respect to the 99th percentile to ensure comparison across many data sets in a portfolio. Also, as the regularization can also change the magnitudes of the resultant time shifts, it should be conservatively applied. As with most waveformbased methods, this approach can experience the usual issues with cycle skipping for t > T/2 (where T is the dominant period of the seismic data), and thus it only works well for small time shifts. For example, for 25 Hz data, this would translate to t > 20 ms. Rickett et al., (2007a) observed that NLI was better at resolving discontinuities than cross-correlation. Indeed, in an application to the Genesis field, they found higher definition in the thin sands above the reservoir. Another approach to regularization of this method is to use a Gaussian reconstruction method for the time-shift field (Nguyen 2018). Figures 2(e), 3(e), 4(e) and 5(e) show examples of the application of NLI2D to our test data. We define NLI2D as the inversion with the full vertical and lateral constraints in place.
The constraints are first-order Tikhonov with appropriately chosen weights. Figure 8 shows the impact of the regularization terms and demonstrates the necessity to choose correctly.

Waveform-based inversion with amplitude coupling (non-linear inversion variants)
In many time-shift methods, an often under-stated error may arise from failing to take into account changes in the reflection amplitudes leaking signal into the time-shift estimation. Thus, the correct time shifts are not necessarily the ones that minimize the amplitude differences in equation (14), even when the changes in the subsurface are purely in velocity (i.e. no density change). This is because time-lapse changes of velocity affect both the kinematics of the reflection events and their individual magnitudes and hence wavelet interferences. This matters in particular when we are measuring small time shifts. MacBeth et al. (2016) show that the error depends on specific subsurface contrasts, and these determine the circumstances under which such phase distortions will masquerade as time shifts. Errors are unlikely to arise in the overburden or underburden as velocity changes are generally small (0.1% to 0.5%) unless we are monitoring clouds of gas saturation. However, time-shift errors do become an issue in the reservoir zone when the reservoir thickness is close to tuning thickness and can be up to 1 or 2ms. It is anticipated that as 4D seismic technology is applied to increasingly challenging fields, we will be compelled to measure smaller and smaller time shifts, and such errors will become increasingly more visible. A first attempt to consider this amplitude-kinematics coupling was published by Williamson et al. (2007), who integrated impedance change inversion into time-shift estimation, so that time warping could simultaneously invert 4D amplitude differences and time shifts via velocity change. Both inversions are given equal weight. An added modification was to pose the problem in terms of velocity changes v (neglecting possible changes in density) rather than time shifts or time strain. Note that in our following description of their method we use the sign convention of equation (1), which differs from their original publication. To take account of the reflectivity changes, they assume that the time-shifted monitor survey is related to the amplitude-adjusted baseline, that is, m(t − t ) =b(t ). This equates to absorbing the amplitude term s(t) in equa-tion (4) into the baseline data. To calculate the amplitude adjustment, the baseline trace is written as a convolution of the wavelet ψ and the original reflectivity series R, leading tob(t ) = ψ * (R + R). Approximating the reflectivity by the small contrast approximation R ≈ 1 2 ∂ ∂t ln(v/v0) where v0 is a constant reference velocity and where t is (dimensionless) two-way time, allows the time-lapse change in reflectivity to be expressed in terms of the fractional velocity change: . This development leads to the final approximation for the time-shifted (or warped) monitor survey as Writing the time shift as the summation of v/v over the zero-offset ray path, the problem therefore becomes the minimization of a new least-squares objective function F W defined as with respect to the relative velocity change n = − v/v (Williamson et al., 2007). (Here the factor of 1 2 which was absent from the original expression is included for completeness). The time-shift correction of the monitor is implemented using a Taylor series expansion (see the previous section for details). In the original publication, the summation is taken over the entire trace of N samples, and a solution sought on a trace-by-trace basis as in the NLI algorithm. The two additional terms on the right-hand side of this equation are constraints to ensure smoothness of the final solutions. Williamson et al. (2007) sought a minimum using Gauss-Newton optimization. This is similar to the NLI algorithm above since without the amplitude term the method becomes identical to Rickett et al. (2007b) minus the slightly different smoothing constraints and a focus on velocity changes. Williamson et al. (2007) state that the technique converges quickly if the time shifts are less than one half of the dominant period in the seismic wavelet (the usual cycle skipping criterion). By choosing to invert for relative velocity change, the Jacobian matrix in the computation is now no longer diagonal but contains off-diagonal components. Thus, the results do contain fluctuations and need adequate regularization to ensure stability. To achieve an accurate inversion, it is essential that a well-founded wavelet is supplied to the technique. In addition, both the kinematics and amplitudes of the data need to be truly correlated and not biased by the processing workflow. We call the generic form of this approach waveform inversion (WFI). The application of this WFI approach to our test data sets is shown in Figs. 2(f), 3(f), 4(f) and 5(f), and the difference between the methodology with and without the amplitude term is shown in Fig. 9(a,b). Both have vertical velocity changes smoothed and regularized by the first-order Tikhonov method. We note from our tests that the solution is strongly dependent on the choice of seismic wavelet and the scaling must be determined to ensure adequate improvement is possible. Both results appear very similar, and no significant uplift is observed in using the amplitude term. Figure 9(c,d) shows the resultant velocity changes. The point at which the amplitude term should be included is difficult to optimize.
There have been several refinements of the Williamson non-linear inversion and also some adaptations to the workflow when applied to field data sets. Grandi et al. (2009) modified the original Williamson method and applied it to three case studies from high-pressure high-temperature (HPHT), turbidite and chalk fields. For their applications, they felt that inversion to time strain t/t was more appropriate than fractional velocity change v/v. Grandi et al. (2010) applied the WFI technique again to multi-vintage warping on the Ekofisk, chalk reservoirs, but did not use the amplitude term. Fiore et al. (2014) define a two-step procedure that firstly warps the overburden data without considering the amplitude term. The resultant low-frequency time shifts are then applied to the monitor data prior to the seismic at the reservoir interval being warped by WFI but this time with the amplitude effect included. As a refinement, both Fiore et al. (2014) and Griffiths et al. (2015) included a density term, on the understanding that density fluctuations are correlated to velocity fluctuations such that I The factor α is dependent on lithology and fluid saturation, and therefore needs to be calibrated. Other methods, similar to Williamson et al. (2007), have been proposed to capture the amplitude term. For example, Lie (2011) posed the problem as a constrained Bayesian inverse problem in which he treats the 4D amplitude term as a Gaussian noise term. The approach is robust, and has been successfully applied to many data sets (for example, Røste et al., 2015a,b). Similarly, the work of Baek et al. (2014Baek et al. ( , 2015 modifies the implementation of WFI to solve an exact non-linear inverse problem by robust numerical methods involving multi-scale optimization.

Dynamic time warping
DTW is a specific subset of the more generalized problem of morphing (see, for example, Whited and Rossignac 2011). It is used for speech recognition (Sakoe and Chiba 1978) and also in the signal processing and a wide range of other applications. The adjective 'dynamic' comes from the dynamic programming used to implement the method. Hale (2013) introduced an adaptation and application of the classical DTW algorithm to 4D seismic data. Other applications in the seismic literature include use for non-stretching normal moveout, P-S matching and statics corrections (Chen et al., 2017). In the context of 4D time-shift estimation, Venstad (2014) compares DTW with WFI and NLI (see above for details of these algorithms). The approach is claimed to be more accurate than methods based on the cross-correlation of windowed images as the shifts from XCR vary significantly in quality with window size. DTW does not require a window but operates on the entire trace. In XCR there is also a very variable response to noise and limit on the rate of change of shift (or strain). XCR is only accurate when the shifts are smoothly varying, thus if there are significant changes within the window then the algorithm will break down. DTW is also claimed to be particularly effective when shifts are large and rapidly varying (note that the rapidity of variation can be set as a constraint in the algorithm). Hale (loc.cit.) also shows how DTW may be used to estimate vertical and lateral shifts. In dynamic time warping, we minimize the dissimilarity error (or space) between two traces e [i, q] e [i, q] where i is the sample index, l is the integer lag (time shift) between seismic traces and f and g are the seismic amplitudes of the baseline and monitor traces, respectively. In this form, the problem minimizes the Euclidean distance between the two traces. DTW works with f[i] and g[i] defined by sample indices only and attempts to estimate shifts s(i) such that . This task is solved by dynamic programming, in which the original problem is decomposed into a sequence of nested and smaller sub-problems. The rate of change of lag is limited by a constraint set to avoid shifts changing by more than 100% -which is reasonable and still sufficient to capture rapid variations in the seismic data. The constraint helps the algorithm to efficiently converge. Identification of the best path to minimize (19) is achieved using mathematical steps defined as accumulation and backtracking, and using these procedures the optimization is performed much quicker than a simple test of all possible lags. Accumulation forms an accumulative distance function from the running sum of the alignment errors. Backtracking steps backwards along an optimum trajectory by only searching three points at a time, starting from a known lag estimated at the end of the trace by searching across all lags. DTW (and example code) is now widely used in the literature, although there are only a few examples of application to field data sets in the 4D seismic literature to date. We generate Figs. 2(g), 3(g), 4(g) and 5(g) to demonstrate the application of this method to our four field data sets. The DTW code used for this implementation is similar to that published by Hale, but has an upper and lower limit on the lag range to speed up computation -thus a smaller error matrix search. As the method works only for a single 1D trace, we find the results must also be smoothed laterally to remove the jitter associated with this discrete solution. The vertical estimates also need to be smoothed prior to this procedure. In practice, the addition of a regularization term such as that introduced by Venstad (2014) is required to ensure robustness against noise. The power of the DTW algorithm comes from the dynamic programming but this requires a discrete lag value. Quoting time shifts in number of samples may be a weakness, as the accuracy now depends on the sampling rate. Down-sampling by interpolation is required if an accuracy greater than one sample is required. In the examples we use a down-sampling factor of four, and a Gaussian window. There is thus a tradeoff between resolution of the time-shift estimates and computational efficiency (Venstad 2014). DTW is suited to parallelization and thus speed can be improved. Other benefits are that it does not get stuck in local minima and the lags for the entire trace are computed simultaneously. More recently, Li et al., (2017) modified DTW to create a new method called phase-constrained warping. They add an additional term to (19) above, weighted appropriately, to match either the first derivatives of the traces (proxy to phase) or the instantaneous phase from the Hilbert transform. Li et al. (2017) show in an application to Gulf of Mexico data that pure phase matching can eliminate spurious time shifts and amplitude distortions.

A catalogue of other significant miscellaneous methods
Apart from the time-warping methods described above, there are also a large number of other contributions that have been made to time-lapse time-shift estimation, which are described below.
Plane-wave destruction filters: Phillips and Fomel (2016) measure time shifts using amplitude-adjusted plane-wave destruction filters. Plane-wave destruction is effective when measuring small, rapidly varying time-shift functions. It is applied to a synthetic example and a CO 2 sequestration data set. In a later paper, Phillips and Fomel (2017) use a local time-frequency transform, followed by the amplitude-adjusted plane-wave destruction filter to simultaneously estimate time shifts and amplitude weights at each frequency. The method is specifically aimed at tackling problems associated with low frequencies or thin beds causing tuning effects. The method is again applied to data from the Cranfield CO 2 sequestration project.
Local similarity: Fomel and Jin (2009) propose an algorithm that selects a regularized time-warping function that maximizes the local similarity attribute. The technique is demonstrated on a synthetic example, but is later applied to field data sets (Zhang et al., 2014). In the synthetic tests, this algorithm effectively measures the low-frequency component of the shift function, but fails to detect higherfrequency variations. In a follow-up paper, Karimi et al. (2016) use this local similarity attribute to estimate time shifts after flattening the images using a stratigraphic coordinate transformation.
Mutual information: Donno et al. (2017) propose this novel method used in medical imaging to measure the mutual dependence between variables. Mutual information (MI) measures the 'distance' between two distributions of image values (hence traces). It is window based, and MI is measured for every translation of the monitor image as with previous algorithms; however, unlike cross-correlation MI gives a sharper peak value. An added feature of this approach is the application to the envelope function of the traces.
Complex trace analysis: Hoeber et al. (2008) introduce a method to make use of the complex trace representation in the context of 4D matching and noise reduction. Instantaneous phase matching is ideally suited for removing small residual phase-jitter between misaligned data sets. When larger time shifts are present, a cascaded approach is advised, matching instantaneous phase after initial time alignment with another algorithm (such as cross-correlation). This technique is intended to clean up the data and improve repeatability if dominated by residual noise.
Using a repeatability measure: Lecerf et al. (2015) formulate an expression for the normalized root mean square (NRMS) repeatability metric as a function of signal-to-noise ratio, energy ratio of baseline and monitor data, dominant frequency and time shift. The expression retains first-order terms in a Taylor series expansion and is thus valid for small time shifts. In principle, this algorithm allows us to determine time shift using NRMS as a metric when all other factors are known. In this approach, the time shift is evaluated within a running window.
Higher-order statistics: This method has a long history in other fields such as sound processing. In an application suited for general use, Nikias and Pan (1988) compute the time shift in the presence of Gaussian noise by comparing similarities between higher-order spectra rather than cross-correlation. This works as the third moment of a zero-mean Gaussian process (the noise itself) is zero. This method is chosen to compare against the other more popular methods in Figs. 2(h), 3(h), 4(h) and 5(h). Like most of the techniques above, this method is applied iteratively with windows decreasing from 800, 600, 400 to 200 ms. HOS does not respond well to noise but gives results comparable with the other techniques when signal to noise is high.
Short-window variance: Zhen et al. (2016) demonstrate that evaluation of the minimum variance in short window works better at correcting for acquisition and productionrelated time shifts than cross-correlation. However, there is no emphasis on the estimation of these time shifts.

Large time shifts and resolution
In many situations in practice, such as mechanical extension or gas injection in the overburden, time shifts are sizeable when compared with the dominant seismic period (T) and may be considered large. Under these conditions, most of the main families of methods described above (cross-correlation, Taylor series, and some waveform inversion approaches employing approximations) do not work. Thus, for example, the linear approximation central to the Taylor series approximation is valid only when t < T/10. For example, for 25 Hz data, T = 40 ms and this equates to t < 4 ms, which is extremely restrictive for most production-related time-shift analysis. For thick reservoir sequences saturated with gas, gas injection in the overburden or compacting reservoirs, then absolute time shifts and strong time-shift gradients can easily exceed this threshold (MacBeth et al., 2019). The condition for avoidance of cycle skipping in cross-correlation or waveform inversion is T/2, giving a maximum to the working range of 20 mswhich is more acceptable, but may still be problematic. As discussed previously, to overcome these limitations it is usual for techniques to be applied iteratively with successive loops of estimation and correction until the noise floor is reached. For very large and obvious time shifts, it is also possible to simply match distinct time picks on the data, before applying the chosen algorithm. The time-shift gradient is linearly interpolated between these horizons to create a 'low-frequency model', and the entire time-shift profile is applied to the data in the correction step. Next, the methods for small time shift can be applied to determine the much smaller residual time shift on the corrected monitor and baseline data. An application of this sequence is illustrated on a North Sea data set C in Fig. 10. In this example, large time shifts have been artificially manufactured on data set C by applying reverse warping the monitor survey. Time shifts are now up to 40 ms in size, and this reference time shift is shown in Fig. 10(a). Next, we apply two techniques [NLI2D (Fig. 10b) and FCC (Fig. 10c)] to the data with the amplified time shifts, and observe that the methods fail to adequately capture the correct results due to a breakdown of the small time shift assumption. To extract the correct values, we obtain a horizon-based time shift (Fig. 10d) and apply this to the data prior to application of the methods. The residual time shift (Fig. 10e) is then added to the horizon-based result to give the final (Fig. 10f), which now closely matches the desired result.
As an alternative, the data can be low pass filtered such that T is large enough to capture the long wavelength, smooth time-shift trends prior to the application of the more methods for small time shifts with the full bandwidth of the data.
Another issue of importance is the resolution of local techniques such as cross-correlation, Taylor series or waveform inversion. With window-based algorithms, a compromise is usually sought between window size and instability. The need to compromise can be avoided to some degree by a multi-scale approach (as mentioned for example by Baek et al., 2015), in which a long window is used initially, and the smooth trend isolated and corrected, and then the computation is run again on a shorter widow to pick out the higher-frequency information. Several iterations with shorter and shorter time gates are applied until the desired resolution is obtained. This approach is also a possible solution for previous methods. Window-sensitive results can be avoided by inverting for the entire volume as in Rickett et al., 2007or Williamson et al. (2007. It should also be noted that the dynamic time-warping method of Hale (2009) has the advantage of measuring large and small time shifts equally, and also does not require a window. In practice, many of the above proposed methods can be used satisfactorily for most situations provided they are iterated/cascaded or applied in a multi-scale fashion. An example of this is the Taylor series-based MSIOF of Li et al. (2017).

Correction for time-variant time shifts
Before differencing, the monitor m(t) survey is back-shifted by the estimated values t(t) to m(t − t(t)). Publications of case studies or applications of time-shift methods almost never describe the procedure of time-shift correction (or alignment) of the monitor after the process of estimation, yet an accurate correction is key to providing undistorted difference amplitudes for interpretation. Ideally, the correction method should be entirely compatible with the measurement method. For most methods based on the Taylor series approximation, the correction is already part of the output and can be easily implemented as: The exception is the correlated leakage method. Correction by Taylor series is valid for small time shifts only. DTW and waveform inversion can return as output of their process the warped monitor trace. For methods such as cross-correlation and many in the miscellaneous category, an independent algorithm must be used. Typically, correction is achieved on a trace-by-trace basis by first shifting each amplitude in the trace by the requisite time shift, such that the amplitudes are now non-uniformly sampled. Next, these amplitudes are resampled/reconstructed numerically back onto the original uniformly sampled grid. Reconstruction of a discrete uniformly sampled (or continuous) seismic trace from a trace sampled in a non-uniform grid is possible using Lagrange interpolation, Shannon (or sinc) interpolation (less accurate than its more popular application for uniform grids) or polynomial splines (Unser 2000). All these choices presuppose that the sampling exceeds the Nyquist rate, which for our applications is always the case. Many other sophisticated methods exist for accurate reconstruction as non-uniform sampling is common in many other disciplines. MATLAB also offers a selection of routines for re-gridding. In practice, whilst it is computationally efficient to resample using linear interpolation, the spline approach provides much more accuracy and it to be preferred. There are also changes to the spectral content following re-gridding, and such amplitude distortion in time-domain warping was also noted by Griffiths et al. (2015). Ideally a filter should be applied to the monitor data to rematch to the baseline spectrum. For warping correction, a more physically satisfying approach might be to invert the monitor to a reflectivity series, shift the spikes and then reapply the wavelet to the shifted series. This approach will avoid wavelet distortion, but is challenging in its implementation and impractical.

Lateral shifts
Up until this point we have focussed on methods which ignore the contribution of lateral shifts. However, well-known imaging considerations tell us that lateral shifts should accompany the vertical time shifts in our data, and these could also carry important information about the reservoir. In practice, this is observed in many 4D data sets (Rickett and Lumley 2001;Hall et al., 2002;Cox and Hatchell 2008). Matching reflector positions for the baseline and monitor cannot occur in depth or time as they have different velocity models, but have been migrated with the same velocity model. Different velocity models are expected when the time-lapse changes due to stresses/strains, rock damage, or fluid saturation effects generate velocity perturbations in the rocks overlying the region of interest. Lateral shifts are typically 5 to 10 m in size when measured at 2 to 4 km depth. The concept of 3D warping for seismic image registration, a process where lateral and time shifts are estimated together and then applied to the monitor data, was first proposed by Rickett and Lumley (2001) and Druzhinin and MacBeth (2001) as a refinement of the post-stack cross-equalization workflow. The first application of 3D warping to image matching of 4D seismic data was by Hall et al. (2002) and applied to the Valhall field (matching 3D streamer data to OBC data). This was followed by applications to the Southern Gas Basin by Hall et al. (2003) (matching two towed streamer data of different directions). Hale (2007), who developed a method to speed up and increase the resolution of the 3D image warping, provides a further refinement. When the image-based components dominate, then 3D warping may also be viewed as a post-stack residual migration operation, and the temporal and spatial displacements mapped directly to the velocity perturbations. Warping is preferred over residual migration as events of interest are not damaged and do not change character. Techniques for measuring lateral shifts do not carry the same potential for resolution as vertical shifts (Hall et al., 2002;Hale et al., 2008), due mainly to the lack of a strong variation in event character horizontally. Despite this consideration, it is noted that measurement of lateral shifts is nevertheless accurate during processing for repo-sitioning nodes or correcting acquisition artefacts in streamer data. To date, techniques to measure lateral shifts are not yet in widespread visible use in the industry. Displacements due to active geomechanical variations in the overburden also create lateral and vertical changes that superimpose these imagerelated changes. This overlap does not negate the use of 3D warping as a tool for image registration but does make the interpretation of the lateral and vertical shifts harder to decipher. This topic is still of major research interest to date. As more complex reservoirs with steeper dips are accessed with 4D seismic technology, we would expect 3D warping (or an equivalent image-based solution) to become more commonly accepted. Figure 11 shows an example of the results of this 3D warping approach applied to the North Sea field data set A, where the NLI algorithm has been used to estimate the shifts. It is observed that the results must be viewed as a 3 × 3 matrix of vertical and horizontal sections, each showing the three components of the warp field.

Future possibilities
A major limitation for most of the techniques described in this review is the 1D assumption for calculating the relative velocity change. That is, the techniques are based on traceby-trace measurements of the time shifts between monitor and baseline. One approach towards potential improvement of these methods to enhance the accuracy and positioning of the time strain/velocity change anomalies is the solution of incorporating more image-based analysis to account for complex dipping structure in the overburden. Despite this, the time shifts will determine only the long wavelength components of the velocity wavefield. A recent promising direction to access multi-scale changes are the more sophisticated but computationally intensive methods involving FWI (for example, Asnaashari et al., 2015). 4D FWI offers the potential to obtain high-resolution time-lapse images of production-induced velocity changes using diving waves and refractions propagating laterally in the subsurface. In this method, velocity model updates are obtained using pre-stack data with only light touch processing to isolate diving waves. Inversions are typically performed at frequencies of less than 10 Hz, and the data are acquired for long offsets. Hicks et al. (2016) provide the first example of a direct comparison between 4D FWI and time shifts and time strains obtained from post-stack migrated reflection data. These tests show a good consistency between the two methods when applied to permanent reservoir monitoring data from the Grane field. The agreement is remarkable considering the methods focus on different information Figure 11 Results of 3D warping measured on data set A, displayed for two vertical (inline and crossline) sections and one inline-crossline horizontal section. For each section, we display three measured quantities: vertical time shift (in milliseconds) and lateral shifts in inline and crossline direction (in metres). content of the recorded wavefield and demonstrates that velocity changes are preserved in the final processed PSDM migrated data, and also that 4D FWI is a complementary tool.
One final possibility for the future of time-shift estimation is the use of machine learning, a tool that has become increasingly popular in recent years for solving non-linear multi-variant regression in problems in geophysics. Both the EAGE and SEG annual meetings in 2019 indicate a significant rise in popularity of machine learning applied to the field of geoscience and engineering. Dramsch et al. (2019) offer one example of a possible network architecture for 3D shift estimation and this may point towards the development of waveform-based machine learning applications.

D I S C U S S I O N Which Is 'Best'?
As stated in the introduction, our purpose was not to establish, via exhaustive analysis, which is the 'best' method, but instead to create a guide to the diversity of available methods that could be applied under certain circumstances. Given the large array of methods (thirty or more) and the publications to date, it is important to make some evaluation of them. In the literature there are clear theoretical strengths and weaknesses as defined in the description of each methods above and as summarized in Table 1. These should translate into robustness to changes in signal-to-noise ratio, sensitivity to the 4D changes themselves and resolution and accuracy. Thus, if you are located in company X with data set Y and budget Z, under specific time constraints, how should you proceed with a selection? How can one make a sensible choice that satisfies your needs? 1 One approach is clearly to test out several algorithms across a range of representative data sets in your portfolio, and then assess the response to noise and signal character using this matrix. By choosing to show time shifts and time strains, the stability and resolution of the solution can be judged. An example of this type of approach is found in Ji et al. (2019). One may also perform some assessment using Figs. 2, 3, 4 and 5, where we apply eight methods to four different data sets. Some cross comparisons are also available in the literature from several authors. For example, Naeini and Hoeber (2008) considered the successes and failures in evaluating time shifts in 4D processing and interpretation. They compared cross-correlation, the Taylor series method with amplitude correction and a method based on higher-order statistics. They tested all three methods on field data, with the specific problem of removing residual acquisition artefacts and false amplitude anomalies on the 4D difference section. The stabilized Taylor series method appeared to perform well for the short-window time adjustments required. In another study, Kanu et al. (2016) compared five techniques: local cross-correlation, non-linear inversion, correlated leakage, dynamic time warping and their proposed new method of multi-scale iterative refinement of optical flow or MSIROF ; see description in the previous section). They applied these tests to synthetic data of Gulf of Mexico subsalt and extra-salt reservoirs with geomechanically induced changes in the overburden and reservoir. The data are generated by finite difference modelling followed by migration. They find that correlated leakage, non-linear inversion and MSIROF provide the most accurate time shifts for these particular data. Finally, Ji et al. (2019) compared local crosscorrelation, correlated leakage and non-linear inversion ap-1 It should also be noted that for some of the methods shown, companies have either patented the technology or these are their chosen paradigm. For this situation, their choice is already in place. plied to four different North Sea data sets. They conclude that although all algorithms showed a similarity in response, there were marked differences in terms of the final time shifts and the response to noise. Cross-correlations (XCR) have the advantage of being amplitude independent and easy to apply, but need sliding windows and are thus quite sensitive to window placement/size and shape. They also handle sharp changes in time shift in a less elegant way, since they essentially estimate bulk shifts. Time shifts and time strain appeared stable and smooth. The correlated leakage (CLM) approach deals with time-varying time shifts in a much better way, but stability is a greater problem. The non-linear inversion (WFI) proved to be the best balance of signal representation and noise response, and came out as the best method when ranked according to noise in the final results, and particularly in the resolution of the signal and quality of the time strain (derivative of the time-shift estimates). In our experience, no algorithm performs universally well in all cases as each responds differently due to the nature of the signal and noise in the data sets. The more noise in the data, the wider the diversity of results between methods. Furthermore, no method is sufficiently poor in terms of compute time that it should be rejected outright. However, a comparison exercise does not necessarily provide a clear generic answer as the result depends on many factors, including the skill of the user, cost, value, response to noise and robustness of the software. We note, in carrying out our comparison exercises, that the main factor for success tended to be optimal implementation rather than the method per se, as there may be a wide range of variations in the detailed implementation of these algorithms which we cannot exhaustively replicate in this review. We conclude that all methods implemented, within the theoretical and computational limitations of the method used, are equally useful on good quality data. The choice of method is therefore largely subjective and based on the experience and prior knowledge of the user. We would also expect the results to be strongly case dependent.
To be more definitive on a chosen method based on our overall experience to date, Table 2 compares, contrasts and ranks the five popular families of time-shift method: the crosscorrelation based methods, Taylor series methods, waveformbased methods, dynamic time warping and finally the miscellaneous category of algorithms. The latter are difficult to judge and vary considerably in performance and methodology. Most are not suited to industry data or the problem of extracting reservoir-related time strains, and therefore are ranked fifth in our table. At face value, dynamic time warping sounds promising, as it does not need a window, and it is proposed to treat large rapidly varying time shifts. However, lack of easy implementation with much necessary customization in practice, and mediocre performance on real seismic data means we rank it fourth. The Taylor series expansion methods are all window based and therefore carry the possibility of creating artefacts. They also suffer from lack of stability and can only estimate time shifts of up to a fraction of the seismic period. However, when treating 4D seismic data and ease of use, their flexibility places them third in our table. Cross-correlation methods, and in particular the fast cross-correlation approach, are a popular choice that is robust to noise and quick to implement. The reliance on a window however (with careful tapering and normalization tricks are applied) combined with the artefacts created by cycle skipping ranks it second. Our first choice is waveformbased methods, which invert the entire trace and are fairly robust to noise. They also benefit from their ability to be regularized with some degree of effectiveness. Our experience is that the simpler form of WFI without the amplitude term delivers generally optimal results over other implementations for data sets characterized by a range of noise levels.
In conclusion, for those who wish to embark for the first time to select a method for quantitative analysis, but do not wish to test different algorithms, there are some general rules for guidance: 1. Choose any one of the popular families of solution -as this offers the widest chance of literature support and overall success.

2.
If applying a window-based method, weight your window using a Gaussian. 3. Implement a multi-scale and iterative approach to guarantee accuracy and robustness of the solution, with enhanced resolution. 4. For large time shifts, choose to evaluate the long wavelength trend first before moving to the shorterbrk wavelengths.
Some additional considerations are: 5. Data sets should be processed appropriately to preserve the production-related time shifts. One such example is the requirement to flatten events in gathers. It is therefore recommended that residual timing variations are minimized by applying offset/angle seismic event alignment with two independent references for baseline and monitor surveys before stacking. 6. If the reservoir zone is near tuning and expected to be a complex mixture of saturation and geomechanical signal, consider using a waveform inversion-based approach or at the very least a method incorporating amplitudes and time shifts. 7. Time-shift analysis in the overburden does not require an amplitude term. 8. If your reservoir depth is moderate and the overburden benign, an imaging correction need not be applied. 9. For imaging heterogeneity, structural complexity or hazards in the overburden, consider pre-stack methods such as time-shift variation with offset or 4D tomography.

C O N C L U S I O N S
As data repeatability has improved over the years, 4D timeshift analysis has become recognized as a mainstay component of the interpretation workflow. Time-shift estimation and subsequent correction of the monitor data ensure that the amplitudes in the difference volume are of high fidelity, whilst also providing an additional attribute volume for dynamic reservoir and overburden description. Time shifts have been linked to velocity compaction/extensional changes in the subsurface and are particularly useful for assessing production in reservoirs that is strongly controlled by geomechanical effects (for example, stress changes around faults or wells). Additional techniques are available for inverting time shifts to reservoir pore pressure (Hodgson 2009) and closing the loop with the geomechanical model. Time shifts may also be utilized further in the amplitude inversion workflow as a constraint or in a joint objective function. Whilst the interpretational benefits of time shifts are recognized, this current review concludes that a definitive approach for the direct measurement of time shifts still appears unclear. Indeed, the open literature contains a wide array of possible methods, suggesting that our community is not yet wholly satisfied with the results we observe and has not yet matured to a common paradigm. The 30+ published methods selected for this paper may be broadly categorized as belonging to the family of cross-correlation, Taylor series expansion, waveform inversion or dynamic warping. Many miscellaneous and somewhat bespoke solutions also exist. Cross-correlation is a tried and tested favourite of the geophysics community and its strengths and weaknesses are well understood. We may call this our Pareto choice algorithm, in the sense that it takes 20% of the effort of the other algorithms to get running, whilst you can roughly output an 80% acceptable solution. The Taylor series expansion approach, which evolved from optical image processing, represents a short cut to the warping problem for small time shifts. Waveform inversion is popular in areas such as full-waveform inversion (FWI), and this offers an elegant and attractive alternative that avoids the use of windowing the data and can produce some excellent results. Dynamic warping also avoids windowing the data, and is a new approach exported from the speech processing community. It is yet to be fully tested across a wide range of field applications. Cross-correlation, Taylor series expansion and waveform inversion must all be applied to small time shifts (typically less than half of the seismic period) and hence must be applied iteratively. Dynamic warping is claimed to be applicable to any magnitude of time shift. However, this is less straightforward to code and implement than the tradi-tional geophysical methods. In general, the great majority of methods are difficult to assess as published examples do not show applications to production-related data, and those that do generally do not follow standard 4D conventions 2 (such as these set by Stammeijer and Hatchell 2014). Methods that do look technically competitive unfortunately have not been fully road-tested in a professional industry setting or lack efficiently when dealing with large volumes of data. In our review due to space limitations it has unfortunately not been possible to apply all of the published methods to our complete database, and thus only a small representative selection has been tested. From these tests, clear favourites have emerged that deliver robustness, ease of use and well-resolved images: thus crosscorrelation is found to be the easiest to implement effectively (and easily the most popular in the literature), whilst the nonlinear waveform inversion possessed the finest and most stable resolution, and is a possible front-runner. The Taylor series approach is the second most popular in the literature. One caveat is that our results are based on our particular implementation and test data sets and do not confer an overall validation of these methods.
From a general perspective, the results in this review and our applications to other data in-house indicate that noise levels and distortions on measured time shifts still remain moderately high. Despite this, leading-edge techniques such as FWI have been shown to produce similar results to timeshift methods, albeit at greater computational cost (see, for example, Hicks et al., 2016). In many cases, time shifts are largely semi-quantitative, and their overall interpretative benefits are significantly reduced relative to expectations. Some improvements can be made by employing regularization techniques or inverting directly for relative velocity change or relative time shift. However, noise still remains a challenge that varies considerably with method and data set. Measurements still strongly reflect 4D data repeatability, the influence of acquisition, processing (balancing of kinematic versus amplitude preservation), imaging workflow and subsurface structure -this topic is discussed further in our review on error and 2 Our time-shift colour bar and conventions follow those published by MacBeth et al. (2019), in which a gradational red colour defines a positive time shift (velocity decrease or softening) and a gradational blue colour represents a negative time shift (velocity increase or hardening). Further, our preference is for a green-blue-cyan/yellow-redpink colour bar in which hardening signals are represented by cold colours and softening signals by warm colours. Oversaturated redwhite-blue colours are less favourable for quantitative analysis as the time-shift increments are not readily detectable. An axis on the colour bar should provide time-shift values that can be clearly quantified. uncertainty. Apart from noise, there are a number of recent advances that aim to extract additional benefit from time shifts. For example, one recent suggestion is that post-stack data are not adequate for detailed analysis of subsurface changes, and that time-shift variations with offset must be analysed (Røste et al., 2007;Kudarova et al., 2016) or time-shift tomography implemented (Dvorak and MacBeth 2018). This may be particularly necessary for resolving dynamic features in the overburden such as fault reactivation or gas leakage. Lateral shifts are also currently underexploited, and beyond a few papers this is not widely applied in practice to date. There is a growing need to resolve changes in the reservoir and small, thin beds in particular. Such limits may be avoided if we adopt more waveform-based joint inversions. Especially challenging is an environment of thick reservoir stacked by thin sands (Tatanova 2018). Finally, the use of constraints and closing the loop from the geological, geomechanical (Tigrek and Hatchell 2006) or the fluid-flow domains may hold some merit for future applications and remains a worthwhile goal.

AC K N OW L E D G E M E N T S
We thank the sponsors of the Edinburgh Time-Lapse Project, Phase VII (BP, CGG, Chevron, ConocoPhillips, Vår Energi AS, ExxonMobil, Halliburton, CNOOC, Norsar, Petrobras, Shell, Equinor, Taqa and Woodside) for supporting this research. We thank the sponsors for their support and donation of data to make this project possible. We also thank Jon Brain, Henning Hoeber and Adam Cherrett for helpful discussions and feedback on various aspects of this article. Thanks to Tony Hallam of ETLP for writing some of the code that implemented the methods described. The data examples are created with Schlumberger's Petrel software, Hampson-Russell and ETLP's own in-house software.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from Shell UK, ConocoPhillips, BP, and CNR International. Confidentiality restrictions apply to the availability of these data, which were used under a contractual agreement for this study. Data are available from these companies by a written request for permission from the relevant company.

Field C. Ivory Coast, Late Cretaceous turbidite sands
This oil and gas field lies at 2500 to 3500 m below mudline and in deep water. The seabed displays pronounced canyon features and the overburden is structurally complex. The reservoir interval is a sequence of thin turbidite sands, interbedded with silts and muds, several hundred metres thick. A single monitor survey is shot more than ten years after initial production. Well-developed 4D signatures are observed due to gas out of solution and also pore pressure increase due to injection. Pore pressure increase of up to 2030 psi (14 MPa) gives rise to 5 ms slowdown, whilst gas out of the solution reaches 10 ms . Example vertical sections of 3D seismic and 4D seismic (pre-time-shift correction) are shown in Figs. A2(a) and A2(b).

Figure A2
Examples of data for field data set C. (a) Vertical section from 3D baseline volume; (b) 4D differences prior to warping (i.e. time-shift correction). Examples of data for field data set D. (c) Vertical section from 3D baseline volume; (d) 4D differences prior to warping (i.e. time-shift correction).

Field D. UKCS, Palaeocene deepwater sands
This turbidite field is composed of thin, multiply stacked channel and sheet-like sands compartmentalized by roughly eastwest oriented faults (Falahat et al., 2010). These clean, good quality sands lie at 2000 m depth, range in thickness from 20 to 40 m, have porosities around 27% and a well-developed permeability of hundreds of darcys. The field is monitored extensively over several years from injection start-up using towed streamer surveys acquired one or two years apart. Light hydrocarbon gas injected into water bearing sands creates a strong gas saturation signal with a time-shift slowdown of 26 ms, starting at the top of the shallowmost sand unit. This gas-related signature is observed to be bounded by the major faults and also to some degree limited by stratigraphic pinchouts. Example vertical sections of 3D seismic and 4D seismic (pre-time-shift correction) are shown in Figs. A2(c) and A2(d).

A P P E N D I X B : T I M E S T R A I N
Time shift is a cumulative measure of the time-lapse changes in velocity and length along the ray path. It therefore masks local variations and cannot be directly interpreted for local