Analysing the role of caprock morphology on history matching of Sleipner CO 2 plume using an optimisation method

Geological carbon storage is a promising solution to reduce the CO2 concentration in the atmosphere to ameliorate the effects of global warming from the greenhouse effect. Among feasible storage options, deep saline aquifers are believed to have the largest storage capacity for the gas collected from industrial processes. The first CO2 storage project at a commercial scale in a saline aquifer is in the Sleipner field of the Utsira storage formation in Norway. The long ongoing storage operation in the Sleipner field has been the subject of several past studies attempting to recreate the observed injected CO2 plume migration behaviour. History matching is a method to adjust the input parameters of the model in a way to minimise the mismatch between the simulated and the actual production data in reservoir engineering and applicable to carbon sequestration. Typical parameters adjusted in history matching are porosity, absolute and relative permeability data. In this study, we used an adjoint-based optimisation tool and showed the importance of caprock morphology in finding an accurate plume match. Using a set of synthetic models, we initially minimised the mismatch between the observed and simulated CO2 plume outline by modifying the caprock topographical details. After testing the optimisation tool on the synthetic models, we applied the methodology to the Sleipner benchmark 2019 model and improved the plume match by locally adjusting caprock elevation within seismic detection limits. We subsequently improved the match by calibrating porosity, permeability, CO2 density and injection rate together in an experiment in which we calibrated all the parameters, including the caprock morphology, to find a better match. The results showed an improvement of around 8% (compared with the original model) in the plume match resulting from an average absolute elevation change of 3.23 m in the model while keeping the other parameters constant. Calibrating the porosity, permeability, CO2 density and injection rate resulted in a 5% improvement in the match, and once caprock morphology was included in the optimisation process, the match improvement increased by 16%. We changed the caprock elevation within a range lower than the seismic detection limit, and Correspondence to: Masoud Ahmadinia, Centre for Fluid and Complex Systems, Mile Ln, Coventry CV1 2NL, UK. E-mail: Ahmadinm@uni.coventry.ac.uk Received January 14, 2020; revised August 4, 2020; accepted August 6, 2020 Published online at Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/ghg.2027 This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2020 The Authors. Greenhouse Gases: Science and Technology published by Society of Chemical Industry and John Wiley & Sons Ltd. Greenhouse Gas Sci Technol. 0:1–21 (2020); DOI: 10.1002/ghg.2027 1 M Ahmadinia, SM Shariatipour Original Research Article: Analysing the role of caprock morphology on history matching of Sleipner CO2 plume results showed that even a few metres variations in the elevation have significant impacts on the plume migration and trapping mechanism in the Sleipner model. The method presented in this work results in a better match than the original seismic data for the Sleipner model. © 2020 The Authors. Greenhouse Gases: Science and Technology published by Society of Chemical Industry and John Wiley & Sons, Ltd.


Introduction
G lobal warming is mainly caused by the sudden increase in CO 2 concentration in the atmosphere. 1 Carbon capture and storage (CCS) is one of the practical solutions to tackle this problem, 2 which theoretically has the potential to decrease greenhouse gas emissions by up to 32% by 2060. 3 The idea is to first capture emitted CO 2 from large point sources, such as power plants or industrial facilities, then transport it to the storage site, and inject it into deep geological formations (deeper than 800 m) where it can be securely stored permanently. 4 Based on the estimation from the Intergovernmental Panel on Climate Change (IPCC), CCS helps reducing the cost of limiting global temperature rise to 2°C by 138%. 5 Currently CO 2 is primarily stored either in saline aquifers or used as an enhanced oil recovery (EOR) agent in hydrocarbon reservoirs. 6 Over the last decade, CO 2 has been used in more than 70 EOR projects around the world. 7 Some of the largest CO 2 -EOR projects are the Cranfield oil and gas field, [8][9][10][11] Weyburn Field in Canada, 12,13 Changqing Oil Field in China, 14 Santos Basin in Brazil 15 and Alberta Carbon Trunk Line project in Canada. 16 The Sleipner CCS project is the first storage project demonstrated on a commercial scale 17,18 and is the longest ongoing CO 2 storage project in the world, storing around one Megaton (Mt) CO 2 per year since 1996. 19 This has provided researchers with exceptional insight into CO 2 storage process in the saline aquifers. The site, located off the western coast of Norway, has been operated by Equinor as a CO 2 storage site, to prevent CO 2 emissions associated with natural gas production from the same region. The CO 2 is captured from the nearby gas processing field and then injected into the Utsira formation located at a depth of 800-1000 m beneath the North Sea. [20][21][22][23] The Utsira formation is late Cenozoic, a 200-250 m thick sandstone that has stored 17.8 Mt of CO 2 by 1 January 2019, 24 and the caprock formation is the Nordland shale with a thickness of 200-300 m. 25 The Sleipner 2019 Benchmark Model 26 is the most recent reference dataset from the Sleipner CO 2 storage site. Although so far, nothing suggests the stored CO 2 may leak into the atmosphere in the Sleipner model, it is still important to monitor the storage process and manage the leakage risk throughout the project. 25 Moreover, monitoring is important in order to understand the CO 2 plume migration path and speed and trapping mechanism, which is essential for decision-making purposes. Numerous researchers [27][28][29][30][31][32][33] have tried to gain a better understanding of the plume migration in the Sleipner and to find a satisfactory history match of the CO 2 plume migration. The nature of the caprock has been suggested to be a source of mismatch between the observed and simulated plume outline in Sleipner. [33][34][35] While a portion of the CO 2 that is injected into storage site will be securely stored through various processes, including dissolution, residual and mineral trapping, a large part will remain as a free phase in short to medium term. 2 If there is no proper barrier, such as anticline, the free phase will move upslope along the caprock. For storage safety and conformance purposes, it is important to improve the understanding of this migration process and develop confidence in long-term carbon storage performance. 2 CO 2 is generally injected in a supercritical state to achieve a higher density and consequently occupy less volume within the storage formation, 36 and depending on local storage condition, will remain in gas, liquid or supercritical phase. 37 Regardless of the phase, the injected CO 2 has a lower density than the formation water in practically all relevant scenarios. Consequently, shortly after the end of the injection period, most of the free phase CO 2 will migrate upwards due to buoyancy to lie beneath the caprock if not trapped below the low permeability formations in middle layers. 38 Therefore, the caprock morphology plays an important role with respect to the plume Original Research Article: Analysing the role of caprock morphology on history matching of Sleipner CO 2 plume M Ahmadinia, SM Shariatipour migration and storage security. Most caprocks are sedimentary rocks, mainly deposited horizontally. 39 They may, however, become tilted over the long term, after the imposition of various tectonic forces. Caprock morphology refers to the structures, such as tilts, folds, and so on, that are observed in the caprock and are controlled by several processes including pressure and temperature changes during the deformation of sedimentary rocks. 40 Sinusoidal structures are formed throughout the deformation of strata and the deposition of sediments. 40 They are common in sedimentary rocks and are observed in various scales from microscopic to regional, of which the main types are folds and bedform. 40 Formation dip has proven to have a significant influence on the storage process including residual and dissolution trapping, 41 up-dip migration and plume stabilisation. 42 In some demonstration storage sites, the formation has preserved various dip angles from high ranges, such as Nagaoka in Japan 43 and Ketzin in Germany with 15°d ip, 44 Frio in Texas with 16°dip, 45 to low ranges including Vedder formation in California with 7°dip. 46 Several researchers investigated the impact of caprock morphology on the CO 2 plume migration through numerical and analytical methods using realistic 31,47 and synthetic 38,48-51 models. The impacts of the top surface morphology and geological heterogeneities on CO 2 storage capacity were investigated in a previous study. 47 The structural trapping capacity was calculated through a spill-point (the structurally lowest point of the trap that can retain CO 2 ) approach and geometric analysis. CO2lab module of the MATLAB Reservoir Simulation Toolbox (MRST) which is a set of open-source simulation and workflow tools was used to study the long-term, large-scale storage of CO 2 . 52,53 While a simple volume analysis provided a good estimation of structural trapping, the study showed that residual trapping must be calculated through detailed flow simulation. The impact of rugosity, that is, the topography variation below 10 m, 54 on plume migration, was discussed in another study. 55 The work showed that neglecting surface roughness in geological models generally leads to an overestimation of the migration speed and the roughness provides additional storage capacity for the CO 2 . In a later work, 56 the importance of key structural parameters (including amplitude and wavelength of small scale structural traps) in controlling the CO 2 plume migration and trapping were evaluated. While amplitude plays a significant role in the amount of structural trapping, the work showed that the dynamics of the plume are mainly affected by the trap spacing, spill-points and formation dip angle.
Aquifer permeability and formation dip were shown to be the main parameters affecting the storage efficiency. 57 Later in another study, 58 the effect of tilt, rugosity and permeability anisotropy on CO 2 storage migration and trapping in a saline aquifer was investigated. The effect of the domain between the aquifer and the caprock (transition zone) on the CO 2 dissolution and movement was considered in the study. The results showed an enhancement in the dissolution due to the increased contact between the CO 2 with the brine as a result of the shale dispersing the plume over a wider area. Moreover, CO 2 was trapped beneath the shales which increased its storage security. Later, a systematic analysis of the impact of the formation slope on the CO 2 storage process in the Liujiagou formation in China was performed in another study. 59 According to the results, the plume will be symmetrical around the injection well. Moreover, the impact of dip on the CO 2 plume migration was seen to be negligible during the injection phase. Another work 60 focused on the impact of rugosity on the CO 2 storage process. The results showed that residual and dissolution trapping reduce the plume thickness, while caprock rugosity retards the plume migration. The majority of the CO 2 near the tip of the plume was seen to be structurally trapped inside the small-scale structures.
The observed plume outline from seismic data was matched with one from the simulation in a study for the Sleipner model. 33 The results suggested that the plume outline shape is mainly controlled by the caprock morphology and that the CO 2 -brine contact was governed by the CO 2 density. Later in another work, 31 a one-factor-at-a-time (OFAT) approach was employed to study the sensitivity of estimating the storage to changes in porosity, permeability, caprock elevation and aquifer conditions (pressure and temperature) in the Utsira aquifer. The results showed that caprock elevation and permeability have the greatest impact on plume dynamics, which was also partially confirmed by a previous research. 33 A sensitivity analysis on the impact of caprock rugosity on CO 2 trapping mechanisms was performed 48 using sinusoidal geological models. The results, similar to the previous studies, 61,62 showed that higher dip angles result in a higher dissolution and residual trapping but lower structural trapping. The results also indicated that in models with the highest rugosity values, in which the plume is less mobile, residual trapping is minimum. In another work, 38 the results from the vertical equilibrium (VE) tool in MRST−CO2lab were compared against a number of simulators, including the ECLIPSE-black-oil, ECLIPSE-compositional and ECLIPSE-VE in a CO 2 storage study in an aquifer. The results showed a good agreement between the approaches in terms of plume shape, although the amount of dissolved CO 2 in brine was different. While previous studies 59,63 showed that by increasing the tilt angle, the plume migrates further, which consequently results in a higher dissolution, the work 38 showed that in tilted models, however, with limited vertical permeability, more CO 2 becomes trapped residually in the bottom layers which eventually results in a lower dissolution. Regarding the computational cost, MRST−CO2lab significantly outperformed the rest.
General seismic surveys are proficient in detecting large-scale features in the caprock, including domes, traps and spill points. Their detection level, however, does not cover rugosity. Light Detection and Ranging (LiDAR) is a technique for determining the distance to an object by transmitting a laser beam at it and measuring the time the light takes to return to the transmitter. 54 LiDAR scanning of outcrops with a typical resolution between 0.1 and 1 m, can provide evidence of geological features not evident in the seismic investigation. 64,65 It is necessary to quantify and find the sources of uncertainty in the data representing the geological model as if this is not undertaken, it introduces errors into the simulation process. A popular method to mitigate the model uncertainty in reservoir engineering problems is history matching where different parameters, such as pressure and production data, are calibrated to decrease the mismatch between simulated and observed data. 66 Typical uncertain parameters considered in history matching problems are porosity, absolute and relative permeability data. 67 The focus of the current work is to find the importance of caprock morphology in comparison to other typical parameters in decreasing the mismatch between the observed and simulated plume outline in Sleipner model. The work continues with the following sections. In the second section, we introduce the characteristics of the synthetic models, which will later be used to test the adjoint-based optimisation tool by recreating a given CO 2 plume shape in the model. Later in this section, the Sleipner 2019 benchmark model is introduced. We performed simulations with a vertical-equilibrium assumption using the CO2lab module in the MRST. 68 The numerical simulator together with the optimisation framework and the methodology used to compare the plume outlines, are described in the third section. The simulator was coupled to an adjoint-based optimisation tool 33 to minimise the mismatch between the observed and simulated plume shape. The results for both synthetic and the Sleipner models are discussed in detail in the fourth section followed by a summary and conclusion in the fifth section.

Model input Synthetic model description
Here we first tested the optimisation tool in a problem with a known answer. Five sets of plume outlines have been created using specific slope and caprock topography variations and are referred to as the 'observed' plume. Using our approach, we recreated the 'observed' plume outline through systematic optimisation of uncertain parameters, that is, aquifer slope and caprock topography. Since the answer is already known, this step helps to test the approach before applying it to a real field model which is more complex. The contact boundary between the caprock and aquifer of the synthetic models in the first part of the study is represented by Eqn (1). The z-axis is oriented downwards.
where ω = 2π λ : angular frequency of the sine wave function, x, y and z: directional coordinate of the caprock surface, A and B: amplitude of the sine wave function, S: model dip angle.
The first two terms (with 'B' as a multiplier) represent the main structural traps. The rugosities in the x and y directions are presented by the next two terms, with a higher angular frequency than the main traps (ω 2 > ω 1 ). The last two terms represent the model slope.
B, ω 1 and ω 2 are constant in Eqn (1) at values of 35 (m), 15 (rad m −1 ) and 100 (rad m −1 ), respectively. The model parameters are the rugosity amplitude and model slope in the x and y directions, presented by A x , A y , S x and S y , respectively, which are subjected to change in order to find the best match. A schematic of  the caprock with no rugosity and slope ( A x = A y = S x = S y = 0) aquifer is shown in Fig. 1. The amplitude of the dome (B) is 40 m, and the z-axis is vertically exaggerated by a factor of 25.
A single CO 2 injector is considered at the centre of the model (i.e., cell (50,50)), injecting CO 2 for 12 years (starting from 1999) with a constant flow rate of 0.5 Mt year −1 (assumed), equivalent to 20% of the CO 2 emission of a 500 MW coal-based power plant. 69 Further information about the model is presented in Table 1. The model is homogenous with a permeability of 500 mD in the lateral direction. Note that vertical heterogeneity in permeability is ignored in VE models. Therefore, the intra-layer flow in the simulation is not considered and by permeability in this study, we are referring to horizontal permeability only. The errors resulting from VE modelling can in many cases be smaller than the errors resulting from low lateral resolution used to make the 3D simulations computationally feasible. 70 More information about the VE models is provided in the section on numerical simulation.
The relative permeability curves (k r ) for CO 2 and brine are taken from a previous work 71 and are shown in Fig. 2.
The numerical simulation of CO 2 injection and migration was initially performed on a synthetic model with a known slope (S x and S y ) and rugosity (A x and A y ). The resulting plume outline was recorded and subsequently reinterpreted as 'observed' data. New models were synthesised with a different set of inputs for the investigated parameters (slope and rugosity in the x and y directions). Simulations were then performed on individual synthetic models using a non-linear optimisation framework. The investigated parameters were calibrated within predefined limits with a view to match the original 'observed' data.

Sleipner 2019 benchmark model
The Sleipner CO 2 storage site has been extensively monitored from its inception, and a series of time-lapse seismic datasets that documents the migration of the injected CO 2 has been established and (along with associated well logs and baseline seismic) has served as an input to the creation of the benchmark model. The 2019 benchmark model is the first complete 3D model of the Sleipner covering eight reservoir zones interbedded with eight continuous shale layers. The geological model used in this part of the study consists of the caprock, sand wedge (L9, highlighted in red in    (Table 2) 72 in which the authors measured the values of the rates based on the injected volumes and plume growth rates observed from the seismic data. 72 The injection rate is representative of the volume of the CO 2 entering Layer 9, therefore it is smaller than the real CO 2 rate in the field. Since the internal layers are not considered, once injected, the whole CO 2 plume will reach the top-surface immediately, whereas in reality, the plume encounters and passes through eight shale layers and part of it becomes trapped before reaching Layer 9.
Dissolution and mineralisation are important factors, which nevertheless had to be left out of the study due to computational considerations. Relative permeability data have been taken from a previous work 72 with a residual saturation of 0.11 and 0.21 allocated to the brine and CO 2 , respectively. The temperature impact on density (and viscosity) is modelled with sampled tables of density and viscosity as functions of pressure and temperature using the CoolProps open-source package. 73 Further information about the model can be found in Table 3.
Permeability values are generated using the lognormal distribution approach within the reported range. Porosity is then generated from permeability data using the Kozeny-Carman correlation. 79 Figure 4 shows the porosity and permeability map used for the Sleipner model.

Numerical simulation
The numerical simulations are performed using the CO2lab module in MRST, 80 which uses a simplified version of flow equations and is based on the assumption of VE. Brine and CO 2 are assumed to be in hydrostatic equilibrium throughout the simulation and vertical flow migration is considered negligible compared to lateral migration. 81 Due to the significant difference between CO 2 and brine densities, segregation is considered to occur instantly and the fluids form two separate layers. For typical operational conditions and formation thickness in geological CO 2 storage projects, the VE assumption is likely to be valid for horizontal permeabilities above 100 mD. 82 Note that Sleipner Layer 9's permeability is between 1100-5000 mD and is characterised by gravity segregation. 33 By using the VE method, it is possible to model most relevant physical aspects of long-term CO 2 storage. Its feasibility in the CO 2 storage problem has   been investigated in previous studies 38,70,[82][83][84][85] and was also employed to model the Sleipner benchmark. 33,34,68,72 Strong segregation occurs in the Sleipner, and the plume migration is strongly affected by the caprock morphology. 72 In the VE model, the problem dimension is reduced from 3D to 2D. This significant reduction in the number of unknowns also significantly reduces the computational cost and allows the modeller to consider a higher lateral grid resolution beyond what would be practical in full 3D simulations. It is then possible to reconstruct the full 3D model from the 2D one as a post-processing step. 70 The VE simulation results were compared with a full 3D black-oil simulator, and the results showed that the VE model could capture the main flow physics of Sleipner benchmark. 68 The CO 2 plume behaviour on Sleipner observed in the VE model was observed to be the same as the one from a full 3D simulation model. 34 Equation 2 represents the mass conservation in a 3D simulation on the fine-scale.
Here, capillary pressure that is a function of water saturation can be expressed as: P c = p g − p w . By assuming vertical hydrostatic equilibrium and fluid segregations, it is possible to derive VE model equations from Eqn 2. The up-scaled parameters in VE formulations are calculated by integrating the fine-scale ones across the thickness of the aquifer with respect to the z-axis. The up-scaled mass conversation is represented in Eqn (4), which is achieved by replacing each variable in Eqn (2) with its vertically averaged counterpart. Readers are referred to a previous work 33 for the full derivation. The equation is presented in fractional form by using the plume thickness as a variable.
where p i : pressure at the interface of gas and water, f g : fractional flow function for the gas phase, which is given by Eqn (6).
It is important to note that the up-scaled mobility ( t in Eqn (5)) is different from the fine scale (λ α in Eqn (3)) and is defined as 81 : where ξ B = elevation of the bottom of the formation (surface), ξ I = elevation at which two fluids are separated (surface), K = integrated permeability.
Note that fine-scale mobility (λ α ) depends on saturation (which is a 3D model), whereas up-scaled mobility ( α ) depends on vertical plume thickness, h, (which depends only on 'x' and 'y'). The sharp interface assumption, which is valid in models with negligible capillary pressure effect, is employed for the reconstruction of fine-scale saturation and mobility based on the up-scaled saturation. More details can be found in Andersen et al. 86

Optimisation framework
The optimisation tool employed in this study is implemented in the MRST. It can be used for solving optimal control problems with forward and adjoint solvers, and implements a quasi-Newton optimisation routine using Broyden-Fletcher-Goldfarb-Shanno (BFGS) updated Hessians. 53 In the first part of the study (synthetic model), model parameters are A x , A y , S x and S y , which are scalar values representing the rugosity amplitude and aquifer slope in the x and y directions, respectively (Eqn (1)). In the second part of the study, the model parameter is dz, which represents the absolute local elevation changes in caprock depth, and whose number of scalar values equals to the number of cells in the top row of Sleipner layer 9 (64 × 118 = 7552).
Note that the objective function in this study only depends on plume thickness (h). The upscaled (VE) saturation is the fraction of CO 2 in a total vertical column (disregarding the residual saturation); in other words, h divided by the aquifer thickness. 38 The optimisation framework aims to find a set of model parameters and minimise the misfit function in form of J = J m where J m is described by the following equation: where m = time-instance of the set of observed CO 2 plume thickness (h obs ). We have four sets of observed data (2001, 2004, 2006 and 2010) in both parts of this study, h = Simulated plume thickness given the model parameters, V = Aquifer volume found below each cell in the 2D top surface grid, α = Regularisation term (equals to 0.1).
While adjusting dz in the second part of the study, we have a very large degree of freedom in the optimisation problem (equivalent to the z-value of all caprock cells in Sleipner model). The regulation term (α) is introduced to reduce the ability to overfit the data. See Sections 2.3-2.6 in the study performed by Nilsen et al. 33 for more information about the implemented adjoint-based optimisation framework in the MRST and Jansen's 87 work for more details about formulating adjoint equations for simulating multiphase flow in porous media.

Sørensen-Dice coefficient
Several approaches are available to quantify the similarity of the plume migration resulting from two different geological models. One method used in previous studies [88][89][90] is to compare the plume centre of mass with a reference point, such as the injection point.
Here we employed the Sørensen-Dice coefficient (SDC), a statistic used to quantify the similarity of two discrete samples. 91,92 The approach showed promising results when applied to the Ketzin 93 and Sleipner 30,31 CO 2 storage sites to compare the similarity of the simulated and observed CO 2 footprint. SDC ranges between 0 and 1, where an SDC equal to 1 corresponds to identical samples. Note that the SDC coefficient is not implemented into the optimisation framework and as mentioned in the section on optimisation framework, the objective function in this study is solely a function of plume thickness and plume outline refers to the plume thickness profile. Once the model parameters are optimised, the SDC coefficient is calculated after running a forward simulation using the optimised parameters. The SDC coefficient is defined as: where X: the plume outline from the simulation at the desired time, Y: the observed footprint generated from the seismic data at the same time as X.

Synthetic model
We tested our approach on a synthetic model first intending to recreate a set of model parameters used to generate simulated 'observations' . The adjoint-based optimisation tool is employed to effectively minimise the mismatch between the observed and simulated    (1)) and aquifer slope (S x and S y in Eqn (1)). Five sets of observed plume outlines (Cases A-E) are considered which are achieved with the following set of inputs listed in Table 4. Figure 5 shows the observed CO 2 Plume thickness profile at the end of the simulation for the synthetic models which are achieved using different sets of input parameters listed in Table 4.
Six sets of initial guesses (Cases 1-6) are considered for all the cases and are listed in Table 5. Two sets of limits to optimise the variables are considered: scenarios 'a' and 'b' in which the optimised parameters could be any value within (0-10) and (0-20) (degrees or metres, based on the parameter), respectively. Table 5. Initial guesses and search range for each of the model parameters.

Model parameter limit
Scenario Assigning the initial guess values listed in Table 4 to the uncertain parameters results in different plume thickness outlines in 2001, 2004, 2006 and 2010, which are shown in Fig. 6.
The optimisation has been performed for 60 cases which is a combination of five sets of observed plume outline, six sets of initial guesses for the parameter with two sets of assigned limits. Figure 4 shows the number of iterations for each optimisation run (x-axis), where the letters A-E on y-axis show the name of the five cases with a different observed plume outline (as in Table 3 and Fig. 2) and legends 1-6 indicates the six sets of data used as an initial guess (as in Table 4). Figure 7(a) and (b) shows the results for scenario 'a' , that is, small parameter limit of (0-10) and the results for scenario 'b' that is, larger parameter limit of (0-20), respectively.  While it is difficult to find a relationship between the shape of observed plume outline (Cases A-E) and the number of iteration, the results clearly show that optimisation requires a higher number of iterations in scenarios 'b' than 'a' . Therefore, the narrower range can decrease the computational cost, although this could introduce the risk of missing the optimum points.
The resulting plume outline for all the cases resulted in an SDC ∼1 at all four time steps, that is, the observed and simulated plume outline had a complete match throughout the simulation. Here, we show the results of the plume outline match for Case A. The corresponding SDC for plume outlines using the initial model parameters (A x = 7, A y = 3, S x = 4 and S y = 2) are listed in Table 6.
Calibrated CO 2 plume thickness profile for Case A using smaller parameter limit (scenario 'a') is presented in Fig. 8. Similar plume outline was achieved for scenario 'b' and the results indicate that a satisfactory plume match is achieved regardless of the starting point (Cases 1-6) and ranges (scenarios 'a' or 'b') used in this study.
Here we also compare the results based on the calibrated values for the model parameters. Figure 9 shows the optimised values for all the optimisation runs in Case A. As mentioned earlier, the observed plume outline in Case A is achieved setting A x = 7 m, A y = 3 m, S x = 4 o and S y = 2 o in Eqn (1). For this study, we observe no significant dependence on the initial guess for the final solution obtained. A similar degree of accuracy was achieved for Cases B-E. Note that in more complex optimisation settings, the choice of the start point would be important and here we are dealing with a simple problem. We define averaged error to quantify the mismatch between the calibrated and observed model parameters as: where X and Y are described by Eqns (11) and (12), respectively, where 'cal' and 'sim' denote calibrated and simulated, respectively.  The averaged error helps to provide a better understanding of the performance of the optimisation tool. The results for all the optimisation cases of scenarios 'a' and 'b' are shown in Fig. 10(a) and (b), respectively. The results show an average error of less than 2.5% for all the cases. Although setting tighter bounds (scenarios 'a') for the parameter ranges resulted in a relatively lower computational cost (lower number of iterations as shown in Fig. 7), however, it does not guarantee a more accurate result, as the averaged errors for this scenario is fairly the same as for scenario 'b' . While there is a small 'wiggle room' in calibrated parameter values, results clearly show that the optimisation tool employed in this study can identify precisely a single set of model parameters to match the plume extension at different time steps. In the next section, we discuss the results from applying the tool to the recent Sleipner model to calibrate its caprock morphology to improve the match between the observed and simulated plume outline.

Sleipner model
The plume outline resulting from the original Sleipner model is compared with the one from seismic data in 2001,2004,2006 and 2010 (Fig. 11). The plume shows a better match in the early years (SDC ∼70%) than in 2010 (SDC ∼60%).
Various researchers addressed the different source of uncertainties, such as temperature, 30,32,94 CO 2 plume impurities, 30 CO 2 density, 35,78,94,95 injection rate, 33,35 porosity and permeability 33  First, the mismatch between the observed and simulated plume outline is minimised by only modifying the caprock morphology. To avoid an unrealistically large plume, we calibrated the injection rate before the main optimisation. The results showed that using a rate multiplier of 0.7, which is in agreement with the range reported in a previous study, 33 improved the plume match (9% improvement in plume match in 2010, for instance). The resulting plume therefore roughly represents the fraction of CO 2 that has reached the upper layer based on the time-lapse seismic data. The resulting plume outline from the calibrated model is compared with the seismic data in Fig. 12. The result shows an increase in the average SDC in the four studied time steps from 65% (original model) to 73% (calibrated model) which is significant. Note that we are not expecting to find a perfect match as other parameters affect the Sleipner plume match. Calibrating the caprock morphology alone seems to especially improve the match in the middle of the plume, where the CO 2 plume is thickest and forms a reverse 'L' shape. This improvement cannot be observed by SDC as the plume thickness is not considered in the similarity measurement which is a limitation of this approach. Moreover, SDC considers the difference in the uncommon elements which is another limitation. In other words, if circle X and Y have an area equivalent to 75% and 125% of circle Z, respectively, then SDC X & Z = SDC Y & Z = 0.75. Note, as mentioned in the Sørensen-Dice coefficient section, the SDC coefficient is not implemented into the objective function and is calculated for the model once the parameters are optimised, therefore, its limitation does not affect the optimisation process.
While the optimisation tool was allowed to modify the caprock morphology within ±10 m, the results presented in Fig. 13 are achieved by an average absolute elevation change of ±3.23 m. Figure 10 shows the elevation change applied to the original geological model. The algorithm seems to decrease the elevation in the region outside the observed plume outline. Therefore, the lower caprock outside the observed plume area encourages the plume to move inward.
For the second and third experiments, the values of porosity, permeability, injection rate and CO 2 density are modified using a multiplier for each of the parameters. The results are listed in Table 7. While adjusting the caprock morphology only in the first experiments increased the SDC to 73%, changing porosity, permeability, injection rate and CO 2 density (second experiment) resulted in an SDC of 70% which shows the significant of caprock morphology in controlling the plume migration. The results also show that calibrating all the parameters together (third experiment) can decrease the mismatch between observed and simulated plume outlines significantly and results in an SDC of 81%. It is, however, important to note that the calibrated permeability average in the second and third experiments is noticeably above the reported range for Layer 9 in the literature, which is 1.1-5 Darcy. Calibrated densities in both experiments, however, are close to the proposed value by Cavanagh and Haszeldine, that is, 355 kg m −3 .
The pressure distribution in the models at the end of the simulation is also shown in Fig. 14.
The resulting plume outlines for the second and third experiments are shown in Figs. 15 and 16, respectively. While the previous results showed that calibrating the caprock morphology alone improves the plume match in the middle regions of the plume, as we can see in Fig. 15 calibrating porosity, permeability, injection rate    and CO 2 density has a similar impact and helps to improve the match in the centre of the plume. However, adding the caprock morphology to the history matching parameters results in a better match in the upper and lower part of the plume. This can be seen in Fig. 17 as increasing the elevation in both sides of the model (red area in Fig. 17), helps the plume in becoming trapped within these regions.

Conclusion
This work is focused on the impact of caprock morphology on CO 2 plume migration and footprint in the Sleipner model. We employed an optimisation code to systematically change the model parameters to minimise the mismatch between the observed and simulated plume outlines. First, we used our method to predict the CO 2 plume shape in a synthetic model. We found a set of parameters representing the slope and rugosity in the x and y directions using a non-linear simulation-based optimisation tool implemented in MRST and successfully reproduced the observed plume footprint. The method was robust, and the results were satisfactory using different initial starting points and calibration limits for each parameter. We then employed the optimisation tool to improve the plume match in the Sleipner 2019 benchmark model by modifying the caprock morphology within a range smaller than the seismic detection limit. The results showed an average improvement of about 8% in the plume match in comparison to the original model, resulting from an average absolute elevation change of 3.23 m in the caprock. In order to compare the importance of caprock morphology with other parameters in improving the plume match in Sleipner model, we defined two additional experiments. First, we performed the optimisation by calibrating porosity, permeability, CO 2 density and injection rate and later we added the caprock morphology to the history matching parameters. When excluding the caprock morphology, modifying the porosity, permeability, CO 2 density and injection rate, the optimisation resulted in an improvement of 5%. Calibrating the caprock morphology together with other parameters, however, resulted in an improvement of 16% compared to the original model. The results clearly show the impact of caprock morphology in controlling the plume migration and trapping mechanism in the Sleipner model.
Calibrating the caprock morphology in the presence of other parameters improved the match in the bottom and upper part of the plume, while when calibrated alone, it was more effective in the middle where the CO 2 has the highest saturation. This work demonstrates the importance of accurately implementing topographic variations in the geological models for CO 2 storage studies. Moreover, it helps in developing a better insight on the plume migration in the saline aquifer and shows how a few metres elevation change in the top surface, which cannot be detected in seismic surveys, can control the CO 2 migration path. Similar optimisation tool can be implemented in any storage site to improve the geological model by modifying the caprock morphology within the range lower than the seismic detection limit.
The internal layers of the Sleipner model were disregarded in this study, and the focus was on Layer 9 only. A more detailed study would involve applying the VE model to each layer and use a stacked VE model instead where shale layers could be represented by modified transmissibility.