CAS-ESM 2 : Description and Climate Simulation Performance of the Chinese Academy of Sciences ( CAS ) Earth System Model ( ESM ) Version 2

©2020 American Geophysical Union. All rights reserved. Zhang He (Orcid ID: 0000-0001-7222-9072) Zhang Minghua (Orcid ID: 0000-0002-1927-5405) Fei Kece (Orcid ID: 0000-0001-8520-7278) Ji Duoying (Orcid ID: 0000-0002-1887-887X) Wu Chenglai (Orcid ID: 0000-0002-8397-2424) Zhu Jiawen (Orcid ID: 0000-0002-6174-7234) He Juanxiong (Orcid ID: 0000-0003-3816-9656) Chai Zhaoyang (Orcid ID: 0000-0002-8064-9066) Xie Jinbo (Orcid ID: 0000-0002-3003-9155) Dong Xiao (Orcid ID: 0000-0003-2634-2132) Chen Xueshun (Orcid ID: 0000-0002-4708-1149) Lin Pengfei (Orcid ID: 0000-0003-2361-0066) Lin Zhaohui (Orcid ID: 0000-0003-1376-3106) Liu Hailong (Orcid ID: 0000-0002-8780-0398) Liu Xiaohong (Orcid ID: 0000-0002-3994-5955) Wang Tianyi (Orcid ID: 0000-0002-8726-9197) Wang Zifa (Orcid ID: 0000-0002-7062-6012) Xie Zhenghui (Orcid ID: 0000-0002-3137-561X) Xu Yongfu (Orcid ID: 0000-0002-9925-6756) Yu Yongqiang, yong (Orcid ID: 0000-0001-8596-3583) Zeng Qingcun (Orcid ID: 0000-0002-8237-8802) Zhu Jiang (Orcid ID: 0000-0001-9846-8944)


Introduction
The Chinese Academy of Sciences (CAS) Earth System Model version 2 (CAS-ESM2) has been developed to simulate both the physical climate system and the global environmental processes of air pollution and carbon cycle. In developing CAS-ESM, we have set the following priorities. First, we introduce new model parameterizations to provide numerical simulations that are sufficiently independent from those of other models. Second, the model is built with capability to simulate environmental and ecological systems. The former includes air pollution while the latter vegetation dynamics. Third, wherever possible, physical parameterizations are designed with placeholders for additional improvements pending on follow-up diagnostics and evaluations. Additionally, the model is built with a two-way nesting of a regional high-resolution atmospheric model to facilitate the application for simulation and forecasting of regional climate and environmental extreme events. Additional components are the IAP Vegetation Dynamics Model and the IAP fire model embedded within the land model; IAP ocean biogeochemistry model embedded within the LICOM ocean model; interactive components through the coupler to CAS-ESM2 of an atmospheric aerosol and chemistry model and various emission models. The model is a newcomer in the community since this is the first time that it is contributing to CMIP simulations. An earlier version CAS-ESM1 had used IAP GCM4 as the atmospheric model ) that contained the IAP dynamics but largely the same physical parameterizations of the Community Atmospheric Model (CAM5). CAS-ESM2 contains significantly changed physical parameterizations in addition to the IAP dynamical core in the atmospheric model.
CAS-ESM has several sister versions of a climate model, the CAS-Flexible Global Ocean-Atmosphere-Land System model (FGOALS; Li et al. 2020;Guo et al., 2020). FGOALS has contributed to each phase of the past Coupled Model Intercomparison Project (CMIP). CAS-ESM2 shares the same ocean model as CAS-FGOALS and some features of the dynamical core of the atmospheric model in FGOALS-g. It also shares the coupling software infrastructure Coupler 7 of CESM.
The purpose of this paper is twofold. One is to describe CAS-ESM2 and its simulation features as a reference to a CMIP6 model. Emphasis will be given to remaining biases and possible causes. The other is to share our experience in the development of model. The paper is organized as follows. Section 2 gives a synopsis of the model components. Section 3 describes the calibration process of the pre-industrial control simulations. Section 4 presents simulation results of the present-day climate. Section 5 discusses results of climate sensitivity experiments.
The last section is a summary. CMIP6 simulation data have been submitted to the ESG grid for analysis by the community.

Atmospheric model
The atmospheric component of CAS-ESM2 is IAP AGCM5.0, the fifth generation AGCM developed by IAP. The development of the IAP AGCM started in the early 1980s. It is a global grid-point model using finite-difference scheme with a terrain-following σ coordinate. Its predecessor version IAP AGCM4.1  was released in September 2015 as part of the CAS-ESM1. Earlier versions of the model used physical parameterizations of the Community Atmospheric Model (CAM, e.g., Neale et al., 2010) with some modifications for tunable parameters. The current version is described in the following.
a. Dynamical core The dynamical core of the atmospheric model IAP AGCM5.0 uses the finite difference method of Zeng (1983) with two horizontal resolutions, 0.5 degree and 1.4 degree respectively on uniform latitude-longitude grids. The model has three configurations of vertical resolution with 35 levels (model top at 2.2 hPa), 60 levels (model top at 2.2 hPa), and 91 levels (model top at 0.01 hPa) respectively. For results presented in this paper, the 1.4-degrees and 35-level resolution configuration is used. Compared to its predecessor IAP AGCM4.1 with 30 levels, the increased vertical layers in IAP AGCM5.0 are mostly located in the lower troposphere below 700 hPa (Figure 1), which was designed to improve model performance related to boundary-layer processes.
Several novel features of the dynamical core in the IAP AGCM5.0 are inherited from the previous version 4.0. These include subtraction of the standard atmospheric stratification, in which the state variable of temperature is the departure from a prescribed vertical profile of climatology; the IAP transform, in which the prognostic variables are atmospheric state variables weighted by the square root of the surface pressure so that the numerical scheme conserves total available energy instead of total energy; a nonlinear iterative time integration method, and time splitting method. More specific formulations can be found in Zhang et al. (2013). In the IAP AGCM4.1, Fourier filtering is applied to damp the high frequency waves in high latitudes. However, the global data communication in the FFT algorithm posed serious limitation to the model parallel scalability. Thus, an adaptive leap-format difference is implemented in the IAP AGCM5.0 instead of FFT filter to achieve high parallel efficiency based on 3D decomposition (Cao et al., 2020).
b. Physical parameterizations IAP AGCM5.0 uses a two-plume atmospheric convection scheme. The mean vertical velocity in both plumes is calculated by using buoyancy and entrainment as described by the Simpson and Wiggert (1969) equation. The two plumes differ in the parameterizations of their entrainment/detrainment rates, mass flux closure, and trigger. The deep plume has both an organized entrainment/detainment and turbulent entrainment/detrainment, conceptually similar to the Tiedtke (1989) scheme, but different in physical assumptions and implementations. The organized component in the deep plume is intended to capture buoyance-forced cloud-scale circulations, while the turbulent component is intended to capture mixing at the lateral surface of the plume. The deep plume uses a prognostic closure with the Convective Available Potential Energy (CAPE) relaxed toward a quasi-equilibrium condition. It is triggered jointly by CAPE, dCAPE (Xie and Zhang, 2000), and column-integrated precipitable water at a threshold saturated value below the level of 750 hPa. The shallow plume uses buoyancy sorting to calculate entrainment and detainment with Convection Inhibition (CIN) as closure following Park and Bretherton (2009). It is triggered when the kinetic energy of rising air parcel in the boundary layer is larger to overcome the CIN. The implementation differs from Park and Bretherton (2009) in that the lifting forced by large-scale flow and surface heterogneity are added to the kinetic energy of the parcel. In Park and Bretherton (2009), shallow convection is triggered when the kinetic energy of air parcels at the tail of the subgrid scale distribution of vertical velocity is larger than the convective inhibition (CIN). We included the large-scale vertical velocity to the subgrid scale vertical velocity. We additionally used the subgrid distribution of terrain and the magnitude of spatial gradients of temperature and humidity at the resolved scale to form an empirical heterogeneity metric to add to the eddy kinetic energy. Description of the convection scheme and its performance will be presented in a separate paper in this special collection.
The turbulence scheme uses the 1 st order closure with diagnostic turbulent kinetic energy modified from Bretherton and Park (2009). The two notable changes include the following: (1) Entrainment of cloud-top radiative cooling is calculated by using radiative cooling rate at the top of the fractional clouds instead of at the whole grid; (2) the impact of mesoscale surface inhomogeneity is included in the calculation of the turbulent kinetic energy. Surface mesoscale inhomogeneity is introduced as a perturbation to the thermodynamic fields by using subgrid distribution of terrain and spatial gradients at the resolved scale.
The atmospheric cloud macrophysical scheme uses separate calculations of boundary layer clouds, convective and stratiform clouds above the boundary layer. Within the boundary layer, cloud amount is calculated based on the variances of subgrid variation of liquid water potential temperature and total water that are diagnosed from the subgrid scale vertical transport calculated from the convection and boundary layer scheme. An implicit probability distribution function of the subgrid scale variability is used to derive boundary cloud amount and cloud water, which were based on LES simulations over several climate regimes. Convective cloud amount is parameterized based on convective mass flux and convective vertical velocity as well as detrained amount of air mass in both the deep and shallow plumes. Stratiform cloud above the boundary layer uses implicit function of the subgrid scale distribution of temperature and water vapor that collapses to grid-scale relative humidity, for both water and ice saturation depending on temperature. Detailed description of the cloud scheme will be submitted in a separate paper.
The cloud microphysical scheme is a modification of the two-moment Morrison and Gettlemen (2008) scheme with several changes: (1) consideration of subgrid scale variability of cloud water that is dependent on atmospheric stability and model resolution based on measurements from the Atmospheric Radiation Measurement program (ARM) of the U.S. Department of Energy (Xie and Zhang, 2015); (2) a new particle size dispersion parameterization based on data from Liu et al. (2006); (3) inclusion of the impact of spectral dispersion of particle size distribution on the collision kernel and in the radiation calculation; (4) inclusion of water vapor deposition onto ice crystals. The cloud microphysical scheme is coupled with the aerosol model in the same way as in the CAM microphysics with the Modal Aerosol Model (MAM) (Neale et al., 2013).
The radiation scheme uses the RRTMG (Mlawer et al. 1997). The infrared radiation model RRTMG-LW uses 16 contiguous spectral bands and 140 g-intervals. The shortwave radiation model uses the RRTMG-SW with 14 contiguous spectral bands and 112 g-intervals.
The orographic gravity wave parameterization is from Xie et al. (2020). It is based on the orographic anisotropy scheme of Kim and Doyle (2005) and Choi and Hong (2017) by allowing the winds to blow from all directions relative to the terrain orientations.
The IAP AGCM5.0 is therefore sufficiently different from other models because of its unique dynamical core, the new convection and cloud schemes, and many modifications made to other parameterization schemes adopted from other models 2.2 Ocean and sea ice models The ocean component of CAS-ESM2 uses LICOM2.0 (Liu et al., 2012), with  -coordinates and a free surface. The model is formulated on standard latitude-longitude grids at horizontal resolution of approximately 1globally, increasing to 0.5meridionally between 10°S and 10°N. There are 30 levels in the vertical direction with 10 m per layer in the upper 150 m. The vertical turbulent mixing scheme is the second-order turbulent scheme of Canuto et al. (2001Canuto et al. ( , 2002. The mesoscale eddy parameterization uses the Gent and McWilliams (1990) scheme with a diffusion coefficient of 1000 m 2 /s for both the bolus and Redi parts of the isopycnal mixing. Convection is parameterized by convective adjustment (Pacanowski, 1995). Based on original version of LICOM2.0, some key modifications have been made: (1) a new sea surface salinity boundary condition is introduced based on the physical process of air-sea flux exchange at the actual sea-air interface (Jin et al., 2017); (2) a new formulation of turbulent air-sea fluxes is introduced based on Fairall et al. (2003). An evaluation of the ocean model performance can be found in Lin et al. (2013) and references therein.
The sea ice model is modified from the Los Alamos sea ice model Version 4.0 (Hunke and Lipscomb, 2008), using the same grid as the oceanic model. The model solves the dynamic and thermodynamic equations for 5 ice thickness categories, with one snow layer and four ice layers.
For the dynamic component, we used the elastic-viscous-plastic rheology (Hunke and Dukowicz 1997), the mechanical redistribution scheme (Lipscomb et al., 2007) and the incremental remapping advection scheme (Lipscomb and Hunke, 2004). For the thermodynamic component, we used a parameterization with more realistic sea ice salinity budget as described in Liu (2010).

Land model
The land component of CAS-ESM2 is the Common Land Model (CoLM; Dai et al. 2003), which was initially developed by incorporating the best features of three earlier land surface models: the Biosphere-Atmosphere Transfer Scheme (BATS; Dickinson et al., 1993), the 1994 version of the CAS IAP Land Surface Model (IAP94; Dai and Zeng, 1997) and the NCAR (National Center for Atmospheric Research) Land Surface Model (LSM; Bonan, 1996Bonan, , 1998. The initial version of CoLM was adopted as the Community Land Model (CLM) for use with the Community Climate System Model (CCSM; Collins et al., 2006), and later was adopted as the land component for Beijing Normal University Earth System Model (BNU-ESM, Ji et al., 2014). Currently, the CoLM is radically different from its initial version.
Changes include the followings: (1) The two-stream approximation model of radiation transfer of the canopy has been improved, with attention to singularities in its solution and with separate integrations of radiation absorption by sunlit and shaded fractions of canopy (Dai et al., 2004).
(2) A photosynthesis-stomatal conductance model is used for sunlit and shaded leaves separately, and for the simultaneous transfers of CO2 and water vapor into and out of the leaf (Dai et al., 2004). (3) The vertical resolution is increased to 15 vertical soil layers and up to 5 snow layers, with which the soil column resolves downwards to 42.1 meters and has the thermal insulation effects of soil organic matters. The deep soil column and the insulation effects of soil organic matter are important to realistically simulate northern high-latitude permafrost extent and its active layer thickness (Lawrence and Slate, 2008). (4) A new frozen soil scheme is introduced by considering the supercooled soil water (Niu and Yang, 2016), which allows liquid water to coexist with ice in the soil over a wide range of temperatures below 0°C by using the freezing-point depression equation. The computation of vertical water fluxes considers fractional permeable area within a model layer, with the total soil moisture used to calculate the soil matric potential and hydraulic conductivity. (5) Multi-layer soil organic carbon (OC) scheme is introduced to describe the cryoturbation and bioturbation effects on vertical soil carbon mixing, which allows the soil carbon generated near the soil surface to move downwards into colder regions of the soil (Koven et al., 2009). (6) A carbon-nitrogen interactions scheme is introduced based on LPJ-DyN (Xu and Prentice, 2008). (7) A new frozen soil parameterization is implemented that includes frost and thaw fronts (Gao et al., 2016(Gao et al., , 2019. (8) An anthropogenic water use scheme is introduced (Zou et al., 2014(Zou et al., , 2015Zeng et al., 2016Zeng et al., , 2017. (9) The model has included anthropogenic nitrogen discharge and transport into rivers (Liu et al., 2019).

Dynamic global vegetation model
A new dynamic global vegetation model, IAP-DGVM, is coupled with CoLM in the framework of CAS-ESM2 (Zhu et al., 2018). The IAP-DGVM classifies land natural vegetation into 12 plant functional types with physical, phylogenetic, phenological parameters and bioclimatic limitations (Zeng et al., 2014). Significant developments in IAP-DGVM included the following: (1) a shrub sub-model (Zeng et al., 2008;Zeng, 2010); (2) a new establishment parameterization scheme ) that enables the model to correctly reproduce the regimes of major terrestrial ecosystems, the dependence of the vegetation distribution on climate conditions; (3) a process-based fire parameterization of intermediate complexity (Li et al., 2012) to simulate the global fire burned area, fire seasonality and interannual variability. When coupled with CAS-ESM2, IAP-DGVM showed a good performance in simulating global vegetation distributions and carbon fluxes (Zhu et al., 2018).

Atmospheric aerosol and chemistry model
CAS-ESM2 uses two atmospheric aerosol and chemistry models. One is the IAP Aerosol and Atmospheric Chemistry Model (IAP-AACM; Chen et al., 2015;Wei et al., 2019) which is coupled with two-way interaction to the IAP AGCM5.0 through the coupler. The other model uses the three-mode version of modal aerosol mode (MAM3) described in Liu et al. (2012b) and the Model for Ozone and Related chemical Tracers (MOZART) (Emmons et al., 2010;Tilms et al. 2015). The IAP-AACM includes modules to calculate gaseous chemistry, aqueous chemistry, heterogeneous chemistry, dry and wet deposition. For different research purposes, two options can be used for gas phase chemistry: the Carbon Bond Mechanism Z (CBMZ; Zaveri and Peters,1999) and a simplified chemistry using the oxidants from CBMZ (Wei et al., 2019). The CBMZ scheme calculates 133 reactions for 53 species to simulate the major gas-phase reactions for photolysis, ozone production, oxidation of gas pollutants in the atmosphere. The simplified chemistry option uses the oxidants of CBMZ scheme and only considers the sulfur chemistry.
The IAP-AACM also has two versions of aerosol modules to represent the size distribution of aerosol particles. One is the two-static mode and the other is the size-bin scheme. The twostatic modes include a fine mode to simulate the primary fine particles and secondary particles with diameters less than 2.5 μm and a coarse mode to simulate primary coarse particles with diameters in 2.5-10 μm. In this scheme, only the bulk mass concentration of aerosol components is calculated. The dust and sea salt particles are represented by four size bins and they are calculated online using the schemes developed by Wang et al. (2000) and Athanasopoulou et al. (2008) respectively. The size-bin scheme uses advanced particle microphysics (APM; Yu and Luo, 2009) to simulate size distribution of aerosol particles with dry diameters in 0.0012 μm to 12 μm (Chen et al., 2014). The model coupled with this scheme has been used to simulate the spatial-temporal variation of particle number concentration, calculate the aging black carbon (BC) particles, and explore new particle formation events (Chen et al., 2017(Chen et al., , 2018. The IAP-ACCM with the complex chemistry option has been used in China for air quality and air pollution studies both as a research tool and as an operational model. The IAP-AACM is coupled to the cloud microphysics model in the same way as MAM in CAM (Neale et al., 2013). Liu et al. (2012b) treats both the internal and external mixings of various aerosol components including BC, OC, sulfate, ammonium, sea salt and mineral dust (Liu et al., 2012b). While anthropogenic aerosol and precursor gas emissions are prescribed, natural aerosol (dust and sea salt) emissions are interactively calculated in the model. Dust emission strongly depends on the near surface wind speed, soil properties and roughness elements. The dust emission scheme of Shao (2004) is implemented into CAS-ESM2 and dust emission is coupled to the land model CoLM. The scheme is developed based on wind-erosion physics assuming that the main mechanisms for dust emission are saltation bombardment and consequent aggregate disintegration (Shao, 2004). Dust emission occurs in the bare soil when the surface wind (in terms of friction velocity) is strong enough to overcome the resistance of surface for wind erosion (in terms of threshold friction velocity). Threshold friction velocity is calculated by adding correction factors from soil moisture (Fecan et al., 1999) and roughness elements (Raupach et al., 1993) to the idealized threshold friction velocity for dry soils with no vegetation cover surrounding. The idealized threshold friction velocity is calculated following the scheme of Shao and Lu (2000). In general, CAS-ESM can reasonably simulate the dust emissions over the deserts on the earth and the estimated global dust emission flux is around 2640 Tg for piControl experiment. A detailed evaluation of dust simulation in CAS-ESM will be given in a separate paper in preparation. For the simulations in this paper, the three-mode version of aerosol module (MAM3) and the MOZART chemistry model are used.

Coupling and configuration
The software framework of CAS-ESM2 is shown in Figure 2, which is based on NCAR's CPL7 (Craig et al., 2012). As with CESM, CAS-ESM2 provides many different model configurations, including both standalone model and various combinations of the individual model components. Two new features in CAS-ESM2 are noted: (1) The mesoscale Weather Research and Forecasts (WRF) model is coupled to atmospheric model IAP AGCM through CPL7 with two-way interactions to enable regionally refined climate simulation or one-way downscaling (He et al. 2013). (2) The atmospheric aerosol and chemistry model is coupled to the IAP AGCM through CPL7 rather than embedded within the atmospheric model to facilitate independent development and maintenance. These two features both involved extension of CPL7 for 3dimensional data mapping and transfer. For the coupling between WRF and the IAP AGCM, the global model generates the lateral condition for the WRF (version 3.6, at present). WRF feedbacks to the global model by providing relaxation fields of atmospheric state variables for the IAP AGCM. The coupling of IAP AGCM and WRF is intended for regional high-resolution short-term simulations and operational seasonal prediction.
The CAS-ESM2 component models' versions and resolutions for CMIP6 DECK and historical simulations are summarized in Table 1. Table 2 summarizes the CAS-ESM2 simulation campaign described in this paper for CMIP6. The five simulation types followed the CMIP6 DECK specifications (Eyring et al., 2016). In the historical and AMIP experiments, the external forcings are based on CMIP6 data at https://esgf-node.llnl.gov/search/input4mips/. These include: GHG concentrations for CO2, CH4, N2O, CFC11, and CFC12 from Meinshausen et al. (2017); emissions of short-lived species from Hoesly et al. (2017Hoesly et al. ( , 2018 and van Marle et al. (2017); solar forcing from Matthes et al. (2017); stratospheric aerosols and ozone concentrations from CMIP6. In addition, prescribed PCMDI sea surface temperatures (SSTs) and sea ice fractions (Durack and Taylor, 2017) are used in AMIP experiments.
The piControl simulation was initialized as follows. The initial states of the atmosphere and land are obtained from a 10-year AMIP-like simulation forced by observed climatology of SST and sea ice concentration, and the initial states of the oceans and ice are from the annual mean observational sea temperature, salinity and sea ice from the World Ocean Atlas 2005 . The model first experienced a cooling in SST in the first few years, after which it warmed up to find its own climate state. Initial conditions for the 1%year CO2 increase (1pctCO2) and abrupt CO2 quadrupling (abrupt-4xCO2) simulations were taken from 1 January of year 250 from piControl. Four members of historical simulations were branched from 1 January of year 80, 150, 200, and 250 from piControl simulations, respectively. Atmosphere initial conditions for four members of AMIP simulations were taken from year 1979 of the corresponding historical simulations.

Model calibration
In the initial development of IAP AGCM5.0, we used AMIP simulations with prescribed seasurface temperature to calibrate the new convection scheme, cloud scheme and various modifications in physical parameterizations against observations. Key targeted metrics were seasonal and spatial distribution of clouds, precipitation, and zonally averaged atmospheric temperature and specific humidity. We achieved improvement in many of these metrics relative to the earlier versions of the model. When applied to preindustrial simulations in the coupled model CAS-ESM2, however, we had to retune the model for it to reach quasi-equilibrium that is not far away from observational estimate.
The most significant tuning target was the energy balance at top-of-model (TOM) in the piControl simulation. After tuning, the simulated TOM net radiative flux was approximately -0.1 W/m 2 over the 500 years simulation and almost no long-term linear trend (Figure 3a). Figures 3b and 3c show the 500-year time series of Sea Surface Temperature (SST) and globally averaged ocean temperature (T). After the first 80 years of a rapid adjustment in the upper ocean, the model nearly reached its quasi-equilibrium state. The equilibrium state of SST was approximately 18.3C, which is slightly warmer than the observed value 18C for 1854-1859 from ERSSTv5 . Sea ice coverage in Arctic and Antarctic also reached quasi-equilibrium states (figure not shown).
The tuning is carried out by adjusting a parameter in parameterization of the amount of low clouds. In theory, as long as the model does not enter into a run-away greenhouse state, equilibrium will be reached eventually because of the negative feedback of thermal radiation. In practice, however, the equilibrium state could be far away from the initial condition and the observation. Increasing the low cloud amount acts to cool the planet, and vice versa. The low cloud amount was initially tuned to the observation in present-day climate, but it had to be re-tuned higher to obtain a reasonable value of SST in the coupled simulation. The tuning parameter is a threshold relative humidity that represents the width of the subgrid scale distribution of humidity. The radiation imbalance is 0.16 W/m2 in AMIP simulations. This small imbalance with present-day SST is due to the warm bias of SST in the piControl simulation.
The small increasing trend of less than 0.05 K/century in the global ocean volume-averaged temperature in Figure 3c reflects that the system is not in true equilibrium state after 500 years of simulations. In fact, the -0.1 W/m 2 loss of energy at TOA implies a cooling trend of about -0.04 K/century. The discrepancy indicates that the model does not exactly conserve energy. Such trend is also seen in other CMIP6 models (e.g., Golaz 2019;Held et al. 2019;Danagasoglu et al. 2020). Geoffroy et al. (2013) gave a solution of the time evolution of temperature under a constant forcing for a two-box model. They estimated the response time scale of the deep layer to be about 240 years. If the initial condition of the deep ocean is different from the equilibrium by ∆ = 1 , at the time of 500 years, the rate of temperature change in the deep layer is about 0.1 K/century. Clearly, a good initial guess (small ∆ ) is important to reach equilibrium faster. This guess field needs to include equilibrated temperature of both the mixed layer and the deep ocean. The trend in Figure 3c is therefore expected.
The Atlantic Meridional Overturning Circulation (AMOC) plays an important role in driving the global climate variation (Caesar et al., 2018). The simulated averaged AMOC in CAS-ESM2 over the piControl final 50 years is displayed in Figure 4. The North Atlantic Deep Water (NADW) between 500m and 4000m north of 35°S and the Antarctic Bottom Water (AABW) below 4000m are well captured by CAS-ESM2. The simulated maximum value of NADW is about 18.7Sv located at approximately 37°N, between 800 m and 1000 m. At 26°N, the maximum strength of NADW is approximately 17.4 Sv, which is slightly stronger than the 16.8 Sv inferred from observations for 2005-2014 (Cunningham et al., 2007;Smeed et al., 2018). Figure 5 shows the cross section of simulated and observed AMOC at 26N. The simulated peak value of AMOC is located at the right depth of 1000 m according to the present-day observations. The observed AMOC penetration depth as measured by the zero crossing in the stream function at 26N is ~4300 m. Although CAS-ESM2's AMOC penetration depth is ~300 m shallower than the observation, this shallow bias is much more reduced compared with some other models Zhang et al., 2011;Wu et al., 2019). The time series of AMOC strength at 26N in piControl simulation is shown in Figure 6. After 100 years spin-up period, the AMOC is statistically stable with multidecadal AMOC variability 0.89 Sv (standard deviation of 10 year low-pass filtered AMOC strength at 26N), which is weaker than but comparable to 1.4 Sv inferred from observation (Yan et al., 2018).

Mean climate and biases
a. The coupled system and the oceans We use the average of 1979-2014 to compare model results with observational estimates. Data periods will be otherwise noted if different from it. We first show the implied atmospheric and ocean transports of heat as a measure of the overall performance of the model dynamics in simulating the global scale circulations. This transport is inferred from the energy fluxes at topof-atmosphere (TOA) and at the surface from: where ℎ and ℎ are respectively the atmospheric moist static energy and heat density of the oceans; the bracket represents vertical integration over the atmosphere or oceans; ⃗ and ⃗ are atmospheric winds and ocean currents respectively, and ↓ are ↓ are net downward energy fluxes at the top of the atmosphere and at the surface, respectively. In the above equations, the atmospheric and oceanic kinetic energy is neglected because it is relatively small.
The implied heat transports in the atmosphere and the ocean, < ⃗ ℎ > and < ⃗ ℎ >, are calculated by assuming negligible change of the heat storage by setting the partial time derivatives in the above equations to be zero. Based on the trend of historical warming during 1979-2014, the time derivative term is around 2 W/m 2 at most latitudes, which is much less than the magnitude of the other terms.
The implied meridional heat transports simulated by the model are shown in Figure 7a. This is compared with observational estimate in Figure 7b. The overall structure of the transports in both the atmosphere (red solid line) and the ocean (blue dashed line) in the model are comparable with observational estimates. The simulated atmospheric northward heat transport in the northern hemisphere is smaller than observation, 4.5 PW versus 5.0 PW. The atmospheric southward heat transport in the southern hemisphere in the model is stronger than the observation, 6.0 PW versus 5.0 PW. The difference in the southern hemisphere atmospheric transport is compensated by weaker southward energy transport by the ocean.
The overall weaker implied ocean heat is more clearly seen in Figure 8 that shows the breakdown in different oceans. The model reasonably simulated the ocean heat transport in the Pacific and Atlantic. In the Atlantic, heat transport is northward across all latitudes, with magnitude comparable to the AMOC transport of 1.2 PW inferred from the observation in Johns et al. (2011). Deficiencies are larger in the Indian Ocean. Observation shows southward transport across all latitudes, but the modeled maximum value is about half of that and changes sign at 30S. This difference indicates too much energy coming from the atmosphere to the ocean in southern hemisphere high latitudes.
The simulated SST bias relative to observation are shown in Figure 9. We noticed similar spatial pattern of the SST biases to some other models despite drastically different atmospheric, ocean and land models (e.g., Golaz 2019;Held et al. 2019;Danagasoglu et al. 2020). Prominent warm biases are over the eastern Pacific, in the eastern hemisphere at southern latitudes around 60S, and south of Greenland. Prominent cold biases are in the western and mid-latitude Pacific and mid-latitude southern Atlantic, Arctic to the north of Europe, and in the ITCZ. The model has a double ITCZ in SST. Causes of some of these biases are generally understood, but they cannot be easily eliminated with the current resolution. For example, the warm biases along the California and Peru coasts are related with the model's inability to simulate the oceanic coastal upwelling with scale comparable to the model grid size, which further causes deficient marine boundary clouds, and thus warming. The warm bias south of Greenland could be partially related with the inability of the model in simulating the intensity of extreme storms that tend to mix heat downward.
The biases of the zonally averaged temperature in different basins simulated in the CAS-ESM2 historical experiments are shown in Figure 10. As a comparison, the biases from a stand-alone ocean/sea ice component forced by reanalysis atmospheric data are also shown. The initial states of these two simulations are the same. The coupled run and the reanalysis-forced run have similar bias patterns, suggesting that these biases in CAS-ESM2 have their primary origin in the ocean component, such as vertical turbulent mixing formula and so on. Subsurface cold biases in the Pacific Ocean (PO) and Indian Ocean (IO) are substantially reduced in the reanalysis-forced experiment, indicating biases in surface forcing in the coupled CAS-ESM simulation. A counter-intuitive feature is that the North Atlantic Deep Water (NADW) warm bias in the unforced simulation is larger than in the coupled experiment. The cause is not clear to us.
b. The atmosphere Figure 11 shows the distribution of net TOA radiation in the model and comparison with observation. The global mean bias is 1.54 W/m 2 , and the root mean square (RMS) error is 12.47 W/m 2 . The bias patterns of TOA net flux are somewhat consistent with the SST bias patterns over the ocean, but the cause-effect relationship is less clear. For example, simulated positive SST biases near the California coast can be the result of excessive downward radiative flux, but they may be the cause of deficient low clouds that lead to more downward radiation.
We note that in the net bias, there is compensation of biases in the longwave (LW) and shortwave (SW) radiation. The model simulated less outgoing LW radiation and less downward SW radiation (Figure 12). The deficient outgoing longwave radiation is in the clear-sky (not shown). Possible causes are negative bias in land-surface temperature and colder troposphere or positive bias in the upper troposphere water vapor in the model simulation as will be shown later. Sensitivity tests do not lead to a simple solution. We therefore tuned up the low cloud amount to reduce the incoming shortwave radiation. This is not ideal but used as a practical remedy to achieve radiation balance so that the model does not drift from reality in the coupled simulation. The longwave and shortwave cloud radiative forcing at TOA (LWCFR and SWCRF) are given in Figure 13. While the model captured the large-scale patterns in observations, the shortwave cloud forcing has a mean bias of about 10 W/m 2 , which is tuned to compensate for the deficient outgoing longwave radiation is in the clear-sky.
The simulated precipitation is in Figure 14. While the mean bias of 0.09 mm/day and the RMS error of 1.51 mm/day are better than the previous version of the model, the double ITCZ bias is still obvious. This bias feature is much muted in the AMIP simulation of the model (not shown). It is accompanied by the double ITCZ in SST. Air-sea interaction exacerbated the bias. Zhang et al. (2001) showed how an initial bias in precipitation in the western equatorial Pacific south of the equator can lead to spurious eastward ocean current above the thermocline, advecting warm water eastward to cause a positive feedback by inducing more atmospheric convection and precipitation.
Other precipitation biases include deficient amount of precipitation in South America and excessive precipitation along its western coast. Since precipitation over tropical Africa is reasonably simulated, we suspect that the lack of precipitation in South America is at least partly caused by the model's inability to capture the sharp gradient of the surface terrain of the Andies Mountain. The simulated Southern hemisphere storm track precipitation is displaced equatorward, with maximum at around 45S versus 55S in observation.
The zonally averaged temperature in the model and its bias are shown in Figure 15. Overall the model has a cold bias in middle and high latitudes. In the tropical troposphere, the temperature bias is less than 1K, but in the polar lower stratosphere, the cold bias is much larger. This polar bias is a common feature of many models. Possible causes include insufficient wave drag and meridional mixing since the model is not able to simulate the filaments of breaking resolvedscale waves, neither able to capture the inertial gravity waves excited by strong horizontal advection of temperature and momentum in the high-latitude upper troposphere and lower stratosphere. The CAS-ESM2 also simulated a cold bias in middle latitude throughout the troposphere between 30-40N/S, especially in the southern hemisphere. As expected from the temperature field, the simulated zonal winds in both hemispheres are displaced slightly equatorward and are stronger in the upper troposphere and lower stratosphere ( Figure 16) than in observations. This is consistent with the equatorward displacement of the storm track described earlier.
Except in the subtropical lower troposphere, the model simulated a moister atmosphere than in observation (Figure 17) from 20 o N-20 o S, suggesting biases in convection. Reduction of this bias could be achieved by increasing condensation. This may be carried out by reducing the evaporation rate of falling precipitation in the convection scheme, or tuning the large-scale fractional cloud scheme to decrease the grid-scale threshold relative humidity above which condensation occurs and clouds start form. However, such tuning may lead to cascading effects to precipitation and clouds, and so it is left to future improvement of the model.

c. The land
The simulated land-surface temperature is shown in Figure 18 for winter and summer separately. The overall biases are that the surface temperature is too cold in the winter, but too warm in the summer. In the boreal winter (DJF), the cold bias in northern Europe is as large as 5K that is accompanied by a warm bias in the Far East of Russia. These biases may be partially explained by biases in the simulated strength and orientation of the Iceland low and Aleutian low shown Figure 19. The simulated Iceland low is weaker than observation over the Northern Europe, leading anomalous circulation advecting cold air from the land to the region. The simulated Aleutian low is weaker than observation in the northwest Pacific, leading to anomalous southerly flow advecting warm air from the ocean to the Far East. Other likely causes of the prevalent cold bias are due to biases in model physics, especially the simulation stable boundary layers and low clouds. These warrant further study.
In the boreal summer, the overall warm temperature biases in the northern hemispheres are associated with smaller surface albedo. This is indicated by the overestimation in the clear-sky net downward shortwave radiation ( Figure 20). The large positive bias over Greenland suggests that snow albedo in the summer is overestimated, which needs to be further investigated. There is also strong indication that warm biases are associated with smaller latent heat flux, either because of insufficient rainfall or because of biases in soil moisture in the model (Figure 21). For North America, the warm bias has been shown as a common feature in many models that is attributed to insufficient heavy rainfall events (Lin et al. 2008).
d. Sea-ice Figure 22 shows the seasonal variation of the observed and simulated sea ice extent (SIE) in the northern and southern hemisphere. The SIE in CAS-ESM2 is too extensive in the winter and too confined in summer in the both hemispheres, indicating a higher model seasonal variation relative to observation. The RMS difference between the CAS-ESM2 and observed monthly seasonal cycle of Arctic SIE climatology is 1.56 × 10 6 km 2 . This value is very close to the CMIP5 median RMS error 1.45 × 10 6 km 2 (Shu et al., 2015). The corresponding RMS error of Antarctic SIE is 2.83 × 10 6 km 2 , which is slight lower than the CMIP5 median RMS bias 3.42 × 10 6 km 2 (Shu et al., 2015) .
The CAS-ESM2 Arctic winter SIE bias is mainly due to the overestimated sea ice concentration in the Greenland-Iceland-Norwegian and Bering seas, as indicated by the difference of the black and purple contours that represents the threshold of 15% coverage (Figure 23). This overestimation is consistent with the cold bias in March SST (Figure 24a) and air temperature over land. The negative summer SIE bias is in Canadian sea, Beaufort, East Siberian seas and Laptev sea (Figure 24a, b) which also can be explained by warmer bias in September SST at Canadian sea and East Siberian seas (Figure 24b) in the model than in observations. Surrounding the Antarctic, the summer SIC is underestimated in the whole southern hemisphere (Figure 23c) due to the warm bias of SST (Figure 24a). Winter SIE is underestimated in the eastern hemisphere, but overestimated in the western hemisphere ( Figure  23d), which can be all explained by the September SST bias (Figure 24b).

ENSO
The El Nino-Southern Oscillation (ENSO) is the strongest signal of climate variability on the time scale ranging from a few months to several years (Danabasoglu et al., 2006). Figure 25a and Figure 25b show the time series of the Niño-3.4 region (5°S-5°N, 170°-120°W) smoothed (5 points) and detrended monthly SST anomaly from one member of the CAS-ESM2 historical experiment and observation. Other members display similar variabilities. The simulated amplitude of Niño-3.4 is stronger relative to observation, which is common in many other models . The values of the corresponding standard deviation (SD) for CAS-ESM2 and observation are 1.67°C and 0.76°C, respectively. Additional experiments indicate that the amplitude is sensitive to the coupling frequency between the atmosphere and the ocean. Here a 3-hr coupling frequency is used. Recent diagnostics tools to measure the fidelity of ENSO simulations in models include the strength of the surface wind and energy flux responses to temperature SST perturbation, the strength of the ocean thermocline and mixed layer temperature to surface winds (Jin et al., 2006). These diagnostics will be conducted in future studies. Figure 26 shows the power spectra of the Niño-3.4 SST anomalies from the CAS-ESM2 historical experiment and observation. The CAS-ESM2 simulations and observation both show irregular variations. The observation shows double peaks between 3-4 yr and 4-7 yr which is significant at the 95% level. The power spectra amplitude of CAS-ESM2 is larger than observation, but with peak periods of about 3-4 yr and 4-7 yr, which are significant at the 95% level.
The spatial distributions of the standard deviations of tropical interannual SST anomalies (SSTAs), for the observation and CAS-ESM2 ensemble historical simulations are shown in Figure 27. The amplitude of SSTA variability in CAS-ESM2 is overestimated and the corresponding pattern is too extensive compared to observation, except for underestimating the variability along the South American coast. Besides, CAS-ESM2 also shows slightly stronger patterns of SSTA variability in the tropical Indian and Atlantic Oceans.

20 th Century Climate Change
The change of global temperature anomalies since 1850 from the historical ensemble and two observational products is displayed in Figure 28. The two observations from the National Oceanic and Atmospheric Administration (NOAA) National Climatic Data Center  and HadCRUT4.6 (Morice et al., 2012) are in good agreement. The CAS-ESM2 historical ensemble mean is given in red and the ensemble minimum and maximum in pink shading. The model did not simulate the warming from 1900 to 1940 as in observation. The simulated cooling trend from 1950 to 1980 is pronounced, more than in observation. As a result, despite the simulated warming after 1975, the overall warming is about 0.6K during since 1850, about 60% of what is observed.
We note that there are two decadal-scale periods with steep increase of sulfate aerosols, from 1900 to 1930, and from 1950 to 1980. These are the periods during which the model did not simulate the warming or simulated stronger cooling trend than in observation. This suggests that the likely bias of aerosol effect in the model, either from direct forcing or from aerosolcloud interaction or from emission, may be too strong. Possible connections between aerosol effects and the model's excessive cooling in these two periods are subject for a future study.
The time series of March and September SIE variation in the two hemispheres are shown in Figure 29. Though CAS-ESM2 simulated more SIE in March and less SIE in September, the declining trends of Arctic SIE in March and September are in quite reasonable agreement with the National Snow and Ice Data Center (NSIDC) observed trends. The ensemble mean trends of Arctic SIE in CAS-ESM2 are -0.35 and -0.90 × 10 6 km 2 per decade in March and September, respectively, which are close to the observed trends of -0.39 in March and -0.87 in September. The corresponding CMIP5 multi-model ensemble mean trends are -0.30 × 10 6 km 2 per decade in March and -0.57 × 10 6 km 2 per decade in September (Massonnet et al., 2012;Shu et al., 2015;Stroeve et al., 2012).
The trends of Antarctic SIE simulated by CAS-ESM2 fail to capture the slightly positive observed SIE trends (+0.21 × 10 6 km 2 per decade in March and +0.24 × 10 6 km 2 per decade in September), while CAS-ESM simulates negative trends in both Winter and September, which are similar to most CMIP5 models and GFDL CM4.0 (Held et al., 2019) . The simulated trends are -0.13 and -0.14 × 10 6 km 2 per decade in March and September, respectively, which are reasonable relative to the CMIP5 multi-model ensemble mean trends of -0.25 in March and -0.40 in September (Turner et al., 2013).

Climate sensitivity experiments
The climate sensitivity of the model as inferred from the 4xCO2 simulation using the Gregory et al. (2004) method is 3.42 K with a doubling of CO2 ( Figure 30). The slope is -0.91 W/m 2 /K. We have overlaid the clear-sky net flux as a function of temperature to the Gregory figure. A steeper slope of -1.09 W/m 2 /K is obtained. The difference between the two slopes suggests positive cloud feedback, since as temperature rises, the net radiation to the system is decreasing faster (primarily due to infrared cooling) in the clear-sky than in the total sky.
In Figure 30, the intercept of the linear fitting with the y-axis is approximately equal to the instantaneous radiative forcing of 4xCO2 after the fast atmospheric adjustments. The intercept of for the net clear-sky flux is different from that for the total-sky fluxes. The clear-sky flux has a higher intercept than the total sky flux, 8.3 W/m 2 versus 6.2 W/m 2 . The difference indicates the masking of CO2 greenhouse effect by clouds of about 1 W/m 2. when the CO2 concentration is quadrupled.
The cloud feedback can be also examined by using the change of cloud radiative forcing against temperature shown in Figure 31. The net cloud forcing (black dots) increases with temperature, indicating a positive cloud feedback. This positive cloud feedback is caused by the shortwave cloud feedback (red) that is partially offset by the negative cloud feedback from the longwave radiation (blue). The rates for the net, shortwave, longwave cloud feedbacks are 0.2, 0.3, -0.1W/m 2 /K respectively. Figure 32 shows the time evolution of the simulated temperature in the 1pctCO2 simulation as well as in the 4xCO2 simulation. In year 70 when the amount of CO2 doubles, the simulated warming, which is often referred to as the transient climate response (TCR, Meehl et al. 2020), is 2.4 K, in contrast to the equilibrium climate sensitivity (ECS) of 3.4 K. Danabasoglu et al. (2020) reported that the ECS and TCR in CESM2 are 5.3 K versus 2.0 K respectively, while Golaz et al. (2019) reported that the E3SM values are 5.3 K versus 2.9 K respectively. Relative to these models, the difference between the equilibrium climate sensitivity and TCR is smaller in CAS-ESM2. Zhang (2004) showed that the difference between ECS and TSR is proportional to the ECS itself, because in higher climate sensitivity models, more heat is penetrated to the ocean rather than going out to space in a transient climate change, and so the effect of ocean uptake is larger. This is consistent with the smaller equilibrium climate sensitivity in CAS-ESM2 relative to those in CESM2 and E3SM.

Summary and discussions
We have described the main components of CAS-ESM2. Notable features of the atmospheric model include the atmospheric model dynamical core, a two-plume convection scheme, the cloud macrophysical scheme, and modifications of the boundary turbulence and cloud microphysical schemes from those in the CAM5. The ocean model (LICOM) and the land model (CoLM) are developed at IAP. The CAS-ESM2 includes a biogeochemistry model, dynamic vegetation model, fire model, and an atmospheric aerosol and chemistry model that is coupled with other component models through the coupler. It has a two-way coupled function with the regional WRF model, also through the coupler. CAS-ESM2 is sufficiently different from other models to provide independent simulations of climate and its change.
The model is shown to simulate reasonable meridional transport of heat by the global atmosphere and the oceans. SST bias patterns share great similarities with those in the GFDL model despite using very different model components. The simulation errors include the warm bias along the western coasts of the continents, cold bias in the middle of the mid-latitude oceans, dipole bias in the North Atlantic, and warm biases in the eastern hemisphere of the southern high latitudes. The model has a double ITCZ bias and deficient precipitation over Amazon but much less bias in equatorial Africa. The simulated atmosphere is biased cold and wet. The northern hemisphere land is colder in the winter and warmer in the summer than in observations. The former is shown to be consistent with the bias in the simulated strength and orientation in the Iceland low, while the latter is associated with lower soil moisture and larger clear-sky downward shortwave radiation. Sea ice is overestimated in the spring and underestimated in the fall in both Arctic and Antarctic. In the northern hemisphere, the amplified seasonal cycle is consistent with the model cold bias in the winter and warm bias in the summer over land. In the southern hemisphere, the underestimation of sea ice in March is consistent with the warm SST bias in high latitude in the summer, while the overestimation of sea ice in the in September is caused by overestimation in the western hemisphere that is consistent with a cold bias in SST in the winter. The amplitude of the simulated ENSO is overestimated in the model simulation, with the simulated interannual variation of SST in the Niño-3.4 as 1.67 K relative to 0.76 K in observation. The irregularities of the 3-7 year period is captured by the model.
The model simulated a 20 th Century warming of about 0.6 K, approximately 60% of the observed warming, despite a similar slope of temperature trend after 1980. This is associated with the discrepancy of the simulated temperature in two periods, one from 1900 to 1930 during which the model did not simulate the observed warming trend, the other period from 1950 to 1980 during which the model simulated a cooling trend relative to observation. These two periods both correspond to large rate of increasing anthropogenic emission of aerosols. While we cannot rule out multidecadal interannual variability as a cause of these discrepancies, it is very likely that aerosol forcing, either direct or indirect, has cooled the model too much during these two periods. The equivalent equilibrium climate sensitivity of the model is 3.4 K from the 4xCO2 experiment while the transient climate sensitivity is 2.4 K from the 1pctCO2 experiment. Diagnosis of the 4xCO2 simulation indicates that the model has a positive cloud feedback.
In the development of CAS-ESM2, we were faced with the dilemma whether the model should be tuned to reproduce the magnitude of the 20 th Century warming. The tuning will need to either increase the climate sensitivity of the model or reduce the aerosol direct or indirect forcing, or to do both. Tuning to a higher climate sensitivity will make the warming trend after 1980 steeper in the model than in observation. Tuning the aerosol forcing is therefore more desirable. Danabasoglu et al. (2020) described the CESM2 experience of tuning the cloud microphysics to reduce the cooling from the aerosol-indirect effect on the 20 th Century warming as well as the impact of modified aerosol emission. We are not completely confident to rule out the possibility of overestimated change of historical emission of aerosols that might have led to the muted simulated warming. If emission is indeed the problem, then tuning the model processes to match the historical record would have an undesirable effect on the model projection of future climate change. Given the constraints in time and computational resources, we decided to leave the aerosol effects as they are in the model while continuing further tuning in our ongoing research.
Since CAS-ESM2 is a new model, it has not undergone extensive tuning against many available metrics as for some other models. As a result, by the standard of widely used metrics that model developers often tune the model to, the CAS-ESM2 is not likely among the best models. In the development process of the model, we have designed several parameterization modules with considerable room for future improvement. This is especially true for the atmospheric convection parameterization. The degree of tuning is a subjective judgment. Philosophically, the better a model matches existing observations, the more confidence one may have, with the exception of possible spurious compensation of errors. The desired level of consistency with observation is dependent on the types of applications for which the model is used. Results of CAS-ESM2 are presented here and submitted to the CMIP6 archive for critique by the community with the goal to offer an independent set of climate simulations and to continue improvement of the model for application studies.   Figure 22. Arctic (a) and Antarctic (b) sea ice extent (SIE) climatology (10 6 km 2 ) computed over 1979-2014 from satellite observations (black), the ensemble of the CAS-ESM2 historical simulations (red), and the CAS-ESM2 historical ensemble mean (thick red). The observed data is from the National Snow and Ice Data Center (NSIDC; Cavalieri et al., 1996). SIE is the areal sum of all grid points whose sea ice concentration (SIC) exceeds 15%.