Evaluation of onset, cessation and seasonal precipitation of the Southeast Asia rainy season in CMIP5 regional climate models and HighResMIP global climate models

Representing the rainy season of the maritime continent is a challenge for global and regional climate models. Here, we compare regional climate models (RCMs) based on the coupled model intercomparison project phase 5 (CMIP5) model generation with high‐resolution global climate models with a comparable spatial resolution from the HighResMIP experiment. The onset and the total precipitation of the rainy season for both model experiments are compared against observational datasets for Southeast Asia. A realistic representation of the monsoon rainfall is essential for agriculture in Southeast Asia as a delayed onset jeopardizes the possibility of having three annual crops. In general, the coupled historical runs (Hist‐1950) and the historical force atmosphere run (HighresSST) of the high‐resolution model intercomparison project (HighResMIP) suite were consistently closer to the observations than the RCM of CMIP5 used in this study. We find that for the whole of Southeast Asia, the HighResMIP models simulate the onset date and the total precipitation of the rainy season over the region closer to the observations than the other model sets used in this study. High‐resolution models in the HighresSST experiment showed a similar performance to their low‐resolution equivalents in simulating the monsoon characteristics. The HighresSST experiment simulated the anomaly of the onset date and the total precipitation for different El Niño‐southern oscillation conditions best, although the magnitude of the onset date anomaly was underestimated.

total precipitation of the rainy season for both model experiments are compared against observational datasets for Southeast Asia. A realistic representation of the monsoon rainfall is essential for agriculture in Southeast Asia as a delayed onset jeopardizes the possibility of having three annual crops. In general, the coupled historical runs (Hist-1950) and the historical force atmosphere run (HighresSST) of the high-resolution model intercomparison project (HighResMIP) suite were consistently closer to the observations than the RCM of CMIP5 used in this study. We find that for the whole of Southeast Asia, the HighResMIP models simulate the onset date and the total precipitation of the rainy season over the region closer to the observations than the other model sets used in this study. High-resolution models in the HighresSST experiment showed a similar performance to their low-resolution equivalents in simulating the monsoon characteristics. The HighresSST experiment simulated the anomaly of the onset date and the total precipitation for different El Niño-southern oscillation conditions best, although the magnitude of the onset date anomaly was underestimated.

K E Y W O R D S
CORDEX, GCM, HighResMIP, Indonesia, monsoon, RCM, Southeast Asia

| INTRODUCTION
Global climate models (GCMs) are the primary tools used to assess the impact of climate change. The reliability of GCMs in projecting climate change depends on the skill of the models in simulating the present-day climate (Raäisaänen, 2007). The representation of the hydrologic cycle in GCMs is still limited, although there have been improvements made on physical, biological and chemical processes (Ul Hasson et al., 2016). The spatial resolution of the models is one of the limitations that leads to a poor representation of the hydrologic cycle in GCMs as many of the processes contributing to the hydrologic cycle need to be resolved through parameterization schemes, such as convection. With this limitation, the assessment of future climate change impacts over Southeast Asia (SEA) is challenging since this region has unique physiogeographical characteristics (Ul Hasson et al., 2016).
Earlier research used rainfall gauges and gridded sea surface temperature (SST) data for the reproduction of the three observed dominant rainfall patterns in the Indonesian region (Aldrian and Susanto, 2003). In a follow-up of that study, the GCM results were downscaled using a regional climate model (RCM) aiming to test the hypothesis that the poor results were related to inadequate representation of the rainfall characteristics and a topography that was too coarse (Aldrian et al., 2004). The use of the RCM embedded in the global model clearly improved rainfall due to the more realistic topography. Changing the resolution of the GCM from 1.125 to 0.5 resulted in a dramatic improvement in rainfall over most of the Indonesian archipelago as well. In addition, the authors concluded that a prerequisite to realistic simulations of precipitation patterns is to account for the SST, which is the major factor determining the quality of the simulations. The region with the most realistic SST values had the smallest bias. Motivated by these results, follow-up research was conducted using long-term high-resolution (HR) coupled climate model simulations (Aldrian et al., 2005). The coupling produced a more realistic representation of SST and a lower overestimation of rainfall over the sea. Compared to the previous research that used the uncoupled model, the HR coupled model has accurately simulated rainfall over the region.
The ability of the climate model to simulate El Niñosouthern oscillation (ENSO) circulation improves as the spatial resolution increases (Shaffrey et al., 2009;Masson et al., 2012). In addition, the modelled Atlantic intertropical convergence zone (ITCZ) (Doi et al., 2012) behaves more realistically. Further improvements are expected in the latest generation of GCMs from the HR model intercomparison project (HighResMIP; Haarsma et al., 2016). These models are run at spatial resolutions similar to that of RCMs. For Europe, the HR GCM and the RCM show similar strengths and weaknesses in terms of daily precipitation distribution (Demory et al., 2020). For SEA however, the added value of the HighResMIP approach in comparison to the downscaling approach of RCMs needs to be investigated, especially for the aspect of the monsoon characteristic.
As global warming affects both the mean precipitation and the precipitation variability (Seager et al., 2012), a higher frequency of dry spells may be possible (Lintner et al., 2012). This condition may impact the monsoon rains which are very important for agriculture in SEA (Marjuki et al., 2016). The onset of the monsoon season is essential for farmers in SEA. Observations have shown a negative correlation between paddy rice yield and a delayed onset for some Southeast Asian countries (Marjuki et al., 2016). A delayed onset can prevent a farmer from having three annual crops, which is projected in some climate scenarios (Naylor et al., 2002).
The monsoon characteristics relate not only to local conditions but also to large-scale structures like the ITCZ (Nieuwolt et al., 1977;Aldrian and Susanto, 2003). This means that a global HR model might have an advantage over a low-resolution (LR) GCM with an embedded RCM as the former is expected to simulated large-scale structures more realistically. The HR models of HighResMIP may show similar levels of detail and realism in precipitation over complex topography as the RCMs, combined with the benefit of a more correct large-scale ITCZ.
This study aims to assess model performance in simulating the monsoon characteristics for SEA. We also investigate the effect of ENSO on the monsoon. A robust climate change impact assessment in the region must be based on climate models, which have a good representation of the onset of the monsoon rainfall. Here, we will compare the HighResMIP against observations and the downscaled result of the Coupled Model Intercomparison Project phase 5 (CMIP5) from the coordinated downscaling experiment (CORDEX) SEA simulations (Giorgi et al., 2012;Yang, 2012;Juneng et al., 2016;Ngo-Duc et al., 2017;Supari et al., 2020;Tangang et al., 2020), focusing specifically on monsoon characteristics. We will also compare the HR configuration of the HighResMIP models against their LR equivalents.

| Description of the study area
The SEA domain that we analyse covers the area between 12.5 S-24.5 N and 92.5 E-142.5 E and includes the northern part of Myanmar, Laos, Vietnam, Thailand, Cambodia, Malaysia and Indonesia. The wet season in the SEA is part of the Australia-Asia monsoon system. The system is situated between the centre of the Asian summer monsoon and Australian summer monsoon and spans the East Asia region, including the northern part of Australia (Wang, 2006).
The ITCZ crosses the equatorial region (between 10 and −5 latitude) twice a year. Most of the area in this region has two rainy seasons, the first during boreal spring and the second during boreal autumn with a higher peak in the second season (Aldrian and Susanto, 2003). The temporal development of the monsoon over the region has been described in earlier publications (Hamada et al., 2002;Aldrian and Susanto, 2003;Moron et al., 2009).

| Observation
In the SEA, the density of gauges and the long-term availability of rainfall time series are limited, which makes the development of a dataset for daily precipitation amounts based on in-situ measurements challenging ( Van den Besselaar et al., 2017;Singh and Xiaosheng, 2019). To account for the uncertainty in observed rainfall, a total of three gridded daily observational datasets are used in this study. The first dataset is the Southeast Asia Observation (SA-OBS) ( Van den Besselaar et al., 2017). SA-OBS is a daily HR land-only observational gridded dataset for precipitation and minimum, mean and maximum temperatures covering the SEA region. This dataset is used in its 0.5 by 0.5 regular latitude-longitude grid for the period from 1981 to 2016. The observational data on which SA-OBS is based is collected by the Southeast Asian Climate Assessment & Dataset (SACA&D), a cooperation between Indonesia's meteorological service and other meteorological services in the region (Van den Besselaar et al., 2015). As a result of this cooperation, 1,394 precipitation stations, 365 stations with minimum and maximum temperature, and 274 stations with daily mean temperature have been selected.
The second dataset is the Asian Precipitation Highly Resolved Observational Data Integration towards Evaluation of Water Resources (APHRODITE; Yatagai et al., 2012). The APHRODITE (APHRO) daily precipitation data were created by collecting and analysing rain gauge observation data across Asia. The interpolation algorithm for the latest version of APHRO is similar to that presented by Yatagai et al. (2009) with improvements in weighting function to consider the effect of mountain ranges by giving high weight to gauges on slopes inclined to the target location and low weight to gauges on the leeward side behind a mountain ridge. The dataset is used to evaluate the model skill in simulating daily precipitation, including the precipitation extremes.
The third dataset is the Climate Hazards group Infrared Precipitation with Stations v2.0 (CHIRPS; Funk et al., 2015). The CHIRPS product provides daily precipitation data at a spatial resolution of 0.05 for the quasiglobal coverage of 50 N-50 S from 1981 to the present. The CHIRPS was created using data from rain gauge stations collected from food and agriculture organization and global historical climate network, the Cold Cloud Duration information based on thermal infrared data archived from climate prediction center and NOAA National Climate Data Center (NCDC), the Version 7 TRMM 3B42 data, the Version 2 atmospheric model rainfall field from the NOAA Climate Forecast System, and rain gauge station data from multiple sources.
There is good agreement between the observation datasets on the mean onset of the rainy season in SEA ( Figure 1). Figure 1d shows that the general movement of the onset of the rainy season is in line with the movement of the ITCZ. The zonal mean plot shows that the onset date moves to later dates for latitudes closer to the equator, but local detail is lost in this aggregated view. The Maritime Continent of SEA has a complex seasonal cycle of rainfall, which means that some areas end up having a local characteristic that is different from the movement of the ITCZ (Aldrian and Susanto, 2003).
Between the three gridded rainfall datasets, the SA-OBS dataset was specifically generated for the SEA using gauged rainfall stations. The dataset has been considered more accurate and reliable as compared to other datasets (Van den Besselaar et al., 2017; Ge et al., 2019), but the restrictions on the search radius in the interpolation method have as a consequence that in areas that are too data-sparse, no interpolated values are calculated. The other two datasets use other approaches or more permissive interpolations and cover the complete land area of SEA.

| Model data
In this study, two classes of model experiment results are compared to the observation. The first model output is the downscaled version of CMIP5. We use six downscaled CMIP5 model datasets. The data of CNRM, CSIRO, EC-Earth and MPI were downscaled using RegCM4 (Giorgi et al., 2012) by CORDEX SEA (Ngo-Duc et al., 2017;Supari et al., 2020;Tangang et al., 2020), whereas HadGEM was downscaled using regional Weather Research and Forecasting (WRF3.5) (Skamarock et al., 2008) by the Asia-Pacific Economic Cooperation Climate Centre (APCC) (Yang, 2012). The RCMs were run over the historical period 1971-2005. We will refer to the 6 model dataset as CORDEX.
The second model experiment output is from the HighResMIP experiment. The HighResMIP data are available from the H2020-funded Primavera project. In this project, GCMs were run at a spatial resolution comparable to that of the CORDEX models. We use two experiments of HighResMIP in this study. The first HighResMIP experiment is the coupled historic runs for the period 1950-2014(Hist-1950. Fixed historical atmosphere and SST forcing from 1950 was applied for the spin-up period, after which a historically-evolving forcing was imposed (Haarsma et al., 2016). Six HR models from the Hist-1950 experiment were used including EC-Earth , MPI (Müller et al., 2018), HadGEM (Roberts et al., 2019), CMCC (Cherchi et al., 2019), CNRM (Voldoire et al., 2019), and ECMWF (Roberts et al., 2018). EC-Earth and ECMWF are available in two and four members, respectively. For other models, there is only one member available. Later in this study, the HR of Hist-1950 model experiment will be called HR-Hist-1950.
The second HighResMIP experiment is the historically forced atmosphere run for the period 1950-2014 (HighresSST). This experiment used the daily 1/4 HadISST2-based dataset as the SST and sea-ice forcing. With the extended period and HR simulation, it is expected that this experiment will improve upon the realism of the ENSO teleconnection (Sterl et al., 2007). This study used HR and LR simulation from the HighresSST. The same six models from the Hist-1950 experiment were used for the HighresSST experiment. There is only one member available for the CMCC, CNRM and MPI, both for the HR and LR simulations. For EC-Earth, there are three members available both for the HR and LR simulations. For ECMWF, there are four and eight members available for the HR and LR simulations, respectively. For the HadGEM, three and five members are available for the HR and LR simulations, respectively. More detailed information on the HighResMIP experiment can be found in Haarsma et al. (2016). Considering CORDEX, four models are available in HighResMIP (CNRM, EC-Earth, HadGEM and MPI). The remaining two models are different (CSIRO and GFDL for CORDEX, also CMCC and ECMWF for HighResMIP), which is due to the limitation of the data that are available from the CORDEX-SEA and Primavera project. In addition to the CORDEX and HighResMIP datasets, the EC-Earth GCM from CMIP5 with an ensemble of four members was used to support the conclusions, each member representing a perturbation in realization.
For the analysis, we interpolate the model and observation datasets to the same grid resolution; the reference resolution is 0.5 . We used bilinear interpolation to interpolate a dataset with higher resolution than the reference. Also, we used nearest-neighbour interpolation to interpolate datasets with a lower resolution than the reference. Information about the model configurations is shown in Table 1.

| Season onset definition
There are various ways to quantify the onset (and cessation) of the rainy season. In the model context, it is possible to calculate sophisticated indices (including for example, wind direction at higher elevations and reference of evapotranspiration) (Zeng and Lu, 2004;Diong et al., 2019;Wati et al., 2019) but the observational datasets lack parameters other than temperature and precipitation, which limits us to applying the most straightforward indices. Marjuki et al. (2016) compared a few of these simple indices and despite a quantitative lack of similarity between the indices (although they are strongly correlated), we used the onset definition developed by Liebmann et al. (2007) as this index is applicable to the variations of climate in SEA.
We used the Liebmann et al. (2007) definition to define the rainy season onset and retreat. This definition is based on time series of daily sums of precipitation, which makes it specific for each location. It is calculated by where A is the 'anomalous accumulation', R n ð Þ is the daily precipitation and R is the yearly average daily precipitation. The onset (retreat) of the wet season is defined as the absolute minimum (maximum) of A, indicating that the daily precipitation total from that date onwards is larger (lower) than the average daily precipitation.
The approach followed by Liebmann et al. (2007) used average daily precipitation calculated over a climatological mean period, but in this study, we use precipitation amounts calculated annually. This approach allows the onset and the retreat dates to be calculated for every year, including excessively wet or dry years, which may not be the case when using the climatological values (Marjuki et al., 2016).
In this paper, the accumulation period starts on January 1st for every year. This condition makes the onset date that is calculated for areas with two rainy seasons in the equatorial region represents the second rainy season. The index is calculated when at least 350 days of the year have nonmissing data. This condition is only relevant for SA-OBS and APHRO as they have some missing data. Furthermore, we calculate the onset for all available members of the ensemble and show the mean of the ensemble.

| Validation methods
We use the Taylor diagram to compare the climatological mean values of the onset dates of the climate models. The x-axis and y-axis show the normalized standard deviation (SD). We use the observation SD for the normalization. Therefore, the closer the model normalized SD is to 1, the closer the model SD is to the observation SD. The curve axis shows the correlation between the spatial distribution of the model and the observation. The correlation calculation was made over the complete land area. The number of the grid box is the sample range for the correlation calculation.
We also calculate bias and normalized root mean square error (NRMSE) based on the climatological mean and median. In the calculation, the climatological mean and median were calculated for each grid cell. Furthermore, the indicators were calculated for each model, using the number of grid cells as the sample range. The climatological bias (mean difference) and median difference are used to quantify the similarity in the climatological condition between models and observations. The NRMSEs are used to measure the deviation between simulated and observed values (Randall et al., 2007). The NRMSE value is expressed as a percentage (%), and it uses the standard deviation of observation data to normalize the RMSE.
The three observation datasets (SA-OBS, APHRO and CHIRPS) were combined into one pooled observation series as the reference. Combining these three datasets in this way gives us a straightforward way of accounting for the uncertainty of the observational estimates. A model simulation is compared to the pooled observations so that the comparison metric, like correlation, is based on a comparison against all three observational datasets. Furthermore, for the models run in ensemble mode, the ensemble-mean of the statistic indicator values is used.
In addition, we use the two-sample Kolmogorov-Smirnov test (K-S index) as an evaluation metric (Wilks, 2011). The indicator evaluates the difference in cumulative distributions between the observations and the model simulations. A smaller Kolmogorov-Smirnov statistic (K-S index) value indicates a better representation of model simulation for the data distribution. The K-S index was calculated for each grid box and used every year in the time range to build the cumulative distribution. For the cumulative distributions, the model members need not be averaged and are pooled into the sample. In general, we found a similar pattern of the mean rainy season onset in the models ( Figure S1) as compared to the observations (Figure 1). The onset progresses from T A B L E 1 Description of the models, showing for each model in the first column, its horizontal resolution of the global model as used in CMIP5 and the resolution of the regional model from CORDEX for which it provided lateral boundaries May for the northern part of the region to the end of the year for the southern part of the region. Some models also reproduce the contrasting onset patterns of some areas in Sulawesi, Maluku and Papua (−5 -2 latitude and 120 -135 longitude) that have a different annual rainfall pattern than other areas at the same latitude (Aldrian and Susanto, 2003). Tailor diagram (Figure 2) shows that in general, the SD values of the CORDEX models are closer to the observations as compared to the HighResSST and Hist-1950 models. In addition, the spatial correlations of the climatological onset of the rainy season are between 0.6 and 0.8 for most models. The model simulations are compared against the reference based on the pooled observational datasets. Hence, the simulations describe the movement of the monsoon well. The HighResMIP simulations show (considerably) higher spatial correlation with the observations as compared to the CORDEX. Among the HighResMIP simulation, we find a similar spatial correlation between the atmospheric model with the prescribed SST experiment (HighResSST) and the coupled model experiment (Hist-1950). This is also the case between HR and LR of HighResSST. Overall, there is no significant improvement of CORDEX results compared to the single CMIP5 model. Also, there is no significant improvement of HR HighResSST compared to LR HighResSST. However, in some models, like the MPI model, the spatial correlation of HighResSST is higher than Hist-1950. For the CNRM, the LR HighResSST is higher than the HR HighResSST. Figure 3 presents the model bias of the climatological onset of the rainy season. We find a similar pattern of bias distribution between the three model experiments. There is more bias for the region around 5 S-10 N, the models tend to have an early-onset compared to the observations except for the area around 115-140 W where the models tend to have a late-onset compared to the observations. Figure 4 presents a boxplot of the climatological bias and median difference of the models in simulating the onset of the rainy season. This figure shows data aggregated over the land area of the domain. The simulated monsoon onset in the CORDEX experiment is slightly closer to the observation as compared to a single CMIP5 model. However, a better simulation is shown with the HighResMIP experiments as indicated by the bias and median difference. Most HighResMIP models have biases less than ±25 days for the majority of the grid cells, which is much less than what is simulated in CORDEX. In terms of the HighResMIP experiments, the percentage of grid cells with ±25 days bias amount to 58-75% for LR HighResSST, 56-74% for HR HighResSST, and 57-73%. However, for CORDEX, this value amounts to 50-58% and 55% for a single CMIP5 model ( Figure S2). The enhanced realism is confirmed by the NRMSE value, the average for HighResMIP models is 63% for LR HighresSST, 64% for HR HighresSST and 69% for HR Hist-1950. The average NRMSE for CORDEX models is 80%, which is slightly lower than the single CMIP5 model with an NRMSE of 83% ( Figure S3).
The two-sample Kolmogorov-Smirnov test (K-S index) (Wilks, 2011) evaluates the difference in cumulative distributions between the observations and the model simulations. A small indicator value represents a close similarity between observations and model results. The K-S index for the onset of the rainy season confirms that the HighResMIP models represent the onset date distribution substantially better than CORDEX models. Meanwhile, we find a similar performance between COR-DEX models and a single CMIP5 model ( Figure 5).
Among the CORDEX models, the ones with lateral boundary conditions from the global EC-Earth simulation show high biases (Figure 4), NRMSE ( Figure S3) and K-S index ( Figure 5) of the onset of the rainy season. On average, CNRM, CSIRO and HadGEM tend to delay the onset, whereas the downscaled EC-Earth, GFDL and MPI tend to advance the onset. Compared to other CORDEX models, CNRM shows the smallest biases and MPI shows the lowest NRMSE ( Figure S3) and K-S index ( Figure 5).
Among the HighResMIP experiments, HighResSST shows slightly smaller bias, K-S index and NRMSE compared to the Hist-1950. Furthermore, comparisons between HR and LR of HighresSST based on the climatological mean and median biases, the NRMSE, and the K-S index, show similar skill in simulating the onset for both resolutions.
In the HighResMIP, ECMWF shows the smallest biases and also the lowest NRMSE and K-S index. For the LR HighResSST experiment, the CNRM, HadGEM and MPI tend to overestimate (late) the onset date, while EC-Earth, CMCC and ECMWF tend to underestimate (early) the onset date. Whereas for the HR HighResSST experiment, except for the CNRM, most of the models tend to underestimate (early) the onset date. For the HR Hist-1950 experiment, the CNRM, HadGEM and MPI tend to overestimate (late) the onset date, while EC-Earth tends to underestimate (early) the onset date.

| Total precipitation
We now turn our attention to the simulation of the accumulated rainfall (total rainfall) over the monsoon season. The total rainfall of the rainy season is calculated based on the 6 months following the zonal mean onset of the wet season (Figure 1d) and is the cumulative value over these 6 months. A comparison of the three different climate model experiments in terms of the spatial correlation in total rainfall maps shows that the downscaling process in the CORDEX improves the spatial correlation of the total precipitation, although the similarity with observations is not as high as in the HighResMIP ( Figure 6). The CORDEX models have spatial correlations with the observations below .2 except for CNRM. For the HighResMIP, most of the models have a spatial correlation with the observed precipitation pattern exceeding .2, except for CMCC and MPI. HadGEM (MPI) shows the highest (lowest) spatial correlation among the HighResMIP models. Overall, we find a low correlation between modelled and observed spatial distribution of the total precipitation of the rainy season in the region. In terms of the SD value, we find that the HighResMIP and a single CMIP5 model show SD values to be closer to observations as compared to the CORDEX models. Figure 7 shows the bias and difference in the median between observed and modelled total precipitation, where the precipitation accumulates over the rainy season. This figure clearly shows that the CORDEX models deviate more from observations than the HighResMIP and the single CMIP5 model. Most of the HighResMIP models and the single CMIP5 model have a NRMSE smaller than 150% which is considerably smaller than the NRMSE of the COR-DEX models which are above 300% (except for the CSIRO model which is relatively low at 250%; Figure S4).
The K-S index values of the HighResMIP models are significantly lower than that of the CORDEX models. This indicates a closer resemblance of the HighResMIP model output to observations in terms of the cumulative distribution of total precipitation ( Figure S5).
It is an unexpected result that the downscaled CMIP5 CORDEX models perform badly in terms of the mean and median biases, NRMSE and K-S index for the total amount of precipitation during the rainy season. One of the reasons was explained by Juneng et al. (2016), who found that the MIT-Emanuel convective scheme (Emanuel and Živkovi c-Rothman, 1999) that was used in the RegCM4 RCM had simulated large positive precipitation biases. This condition explains the high simulated rainfall variability found by Nguyen-Thi et al. (2021). As an impact, we found that the total precipitation for the rainy season was very high for the CORDEX models. Similar results were also found by Amsal et al. (2019), using the same configuration of RegCM4 RCM for CSIRO Mk3.6 over Indonesia. The total rainfall bias they found was ±500 mm/month. Among the models in the HighResMIP experiment, we found a slight improvement of the HR HighResSST compared to the LR HighResSST in the spatial correlation. However, the same condition is not found in the biases, NRMSE and K-S index. Overall, we observe similar performance between HighResSST and Hist-1950 experiments and between HR and LR HighResSST. All models excluding MPI and CNRM tend to overestimate the total precipitation. Compared to other HighResMIP models, EC-Earth and CNRM show the smallest and the largest biases, respectively.

| ENSO composite analysis for the onset simulation
3.2.1 | Understanding the effect of ENSO on the seasonal rainfall and the onset of the rainy season In this ENSO composite analysis, we select the El-Niño years (1982, 1987, 1991, 1997, 2002 and 2004) and the F I G U R E 2 Taylor diagram of the onset of the rainy season. The xaxis (y-axis) shows the normalized SD of the onset date. The curve axis shows the spatial correlation [Colour figure can be viewed at wileyonlinelibrary.com] La-Niña years (1988, 1995, 1998, 1999 and 2000) based on the Oceanic Niño Index (ONI) (NOAA, 2020). The El-Niño (La-Niña) years are selected using a threshold of eight consecutive months with negative (positive) ONI values. Figure 8 shows the observed cumulative rainfall anomaly in the seasonal period of December-February (DJF), March-May (MAM), June-August (JJA) and September-November (SON) during the El-Niño years.
During El-Niño years, the negative rainfall anomaly in DJF is located more over the Philippines and the northern part of Borneo and Sulawesi. A similar pattern is also found in MAM, but it is more spread out to the Malaysia Peninsula and the northwestern part of SEA. Meanwhile, during JJA and SON, the negative rainfall anomaly is located more over the southern part of SEA (below 5 N). In addition, for SON, the negative rainfall anomaly is spread more to the Philippines and Vietnam. We also F I G U R E 6 Taylor diagram of the total precipitation of the rainy season. The x-axis (y-axis) shows the normalized SD of the total precipitation of the rainy season. The curve axis shows the spatial correlation [Colour figure can be viewed at wileyonlinelibrary.com] find that the positive rainfall anomaly during La-Niña years has a similar pattern to the negative anomaly during El-Niño years ( Figure S6). These results are also found in an earlier study that analysed the correlation between precipitation and Nino3.4 index (Trouet and Van Oldenborgh, 2013). The dominant ocean surface currents during the DJF and MAM flow from the Pacific Ocean to the north of Papua and continues to flow northward to the Philippines before it flows southward to the Indonesian region through the South China Sea. Therefore, the effect of the ENSO condition of those two periods is stronger over the northern part of SEA (above the 0 latitude). Conversely, in JJA and SON, the dominant surface current flows directly to the Indonesian region from the Pacific Ocean through the northern part of Papua. As a result, the effect of the ENSO condition is stronger for the southern part of SEA (below 5 N; Wyrtki, 1961;Aldrian et al., 2007).
The effect of ENSO varies based on the time and place in SEA, and the impact of the effect of ENSO on the onset and the retreat dates of the rainy season also varies based on the place. The onset of the rainy season progression starts from the northern and continues to the southern part of the SEA during the period from May to the end of the year. In this period, the impact of ENSO is stronger for the southern part of SEA (below 5 N). As a result, we find more anomalies in the onset dates for this region. We find late (early) onset dates in El-Niño (La-Niña) years for the southern part of SEA. In contrast, there are tendencies of early (late) onset dates for the regions over the Philippines and the northern part of Borneo (Figure 9a).
Contrasting with the effect of the onset dates, we find an early (late) retreat date in El-Niño (La-Niña) years for the northern part of SEA (above 0 latitude). The retreat date progression in SEA generally starts in the northern part of SEA and moves to the southern part of SEA from September to May. The retreat date for the southern (below 5 N) part of SEA occurs in the period between January and May. In this period, the effect of ENSO is weak over this region. Therefore, we find a contrasting condition to the anomaly of the retreat date for the northern part in the anomaly of the retreat date for the southern part of SEA. Most of the southern part of SEA tends to have a late (early) retreat date in El-Niño (La-Niña) years (Figure 9b).
3.2.2 | Model performance of the effect of ENSO on the seasonal rainfall and the onset of the rainy season In this section, we investigate the performance of the LR and HR models of the HighResSST experiment in simulating the onset of the anomaly in El-Niño and La-Niña years. The HighResSST experiment was forced by SST observation, which means that the model can be compared directly to the observations of the ENSO composite analysis. Figure 10 shows the Taylor diagram of the rainfall anomaly for the seasonal periods during El-Niño and La-Niña years. Overall, based on the spatial correlation, the models simulate the rainfall anomaly better for El-Niño than for La-Niña. However, the SD of the models is closer to the observations when simulating the rainfall anomaly for La-Niña as opposed to El-Niño.
Furthermore, based on the spatial correlation values, we find that both resolutions (HR and LR) of HighResSST successfully capture most of the general pattern of the seasonal rainfall anomaly (Figures S7-S14). This is shown by the spatial correlation of the majority of the models that range between .4 and .6 ( Figure 10). It shows that the correlation in DJF and MAM is higher than in JJA and SON. We find a similar performance between HR and LR HighResSST. This condition is confirmed by the rainfall anomaly ( Figure S15) and the bias of the rainfall anomaly ( Figure S16) of the models. Based on the spatial correlation, the rainfall anomaly and the bias of the rainfall anomaly of the models, we find that EC-Earth, ECMWF and HadGEM perform better as compared to the three other models. Figure 11 shows the onset and the retreat date of the anomalies during El-Niño and La-Niña years. We found a late (early) onset during El-Niño (La-Niña) years in the F I G U R E 1 0 Taylor diagram of the rainfall anomaly during El Niño year for seasonal period. The x-axis (y-axis) shows the normalized standard deviation. The curve axis shows the spatial correlation [Colour figure can be viewed at wileyonlinelibrary.com] observed anomalies for most of the grid cells over SEA. Except for CNRM, all models simulate the anomalies. However, the magnitude of the modelled anomaly is smaller than the observed anomaly. Overall, based on the anomalies of the onset and retreat dates, we found a similar performance between HR and LR HighResSST. This condition was confirmed by the spatial correlation and SD ( Figure S21) as well as the bias of the anomalies ( Figure S22) of the onset and the retreat dates. Looking at all the parametric measures, CMCC, EC-Earth, ECMWF and HadGEM perform better as compared to CNRM and MPI.

| DISCUSSION AND CONCLUSION
The performances of CORDEX and HighResMIP climate models on simulating the rainy season over the SEA region are investigated using the onset of the rainy season as the principal metric. The onset of the rainy season starts in May in the northern part of the region and moves to the southern part by the end of the year, except for some areas in Indonesia, which have a nonmonsoonal rainfall pattern (Aldrian and Susanto, 2003). Wind, precipitation, and OLR data have been widely used for monsoon onset studies (Zeng and Lu, 2004;Diong et al., 2019). In this study, we used a relatively simple monsoon onset definition developed by Liebmann et al. (2007), which uses only precipitation data. Despite the absence of wind that describes the circulation change on the monsoon development, the onset calculation from Liebmann et al. (2007) captures the movement of the monsoon onset in SEA.
We also assess the effect of ENSO on the monsoon date based on this definition. The impact of ENSO varies depending on time and place in SEA. As a result, we find late (early) onset dates for most of the Indonesian region (latitude <5 N) during the El-Niño (La-Niña) phases. This onset is then followed by a late (early) retreat date. In contrast, we find early (late) onset dates followed by early (late) cessation for most of the area around the Philippines during El-Niño (La-Niña) phases. Meanwhile, the effect of ENSO on the onset and the retreat dates is weak over most areas in the northwestern part of SEA (Myanmar, Laos, Thailand, Cambodia).
We use three gridded daily observational datasets in this study. The SA-OBS is developed specifically for the SEA region using rain gauges from the national meteorological services in the region. SA-OBS has a higher density of data as compared to two other gridded datasets (APHRO and CHIRPS) especially over Indonesia, where most of the actual observations over Java and Sumatra are not available for scientific research. However, not all Southeast Asian countries in the domain of SA-OBS contributed to the dataset (Van den Besselaar et al., 2017). This creates some limitations with regards to covering some areas in the north of SEA. A restriction in the search radius of the interpolation method of SA-OBS means that the dataset has areas where no rainfall estimates can be given when data density is too low.
More spatial coverage was found in APHRO and CHIRPS datasets than in SA-OBS. Being developed from in-situ measurements and having a less restrictive interpolation method means that the APHRO dataset covers more area in SEA as compared to SA-OBS. However, the dataset has a lower station density in many parts of SEA as compared to SA-OBS (Van den Besselaar et al., 2017). Also, the CHIRPS dataset uses a recently produced satellite rainfall algorithm that combines climatology data, satellite precipitation estimates, and in-situ rain-gauge measurements to produce a HR precipitation product (Funk et al., 2015). The global coverage of this satellite data means that this dataset has better coverage compared to SA-OBS and APHRO. However, the number of rain gauges used to calibrate this dataset is lower than what is used in the other two observational datasets (Funk et al., 2015). Van den Besselaar et al. (2017) found stronger similarities between the gauge-based datasets than between the gauge-based datasets and the satellitebased datasets.
This study aims to compare the performance of HighResMIP against CORDEX in simulating the monsoon characteristics over the SEA region. The models from the HighResMIP suite were consistently closer to the observations than models from the CORDEX. Using bias, NRMSE and spatial correlation of the climatological mean and median value between model and observation as the key metrics in comparison, we find that the HighResMIP models simulate the onset date and the total precipitation of the rainy season over the region closer to the observations than the other model suits used in this study. Based on the Kolmogorov-Smirnov index, we also find that the HighResMIP models better represent the annual variation of the monsoon index. We found more general consistency in the HighResMIP model simulations compared to the CORDEX model simulations.
In terms of total precipitation analysis from the COR-DEX models, we find that the uncertainty within a model and between models is substantial. Apparently, the downscaling process does not reduce uncertainty. However, for other analysis, we find that the CORDEX experiment has improved the model simulation of the monsoon.
This study is also interested in investigating the model performance of the effect of ENSO on the onset and the total precipitation of the rainy season. Here, we compare HR against LR of HighResSST models. In general, we find a similar performance of HR HighresSST and LR HighresSST on simulating the monsoon characteristic. Based on the El-Niño and La-Niña composite analysis, both HighresSST model experiments (HR and LR) simulated the anomaly of the onset and the total precipitation during different ENSO conditions with comparable skill. This is similar to an earlier publication that shows RCMs do indeed effectively reproduce variability during ENSO years (Aldrian et al., 2004). However, the models fail to completely follow the spatial distribution of the onset and retreat date anomalies of the observation. In addition, the magnitude of the onset anomaly of the model is still lower as compared to the observations.
Overall, we find no significant improvement of HR HighresSST as compared to LR HighresSST on the El-Niño and La-Niña composite analysis. This finding is different than previous study results where the LR model was unable to capture the growth of the coupled disturbance in the developing phase of ENSO, whereas the HR model does capture these disturbances (Gualdi et al., 2005). With this contradiction, we argue that the high skills in the HighresSST experiments relate to the prescribed SSTs rather than the more detailed atmospheric dynamics.
In conclusion, we find that the HighResMIP experiment has a better simulation as compared to the COR-DEX experiment. As for the HighResMIP experiment, we find a similar performance of the HighResSST experiment compares to the Hist-1950 experiment. In the same vein, the HR performed the same as the LR.
There is a high demand for a HR dataset for climate and climate change analysis. We conclude that the HighResMIP experiment models in higher resolutions with better representations of the monsoon characteristics in SEA will give a better climate change impact assessment for the region.

ACKNOWLEDGEMENT
The authors acknowledge the SA-OBS dataset and the data providers in the SACA&D project (http://sacad. database.bmkg.go.id/). The First author thanks the Indonesia Endowment Fund for Education (LPDP) (S-353/LPDP.3/2019) for providing fund for his PhD research. The second author acknowledges the support of the Royal Netherlands Embassy in Jakarta, Indonesia, through a Joint Cooperation Programme between Dutch and Indonesian research institutes. The HighResMIP simulations were made available through the PRIMA-VERA project, which received funding from the European Union's Horizon 2020 Research and Innovation Programme under grant agreement no. 641727, which also supported authors Malcolm John Roberts, Marie-Pierre Moine, Alessio Bellucci, Retish Senan, Etienne Tourigny and Dian Putrasahan. The tenth author also received funding from an innovation programme under the Marie Skłodowska-Curie grant agreement No. 748750 (SPFireSD project).