Correlation properties of integral ground‐motion intensity measures from Italian strong‐motion records

This study investigates the correlation properties of integral ground‐motion intensity measures (IMs) from Italian strong‐motion records. The considered integral IMs include 5–95% significant duration, Housner intensity, cumulative absolute velocity, and Arias intensity. Both IM spatial correlation and the correlation between different integral and amplitude‐based IMs (i.e., cross‐IM correlation) are addressed in this study. To this aim, a new Italian ground‐motion model (GMM) with spatial correlation for integral IMs is first introduced. Based on the newly developed GMM, the empirical correlation coefficients from interevent and intraevent residuals are investigated and various analytical correlation models between integral IMs and amplitude‐based IMs are proposed. The effective range parameter representing spatial correlation properties and the trend in the cross‐IM correlations are compared with existing models in the literature. The variability of the effective range parameters with respect to event‐specific features is also discussed. Modeling ground‐motion spatial and cross‐IM correlations is an important step in seismic hazard and risk assessment of spatially distributed systems. Investigating region‐specific correlation properties based on Italian strong‐motion records is of special interest as several correlation models have been developed based on global datasets, often lacking earthquakes in extensional regions such as Italy.


| INTRODUCTION
Amplitude, frequency content, and strong-motion duration are key features characterizing earthquake-induced ground motions and affecting seismic demands on engineering systems. Integral ground-motion intensity measures (IMs) can contribute quantifying the overall effects of those key parameters. Commonly used integral IMs include Arias intensity (I A ) defined in Equation 1; Housner intensity (I H ), also known as spectrum intensity, defined in Equation 2 1 ; cumulative absolute velocity (CAV) defined in Equation 3 2 ; and a definition of ground-motion duration (presented in the following paragraph): parameters of the spatial correlation models (due to the level of subjectivity present in fitting analytical models to the empirical data). This might result in GMMs with coefficients that are statistically inefficient. 28 In contrast, the scoring estimation approach developed by the authors and adopted in this study can overcome this issue and is proved to be statistically rigorous, numerically stable, and capable to incorporate nonstationarity and anisotropy in spatial correlation properties. 28 These GMMs can also be used to develop correlation models between integral IMs and other nonintegral IMs (e.g., amplitude-based IMs, such as peak ground acceleration or PGA; elastic pseudospectral acceleration, or PSA); the resulting correlation models can be used to improve ground-motion selection and modification for structural analyses, within the Performance-based Earthquake Engineering (PBEE) framework. For instance, Iervolino et al., 29 Bradley, 30 and Tarbali and Bradley 31 have demonstrated that D S5 − 95 and other integral IMs can be used as secondary IMs coupled with the primary ones (e.g., spectral ordinates) to select ensembles of ground motions appropriately representing the target hazard at a given site. Region-specific GMMs for integral IMs can also be utilized in the engineering validation of simulated ground motions. [32][33][34] The existing GMMs for integral IMs are developed based on global databases, which might only partially represent the peculiar features observed in Italian strong-motion records. 35 The growing Italian dataset is of special interest (as also discussed in Scasserra et al. 35 and Lanzano et al. 36 ), because (1) it is principally from earthquakes in extensional regions that are poorly represented in global databases; and (2) past practice in Italy has used local GMMs based on limited datasets 37 . Consideration of newly available, larger datasets for Italy can assist in appropriately representing various source, path, and site effects. For instance, the most recent GMMs based on the Italian dataset is that proposed by Lanzano et al. 36 for PGA, peak ground velocity (PGV), and 5% damped elastic PSA. This model uses a large, harmonized national dataset with an advanced functional form compared with previous GMMs; however, integral IMs and ground-motion spatial and cross-IM correlation features were not addressed in this model. A subset of Lanzano et al. 36 database was previously utilized by the authors of the study presented here to develop a new Italian GMM that also considers ground-motion spatial correlation for PGA, PGV, and PSA (for 29 periods ranging from 0.01 to 4 s), as well as their cross-correlations. 38 The general paucity of records in various regional and global databases has motivated assembling data from different regions/countries. However, data from different regions/countries might be incomparable in size and inconsistent in terms of the signal-processing approach, thus resulting in an underrepresentation or misrepresentation of some regional characteristics. Kotha et al. 39 quantified the regional differences in the attenuation of high-frequency groundmotion properties with respect to source-to-site distance between three groups of strong-motion records: (1) Italy, (2) Turkey, and (3) rest of Europe and the Middle East. The authors point out that a country-based categorization can assist in addressing the unbalanced composition of various datasets around Europe, which is also relevant to this study. It is also worth mentioning that Boore et al. 38 emphasized that country names are often used in GMMs as a convenient shorthand to describe regions, realizing that results for a region may well be applicable beyond political boundaries of a country. Moreover, regional differences may occur within a given country as well.
Based on these various remarks, this study first develops a new GMM with spatial correlation for integral IMs observed in Italian strong-motion records. Then, the corresponding spatial correlation properties are scrutinized, and the variability of the effective range parameter (characterizing the considered spatial-correlation model) with respect to the event-specific features is discussed. Empirical correlation models are also established between the integral IMs and spectral acceleration ordinates, and the results are compared with existing models. Finally, this study develops an analytical correlation model between integral IMs and amplitude-based IM for Italy.

| GROUND-MOTION DATABASE
The selected dataset is extracted from the Pan-European Engineering Strong Motion (ESM) flat file 40 by taking into account the following constraints: • Events with a moment magnitude M w ≥ 4 and records with Joyner-Boore distance (R JB ; i.e., the closest distance to the surface projection of the rupture plane) smaller than 220 km are considered. For each considered event, if a finite-fault model is available, R JB is computed based on the fault geometry provided by the ESM; if not, for M w > 5.5, R JB is estimated from epicenter distance (R epi ) using an empirical model 41 ; otherwise the earthquake source is assumed to be a point source; hence, R JB = R epi .
• Events occurred in Italy with at least 10 records classified as free-field motions. The limit of 10 records is chosen based on analyses conducted to ensure stable estimates of spatial correlation properties, as discussed in the following. • Records without any data on the event M w , fault type, and V S30 (i.e., the average shear-wave velocity in the upper 30 m) are excluded. • Co-located records are identified, and the redundant ones are removed.
• A maximum interstation separation distance of 250 km is considered. For any pairs of stations with a separation distance greater than 250 km, the station with the larger epicentral distance is excluded.
The final dataset includes 5703 records (each with accelerograms in two horizontal motions) from 138 earthquakes in the magnitude range of 4 ≤ M w ≤ 6.5 from 1997 to 2016 in Italy. The geographical distribution of the selected events is shown in Figure 1, together with the M w − R JB distribution and the site classifications (according to Eurocode 8 42 ) of the selected records; 71% of the considered ground-motion records are from normal, 19% from reverse, and 10% from strike-slip events. Figure 2 presents the site class of the considered records according to the Eurocode 8 site classification, 42 as well as the simpler site classification considered in the development of the GMM proposed in this study (presented in the next section). As shown, most of the records are from site class B (i.e., stiff soil), which has a median V S30 of about 627 m/s across the considered stations. It is noted that the ground-motion records utilized in this study have been processed by Istituto Nazionale di Geofisica e Vulcanologia based on the procedure of Paolucci et al. 43 and have been checked in terms of their quality, especially based on the signal-to-noise ratio. 40

| Model specification
Typical functional forms used in empirical GMMs consist of three main components related to the source, path, and site effects (e.g., rupture magnitude, focal mechanism, source-to-site distance, and soil properties). Utilizing a functional form for the considered integral IMs, which is consistent with that for amplitude-based IMs, leads to a harmonized set of GMMs that can be implemented in practical applications, verified, and updated (when new data emerges) in a more straightforward fashion. As discussed in Baker and Cornell 44 and Baker and Jayaram, 45 the choice of a particular functional form has an almost negligible effect on the correlation estimates. The functional form chosen in this study is as follows: where • IM ij is the IM observed at station j during event i and is obtained from the two as-recorded horizontal components to produce the RotD50 value, which is the median horizontal ground-motion across all nonredundant azimuths 46 ; • M i is the moment magnitude (M w ) of event i; • R JB,ij is the Joyner-Boore distance in km at station j during event i; • S S,j and S A,j are dummy variables determining the soil type at station j: • F N,i and F R,i are dummy variables indicating the style-of-faulting for event i: • N is the number of events and n i is the number of records from event i; is the correlation matrix corresponding to event i with ω (i.e., a vector of unknown parameters). To take the spatial correlation into account, the jj 0 th entry of Ω i (ω) is specified as where k(s ij , s ij 0 ) gives the correlation ρ(ε ij , ε ij 0) between ε ij and ε ij 0 at locations s ij and s ij 0 of sites j and j 0 during event i . If no spatial correlation is assumed between the intraevent errors at station j 6 ¼ j, then It is commonly assumed that the spatial field of intraevent errors for PSA ordinates is stationary and isotropic (e.g., Jayaram and Baker 21 ); hence, the spatial correlation depends on the interstation separation distance d. This study retains the same assumptions for the integral IMs (as also implicitly assumed in Foulser-Piggott and Stafford 17 for I A ). Further statistical hypothesis testing and physics-based simulations can be conducted in order to validate these assumptions; this is a topic under further investigation by the authors. An exponential function is employed in this study to model the correlation coefficient relationship with the separation distance d: where h is a positive range parameter in kilometer at which the spatial correlation is around 0.37. The effective range parameter is defined ash = 3h and corresponds to the correlation coefficient of 0.05 47 (similar to other studies; e.g., Jayaram and Baker 21 and Esposito and Iervolino 48,49 ). The methodology used for estimation and regression of the GMM with spatial correlation and the corresponding correlation models is the same as that adopted by Huang and Galasso 50 and Ming et al. 28 Therefore, the computational steps are not repeated here for brevity.

| GMM with spatial correlation
The estimated model parameters for the GMMs with and without spatial correlation are presented in Table 1 for the considered integral IMs. Figure 3 shows the median ground-motion curves and their 95% confidence intervals (CIs) for a M w = 5.5 normal rupture * at a stiff soil condition (assuming V S30 = 580 m/s) in comparison with existing GMMs in the literature. The CIs are shown to illustrate the uncertainty bounds around the established median curves following the practice of Douglas 51 in which the covariance of the model coefficients are also included in calculating the CIs. Note that when using BSA2009, CB2010, and AS2016 models for comparisons, no hanging-wall/foot-wall and basin effects are considered. Figure 3 shows that, although the existing GMMs lie within the 95% CIs of the derived model, the slope of the median curves varies between them. The median D S5 − 95 curve from this study shows a faster increase than those from the other studies, particularly at large source-to-site distances. While the median I H curve has a gentler slope at moderate distances, the CAV curve shows a faster decrease with distance. For I A , the slope of the developed model is in between the other existing models.
The results in Table 1 highlight that considering the ground-motion spatial correlation leads to a reduction in the interevent variance and an increase in the intraevent variance of the considered integral IMs, which is consistent with the findings from other studies for nonintegral IMs (e.g., Jayaram and Baker 52 ). As expected, these differences can become much larger when the correlation in the underlying data becomes higher (i.e., higher range parameter h). In particular, the difference in terms of standard deviations between the model with and without spatial correlation is notable for D S5 − 95 (with 15% difference in the interevent and 5% in the total standard deviations), whereas the difference for other IMs is less than 5%. Figure 4 compares the total and intraevent standard deviations of the proposed GMMs with the considered existing models. As shown, the developed models for D S5 − 95 , I H , and CAV have comparable total and intraevent standard deviations with respect to the considered models; however, for I A , these quantities are larger than those from the existing studies. It is worth noting that the standard deviations from this study inherently include spatial correlation effects (as part of the algorithm utilized to establish the model, i.e., Ming et al. 28 ); hence, the standard deviation values obtained in this study tend to be generally larger than those from GMMs that do not consider *M w = 5.5 is the median of the applicable magnitude range for this study.
T A B L E 1 Estimated model parameters for the D S5 − 95 , I H , CAV, and I A GMM with and without spatial correlation (denoted by S and NS, respectively) the residual spatial correlation during the model parameter estimation. It is also noted that the GMM introduced here (with simple functional forms, yet unbiased medians as shown in Figure 5) is developed to facilitate investigating the spatial-correlation and cross-IM correlation properties of the considered integral IMs. Advanced GMMs can be , and SA2017 refer to Abrahamson and Silva, 9 Kempton and Stewart, 10 Bommer et al., 11 Afshari and Stewart 12 and Sandıkkaya and Akkar, 18 respectively; B, I H , where MM2008 refers to Massa et al. 15 ; C, CAV, where CB2010 and SA2017 refer to Campbell and Bozorgnia 16 and Sandıkkaya and Akkar, 18 respectively; D, I A , where MM2008, FS2012, and SA2017 refer to Massa et al., 15 Foulser-Piggott and Stafford, 17 and Sandıkkaya and Akkar, 18 respectively established for the considered integral IMs based on more complex functional forms, for instance, by including features such as anelastic attenuation (which is especially important for modeling ground motions at large distances). However, this has not been the focus of this study. The functional form of the developed GMM is scrutinized by investigating the statistical significance of parameters b 2 and b 3 for the magnitude scaling, and b 9 and b 10 for the style of faulting. The null hypothesis that the mean values of these coefficients are equal to zeros cannot be rejected at a 5% significance level and their 95% CIs (shown in Table 1) include zero. This implies that the magnitude and style-of-faulting terms may not be statistically significant parameters in capturing the considered ground-motion properties. This finding is consistent with the observations in Lanzano et al. 36 for amplitude-based IMs and with Bommer et al. 11 in terms of the style of faulting for D S5 − 95 . However, this result does not suggest that these physical parameters are not important in explaining integral IM properties, but, rather, it implies that the functional form involving these parameters may not be a good representation of the specific feature. Lanzano et al. 36 have suggested that the failure to reject the null hypothesis regarding the magnitude scaling may be because of the large variability in the magnitude scaling and uncertainty in the estimation of some predefined Regarding the style of faulting, the failure to reject the null hypothesis may be due to the limited difference between amplitudes of motions from normal faulting earthquakes, with respect to those from strike-slip events. 53 However, it is decided here to keep the functional form as in Equation 4, although some parameters may have limited impacts on the model performance.
In order to investigate potential biases in the developed GMM, Figure 5 presents the interevent residuals with respect to magnitude and the intraevent residuals with respect to distance and V S30 . It is shown that there is no major bias in the residuals with respect to these explanatory parameters, which indicates an overall appropriate representation of the considered data by the developed models.
To further compare the performance of the GMMs with and without spatial correlation, the Bayesian information criteria (BIC), 54 which deals with the trade-off between the model goodness of fit and its complexity, is computed. A lower BIC value indicates that the corresponding model can represent the underlying data more appropriately; hence, it would be the preferred one. It is shown in Table 1 that the GMMs with spatial correlation have about 15% lower BIC than the corresponding GMMs without it. Similar results are also obtained using the Akaike information criteria 55 approach (however not repeated here for brevity). To investigate the relevance of the spatial correlation effects for the considered integral IMs, it is examined whether zero is included within the 95% CI of the calculated range parameter (h) for the developed GMM. The 95% CIs of h presented in Table 1 indicate that the spatial correlation is a nonnegligible feature of the considered IMs. This is more pronounced for D S5 − 95 , I A , and I H rather than for CAV, as the h parameter for CAV is much smaller with respect to those for the other IMs. However, its 95% CI does not include zero. Figure 6A compares the effective range parameter (h = 3h) obtained in this study with the values from the existing studies. As shown, theh estimate for I A is comparable with those from other studies, but smaller for CAV and I H . † It is noted that h (and consequentlyh ) from this study is estimated based on the one-stage algorithm of Ming et al. 28 as opposed to the other studies that use the multistage fitting of ad hoc models to the empirical semivariograms. This difference in the estimation approach, as well as the differences in the underlying data and the considered functional form may result in the differences shown in Figure 6, as also discussed by Schiappapietra and Douglas. 27 The smallh values from this study compared with those observed in global databases can be attributed to the weak level of motion from small M w events and the faster attenuation of the high-to-moderate ground motion frequencies in the region of interest, as documented by Scasserra et al. 35 In order to examine the variability in the spatial-correlation features with respect to the considered events, Figure 6B shows the histograms of theh values from all events (and considered integral IMs) and their median values, alongside theh estimates from the country-wide GMM developed in this study. As shown, there is a large variability in the estimatedh values for different events, which is also consistent with the findings of Schiappapietra and Douglas 27 for nonintegral IMs and Bullock 26 for CAV, I H , and D S5 − 75 (5-75% significant duration). Figure 6B also shows that theh values estimated through the one-stage algorithm employed in this study and the median of all the event-specific values are fairly close, with the exception of CAV for which theh estimate is smaller. In fact, the onestage estimation algorithm by Ming et al. 28 attempts to optimize the model misfit considering all the model parameter; hence, the obtainedh estimates for the country-wide GMM are not necessarily equal to the event-specific medians. As shown in Figure 6B, there are smallh values at the lower tail that are not necessarily coming from the same events for all the considered IMs. In order to examine these smallh values, Figure 7 presents the empirical semivariograms and the fitted model for an illustrative event (i.e., IT_2009_0084). As shown, the semivariogram values are quite scattered and no apparent trend can be identified for them; because the model is honoring the data, the estimatedh values are small. Figure 8 presents the event-specifich values for the considered integral IMs with respect to their rupture magnitude and the style of faulting. As shown, no trend can be identified with respect to these explanatory variables. Similar results are obtained by investigating the variability ofh with respect to the number of records from each event, and the smallest and largest separation distances within each event. Overall, scrutiny of the results indicate that the rupture magnitude and style of faulting are not significant parameters in characterizing the variability of event-specifich values. This is consistent with the results presented in Table 1 for the pooled data and the statistical significance of the model coefficients for these explanatory parameters (in their current functional form).
Considering the potential sensitivity of the results with respect to event-specific characteristics (not only in terms of the earthquake source but also in terms of the spatial resolution of the recording stations and the site-response effects), the results of this study are presented for the pooled data representing the average trend in the spatial correlation properties in the Italian data. Further investigations are required to quantify the sensitivity of spatial correlation parameters (for the integral and nonintegral) IMs against the choice of the regression technique, the maximum separation distance to be considered in the analysis, the spatial resolution of the data in terms of the distance from the source, and the local site-response effects. Physics-based ground-motion simulations can provide a pathway to gather sufficient controlled data to scrutinize these factors (which is the focus of the subsequent study). 56

| The cross-IM empirical correlations
The empirical correlation coefficients between integral IMs and between integral IMs and amplitude-based IMs are calculated by dissecting the total residual and correlation coefficients into the interevent and intraevent terms due to the presence of multiple records from a single event (following the approach of Bradley 58 ). Note that the residuals of the amplitude-based IMs are computed using the model of Huang and Galasso. 50 Comparisons are made between the total correlation coefficients obtained in this study and those from existing correlation models, namely, (1) Bradley 58 Table 2 presents the empirical cross-IM correlation coefficients between the considered integral IMs and also between integral IMs and PGA and PGV. In comparison with existing models (e.g., previous studies [58][59][60][61][62][63] Figure 9 presents the empirical correlation coefficients between the considered integral IMs and the PSA ordinates at T = [0.01, 4.0 s]. It is shown in Figure 9A that D S5 − 95 is negatively correlated with PSA ordinates at short-to-moderate periods and positively correlated with the long-period PSA ordinates, which is consistent with the findings in the literature (e.g., Sandıkkaya and Akkar, 18 Bradley,58 and Baker and Bradley 59 ). The negative correlation between D S5 − 95 and the short-period PSA ordinates is due to the fact that ground motions with longer-than-predicted durations tend to have the seismic energy arriving over a longer period of time and thus less likely to cause large peak responses in a damped oscillator. It is also shown in Figure 9 that the correlations between D S5 − 95 and short-period PSA ordinates observed in the Italian data are lower than that those from the global and European models. However, the correlations between D S5 − 95 and moderate-and long-period PSA ordinates are similar to the global models while being slightly lower than those of the European model. In terms of I H , CAV, and I A , the empirical correlation coefficients observed in the Italian data follow a similar trend as the existing empirical models but are generally larger.
Conducting null hypothesis testing 64 on the similarity between the correlation coefficients from this study and those from the considered existing studies yields p values close to zero at a 5% significance level, for most of, not all, the PSA ordinates, suggesting that there could be differences between the empirical correlation coefficients from the Italian data and those from other events. Nevertheless, it is noted that the observed absolute differences are close to the differences that may come from epistemic uncertainty in the functional forms utilized to calculate IM residuals and the spatialcorrelation model. This issue is addressed, for example, in Bradley. 61

| Analytical models developed for cross-IM correlation
In this section, analytical correlation models between the considered integral IMs and PSA ordinates are developed. Following Bradley, 58 the D S5 − 95 -PSA correlation model is considered as where a l is the model coefficient at specific structural period t l as listed in Table 3. Following Bradley, 60-62 analytical correlation models between PSA ordinates and I H , CAV, and I A are developed as follows: where a l , b l , c l , and d l are the model coefficients at specific structural period t l as listed in Table 4.
The developed correlation models in Equations 10 and 11 are consistent with the corresponding empirical correlation values as indicated by the results presented in Figure 9. It is worth noting that these functional forms are chosen to provide an appropriate fit to the data and there is no specific physical interpretation associated with them. Hence, the developed models should not be extrapolated beyond the considered variable ranges in the database.

| Dependence of the cross-IM correlations on magnitude and distance
The dependence of the proposed cross-IM correlations on magnitude and source-to-site distance is evaluated by calculating empirical correlation coefficients for records in different magnitude and distance ranges and comparing them with the developed analytical models. Figure 10 (on the left-hand side panels) presents the correlations between integral IMs and PSA ordinates computed from ground motions with R JB ≤ 100 km and binned magnitude (±0.3 unit around the target value). It is shown that there is no notable trend for the correlation between integral IMs and PSA ordinates against magnitude. Similar, Figure 10 (on the right-hand side panels) shows the correlations between integral IMs and PSA ordinates computed from ground motions with 4.5 ≤ M w ≤ 6 (well-represented magnitude range in the considered dataset) and binned distance (±15 km around the target value). It is shown that there is no notable trend for the correlation between integral IMs and PSA ordinates against distance. These findings are consistent with those of Baker and Bradley 59 and Huang and Galasso 50 for amplitude-based IMs.

| CONCLUSIONS
This study investigated the correlation properties of 5-95% significant duration, Housner intensity, cumulative absolute velocity, and Arias intensity in Italian strong-motion records. Investigating region-specific correlations properties based on Italian ground-motion data was the focus here because of (1) the underrepresenting of events from extensional tectonic regimes in global databases and (2) the future utilizations of the results obtained in this study for selecting ground-motions records for seismic response analysis and validating synthetic ground motions specifically simulated for this region. Findings from this study indicated that ground-motion spatial correlation is a nonnegligible feature of the considered IMs; however, this is more pronounced for significant duration, Housner intensity, and Arias intensity rather than cumulative absolute velocity. Examining the variability of the effective range parameters characterizing the F I G U R E 1 0 Cross-intensity measure (IM) correlation coefficients plotted against the rupture magnitude (left panel) and R JB distance (right panel) for the considered integral IMs. Solid and dashed lines are based on the empirical results and the analytical models, respectively proposed spatial-correlation models with respect to event-specific features indicated that the rupture magnitude and style-of-faulting are not significant factors in explain the variability of event-specific spatial correlation properties. Considering the sensitivity of the results with respect to event-specific characteristics, the results of this study are presented for the pooled data representing the average trend in the spatial correlation properties in the Italian data. Further investigations are required to quantify the sensitivity of the spatial-correlation parameters (for both integral and nonintegral IMs) against the choice of the regression technique, the maximum separation distance to be considered in the analysis, the spatial resolution of the data in terms of the distance from the source, and the local site-response effects.
Comparison between the cross-IM correlations for the Italian data with other existing models (developed based on global and transnational databases) indicated the existence of differences for most IMs (which were confirmed based on statistical hypothesis testing). However, the observed absolute differences are close to the differences that may come from epistemic uncertainty in the functional forms utilized to calculate IM residuals.
Finally, this study proposed a set of analytical correlation models between integral IMs and amplitude-based IMs, capturing well the features of Italian data. In particular, the derived correlation model between integral IMs and PSA ordinates have no significant dependence on magnitude and distance. The results of this study can be used to improve hazard/risk assessment exercises in Italy.