Abstract
Keywords
Introduction
The NGA-Sub project developed a relational database and ground-motion models (GMMs) for subduction zone earthquakes worldwide (Bozorgnia et al., 2022). The database provides uniformly processed ground-motion data from earthquakes recorded in different tectonic settings and regions, including time series and intensity measure (IM) values (Ahdi et al., 2022; Contreras et al., 2022; Mazzoni et al., 2022). The well-populated magnitude and rupture distance ranges for interface and intraslab events in the database are
The NGA-Sub database development had a cutoff date of 2016, meaning that earthquakes after that date were not considered. As inevitably occurs in any such project, application of the cutoff caused important data to be omitted. This was particularly the case in Central America and Mexico (CAM), because of large-magnitude, well-recorded earthquakes that occurred in 2017 and 2018. The present work aimed to augment the NGA-Sub database with this newer information, which was developed in a consistent manner to NGA-Sub. Our second objective was to evaluate the performance of two CAM-regionalized global NGA-Sub GMMs (Parker et al., 2022: Pea22; Kuehn et al., 2023: Kea23). This article is adapted from a report (Contreras et al., 2023a) that introduced the database and analyses. An important topic not addressed here, but which is the subject of a companion article (Contreras et al., 2025), concerns site response in the Valley of Mexico. Portions of the work presented in this article have also been the subject of a conference presentation (Contreras et al., 2023b).
Previous Mexico GMMs
Networks and data sets
In Mexico, the installation of a strong motion network started in the early 1960s, following the 1957 Mw 7.6 Guerrero earthquake. In 1962, three accelerographs were deployed in Mexico City, two at the city center and one at a hill zone site (Ciudad Universitaria, CU). Since 1964, the CU station has recorded several events and has become a widely applied reference site for studies of earthquake ground motions in the Valley of Mexico (Arroyo et al., 2024; Jaimes et al., 2006; Reinoso and Ordaz, 1999).
Starting in the 1980s, the Universidad Nacional Autónoma de México (UNAM, National Autonomous University of Mexico) in cooperation with the University of California, San Diego, significantly increased the number of instruments with the deployment of the Guerrero Accelerograph Network (Anderson et al., 1994). This network is not in operation at present; however, some recording stations were absorbed into the networks installed by the Institute of Engineering at UNAM. Following the 1985
Table 1 summarizes ground-motion data sets (each of which has an accompanying GMM) for Mexico that were developed utilizing recordings from subduction earthquakes. Earthquake ground-motion recordings and modeling have often focused on the Valley of Mexico (Castro et al., 1988; Ordaz et al., 1994, 2024; Rosenblueth et al., 1989; Singh et al., 1987). Mexico City has unique geotechnical conditions, consisting of lakebed deposits, that produce strong site effects. Less attention has been given to the areas outside the Valley of Mexico.
Ground-motion data sets and GMMs for subduction earthquakes in Mexico. Models specific to Mexico City are excluded
Near-trench thrust interface earthquakes only.
GMMs
For forearc regions, GMMs for peak ground acceleration (PGA) and modified Mercalli intensity were developed in the late 1980s and 1990s (Anderson, 1997; Anderson and Lei, 1994; Anderson and Quaas, 1988; Ordaz et al., 1989), which are mainly applicable for the state of Guerrero. Perea and Sordo (1988) produced a local GMM for the urban area of Puebla City, and Reyes (1998) used data from the CU station. These GMMs did not distinguish between event types and are not listed in Table 1.
Current practice for ground-motion prediction in Mexico is different for Mexico City and the remainder of the country. The specific procedures used in Mexico City are summarized in Chapter 5 of Contreras et al. (2023a). The GMMs in Table 1 apply for the overall Mexico region or specific subregions, but are generally not used in Mexico City. Among the models in Table 1, the most applicable models to the entirety of Mexico are those of García-Soto and Jaimes (2017) for interface earthquakes and Jaimes and García-Soto (2020) for intraslab events. As shown in Table 1, the interface data set utilized 418 recordings from 40 interface events and exclusively National Earthquake Hazards Reduction Program (NEHRP) B sites (56 in total). This GMM predicts peak acceleration (PGA), peak velocity (PGV), and pseudo-spectral acceleration (PSA) for oscillator periods (
The present work differs from these previous studies in several respects. Model development considers all applicable data, without the restriction of NEHRP B sites. This was done because site response models integrated within the GMM can address first-order site effects, and by considering such models within the analyses, it is possible to better constrain source and path effects. A second distinction, as noted in the
CAM data
The NGA-Sub database contains more than 71,000 three-component time series from 1883 earthquakes worldwide. We have extended and improved the NGA-Sub database for the CAM region using events with dates extending through 2018. There are two reasons that events in the extended database were not used in the original database. The first is that the events occurred after a 2016 cutoff date, which excluded from consideration two large intraslab events in Mexico (
The additional data enhance ground-motion characterization in the following respects:
(a) Large-magnitude events provide data within a hazard-critical parametric range. Particularly, the 2017
(b) By adding more recordings, including those from large-magnitude events recorded over a wide distance range, it is possible to improve regional path characterization. Specifically, additional recorded ground motions at both forearc and backarc sites help to identify regional backarc effects.
(c) Adding new ground motions increases the number of recordings at selected sites, which enables the evaluation of non-ergodic (site-specific) site response effects at those sites.
In this section, we identify areas where the CAM component of the database was extended and improved. First, we describe the development of source and path metadata for
NGA-Sub event metadata
There are 181 earthquakes from the CAM region in the NGA-Sub database for which classifications of event type (interface, intraslab, shallow crustal, or outer rise) were not provided. As a result, the data from these events were not used during model development. We selected 59 of the 181 unclassified events, each recorded at five or more stations, to develop source and path metadata following NGA-Sub protocols (Contreras et al., 2022). Figure 1 shows the magnitude–distance distribution in CAM for the updated database differentiated by event type (both the

Magnitude–distance distribution of the NGA-Sub database in CAM following updates of events with
Post-NGA-Sub events
In this subsection, we describe source metadata for three large earthquakes that occurred after the 2016 NGA-Sub cutoff date: intraslab events on 8 September 2017 and 19 September 2017 (

Event locations (focus and finite fault) for three events in 2017–2018 considered in the expansion of the CAM portion of the NGA-Sub database. Locations of ground-motion stations that recorded at least one of the events are also shown.
Table 2 presents moment tensor and finite-fault attributes of the three events. The seismic moments are taken from the Global Centroid Moment Tensor (CMT) catalog (Ekström et al., 2012) following NGA-Sub procedures. As described by Contreras et al. (2023a), multiple finite-fault models are available in the literature for each event. The selected models indicated in Table 2 were generally chosen because they considered multiple data sources, including nearby strong motion data and crustal displacements, and were published in peer-reviewed journals. In the case of the Chiapas event, tsunami data were also considered, and in the case of the Oaxaca event, a model that considered local seismic data was not available. Contreras et al. (2023a) provide plots of the as-published and trimmed finite-fault models; the trimming followed NGA-Sub protocols by removing portions of rupture rectangles with <15% of the maximum slip (justification provided by Contreras et al., 2022). The Chiapas finite model is a single rectangle whereas the Puebla and Oaxaca events are modeled with two rectangles with equivalent strike and dip angles but different lengths and widths. Additional metadata for each event is included in the source table (Contreras et al., 2024).
Moment tensor and finite-fault (rupture rectangle length and width) attributes of three large earthquakes since 2016 added to the CAM portion of database
Seismic moment from CMT catalog; bmoment magnitude and magnitude uncertainty; chypocentral depth.
The standard deviations on moment magnitude provided in Table 2 are a combination (square root of sum of squares) of magnitude uncertainty from alternate seismic moments (from different finite-fault models) and seismic moment uncertainty associated with the inversion performed in a given model. Details of these calculations are omitted here for brevity but are provided by Contreras (2022).
Ground motions from post-NGA-Sub events
Earthquake ground motions in Mexico from the three large-magnitude events described in the previous subsection were obtained from the RAII-UNAM network (approximately 100 accelerometer stations across the southern part of Mexico), the RACM network (81 accelerometer stations in Mexico City), and a series of smaller networks for which data are disseminated by SSN (broadband seismic network IG, Pérez et al., 2018; SSN, 2020; Valley of Mexico seismic network VM, Quintanar et al., 2018; and Veracruz seismic network UV; Córdoba-Montiel et al., 2018; each instrumented site within these networks has both an accelerometer and broadband seismometer). Locations of sensors from these networks in the vicinity of the three added events are shown in Figure 2.
Data from the RAII-UNAM network were downloaded in acceleration units with instrument corrections already having been applied. Data from the RACM network, following network approval for download, were similarly in acceleration units and included instrument corrections. Data from the three networks disseminated by SSN were downloaded as raw data (counts) and instrument corrections were applied as part of the present work (details in Contreras et al., 2023a). Additional information on the websites used to access the data is provided in
Table 3 summarizes the numbers of ground-motion recordings that were collected from the different networks for each of the three considered events. In total, 585 three-component acceleration recordings from 219 ground-motion stations were compiled. In terms of number of recordings, 94% of the data (551 recordings) are from the RAII-UNAM, RACM, and IG networks, whereas the VM and UV networks contribute the remaining 6% (34 recordings). Approximately 40% of the stations are located in Mexico City, mainly from the RACM and VM networks, which produces clusters of the collected data within a narrow range of rupture distances. More than 190 recordings are available for each event.
Numbers of ground-motion recordings by earthquake and network
For all 585 three-component ground motions, data processing and computation of IMs was performed following the methodology applied in the NGA-Sub project (Kishida et al., 2020). These procedures, which are well described elsewhere, produce a high-pass corner frequency
Figure 3 shows median-component (RotD50; Boore, 2010) PGA values versus rupture distance for the offshore Chiapas event (similar plots for the other events are provided in Contreras et al., 2023a). The figure highlights data distributions from different networks. Modest flattening of the PGA trend with distance is evident for

Recorded PGA values for the 2017
An additional feature of the data that is unique to Mexico is the extraordinary impact of site response in Mexico City. This appears in the plots as tall “stripes” of data within a narrow distance range, centered on approximately 600 km for Chiapas. This occurs both because of the highly variable site response within the city and the large numbers of recordings that were made, sampling a wide array of geotechnical conditions.
Path parameters
Path parameters compiled for ground-motion analysis are rupture distance (
Rupture distance and additional source-site path parameters were computed using station locations and finite-fault models from literature for the three large 2017–2018 events. When finite-fault models were not available, a simulation procedure was used to estimate the finite-fault rupture surface (Chiou and Youngs, 2008; Contreras et al., 2022). The simulation procedure was used for the
As with other subduction zone plate boundaries, Mexico has a volcanic arc. In NGA-Sub, volcanic-arc locations were delineated by drawing (by hand) lines connecting volcanoes as provided by the Smithsonian Institute’s Global Volcanism Program (2013). These lines were used to categorize the forearc (trench-side) and backarc of each subduction zone region. Figure 4 (red line) shows this line in the CAM region. We reconsidered the location of the volcanic front, recognizing that regional volcanoes are located not on a lineament but across a zone, which is referred to as the trans-Mexican volcanic belt (TMVB) (Ferrari et al., 2012) as shown in Figure 4. The TMVB is about 1000 km long in the west-east direction, spanning from the Gulf of California to the Gulf of Mexico, and ranges in width from 90 to 230 km. The TMVB contains 24 stratovolcanoes and calderas, about half of which are near the TMVB’s southern margin. We relocated the forearc-backarc boundary to the southern edge of the TMVB, as shown in Figure 4, so that the backarc would include the full extent of the TMVB region. As a result, some stations that had been classified as forearc in NGA-Sub (e.g. City of Puebla) are now backarc. With both the current and previous boundaries, Mexico City is located in the backarc region.

Map showing volcanic boundary applied in NGA-Sub (red line), trans-Mexican volcanic belt (orange area), and relocated volcanic front applied for modeling purposes in this study (purple line).
For each source-to-site path, the percentage of the direct-line path in forearc and backarc was computed using the relocated volcanic front boundary in Figure 4. We find that 55.0% of paths are forearc-only, 42.2% originate in the forearc and end (at the station) in backarc, 2.1% originate in the forearc and end outside the volcanic-arc region, 0.4% originate in backarc and end in forearc, and 0.3% are backarc-only. Path data are provided for each record in the project path table (Contreras et al., 2024).
Site parameters
Site parameters for the locations of ground-motion recording stations have been developed for all stations that produced usable ground motions. These parameters update site information that was included in the NGA-Sub database. The procedures used to assign site parameters are conceptually the same as those used in NGA-Sub (Ahdi et al., 2022), but more information on site conditions has been collected in the present work.
Information on geotechnical conditions is much more available in Mexico City than in other parts of CAM. Accordingly, different approaches for assigning site parameters were applied. Site conditions in Mexico City are presented by Contreras et al. (2025). Here, we address site parameters assignments for the remainder of Mexico.
We searched for
Adaptation of global GMM for CAM region
The NGA-Sub global GMMs were developed with regional adjustment factors applicable to constant terms, anelastic attenuation coefficients, magnitude-scaling breakpoints (separating strong scaling below the break from weaker scaling above the break), and site response (
In the case of CAM, data quantity in NGA-Sub was limited due to the relatively small number of recordings per event (average of 9), although not in the number of events (82 events). Data quality was also problematic, with highly uncertain site parameters and most events lacking finite-fault models. For these reasons, regionalization of NGA-Sub GMMs for CAM used a combination of globally and locally constrained components (e.g. Pea22 applied global coefficients for site response and constant terms (slab events), whereas regional coefficients were derived for anelastic attenuation and constant terms (interface events)). The data developed in this study, particularly the well-recorded 2017–2018 events, expand the size and quality of the CAM data. Here, we use the updated data to examine the regional performance of the Pea22 GMM. Specific objectives are to evaluate whether regional coefficients require adjustment to fit the data, to evaluate whether differences between backarc and forearc anelastic attenuation are evident from the data, and to investigate regional site response characteristics. Contreras et al. (2023a) present similar analyses for the Kuehn et al. (2020) model, which is omitted here for brevity. The Abrahamson and Gülerce (2022) model was not considered in our residual analyses because it was not available at the time the analyses were performed.
Approach
Data selection criteria
The combined database for CAM has three components: (1) events in the NGA-Sub database, which are all
Records from the project flatfile were selected for use in ground-motion analyses in a manner consistent with that of Pea22, as follows:
Metadata necessary for model development is available, including
Earthquake classified with high confidence as being interface or intraslab;
Earthquake is a mainshock (Class 1;
Sensor depth ≤ 2 m;
Interface events with hypocentral depths (
PSA at
Earthquakes without multiple event flags; these are events for which the recordings do not indicate that more than one seismic source affected the ground motions;
Earthquakes with source review flags = 0, 1, 2, or 4, which indicate earthquakes that underwent quality control checks and meet metadata quality standards;
Records that capture the start of the P-wave (i.e. those without a late P-wave trigger flag);
After applying criteria 1–10, we only used records from events having at least three recordings.
A criterion from Pea22 that is not applied here is to require that the earthquake epicenter and stations are both located in the forearc region. This criterion was not applied to allow for analyses of potential differences in anelastic attenuation in forearc and backarc regions.
The numbers of events and recordings used for model development vary as a function of period due to Criterion 7, with a range of 329–758 records for combined data from both event types. Figure 5 shows 369 usable interface records in the screened database at rupture distances of 14–1270 km from events with

Magnitude–distance distribution of the screened NGA-Sub database in CAM.
Residual analysis procedures
Using the screened data, residual analyses were performed to examine various attributes of CAM ground motions, including:
Constant term: Allows for an assessment of the degree to which the data are systematically higher or lower than those for a global NGA-Sub model. The constant term is a regionalized attribute in NGA-Sub models.
Source parameter scaling: Check of whether the CAM data are consistent with magnitude- and source depth-scaling relations in an NGA-Sub model. These source scaling relations are not regionalized in NGA-Sub models.
Distance attenuation: GMMs have terms for geometric spreading and anelastic attenuation, the latter of which are regionalized in NGA-Sub models. We investigate CAM-specific regional attributes of anelastic attenuation, both in the forearc and backarc, and for interface and intraslab events.
Site response: The scaling of site response with
Consider an earthquake event
where
where
To quantify systematic event and site misfits (referred to as
where
where
If non-zero constant terms or adjusted path coefficients were needed to fit the data, these modifications were made and the GMMs updated. This required re-computation of residuals, event terms, and site terms, to confirm that misfits were removed. In this way, the residual analyses were iterative.
Once the iterations related to path coefficients were completed, site response was examined by computing site terms relative to the reference condition in the adjusted GMM:
Superscript
Regional path effects
Initial residual analyses using the CAM data with the Pea22 GMM revealed bias terms
The Pea22 path model uses the following equations:
The
We investigate whether the Pea22 path models are able to capture data trends with distance for interface and intraslab events. We also apply the models to backarc data to investigate potentially faster anelastic attenuation relative to the forearc. Preliminary analyses of Pea22 path model performance examine the trends of

Trend of Pea22 within-event residuals for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with

Trend of Pea22 within-event residuals for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with
We anticipate backarc regions to have different rates of anelastic attenuation based on findings elsewhere, mainly Japan (Ghofrani and Atkinson, 2011). Beginning with the null hypothesis that forearc attenuation rates apply in the backarc, we plot in Figures 8 and 9
where

Trend of Pea22 within-event residuals for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with

Trend of Pea22 within-event residuals for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with
To remove the faster backarc anelastic attenuation trend, we fit a linear trend line as follows:
where
Note that by using
Example fits of Equation 9 to the data are shown by the blue lines in Figures 8 and 9. Figure 10 plots

Backarc additional anelastic attenuation coefficient
Regional site response
Residuals were re-computed using the Pea22 GMM modified to include backarc path effects (Equation 10 with coefficients in Figure 10). We assume site response is independent of event type (justification provided by Parker and Stewart, 2022), which allows the combined use of
Using the combined data set exclusive of VM sites, Equation 4 is used to partition

Trend of site terms for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with
As described in the
Regional constants and checks of source model
The residuals analyses performed in this subsection use the Pea22 GMM with two adjustments: (1) backarc path model (Equation 10) and (2) application of a local VM site response model as presented by Contreras et al. (2025) for site response in that region. Residuals analyses performed with those modifications produce the constant terms (

Pea22 GMM model bias for peak acceleration, peak velocity, and PSA for a range of periods: (a) interface events and (b) intraslab events. Range indicates 95% confidence interval of the mean bias.
Figures 13 and 14 show the variation of modified Pea22 event terms (

Trends of modified Pea22 event terms for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with magnitude for interface events. Error bars are 95% confidence intervals. Gray area indicates ±1.0

Trends of modified Pea22 event terms for PGA, PGV, 0.3 s PSA, and 3.0 s PSA with magnitude for intraslab events. Error bars are 95% confidence intervals. Gray area indicates ±1.0
Epistemic uncertainty
Pea22 estimated the epistemic uncertainty in their GMM using the uncertainty in the constant terms. For regional models, these uncertainties scale with the inverse of the square root of the number of observations (recordings). Since the number of recordings has increased significantly from 103 in the Pea22 model development (after data selection criteria were applied) to 821 from the present work, a reduction in epistemic uncertainty is expected.
Figure 15 shows the epistemic uncertainty (

Epistemic uncertainty
Summary and discussion
This research examined ground motions from subduction earthquakes in CAM. We significantly extended the amount of earthquake ground-motion information for CAM relative to what was used in NGA-Sub model development. The added information was from a
The expanded database was then used to evaluate regional features of GMMs for CAM. These analyses produced several important findings, most of which are related to the degree to which the current data support or require modifications to an NGA-Sub GMM. We find that the constant terms in the Pea22 GMM do not require adjustment. We find that ground-motion scaling with source parameters (magnitude and source depth) is sufficient in the GMM. However, the 2017
We found forearc distance scaling in the Pea22 GMMs to be unbiased, but identify faster attenuation in the backarc for both intraslab and interface events. We provide a modification to the path model to account for these effects (Equation 10 and Figure 10). The faster backarc attenuation is significant at short periods (
We find that the average site responses across the CAM region are consistent with the global models of Parker and Stewart (2022). This is useful because it provides a basis for site response estimation across the region, but with “carve outs” for special areas like the VM where region-specific models should be used (Contreras et al., 2025). The soft sediments in Oaxaca and Puebla produce stronger site responses than those provided by the global model, particularly at short periods. Our ability to study these site response features is limited by a lack of site data. The overall database expansion and model refinements presented here were demonstrated to reduce epistemic uncertainties in the Pea22 GMM by about 30%–50%.
The present findings provide an opportunity to update ground-motion hazard analyses in CAM. Advantages of the present approach include the following:
The models are transparent, allowing for future research where individual components can more readily be interrogated and improved.
The CAM-specific models are calibrated for regional features (constant term, site response, anelastic attenuation, magnitude breakpoint) but are constrained by global data for other features that are generally not found to have appreciable regional variations (magnitude and source depth scaling, geometric spreading).
The GMMs include different rates of forearc and backarc anelastic attenuation, which is important for seismic hazard analyses for backarc locations such as Mexico City.
Supplemental Material
sj-csv-1-eqs-10.1177_87552930251322789 – Supplemental material for Regionalization of global subduction ground-motion model for application in Central America and Mexico
Supplemental material, sj-csv-1-eqs-10.1177_87552930251322789 for Regionalization of global subduction ground-motion model for application in Central America and Mexico by Victor Contreras, Jonathan P Stewart, Juan M Mayoral and Xyoli Pérez-Campos in Earthquake Spectra
Footnotes
Data resources
Declaration of conflicting interests
Funding
Supplemental material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
