Supervised Machine Learning of High Rate GNSS Velocities for Earthquake Strong Motion Signals

High rate Global Navigation Satellite System (GNSS) processed time series capture a broad spectrum of earthquake strong motion signals, but experience regular sporadic noise that can be difficult to distinguish from true seismic signals. The range of possible seismic signal frequencies amidst a high, location‐varying noise floor makes filtering difficult to generalize. Existing methods for automatic detection rely on external inputs to mitigate false alerts, which limit their usefulness. For these reasons, geodetic seismic signal detection makes for a compelling candidate for data‐driven machine learning classification. In this study we generated high rate GNSS time differenced carrier phase (TDCP) velocity time series concurrent in space and time with expected signals from 77 earthquakes occurring over nearly 20 years. TDCP velocity processing has increased sensitivity relative to traditional geodetic displacement processing without requiring sophisticated corrections. We trained, validated and tested a random forest classifier to differentiate seismic events from noise. We find our supervised random forest classifier outperforms the existing detection methods in stand‐alone mode by combining frequency and time domain features into decision criteria. The classifier achieves a 90% true positive rate of seismic event detection within the data set of events ranging from MW4.8–8.2, with typical detection latencies seconds behind S‐wave arrivals. We conclude the performance of this model provides sufficient confidence to enable these valuable ground motion measurements to run in stand‐alone mode for development of edge processing, geodetic infrastructure monitoring and inclusion in operational ground motion observations and models.

Abstract High rate Global Navigation Satellite System (GNSS) processed time series capture a broad spectrum of earthquake strong motion signals, but experience regular sporadic noise that can be difficult to distinguish from true seismic signals. The range of possible seismic signal frequencies amidst a high, location-varying noise floor makes filtering difficult to generalize. Existing methods for automatic detection rely on external inputs to mitigate false alerts, which limit their usefulness. For these reasons, geodetic seismic signal detection makes for a compelling candidate for data-driven machine learning classification. In this study we generated high rate GNSS time differenced carrier phase (TDCP) velocity time series concurrent in space and time with expected signals from 77 earthquakes occurring over nearly 20 years. TDCP velocity processing has increased sensitivity relative to traditional geodetic displacement processing without requiring sophisticated corrections. We trained, validated and tested a random forest classifier to differentiate seismic events from noise. We find our supervised random forest classifier outperforms the existing detection methods in stand-alone mode by combining frequency and time domain features into decision criteria. The classifier achieves a 90% true positive rate of seismic event detection within the data set of events ranging from M W 4.8-8.2, with typical detection latencies seconds behind S-wave arrivals. We conclude the performance of this model provides sufficient confidence to enable these valuable ground motion measurements to run in stand-alone mode for development of edge processing, geodetic infrastructure monitoring and inclusion in operational ground motion observations and models.

Plain Language Summary Continuously operating, high sample rate Global Navigation
Satellite System (GNSS) sensors that experience ground shaking from an earthquake can provide valuable data regarding the nature of the ground motion. If this data is streamed in real-time, these observations can complement existing traditional seismic infrastructure measurements that are used for earthquake early warning or rapid ground motion assessments. However, the data from these sensors can be noisy and have non-earthquake artifacts that are difficult to tell apart from true seismic signals. In this work we used a nearly 20-years archive of high sample rate GNSS velocities occurring during known seismic events to train, validate and test a machine learning model for earthquake detection. This machine learning approach is taken from existing algorithms used for a wide variety of challenging classification problems where a label can be applied to a sample. We demonstrate that this data-driven method, without any external information, is more likely to detect these signals with less false alarms when compared to existing methods. The added confidence this algorithm provides will allow these valuable measurements to be included in operational seismic assessment and warning decision criteria.

DITTMANN ET AL.
Current approaches to detect motion use variations of time domain thresholds to flatten the decision to a function of signal amplitude. Several existing approaches make use of low-pass filters similar to traditional STA/LTA seismological phase picking (i.e., Allen & Ziv, 2011;Goldberg & Bock, 2017;Kawamoto et al., 2016;Ohta et al., 2012;Minson et al., 2014) that extract static offsets for finite fault inversion but filter valuable dynamics information. Recent interest in peak geodetic dynamic signals (Crowell, 2021;Fang et al., 2020;Melgar et al., 2015;Ruhl et al., 2019) prompted use of unfiltered timeseries to capture peak signals for magnitude scaling laws and ground motion intensity measurements (Figures 1b and 1c). These epoch-wise threshold detection methods (i.e., Crowell et al., 2009;Dittmann et al., 2022;Hohensinn & Geiger, 2018;Hodgkinson et al., 2020;Psimoulis et al., 2018) use instantaneous measurements to estimate motion onset and therefore are a step-forward for inclusion of GNSS-seismology waveforms, but have limited "real-world" testing and most importantly mitigate false alerts for operational systems by correlating detections with proximal stations within networks or windowing in time from seismic triggers. These processes reduce the usefulness of these measurements for rapid, stand-alone decision criteria. The evolution of these detection methods has been vital for the vanguard of GNSS-based seismology, but fall short for real-time operational hazard systems to ingest the full temporal and Figure 1. An example of the difficulties of differentiating a relatively weak seismic global navigation satellite system (GNSS) signal event and a GNSS noise disturbance using existing detection methods. The signal depicted is the east component of station P507 observing a M W 5.41 (USGS event ID: ci15200401) at approx. 23 km; velocities are presented in seconds relative to the event origin time (OT). The proximal noise disturbance depicted is a non-geophysical processing artifact or signal propagation effect that might result from sources such as cycle slips, ephemeris, multipath, or other signal path effects. Panel (a) is the 5 Hz timeseries, in addition to a low-pass filtered (corner frequency of 0.5 Hz) timeseries to emphasize the signal and noise for the reader. Gray shading represent areas within the noise estimate for each respective method. Vertical dashed lines are estimated (iasp91 model) P-and S-wave arrival times. Panel (b) illustrates a static threshold taken from (Hodgkinson et al., 2020). This approach is sensitive to the weak signal, but equally sensitive to noise. The threshold has limited memory and rapidly alerts to the onset of the noise disturbance, and also issues several additional false alerts around 105 s OT. Panel (c) is a variation on an STA/LTA approach implemented from (Psimoulis et al., 2018) called RT-Shake with a moving threshold of three times the moving standard deviation. This approach detects the signal event later in the waveform with little information regarding the event duration. The noise disturbance adds an initial false alert, after which the noise region expands to minimize additional false alerts. However this memory would result in missed detection should such a noise disturbance occur immediately prior or during a seismic event.
10.1029/2022JB024854 3 of 13 frequency range of these valuable measurements into models with minimal stand-alone false alerting.
In this work, we evaluate whether existing GNSS hardware can: more reliably detect motion signals that are (a) constellating near the ambient temporal noise floor (b) with minimal false alerting (c) in a low-latency, stand-alone mode and (d) with no specific fault or network geometry. We trained a machine learning classifier on a supervised data set of GNSS velocity time series concurrent in space and time with known seismic source signals. We assembled, processed and labeled a data set of 1,701 earthquake-station high rate (5 Hz) time series pairs. We also include a substantial seismic event-free noise data set to improve model generalization. We optimized the classifier on these combined datasets with applied domain knowledge to feature selection and feature engineering that is able to combine time and frequency domain information. We present the superior performance of this classifier relative to existing methods within this motivational context. We offer advantages and implications of deploying this processing and trained model at scale for network wide monitoring, with particular emphasis on the improved sensitivity and integrity of stand-alone GNSS seismic event detection without external inputs.

Signals of Interest
We define our detection domain as a binary event or no event state classification. A critical component of developing a robust classification model is a substantial data set from which to train, validate and test the model. For optimal results, this data set requires broad spectrum noise and signal samples such that the model can "learn" and generalize our classification and distinguish signal from noise. We assembled a catalog of 1,701 station-event pairs from 77 events by cross referencing available 5 Hz GNSS observational data in the UNAVCO geodetic archive with Advanced National Seismic System Comprehensive Earthquake Catalog (COMCAT) of earthquakes greater than M W 4.5. While 1 Hz data is more readily available, 1 Hz observable decimation undersamples certain event velocity spectra (Joyner, 1984). We observe this effect in reduced velocity amplitudes from 1 Hz data when compared to 5 Hz observables in several nearfield TDCP velocity timeseries, such as the M W 6.2 2021 Petrolia event. For larger magnitude events it's likely that sampling closer to 10 Hz is necessary to avoid aliasing (Shu et al., 2018), but we balance this design parameter with the need for sufficiently large available datasets for training. We assigned a conservative radius of detection for each event using ambient noise estimation from Dittmann et al. (2022). For each station-event pair within this spatial footprint, a time series window began 2 min prior to earthquake origin time (OT), and extends out in time as a function of radius ( Figure 2). We conservatively buffered the radius and time window to mitigate limiting this result from the existing model. We also processed the available 5 Hz observables for a 30 min window in the hour prior to event times of the event catalog from 2017 to 2021. This noise catalog consisted of 1,507 unfiltered station-noise timeseries from 904 unique stations across a range of receiver types, geographic locations, antenna environments and atmospheric conditions, among other potential TDCP noise variance sources. Inclusion of this extended, real-world noise data set in training and validating will improve the model's generalization, or performance on unseen data. Current use of GNSS-derived seismic ground motion for operational EEW (Murray et al., 2018) use precise point positioning (PPP) derived topocentric coordinates to capture dynamic waveforms or static offsets relative to a stations a priori position. Instead, we align synchronous carrier phase epoch-wise changes, predicted satellite orbital velocity and line-of-sight geometry to accumulate coherent energy with respect to the shared receiver clock drift rate and directional velocities in a local reference frame. Variations of this geodetic processing method, known as time differenced carrier phase (TDCP) ( van Graas & Soloviev, 2004) or variometric velocities, can record co-seismic velocity waveforms (Crowell, 2021;Grapenthin et al., 2018;Hohensinn & Geiger, 2018) as well as integrated over time into seismic displacement waveforms (Branzanti et al., 2013;Colosimo et al., 2011;Fratarcangeli et al., 2018). We processed these 5 hz measurements with the open-source SNIVEL package  (Crowell, 2021) using broadcast ephemeris and narrow lane phase combinations. We chose TDCP over PPP because it is more sensitive to motion Fang et al., 2020), and it is "lightweight" in that it does not require sophisticated corrections and is computationally inexpensive. From a machine learning perspective, this could be considered a first step in our feature engineering, or applying domain knowledge to extracting features that are correlated with motion in observed carrier phase measurements.

Feature Engineering Pipeline
Data-driven supervised machine learning models are widely used in computer vision and natural language processing due to their superior accuracy for challenging classification, regression and clustering problems. Earth scientists have adopted many of these models for geoscience research (Kong et al., 2019). Recent catalogs of historic seismic data training sets (e.g., Stanford Earthquake Data Set (Mousavi et al., 2019), INSTANCE (Michelini et al., 2021)) have contributed to benchmarking improvements of earthquake detection, phase picking, localization, and magnitude estimation (e.g.  2019)). These extensive labeled data sets enable sophisticated data-driven classifiers and deep learning models using inertial seismic data. Several geodetic applications of machine learning algorithms have demonstrated promising results with respect to seismic processes. Crocetti et al. (2021) used a random forest classifier for antenna offset detection, including due to earthquake offsets, from low-rate, 24-hr position solutions. Habboub et al. (2020) applied a neural network to coordinate time series anomaly detection applicable to specific regional datasets well above the noise floor. Dybing et al. (2021) used neural networks for earthquake detection and Lin et al. (2021) employed deep learning used for rapid event magnitude estimation; both of these studies used extensive synthetic displacement waveforms derived from real-world fault geometries and real-world PPP noise models.
In our study, we used a random forest algorithm for our classifier (Breiman, 2001) of GNSS velocities. Random forest is an ensemble of decision trees; a single decision tree is a classifier where input features are split along thresholds to separate source, or root, data from end node classifications, or leaves. An ensemble or forest of trees each vote on the feature decision criteria to select the optimal decisions toward minimizing correlated noise. Due to the infrequent nature of larger magnitude earthquakes, the event classes are naturally imbalanced but by pre-selecting specific time series of events, we have reduced this imbalance for training (Table 1) and testing. Random forest hyperparameters were selected using a grid search over the number of decision trees used, the maximum decision splits within a tree, and imbalance classification weighting strategies. SNIVEL TDCP processing generates 5 Hz time series of the three topocentric velocity components and the clock drift rates. From these event-station pair time series of velocities, we generated feature sets to label for our supervised classification ( Figure 3). Our feature samples consisted of three directional components of 30 s windows overlapping every 10 s; within these windows we included the four maximum component norm window values, window median, window median absolute deviation and window power spectral densities from the lowest frequencies bins containing periods 1-30 s as features. These features and windowing allowed our model to incorporate signal and noise amplitude in the time domain, akin to the traditional threshold approach, as well as power spectra in the frequency domain. In our binary classification, an event is seismic ground motion in an individual component. Labels were assigned through visual inspection as not event or 0, event or 1, and maybe for windows East (n = 135,671) North (n = 135,671) Up (n = 135,671) Note. For more information regarding the distribution of peak values, see Figure 5c. a Maybe's excluded from training/testing. We evaluated two feature extraction models. Feature set #1 was a combined array of all three directional components with a single label at each window. The horizontally concatenated components resulted in 3 × m features and n samples, where m is the number of features per component (m = 36 in our pipeline) and n is the number of window samples. If any component was labeled "1" for event, the feature set #1 sample label was "1" for event.
If a maybe label was present without yes events on the other concurrent components, the window was excluded from training/testing. Feature set two included a target vector for each component but excluded the noisier vertical signals. These vertically concatenated components resulted in m features and 2 × n samples. In this extraction case any maybe labels were excluded from training and testing.
We employed a nested cross validation approach for unbiased testing of our data set. We initialized 10 different folds of randomly splitting the 77 events and noise catalog samples into 90% training and 10% testing. By splitting on events we avoided "leakage" of information from our training into our testing, including correlation of seismic waveforms from any given event observed across a network. By cross validating over 10 folds we minimized biasing our result by the relatively small testing subsets of events, and can quantify the ability of our classification model to generalize for future events. Each event was observed by a different number of stations depending on network density and sensing radius, and each station-event pair had differing number of time samples; consequently the feature vectors of training and testing were not precisely 90/10 split in samples. In each fold, we held the test set aside as "unseen," and tuned our model using K-fold cross validation (Bishop & Nasrabadi, 2007) on the remaining training set. We implemented five inner folds in our K-fold cross validation to find the best hyperparameters. This cross validation approach allowed us to minimize overfitting the training data set and evaluate the performance of our model on unseen data as though it were running such a classifier on yet-to-occur events.
The traditional "accuracy" metric, or the ratio of the correctly classified labels relative to the total number of labels, of our classification will be less sensitive regardless of optimization choices due to the infrequent events of our imbalanced classification. Instead, we optimized on metrics that reflect accurately classifying the infrequent events. Precision, or positive predictive value, is equal to the number of true positives (TP) over the sum of TP and false positives (FP).
recall, or sensitivity, is the number of TP over the sum of TP and false negatives (FN). = + (2) F1 is the harmonic mean of precision and recall: Here, positive denotes motion and vice versa.
Precision and recall are approximately inversely related and each is a function of our random forest decision threshold (Figure 4). Quantifying missed detections and false alert rates is imperative for the effectiveness of any EEW system (Minson et al., 2019). We optimized hyperparameters on F1 scores, a balance of precision and recall, but this parameter is a knob available to tune depending on societal missed detection of false alerting tolerances of a future operational system.

Results and Discussion
We evaluate the two optimal feature selection strategies and a range of random forest hyperparameters using a grid search. Given the F1 scores of our 10 fold nested cross validation approach (Table 2), our optimal model used feature set #1, with all available spatial components with a single target label to accumulate as much signal as possible toward our binary classification. Each train/test fold selected different optimal hyperparameter combinations for optimizing F1 scores, but the majority selected 100-200 decision trees, 100 decision splits and no class weighting with a decision threshold of 0.4 (Figure 4). This decision threshold was selected inside the cross validation of each split and applied to testing sets along with the other hyperparameters selected. Our mean and one standard deviation nested cross validation F1 score of 0.70 ± 0.12 indicates our ability to successfully train a model using random forest. The variance in our results as a justifies our nested cross validation approach to quantify the variability in results as a function of the testing set; presumably some variability will resolve with expanded target catalogs.

Feature Importance
A benefit of random forest is that individual feature importance is readily extracted from the trained model. When evaluating feature set one, we find several aspects of the feature importances that align with our domain knowledge and therefore contribute to the explainability of our trained model. The horizontal velocity components dominate the contribution to the model (Figure 5a). GNSS ambient noise on the vertical component is much higher than that of the horizontal components and vertical seismic signal amplitudes are diminished relative to horizontal motion along horizontal strike-slip fault mechanics that are common in the spatial region of this study. These less frequent signals amidst a higher relative noise floor were harder to detect and thus contributed less to the empirical classification model. Within a horizontal component, the lower frequency spectral features had the most influence (Figure 5b). The most important frequency bins were between 6 and 15 s periods, aligned with the dominant frequencies of seismic surface waves. Our 5 Hz sampling, as compared to lower rates, boosted the detectability around the noise floor, and avoided corner frequency aliasing of certain magnitudes.
The time domain features contributed to the model, albeit much less than the lower frequency spectral content and with a more complex relationship. Figure 5c shows increasing F1 score with increasing peak velocity up until approximately a peak velocity of 25 cm/s in the east, followed by diminishing performance. We infer this to be the result of readily visibly identifiable signal events experiencing strong to very-strong shaking around 5-20 cm/s (Worden & Wald, 2020), well above the median noise floor. Infrequent, highest peak velocities (≥25 cm/s) might either be the result of the largest events or noise disturbances; the latter are likely degrading the performance within these peak velocity bins. Figure 5d presents a more straightforward feature relationship in the frequency features, where the greater the accumulated power in the frequency bands of greatest importance (b), the higher the performance metrics (F1, recall, precision). After an initial evaluation, we removed the highest frequency power spectral densities from our features; these are logically "noise" in our classification and not contributing. Altogether, these feature importances illustrate a key attribute of such a machine learning approach: combining features in an explainable way into an effective decision process.

Comparison With Existing Methods
A critical performance indicator is evaluating how our classification model performs over a range of test events relative to existing threshold approaches. Logic was applied to map existing continuous epoch-wise time domain threshold detection to our 30 s overlapping window target labels. For a threshold method comparison similar to the approach of Hodgkinson et al. (2020) and Dittmann et al. (2022), we estimated the noise threshold in the 2 min window prior to seismic origin time. Hodgkinson et al. (2020) characterized the stand-alone sensitivity of detection using ambient noise antecedent to an event as a Gaussian heuristic threshold. Dittmann et al. (2022)   imated the 2 min window of ground velocities as a non central chi-square (NCX2) distribution with three degrees of freedom, and then set the 0.995 confidence level value of this distribution as a noise floor approximation. Any three dimensional GV magnitude above this noise threshold after this window is considered an event, and evaluated on whether it falls within a window labeled motion or not. RT-Shake (Psimoulis et al., 2018) evolved the previous geodetic STA/LTA algorithms (Allen & Ziv, 2011;Ohta et al., 2012) by differencing instantaneous measurements from 80 epoch moving averages and then related these values to a moving window noise threshold estimate set to three times the standard error of the previous 80 epochs. This method was run on each component independently, with a single Boolean for the presence of motion on any component, and each sample window assigned a boolean based on the presence of any motion. The Dittmann et al. (2022) implementation of the threshold window in time was based upon S-Wave speeds (Crowell et al., 2013), and Psimoulis et al. (2018) modified STA/LTA correlated with surrounding stations to minimize false alerts; we did not add this logic so that we could simulate running as a stand-alone instrument.
The mean precision, F1 and accuracy from our 10 fold test of our random forest classifier outperforms the existing threshold approaches (Figure 6). In the threshold approach, recall is higher than the random forest classifier; given the large number of FP that this method triggers, we believe this value  is boosted by chance noise triggers occurring in windows of true motion triggering the motion boolean. This further demonstrates the value of optimizing on F1 as a balance of precision and recall to reduce biasing one decision criteria. Precision is low for both the threshold method and the STA/LTA, but for different reasons; while the precision values (Equation 1) are nearly identical, the threshold method suffers from a relatively high amount of FP, whereas the STA/LTA method low score is due to a lower amount of TP. This discrepancy is evident in the accuracy scores, where the STA/LTA outperforms the threshold approach. False positives would be decreased if using additional external information as their authors' suggest, such as stricter time window approaches and correlating in space within networks. Such an approach would also likely improve the random forest classifiers performance but limit the utility of a stand-alone detection node. Spatio-temporal information could be incorporated into future network decision criteria.

Edge Sensitivity Detection
Detecting the largest amplitude velocity waveforms relative to ambient noise does not present a significant challenge outside of mitigating false alerting from sporadic outliers (Figure 7a), with a 98% true positive rate of events greater than M W 6.0 and less than 100 km radius. The random forest classifier's balance of improved false alerting relative to thresholds and improved sensitivity relative to the STA/LTA is evident for these highest seismic risks. To further investigate the random forest model performance we evaluate detecting signals closer to the noise floor. For simplicity, we bin seismic motion edge case detection into two distinct classes in what is a continuous distribution: large magnitude event seismic motion detection in the far field, and smaller magnitude events detected in the nearfield.
In the relative nearfield, much of the seismic energy passes through a station in shorter duration, varied frequency signals. Earthquake focal depth and fault slip distribution in time and space can significantly vary these waveforms as observed. Critically, the waveform signatures can appear similar to those of non geophysical processing outliers which we wish to ignore for this classification. Most existing STA/LTA methods filter these noise signals but also these valuable higher frequency dynamics. In the previous threshold methods, detection of these edge cases was a function of the ambient noise level, with low precision resulting ( Figure 6) as a result of a high false positive rate. Our classifier has far less false alerts than the threshold approach in these signals, but nevertheless still presents the hardest detection domain for our classifier, evident in the missed detections of Figure 7b of events less than M W 6.0. The left Panels of Figure 8 is an example of a smaller magnitude event (M W 5.4) in the relative nearfield (21 km). The top four Panels ([a:d]-0) on the left of Figure 8)) demonstrate that accurately detecting such an event using the threshold or modified STA/LTA approach is difficult; not only does the true signal barely exceed the noise floor, but there are numerous false alerts using both methods. The random forest classifier captures the majority of labeled motion window in addition to "ignoring" the spurious disturbance around 100 s OT that triggers all other methods evaluated 8 (e−0).
The sensitivity of GNSS to longer period surface waves are apparent at relatively great radii in the 5 hz TDCP velocity time series (Figure 7). The model detects teleseismic surface waves in unfiltered GNSS velocities from Figure 7. Performance of Random forest model developed in the work here across the entire event catalog. We reduce detection of events to a single binary for the figure. In this, each event is evaluated in a "test" split during the nested validation pipeline. This approach ensures each result depicted was evaluated as "unseen" relative to the best fit model from the training subset, and therefore representative of our model's future performance.
a M W 8.2 (USGS event ID: ak0219neiszm) at 1,780 km epicentral radius in real-time with no external corrections; the right-hand Panels of Figure 8 provides an example of this detection. Future analysis could investigate the range of geodetic teleseism detection with respect to larger magnitude event directivity, attenuation and observational networks. In Figure 8d-1, the amplitude of the ground velocity magnitude of these long period signals is insufficient to cross the traditional noise threshold with consistency, and there are many antecedent false alerts. The modified STA/LTA RT-Shake approach does not identify the majority of the long period waves either (Figures 8a and 8c-1), while the random forest classifier in the bottom Panel only misses the first window (Figure 8e-1).

Decision Latency
Delay in alerting is critical to EEW. While our model is trained, tested, and validated on overlapping windows every 10 s, we evaluate running the model at once per second, the current US EEW (Murray et al., 2018) geodetic input rate (Figure 9)). On testing data not used in model training, we find a delay relative to the estimated P-wave, ∼3-5 s under 15 km exists in the current approach. Coarse P-and S-wave arrivals are estimated using the iasp91 model (Kennett & Engdahl, 1991); future work more accurately quantifying these phase arrivals such as the approach of Goldberg et al. (2018) would not only more accurately represent timing performance but also useful for training more sophisticated ground motion models. GNSS velocities using this current approach cannot  , with orange vertical lines indicating a potential alert where GV greater than the threshold. (e) Panels are a comparison of the labeled feature set 1 for these event-station pairs in purple, and the results of the model prediction in red. Shading is used to distinguish overlapping windows. This event-station pair prediction is extracted from the test or unseen event collection.
reliably be used for earliest phase picking, but can rapidly contribute to ground motion models or peak motion scaling laws (Fang et al., 2020). Given the feature importances of the classifier (Figure 5a), we interpret delays to be the result of the classifier trained on the relatively longer period signals visible within the noise. Depending on source magnitude and travel path, these could be P-, S-, surface-waves or some convolution of the energy traveling through the GNSS location. Variance in delays in the near field are likely due to inherent limitations of modeling rupture as a point source at proximal locations (Goldberg et al., 2021) and possibly related to errors of the iasp91 travel time model. Future work will address the possible limitations or delays introduced by our visual classification labeling. It is worth repeating that this assessment uses no external input or seismic triggering.

Ambient Noise Data Set
In addition to evaluating the performance within the bespoke event and noise data sets, we also evaluated the performance of the method during periods of quiescence to further quantify relative false alert rates. Our unseen testing set consisted of 1,321 30-min velocity timeseries from 2019 to 2021, not included with the original nested cross-validation data. We ran five-fold cross validation on the entire event and noise labeled data set from the nested cross-validation pipeline (Section 2.1) to select hyperparameters for training a complete model on all available labeled data for future "unseen" events. Such "unseen" events include this set-aside noise testing set. We confirmed there were no concurrent events greater than M W 4.0 in the USGS COMCAT catalog within the relevant spatial footprint and all other sources of noise or disturbances (signal multipath, oscillators, atmospheric anomalies, etc) remained in the test set. We assigned labels of non event to all target vectors associated with feature extraction. This allowed us to quantify ambient noise performance, or false alarm rate (Figure 10a) using the detection methods previously described in Section 3.2 from 860 unique stations from Alaska to the Caribbean across a range of potential TDCP noise or disturbance sources.
The random forest classifier was less susceptible to false alerts over the window tested than the threshold and STA/LTA approaches. The two threshold models have the highest rates to false alerting, an anticipated result based upon the precision metric reported in Figure 6. Station variations present in the random forest approach (Figure 10b) suggest the current random forest model has some station or time noise dependence not correlated with the variations of other detection methods. Future inclusion of more extensive noise training datasets into our detection classifier and possibly data augmentation techniques would likely be beneficial toward training on the widest variety of noise scenarios and optimizing feature engineering for these complex noise environments.

Conclusion
We applied an existing machine learning algorithm and sample splitting pipeline techniques to training, validating and testing a seismic motion detection classifier from 5 Hz TDCP GNSS velocities. We leveraged nearly 20 years of 5 Hz GNSS data archives for training a classification model that outperforms existing threshold approaches for detecting motion in stand-alone mode. The classifier combines time domain and frequency domain features to match the sensitivity of the threshold method without the false alerts, and matches the minimal false alerting of the STA/LTA with improved sensitivity. Given the agreement that GNSS velocities have with existing ground motion models (Crowell et al., 2022) and the increased confidence in separating signal from noise demonstrated here, these GNSS velocities can operationally contribute to ground motion measurements. The alert latency of this current model does not match the sensitivity of existing inertial infrastructure. A complementary approach using the information available at the time, including lowest latency p-wave characterization from inertial sensors and unsaturated velocity estimation from GNSS provides an optimal solution for existing dense multi-sensor networks. For less dense networks of either sensor type, it is more critical to establish a decision criteria for balancing timing, noise and accuracy of these independent observation systems. Further investigation of integrating the processing and classifying approach of this manuscript with the sensitivity of co-located MEMS sensors (Goldberg & Bock, 2017) would advantageously overlap seismic and geodetic traditional boundaries. Current 5 Hz GNSS observation data streams are too verbose for many bandwidth limited remote hardware; this presents an exciting opportunity for edge processing at potentially much higher rates (Shu et al., 2018), or experimental lean 5 Hz carrier phase data streams. Our method presented here does not use a sophisticated machine learning model, yet has improved detection relative to existing approaches; much improvement remains, especially with expanded datasets across global geodetic networks and/or synthetics or data augmentation for training, validation and testing of neural networks and deep learning models.
With an expanding availability and access to real-time GNSS streaming networks, the seismological community stands to benefit from this signal of opportunity for rapid ground motion detection for earthquake and tsunami source characterization. Furthermore, the vast industry of GNSS position, navigation and timing users catalyzing the expansion of these GNSS real-time networks will benefit from improved automated alerting of reference station motion onset. Future work will include integrating this classifier amongst existing and future automated GNSS carrier phase disturbance characterization methods, including space weather disturbances (Jiao et al., 2017), oscillator anomalies (Liu & Morton, 2022), radio frequency interference and signal multipath.