A Surrogate Model for Studying Solar Energetic Particle Transport and the Seed Population

The high energy particles originating from the Sun, known as solar energetic particles (SEPs), contribute significantly to the space radiation environment, posing serious threats to astronauts and scientific instruments on board spacecraft. The mechanism that accelerates the SEPs to the observed energy ranges, their transport in the inner heliosphere, and the influence of suprathermal seed particle spectrum are open questions in heliophysics. Accurate predictions of the occurrences of SEP events well in advance are necessary to mitigate their adverse effects but prediction based on first principle models still remains a challenge. In this scenario, adopting a machine learning approach to SEP modeling and prediction is desirable. However, the lack of a balanced database of SEP events restrains this approach. We addressed this limitation by generating large data sets of synthetic SEP events sampled from the physics‐based model, Energetic Particle Radiation Environment Module (EPREM). Using this data, we developed neural networks‐based surrogate models to study the seed population parameter space. Our models, EPREM‐S, run thousands to millions of times faster (depending on computer hardware), making simulation‐based inference workflows practicable in SEP studies while providing predictive uncertainty estimates using a deep ensemble approach.

First observed in the 1940s as a ground level enhancement (GLE: Forbush, 1946), SEPs consist mainly of electrons and protons.SEPs can reach Earth (at a heliocentric distance of 150 million kilometers or 8 light minutes) in an hour or even less.Their energy levels range from several keVs (kilo electron volts) to a few GeVs (giga electron volts).Measurements of SEPs are available from space missions such as Parker Solar Probe (PSP), Solar Orbiter, Geostationary Operational Environmental Satellite (GOES), Advanced Composition Explorer (ACE), and Solar Terrestrial Relations Observatory (STEREO) (Kaiser et al., 2008;McComas et al., 2016;Rodríguez-Pacheco et al., 2020;Stone et al., 1998).Figure 1  The observed SEP events have been broadly classified into two categories, impulsive and gradual but recent works suggest there must be four different types based on particle abundances (Reames, 2013(Reames, , 2021(Reames, , 2022)).The impulsive events are electron-rich with significant (∼1,000 times) enhancements in 3 He/ 4 He and are associated with solar flares and Type III radio bursts.Gradual events, occurring in connection with CMEs and/or CME-driven shocks and Type II radio bursts (Cane & Lario, 2006;Reames, 2013Reames, , 2022;;Schwadron et al., 2020;Winter & Ledbetter, 2015), are proton-rich with no significant 3 He/ 4 He enhancements.The observed characteristics and spatio-temporal profiles of SEPs are influenced by various physical phenomena at the solar sources including the suprathermal seed particle spectrum or seed population (the low-energy particles often produced in solar flares and accelerated by coronal mass ejection shocks), the heliospheric magnetic field configuration, and the particle acceleration mechanism (e.g., CME speed and energy, the microphysics such as turbulence in the local interplanetary medium).Further, the typical radial dependence in CME speeds leads to changes in the strength of the CME-driven shocks associated with gradual SEP events throughout the heliosphere (Brooks & Yardley, 2021;Reames, 2013Reames, , 2021Reames, , 2022;;Schwadron et al., 2010Schwadron et al., , 2017Schwadron et al., , 2018Schwadron et al., , 2020;;Winter & Ledbetter, 2015).Therefore, an in depth analysis of the SEP profiles along with observations of other solar and solar wind variabilities will provide information on their energy spectra, ionization states, elemental and isotope abundances and, most importantly, the plasma properties of these sources and the physical mechanisms of their release and acceleration.
The SEPs form a significant component of the radiation exposure risk for space exploration.They cause a wide range of disruptions and damages to our technological infrastructure both on Earth and the near-Earth environment.To mention a few, these include aircraft navigation systems, radio and wireless communications and the electronics on board spacecraft.In addition, they can damage the DNA of astronauts in space and pose a threat of radiation exposure to the crew and passengers on board high-latitude commercial airlines.Further,

10.1029/2023SW003593
3 of 14 SEPs are, possibly, the single major hurdle in the future lunar and interplanetary missions (e.g., Schwadron et al., 2010Schwadron et al., , 2017Schwadron et al., , 2018, and the references therein).Therefore, addressing the mitigation of space weather hazards due to SEPs is in the forefront of heliophysics research, both from operational and research perspectives.Providing adequate spacecraft shielding is one way to mitigate space radiation hazards but the secondary particles such as neutrons and nuclear fragments generated in the shielding material when the impinging radiation carries energies exceeding 100 MeVs, worsen the radiation hazard (see Schwadron et al., 2010Schwadron et al., , 2017Schwadron et al., , 2018, and the references therein).A more tangible solution is to accurately predict with sufficient lead time when and where these space weather events will occur.However, SEP prediction based on first principle models continues to be challenging and the role of the seed population in the acceleration and transport of SEPs is still an open question in heliophysics.In this scenario, adopting a machine learning (ML) approach to SEP modeling and prediction is desirable and we present in this article the results of an ML approach known as surrogate modeling or emulation where we developed a neural network (NN) model named as EPREM-S, a fast surrogate of the physics-based particle transport and acceleration model known as the Energetic Particle Radiation Environment Module (EPREM) developed by Schwadron et al. (2007Schwadron et al. ( , 2010)).This article is organized as follows: Section 2 describes the data, models and the methods used and Section 3 presents the results.We conclude this article with a discussion of the results and the methods adopted, along with the scope of the present work and our plans for future work using EPREM-S in Section 4.

Methods and Data
The main challenge in developing ML-based models for SEP prediction is the lack of a large enough and balanced data set of observed SEP events for training and validation of modern NN architectures.When the approach is formulated as a classification task (e.g., detection of SEP events vs. background), the problem is exacerbated even further because of class imbalance, implying that SEP events are rare in existing data.To circumvent this difficulty, we generated the first large data set of physically accurate synthetic SEP events using EPREM, a sub-module within the Earth-Moon-Mars Radiation Environment Module (EMMREM) framework, developed by Schwadron et al. (2007Schwadron et al. ( , 2010) ) for predicting the radiation exposure on Earth, Moon, Mars, and anywhere in the heliosphere between Earth and Mars caused by solar energetic particles and cosmic rays.The EPREM model, the synthetic data generation using EPREM and the development of the NN surrogate model are described in detail in the coming sections.

Energetic Particle Radiation Environment Model (EPREM)
EPREM is a kinetic transport code that solves for the propagation and acceleration of energetic particles in the evolving magnetic fields of the inner heliosphere (Kóta et al., 2005;K. Kozarev et al., 2010;K. A. Kozarev et al., 2010;Schwadron et al., 2007;Schwadron et al., 2010) in 3-dimension and is publicly available through NASA community Coordinated Modeling Center (CCMC).EPREM solves the focused transport equation in the Lagrangian frame of reference (i.e., comoving and field aligned), incorporating parallel transport of particles, cross-field diffusion and particle drifts.EPREM is designed to easily modify the energy range, step size and the boundary conditions in energy space, and is parallelized.EPREM has been validated using well-known events such as the Halloween event and using spacecraft data such as GOES and Ulysses (Schwadron et al., 2010(Schwadron et al., , 2015(Schwadron et al., , 2017(Schwadron et al., , 2018)).In addition, EPREM has been used in various contexts such as modeling radiation doses at 1 AU during an extreme SEP event (Schwadron et al., 2014) and solving for SEP distribution based on focussed transport equation (Schwadron et al., 2010); studying the time-dependent effects of SEP acceleration in the low corona during a CME evolution (K. A. Kozarev et al., 2010) and solving for pick up ion distributions (Hill et al., 2009), to mention a few.
The EPREM grid is situated on nested cubes whose surfaces are regularly subdivided into square arrays of square cells with grid nodes at their centers.The shell of grids at the inner boundary corotates with the Sun at each time step and the nested shells advance radially outward with the solar wind.The displacement of each node at each time step Δt is given by: Δx = Δt × V x , where V x is the solar wind velocity and the nodes propagate at the speed of the solar wind (Schwadron et al., 2010).Due to the frozen-in nature of the coronal plasma, the magnetic field is carried along by the outward propagating solar wind giving rise to the interplanetary magnetic field, with an average orientation of ∼45° to the radial direction at 1 AU (known as the Parker spiral; see Hundhausen, 1972, for a review) caused by the solar rotation (the spirals in Figure 4 in Schwadron et al., 2010).Observer nodes are special class of nodes where the energetic particle distributions projected at a given observer such as Earth, Moon or spacecraft.The magnetic fields connected to these nodes are known as observer connected field lines (the thick red and black dashed line in Figure 4 in Schwadron et al., 2010).Given a source of particles such as SEPs or pickup ions, EPREM computes the distribution function as a function of space (heliocentric distance), time and energy ranges anywhere in the heliosphere at individual nodes advecting with the speed of solar wind, naturally tracing the Parker spirals.

Data Generation
The EPREM simulator consists of more than 10,000 lines of C++ code, and it uses a human-readable configuration file format describing the physical characteristics of SEP events including parameters defining the grid, simulation time, background solar wind, seed particle source function, and shock parameters.
For this study, we selected five core parameters of the seed function, namely, boundary function amplitude (Amplitude), energy spectrum power-law index γ (Gamma), radial scaling index β (Beta), boundary function cut-off energy, also called roll over energy or knee (Cut-off), and the mean free path (λ 0 or LAMO) as the input parameters that vary, that is, parameters which EPREM-S takes as input.EPREM utilizes a seed particle spectrum that is a function of energy and heliocentric distance from the Sun.The Cut-off energy or knee signifies the contribution of particles that are deeply penetrating and is of great significance in the prediction of radia tion hazard in the context of manned and robotic space explorations beyond the protective envelope of terrestrial atmosphere and magnetosphere.We defined prior probability distributions over these five parameters using continuous uniform distributions respectively.The selection of the upper and lower limits of each of these five parameters were based on the numerous event analyses carried out by various groups that are available in the literature and our current understanding of SEPs and their origin, acceleration and transport in the heliosphere.The values in these ranges are expected to yield simulated events that are physically meaningful and close to the realistic scenario.However, these ranges are not conclusive or final but rather a first step toward the better understanding of the seed population parameter space.For the chosen ranges, we select uniform distributions because we would like EPREM-S to be trained over the whole parameter space in an unbiased manner.Future analyses using observed SEP events is likely to update the current ranges chosen and can also suggest more informative prior distributions over these parameters that reflect real SEP events.
The rest of EPREM parameters are fixed as follows.Though EPREM can be coupled with MHD models such as MAS and Enlil capable of simulating the global solar wind, we used the uncoupled version where a uniform solar wind is assumed throughout the simulation domain into which the shock is injected.For the spatial grid, we chose 1 row and 4 columns for the nested cubes so that there are 24 streams, each stream is a linked sequence of nodes representing the trajectory of fluid particles advecting with the solar wind.Along each stream, there were 288 nodes.The inner boundary of the simulation domain was set at ∼22.0 R ⊙ and we selected a time step of 20 min for a 4 day simulation duration.For the background solar wind, through which the nodes are advected forming the Parker spiral (see Section 2.1), a uniform velocity of 450 km s −1 has been used, a reasonable value since the shock is injected near the equatorial region where the slow solar wind is mostly confined to except during solar maxima.The density was chosen to fall off as inverse square of radial distance from the Sun.Simulation has been done for 20 energy steps from 0.01 to 200 MeV/nucleon.A shock of width ∼70° placed in the equatorial region, at ∼60° from the face center, and a speed of 1,600 km s −1 was injected 0.25 days after the simulation started.
For data generation we implemented a Python wrapper that allows us to call EPREM as a simple function ϕ = EPREM(ψ), which takes the parameter vector ψ as a PyTorch tensor, writes the values in the corresponding fields of a config file generated on-the-fly, executes EPREM inside a Docker container, parses the computed values from EPREM output files, and returns the output fluxes ϕ in a PyTorch tensor.This allows us to use EPREM in a flexible way in our ML data generation, training, and inference setup in Python.
We generated 32,000 unique events using EPREM and the high-performance computing (HPC) resources at the San Diego Supercomputer Center (SDSC: part of the National Science Foundation's Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support-ACCESS).Typically, EPREM takes tens of minutes to hours, depending on the specific input configuration, to simulate a single event which is considered computationally expensive in the context of data generation sufficient for NN training.Using this data set, we developed and trained EPREM-S, a surrogate model several times faster than EPREM while producing the same output as EPREM, as described in the next section.

Surrogate Modeling
Surrogate modeling, also known as emulation, refers to the creation of fast and simple models that approximate the behavior of complex analytical models that are computationally expensive to evaluate (Queipo et al., 2005;Sobester et al., 2008).The low computation cost and differentiability (if available) of surrogate models enable tasks that might be infeasible using the original expensive model, including sensitivity analysis (e.g., using differentiability), optimization of parameters (e.g., using gradient descent), and uncertainty quantification (e.g., via efficient sampling).Surrogate models have been used in settings as diverse as particle physics (Shirobokov et al., 2020), exoplanet studies (Himes et al., 2022), probabilistic programming (Munk et al., 2022), and computer vision (Behl et al., 2020).
Recent advances in ML, especially in the field of NNs, make the defining and training of surrogate models relatively straightforward using off-the-shelf model components such as linear, recurrent, convolutional, or transformer modules available in frameworks such as PyTorch (Paszke et al., 2019) and TensorFlow (Abadi et al., 2016).Crucially, modern NN techniques require that a sufficiently large set of data representing the behavior of the original expensive model is available, usually in the form of input-output pairs.The generation of this training data is usually the most demanding and computationally costly part of a surrogate modeling approach, involving the construction of a code base to run the original simulator (usually in a distributed computing setting) with input parameters supplied via a sampling scheme.

Surrogate Training and Uncertainty Estimation
Figure 2 provides an overview of the EPREM and EPREM-S models, mapping SEP event parameters (inputs) ψ to event fluxes (outputs) ϕ.The code for the EPREM-S model is written in Python using the PyTorch framework (Paszke et al., 2019).After running hyperparameter tuning with a range of feed-forward and convolutional architectures, we base our results presented in this paper on a feed-forward NN with four hidden layers of sizes 512, 1,024, 2,048, and 138,240, and ReLU nonlinearities after each layer except the last one.This corresponds to a total number of 285,881,344 trainable parameters θ.The output of the last layer is reshaped into a cube with shape 24 × 288 × 20, where the dimensions correspond to 24 streams, 288 time steps, and 20 energy levels.Given a data set of input-output pairs of we train the surrogate model with a mean squared error loss where,  φ = S() are predictions of the surrogate S θ parameterized by θ which we learn by minimizing  () .For gradient-based optimization of  () we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 2 × 10 −4 .
We applied data scaling to both the inputs ψ and prediction targets ϕ which we observed to significantly improve the results we obtained.We standardized the inputs to where, μ ψ and σ ψ are the mean and standard deviation, respectively, of all ψ i values in the training data.We applied thresholding and log-scaling to the prediction targets to obtain φ = log ( min ( max (  10 −10 )  10 10 )) .
At inference time this scaling is undone after executing EPREM-S, giving flux values with the correct units (Figure 3).
In order to provide uncertainty quantification when running trained EPREM-S models, we used a deep ensemble (Lakshminarayanan et al., 2017) approach, where we worked with multiple independently trained EPREM-S instances.Following standard practice, we trained these model instances with the same training data, but using a different random number seed leading to different model weight initialization and course of stochastic optimization for each instance.Given the set of pretrained surrogate models S i , i = 1, …, M, and a new event parameter ψ, the mean and standard deviation of the flux predictions were estimated as and In our experiments we used an ensemble size of M = 5.

Results
We trained an ensemble of five independently trained EPREM-S models where each model is a feed-forward NN with four layers and approximately 286 million learnable parameters.During each training we used a minibatch size of 64, a learning rate of 2 × 10 −4 , and a weight decay (L2 regularization) coefficient of 10 −6 .We used a random split of the data into a training set containing 90% of data and a hold-out validation set containing the rest.We ran the training for 10 epochs, and the lowest mean squared error (MSE) loss achieved is 0.07 for validation and 0.006 for training.We took the model with the lowest validation loss across the 10 epochs as the final model, to eliminate the possibility of producing a model that might overfit to the training data.Further details of the NN architecture and training are described in § 2.
Using the EPREM-S ensemble, we carried out a proof-of-concept study by generating several thousand SEP events for a range of values of physical parameters that are relevant to the seed population in the physics-based model EPREM.Figure 3 shows a comparison of the outputs of EPREM-S and EPREM for a sample of four events.Here, out of the 24 streams (see Section 2.2 for details of EPREM grid and other parameters), we identified stream 10 as the observer connected stream for an observer at 0.48 AU where the energetic particle distributions Here, the curve showing the flux "before the event" (green) was obtained by integrating the flux over a few hours before the start of the event, indicated by the sudden increase in the flux.Similarly, for the curve for "after the event" (purple), the flux has been integrated over a few hours after the event has occurred, which is marked by the time around which the fluxes dropped to the background.The curve depicting "during the event" (red) was obtained by integrating the flux over the entire duration of the event.In all the plots, solid lines depict the EPREM simulation output while dashed lines represent the mean EPREM-S predictions of the ensemble, with shaded regions showing the standard deviation representing predictive uncertainty.We noted that the EPREM and EPREM-S outputs are in remarkable agreement, the MSE being 0.07 as we found during validation of the surrogate model.Achieving this level of accuracy with previously unseen input data illustrates that the models achieved generalization beyond the training set.We also noted, for example, in the top row, that uncertainty in the predictions increase when the mean EPREM-S prediction deviates from the EPREM output.
In addition to simulating SEP events in the forward-in-time direction, the low run-time cost of EPREM-S allows the solution of inverse problems using simulation-based inference (Cranmer et al., 2020), where we express the problem as inference over the input parameter space conditioned on an observed event output.In Figures 4  and 5 we present analyses for six different events that we performed using Markov chain Monte Carlo (MCMC) sampling over EPREM parameter space.We formulated event analysis as a Bayesian inference problem where we infer the posterior distribution p(ψ|ϕ) describing how parameters ψ are distributed in order to match a given data observation ϕ, based on the prior distribution p(ψ) over parameters and the likelihood p(ϕ|ψ) measuring the closeness of match.The prior p(ψ) is defined by the continuous uniform distributions used in EPREM-S training data generation (Section 2.2), and the likelihood is constructed using the EPREM-S ensemble output, representing the probability of flux data ϕ given parameters ψ.In order to confirm correctness, we start with a ground truth event (ψ*, ϕ*) with both the parameters and fluxes known.We then take only the fluxes ϕ* as input for inference and check if the obtained posterior p(ψ|ϕ*) includes the known correct parameters ψ*.With an MCMC chain of 10,000 iterations, we obtain the posterior distributions shown in Figures 4 and 5, where the dashed vertical lines show known ground truth values of the parameters for each event.We see that all parameters are inferred correctly as their ground truth values (unknown to the inference algorithm) are contained within the resulting posterior distributions.We also see that the value for the roll-over energy or knee (discussed in Section 2) can be inferred with lower uncertainty (or more constrained given the observed data) whereas the remaining parameters have higher uncertainty (less constrained given the observed data), depending on the event.
We measured the runtime costs of EPREM and EPREM-S and present a detailed comparison in Table 1.We take the single-threaded CPU execution of EPREM as the baseline duration with respect to which we compute speed-up factors of EPREM-S.A single execution of EPREM, computing one SEP event on the CPU (EPREM has no GPU support), takes 712,630 ms (11.87 min), whereas the computation of one SEP event by EPREM-S takes 9.82 ms on CPU and 0.32 ms on GPU, corresponding to speed-up factors of 7.25 × 10 4 and 2.23 × 10 6 respectively per execution.Furthermore, EPREM-S allows multiple inputs to be processed in parallel in a single forward execution using "minibatching," which is the typical way of running many inputs through NNs simultaneously.For example, in the rightmost column of Table 1 we see that we are able to run 10,000 SEP events in parallel in a single execution of EPREM-S, taking 2,495.77ms on CPU and 0.71 ms on GPU.Finally, we can consider the compound effect of a single execution of EPREM-S being faster than EPREM and the fact that multiple SEP events can be computed simultaneously with a single execution of EPREM-S.For the example of 10,000 events computed simultaneously with EPREM-S, this gives us speed-up factors of 2.85 × 10 6 on CPU and 1.01 × 10 10 on GPU per event.All run-time figures presented are the mean of 10 independent repetitions of each experiment.In our experiments the CPU used is an Intel Core i5-12400 and the GPU is an NVIDIA A100 80 GB Tensor Core.
In order to generate our new data set of 32,000 SEP events, we used a simulation duration of four days, using the full configuration for EPREM described in Section 2.2.The configuration consists of parameters that are fixed during all the runs (such as the total event duration and time step) and physical parameters that are sampled from the continuous uniform distributions we defined in the parameter space.We executed the EPREM simulator, together with our data sampling code, in a distributed setting on the Expanse supercomputer at SDSC.Given the simulation configuration we used, EPREM has a single-threaded run time of 712.6 s (11.9 min) per SEP event, and therefore our overall data generation run corresponds to 6,334.5 hr (263.9 days) of CPU computation.We save the data using a gzip-compressed format holding PyTorch (Paszke et al., 2019) tensors where each file, representing an individual SEP event, takes approximately 1.6 MB on disk.The total size of the generated SEP data is 50 GB.We expect this new data set to be a useful resource enabling further ML-based work in this domain and we have made this data publicly available to the research community (Poduval et al., 2023a(Poduval et al., , 2023b)).

Discussion and Concluding Remarks
Implementation of artificial intelligence (AI) and machine learning (ML) techniques is increasingly recognized by the scientific community as a powerful tool in all areas of Space Science, particularly in Space Weather, to better extract information from the enormous volume of data and to improve the analytic and predictive performances of models (e.g., Azari et al., 2021;Camporeale, 2019;Poduval, McPherron, et al., 2023;Poduval et al., 2022).
Techniques such as NNs have already been implemented in space weather predictions in the 1980s (e.g., Lundstedt, 1996;Lundstedt et al., 2002;Lundstedt, 2005Lundstedt, , 2006, and the references therein).These are essentially precursors of the recent deep learning techniques that represent a revival of NNs with very large data set, better optimization algorithms, and the availability of modern training hardware such as GPUs.ML techniques including support vector machines (Bobra & Couvidat, 2015), logistic regression (Winter & Ledbetter, 2015), ensemble models (decision trees) (FDLTeam2017, 2017), principal component analysis (Winter & Ledbetter, 2015), long short-term memory (Chen et al., 2019;Tan et al., 2018), Gaussian process (Gruet et al., 2018) and deep NNs (Nishizuka et al., 2018) have been used in the prediction of solar flares, SEPs and the geomagnetic indices such as Dst and Kp.
Lack of sufficient and balanced data set of observed SEP events for training a NN makes it challenging to apply ML methods in the prediction of SEPs.To overcome this limitation to a large extent, we utilised a first principle model EPREM to generate synthetic events.Our final data set contained 32,000 unique events represented as input-output pairs of latent physical parameters and observable SEP event data.Using this data set, we developed and trained the surrogate generative model EPREM-S, which approximates the behavior of the underlying EPREM simulator with high fidelity.Our work on the fast EPREM-S surrogate has two main objectives: (a) making SEP simulations computationally cheap (ideally running in a fraction of a second) on average computer hardware, and (b) therefore making it feasible to develop fast and systematic event analyses based on observed SEP data, using simulation-based inference techniques.Using a deep ensemble approach (Lakshminarayanan et al., 2017) our models also provide an estimate of the predictive uncertainty associated with each simulated SEP event.
Importantly, EPREM-S is thousands to millions of times faster (depending on hardware) than EPREM as shown in Table 1 and capable of simulating many events in parallel in a single run, enabling tractable simulation-based inference (Baydin et al., 2019;Cranmer et al., 2020) techniques in this problem domain.Note.The GPU figures are obtained on the NVIDIA A100 80 GB unit.

Table 1 Runtime Results of EPREM and EPREM-S, Based on the Mean of Ten Independent Runs in Each Case
This is the first time such a fast model has been developed that has applications in space weather.With EPREM-S, we would be able to generate thousands of events for studying the influence of the seed particle spectrum in the acceleration and transport of energetic particles in the heliosphere.An example of such an analysis is presented in Figures 4 and 5 where the top panel depicts a ground truth event that is not used for EPREM-S training, the parameter space for which are given on the top and also represented by the dashed line in the bottom panels.
Posterior distribution over these parameters, taking only the ground truth fluxes as input, are depicted in the bottom panels.Encouraged by the accuracy with which the parameters are recovered by inference as evident from Figures 4 and 5, we plan to carry out similar Bayesian inference over input parameters using EPREM-S and event-analysis of observed events in the near future.Having developed a fast surrogate, EPREM-S, that is parallelizable is a first step toward simulation-based inference in SEP studies.
The simulated data we generated using EPREM for selected ranges of the five parameters discussed in Section 2.2 will be of great value to the scientific community engaged in SEP studies as they represent various types and levels of synthetic events, analyzing which one can infer useful characteristics of the source region, and the acceleration and propagation of SEPs.Moreover, the surrogate we developed is a generative model using which we can generate large volume of synthetic data to develop a larger database of simulated events.This is another future work we plan.
We expect EPREM-S to serve multiple purposes.In addition to being an ideal tool for forecasting SEPs, it can be used for studying the influence of the model parameters on the SEP profiles, determining the parameter space for accurate SEP simulations using EPREM, and sampling more synthetic data from the fast generative model learned by EPREM-S.We have released our simulated EPREM data set and the EPREM-S model (Poduval et al., 2023a(Poduval et al., , 2023b) ) to support a new class of SEP studies by the wider heliophysics community.This work is supported by NSF Grant 2026579 awarded to BP. AGB was supported by grants from NVIDIA and utilized an NVIDIA A100 80GB Tensor Core GPU.This work used the National Science Foundation's Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) (https:// access-ci.org)Expanse at the San Diego Supercomputer Center (SDSC) through a start up allocation.BP wishes to acknowledge Matthew Young and Kenneth Fairchild for the help with EPREM installation and related aspects that were helpful for the present work.Authors AGB and BP contributed equally to this work.
depicts a graphical representation of the radiation environment (top left panel), GOES -13 observation of SEPs associated with the 2017 September 10 solar flare (top right) and the observation of the 2018 November proton event by the Integrated Science Investigation of the Sun (IS\astrosunIS) instrument on board PSP (bottom panel).

Figure 2 .
Figure 2. Overview of EPREM and EPREM-S models.Both map input parameters ψ to output fluxes ϕ.EPREM is a physics-based model for generating SEP events implemented in C++.EPREM-S is a neural network trained with a data set   = {,  = EPREM()}  =1 obtained by running EPREM with inputs ψ i ∼ p(ψ) sampled from a continuous uniform prior distribution.

Figure 3 .
Figure3.Four SEP events from the test set (data unseen during training), showing the outputs of EPREM (solid lines) and EPREM-S (dashed lines), and the predictive uncertainty in EPREM-S output (shaded regions representing one standard deviation).Out of the 24 streams (see Section 2.2 for details of EPREM grid and other parameters), we identified stream 10 as the observer connected stream for an observer at 0.48 AU where the energetic particle distributions are projected and shown here.The panels on the left hand side show the differential flux in different energy levels from 0.01-200 MeV.The panels on the right hand side depict the flux during (event-integrated flux represented by red), before (green) and after the event (purple) for EPREM (solid lines) and EPREM-S (dashed lines) for the events shown on the left hand panels.

Figure 4 .
Figure 4. Three events, unseen during EPREM-S training, analyzed via MCMC sampling of EPREM-S outputs conditioned on observed flux data.For each event: top row shows the fluxes observed; bottom row shows posterior distributions over EPREM parameters inferred via MCMC, conditioned on the fluxes observed.Dashed vertical lines represent the ground truth parameter values for each event.

Figure 5 .
Figure 5. Same as 4 but for another three events unseen during EPREM-S training.