A Multicomponent Magnetic Proxy for Solar Activity

We present a new, multicomponent magnetic proxy for solar activity derived from full disk magnetograms that can be used in the specification and forecasting of the Sun’s radiative output. To compute this proxy we project Carrington maps, such as the synchronic Carrington maps computed with the Advective Flux Transport (AFT) surface flux transport model, to heliographic cartesian coordinates and determine the total unsigned flux as a function of absolute magnetic flux density. Performing this calculation for each day produces an array of time series, one for each flux density interval. Since many of these time series are strongly correlated, we use principal component analysis to reduce them to a smaller number of uncorrelated time series. We show that the first few principal components accurately reproduce widely used proxies for solar activity, such the the 10.7 cm radio flux and the Mg core‐to‐wing ratio. This suggests that these magnetic time series can be used as a proxy for irradiance variability for emission formed over a wide range of temperatures.

are only a few studies which investigate the use of the magnetic field in the specification and forecasting of the solar irradiance (e.g., Henney et al. (2012); Henney et al. (2015)).
In this paper we develop a new, multicomponent magnetic proxy for use in the specification and forecasting of the Sun's radiative output. As we will see, the multicomponent nature of this proxy is important for accurately modeling emission formed at many different temperatures in the solar atmosphere, something that is difficult with proxies such as F10.7. This work is primarily based on synchronic Carrington maps computed with the Advective Flux Transport (AFT) model (Upton & Hathaway, 2014a, 2018, which assimilates observations from Michelson Doppler Imager (MDI) and The Helioseismic and Magnetic Imager Investigation (HMI) instruments (Scherrer et al., 1995;Scherrer et al., 2012). Using the magnetic field determined from a surface flux transport model makes it easy to use the proxy for forecasting and also addresses some of the limitations of magnetic field measurements made near the solar limb.

Magnetic Flux Histograms and Time Series
The AFT model describes how the radial component of the magnetic field on the solar surface is advected by supergranular diffusion, differential rotation, and meridional flow. The unique aspect of the model is its use of a time-dependent velocity pattern (Hathaway et al., 2010) in place of an ad hoc diffusion term to account for the transport of magnetic flux by supergranular motions. For this work we use AFT runs that are updated periodically with an observed line-of-sight magnetogram. The assimilated data is weighted to emphasize observations near disk center, so the AFT maps are dominated by the actual measurements in the region facing the Earth.
The observed magnetograms are taken from the Michelson Doppler Imager (MDI, Scherrer et al., 1995) on the Solar and Heliospheric Observatory (SoHO), which operated from 1996 to 2011, and the Helioseismic and Magnetic Imager (HMI, Scherrer et al. (2012)) on the Solar Dynamics Observatory (SDO), which began operations in 2010. For the HMI data, hourly synoptic data is assimilated. For MDI, full disk magnetograms are available every 96 minutes. The magnetic field measurements from the two instruments have some subtle differences (Liu et al., 2012), which are accounted for before the data is assimilated. Most significantly, the level 1.8.2 magnetic sensitivity map is removed from the MDI magnetogram and then the flux is scaled to match HMI using the scaling relation = 1.10 + 0.33 tanh | | 1000 . (1) Our tests show that this improves the agreement between MDI and HMI in the AFT simulations. Because of the extended loss of contact with SoHO in 1998, we begin our analysis on April 1, 1999, transition to using AFT maps based on HMI on June 1, 2010, and end our analysis on December 31, 2020. We computed a total 7,947 AFT magnetograms during this time, one for each day. The AFT calculation takes about one day of actual time for each year of the simulation.
An example Carrington map from AFT is shown in Figure 1. Here we also show a projection of the Carrington map to heliographic cartesian coordinates and a corresponding observed HMI magnetogram. Because the model is updated regularly, the Earth-facing side of the Sun heavily weighted by the data and the images are very similar. The primary advantage of using the AFT model instead of the observed magnetic flux is that the field at the limb is less noisy and does not suffer from "canopy" effects, where strong flux at the limb has the wrong sign because of line-of-sight projection effects. One limitation of the AFT model is that flux that has emerged on the far side and has just rotated over the limb is not fully assimilated immediately. Both of these effects, though subtle, are evident in this example.
Anticipating comparisons with F10.7, which is measured between 17 and 23 UT, we compute the magnetogram images for 16 UT, the closest time for which AFT snapshots are available. Perhaps the simplest approach to constructing a proxy for solar activity from these images would be to compute the total unsigned flux for each image, that is, the sum of the absolute magnitude of the flux density in each pixel multiplied by the pixel area. The relationship between the radiance and the magnetic field, however, can be complex. The strong magnetic fluxes found in sunspots, for example, rarely produce bright EUV emission (e.g., Tiwari et al., 2017). Similarly, the weakest quiet sun fluxes are always present and unlikely to be strongly correlated with variations in the irradiance. One might imagine defining different ranges of fluxes to represent different components of variability (e.g.,

10.1029/2021SW002860
3 of 15 quiet Sun, active network, active region), but it is not clear how to define these boundaries optimally (see Harvey and White (1999) for a proposed scheme optimized for Ca ii K).
For this work we adopt a two step procedure that circumvents these problems. First we construct histograms of the unsigned magnetic flux as a function of flux density for each day. Second we perform principal component analysis (PCA) on the resulting time series to reduce them to a more manageable size. Recall that PCA is a technique for reducing the dimensionality of a dataset by defining a new orthonormal basis that is ordered by information content. Typically, the first few components account for a large percentage of the variance in the data. A common illustration of PCA is points in three dimensional space that are largely confined to a two dimensional plane. PCA computes a new coordinate system aligned to the data where the third dimension can be dropped while minimizing the loss of information.  The noise level in the magnetograms is estimated to be about 10 G (Yeo et al., 2014), and is likely to be higher near the limb.
Time series for selected bins are shown in Figure 3. These time series of total unsigned flux show modulation over both rotational and solar cycle time scales. The amplitude of this modulation increases with increasing magnetic flux density. As one would expect, the fluxes in adjacent bins are strongly correlated, indicating that some of the information in these time series is redundant.
For the second step in this process we use the PCA package from scikit-learn (Pedregosa et al., 2011) to compute the principle components of the magnetic time series. The first four components are shown in Figure 4. The first component accounts for 85% of the variance in the original time series of magnetic fluxes. The first two components combine to account for 95% of the variance. The four components shown here combine to account for 99% of the variance. Note the the magnetic time series are scaled to have zero mean and unit variance before the PCA decomposition is computed. Thus the components are dimensionless. One drawback to PCA is that it is not clear how to intepret each component. The first component is clearly correlated with the solar activity cycle, but the other components do not have an obvious interpretation.

Example Applications
The processed magnetic time series shown in Figure 4 can be used to model the temporal variability of solar irradiance time series. To illustrate this application we show that the magnetic time series can capture the evolution of F10.7, the Mg ii core-to-wing ratio, several frequencies observed at the Nobeyama Radio Observatory, and several irradiance time series from EUV Variability Experiment (EVE, Woods et al. (2012)) on SDO. For all of these time series we will perform a simple multiple linear regression of the form where M i (t) is a PCA component of the magnetic time series. We use use the Python statsmodel package (Seabold & Perktold, 2010) to perform these fits. The sources of these data are described in Section 5.
We note that there are a number of sources of error in the fitting of a proxy for solar activity to an irradiance time series. There are measurement errors in both the proxy and the irradiance time series. These measurement errors can result from long-term drifts in instrument performance as well as from counting statistics. Additionally, the formation mechanisms for the proxy and the irradiance time series can be different. The irradiance, for example, could depend on both the magnitude and spatial distribution of the magnetic flux. As we will see, all of these sources of error are likely to be present.

F10.7
We fit all of the available magnetic data and the daily F10.7 measurements to Equation 2. In Figure 5 we show a time series of the observed F10.7 and the values inferred from the fit. Also shown are the a correlation between scatter plot of the modeled and observed values, the Pearson's correlation coefficent between the modeled and observed values, the residuals as a function of time, and a histogram of the residuals. Note that the uncertainty in the measured F10.7 flux is estimated to be 1% or one solar flux unit, whichever is larger (Tapping, 2013). To highlight the variation of the irradiance over a solar rotation, time series for smaller time ranges are also shown in Figure 5.
The model fits the observations of F10.7 very well. The correlation between the modeled and observed values is 0.98. The residuals are relatively small, with a standard deviation of 6.4%. The residuals are generally biased towards larger values, where the model systematically under-predicts the observations. Some of these differences may be due to very large sunspots that influence the F10.7 measurements, but are not well captured by the  magnetograms (e.g., September 2017). Flares could also influence the F10.7 measurements more than the magnetograms. Close inspection of the residuals suggests an unexpected secular trend between 2004 and 2015, which seems to resume around 2016. As we will see, this pattern is evident in some of the other irradiance time series. We will discuss this in some detail in Section 4.
We fit F10.7 using progressively more principle components to test the impact of increasing model complexity on goodness of fit. As noted earlier, The first four components account for 85, 95, 98, and 99% of the variance in the original time series. Using only the first component yields a correlation of 0.97 and a dispersion in the residuals of 7.2%. Adding the remaining components yields correlations of 0.97, 0.98, and 0.98. For the dispersion the results residuals are 7.1%, 6.8%, and 6.4%. Thus, the first two components account for the the vast majority of the variation in the observed irradiance time series.

Mg II Core-to-Wing
The fit of the Mg ii core-to-wing ratio to the magnetic time series is shown in Figure 6. The format of the figures is identical to Figure 5. The magnetic model, however, fits the Mg index better than F10.7. The correlation is higher and the residuals are smaller. The residuals shown in Figure 6 show the secular trends that were alluded to in the discussion on the fits to F10.7. Again, we will discuss possible explanation for this in Section 4. Note that the estimated uncertainties in the measured ratio are generally less than 1% (Snow et al., 2014). For the time period of interest here they are generally around 0.3% (See http://www.iup.uni-bremen.de/UVSAT/Datasets/mgii).

Nobeyama Radio Observatory
Measurements of the Sun's radio emission at several frequencies have been monitored at the Toyokawa and Nobeyama radio polarimeters since the 1950's (Tanaka et al., 1973), creating long time series that are useful for irradiance modeling. Observations at 30, 15, 8, and 3.2 cm are available. Dudok de Wit et al. (2014) show that the 30 cm flux is better for modeling the thermosphere-ionosphere system. The fit of the 30 cm (1.0 GHz) signal to the magnetic time series is shown in Figure 7. The correlation and residuals are similar to those seen in the fit to F10.7. Here the correlation between the model and the observation is 0.98 and the dispersion in the residuals is 7.3%.

EVE
As a final application we consider fits of irradiance time series observed with EVE to the magnetic proxy. We have chosen lines formed at three different temperatures in the solar atmosphere: He ii 304 Å, Fe xii 195 Å, and Fe xvi 360 Å. All three lines were observed with the short wavelength range of EVE ("MEGS-A"), which ceased operations in May of 2014. The total uncertainty for each measurement is provided by the instrument team.
The observed and modeled time series, scatter plots, and histograms are shown in Figures 8, 9, and 10. The fit to He ii 304 Å is very good, with small residuals over the entire time period, similar to the results from the Mg ii core-to-wing ratio. The fits to the other wavelengths are not as good. Fe xii 195 Å shows some relatively large discrepancies early in the EVE mission, where the model is systematically lower than the observations. The residuals are about a factor of 3 higher than those for He ii 304 Å and have a large tail to negative values. The residuals for Fe xvi 360 Å are larger still, reaching values of ±50%. This is somewhat misleading, however. Fe xvi is formed at a high temperature and the amount of high temperature emission in the corona at solar mininum is very small (Warren, 2005). Thus, the signal in this line becomes very weak during solar minimum. Figure 10 shows a clear trend towards lower residuals during periods of higher solar activity. Still, even if we restrict the calculation to the core of the distribution, the residuals are about 10 8 times higher than they are for He ii 304 Å. All of these irradiance time series show a linear relationship between the observed and modeled fluxes.
These fits of the magnetic proxy to the EVE irradiance time series can be compared with simple linear fits to F10.7 or the Mg ii core-to-wing ratio. This is shown in Figure 11, where scatter plots of modeled and observed irradiances are displayed. These regression results are summarized in Table 1. The residuals and correlations from these fits show that the magnetic flux proxy performs as well as the Mg ii core-to-wing ratio and better than F10.7. Figure 6. Results from a multiple linear regression of the magnetic flux proxy to the Mg ii core-to-wing ratio. The format is identical to Figure 5. Note that the residual between the fit and the observed proxy is somewhat larger that the estimated uncertainties in the proxy.

Summary and Discussion
We have presented a new proxy for solar activity that can be used in the specification and forecasting of the solar irradiance. We have shown that this proxy can be used to accurately model other proxies for solar activity and irradiance time series formed at different layers in the solar atmosphere.
The application of this magnetic proxy to several example irradiance time series from EVE suggest that it performs better than F10.7 and comparable to the Mg ii core-to-wing ratio. Despite the fact that it doesn't strictly outperform the Mg index, it does have several advantages over it. Magnetograms can be measured from the ground, which could make it easier to obtain consistent measurements over long periods of time. This remains to be demonstrated, as long-term financial support needs to be provided for a distribution global network of ground based observatories to optimize the number of observations with good seeing. The Global Oscillation Network Group (GONG) is an example of such a network, and a next generation GONG would provide additional important capabilities (Hill et al., 2019). Finally, the evolution of surface magnetic flux is well described by models such as AFT, which provides a physics-based framework for forecasting solar activity. The evolution of the Sun's surface magnetic field is well described by models such as AFT. This makes it possible to estimate the future state of the magnetic field and provides a physics-based framework for forecasting solar activity. Currently, forecasting models for proxies generally use simple autogressive techiques to extrapolate recent activity into the future (e.g., Warren et al., 2017).
The secular long-term trends seen in the residuals between the data and the model fits are unexpected. These trends are most clearly seen in the analysis of the Mg core-to-wing ratio data ( Figure 6), but are also evident in the F10.7 and NRO time series. They are largely absent in the fits to the EVE irradiances, which are much shorter in duration. Since these trends in the residuals appear in fits to several different proxies, it seems likely that they are related to the magnetic field.To investigate the possible origins of these trends we compared the flux time series derived from the AFT images with those taken directly from the HMI observations. For the largest flux densities, above about 80 G, these time series are very similar. At smaller flux densities, however, these time series show non-trivial differences. Unfortunately, the origin of these differences is unclear. It seems that the AFT simulation modifies these weak fluxes in a way that is inconsistent with the observations. Since the assimilation of the data into the AFT model is weighted towards disk center, it is possible for these inconsistencies to persist. Of course, weak fluxes near the limb are the most difficult to measure, making it difficult to identify the origin of these differences. Ultimately, the weakest fluxes have the smallest effect on the irradiance time series and the residuals are acceptably small.
This work suggests several future research directions. For example, spatially resolved magnetogram measurements from the Mount Wilson Observatory (MWO) extend back to the early 1970's (Pevtsov et al., 2021) and our histogram analysis could be applied to these data to create a much longer magnetic proxy time series. Addition- ally, the skill of flux transport models in forecasting solar activity needs to be compared with simpler, statistical methods (e.g., Warren et al. (2017)). It seems likely that flux transport models will be limited by a lack of knowledge of both past far-side and future near-side flux emergence. The use of helioseismology or images from STEREO could provide a means for addressing this problem and improving flux transport simulations.

Data Availability Statement
We have made all of the projected magnetograms derived from the AFT flux transport simulation publicly available as standard FITS files on Zenodo (https://doi.org/10.5281/zenodo.5094741). The total volume of data is about 11 GB. The projected magnetograms are derived from a much larger set of AFT Carrington maps of the surface magnetic field, which are not included. All of the data products derived from the magnetograms are available on a GitHub repository https:// github.com/USNavalResearchLaboratory/MagneticProxy. The derived data products include the histograms of magnetic flux for each day, the PCA time series derived from the histograms, and the fit parameters for the PCA model of F10.7, Mg core-to-wing, the Nobeyama Radio Observatory time series, and the EVE irradiance time series. Additionally, routines for reading these derived data products are also available in the repository. The routines are written in Python and can be run with a recent distribution of Anaconda. The F10.7 radio flux data were downloaded from ftp://ftp.seismo.nrcan.gc.ca/spaceweather/solar_flux/daily_flux_values/fluxtable.txt. The Mg core-to-wing Ratio data were downloaded from https://www.iup.uni-bremen.de/gome/solar/MgII_composite.dat. The Nobeyama Radio Polarimeter data were downloaded from https://solar.nro.nao.ac.jp/norp/data/daily/. The EVE/SDO data were downloaded from https://lasp.colorado.edu/lisird/data/sdo_eve_lines_l3/. All of these data were converted into csv files, which are included in the repository. Figure 11. Scatter plots of modeled and observed EVE irradiances using the F10.7 (top panels) and Mg ii core-to-wing ratio (bottom panels) as proxies for solar activity.