Corresponding author: A. J. Ribeiro, 1901 Innovation Dr., Ste. 1033, Blacksburg, VA 24060, USA (firstname.lastname@example.org)
 The Super Dual Auroral Radar Network (SuperDARN) is a chain of HF radars for monitoring plasma flows in the high and middle latitude E and F regions of the ionosphere. The targets of SuperDARN radars are plasma irregularities which can flow up to several kilometers per second and can be detected out to ranges of several thousand kilometers. We have developed a simulator which is able to model SuperDARN data realistically. The simulation system comprises four separate parts: model scatterers, model collective properties, a model radar, and post-processing. Importantly, the simulator is designed using the collective scatter approach which accurately captures the expected statistical fluctuations of the radar echoes. The output of the program can represent either receiver voltages or autocorrelation functions (ACFs) in standard SuperDARN file formats. The simulator is useful for testing and implementation of SuperDARN data processing software and for investigation of how radar data and performance change when the nature of the irregularities or radar operation varies. The companion paper demonstrates the application of simulated data to evaluate the performance of different ACF fitting algorithms. The data simulator is applicable to other ionospheric radar systems.
 The Super Dual Auroral Radar Network (SuperDARN) is an international network of HF (8–20 MHz) radars monitoring plasma dynamics at middle to high latitudes in both the Northern and Southern Hemispheres [Greenwald et al., 1985; Chisham et al., 2007]. The radars coherently detect via Bragg scattering decameter-scale irregularities in the plasma density distribution in the E and F regions of the ionosphere. A conventional SuperDARN radar has 16 look directions, or beams, separated by 3.24° in azimuth. Each beam sounding counts 75–100 range gates. The spatial extent of the range gates is determined by the radar sample separation and is typically 45 km, although other values such as 15 or 30 km are common. The radars use a multipulse sequence in order to simultaneously satisfy requirements of the maximum range of values for target Doppler velocity and range. [Greenwald et al., 1985; Hanuise et al., 1993; Baker et al., 1995; Barthes et al., 1998; Ponomarenko and Waters, 2006]. Plasma irregularities are routinely detected at ranges of hundreds to several thousand kilometers and have speeds of hundreds of meters per second. An autocorrelation function (ACF), from which parameters such as Doppler velocity are determined, is calculated for each range gate using the instant receiver samples (voltages). The dwell (integration) time on a particular beam, tint, is typically 3–7 s. An overall transmit/receive time for a single pulse sequence is typically 100 ms so that in a single integration period, 30–70 pulse sequences are integrated. For each range gate, the arrival time of returns from each pulse in the sequence is calculated, and receiver samples from pulse pairs are multiplied in order to generate the complex ACF values at the time lag set by the delay between the pulses. These products are averaged over the integration time to produce an average ACF. Besides increasing the signal-to-noise ratio (SNR) by suppressing noise fluctuations, the averaging also lowers the interference from undesired ranges (cross-range interference, CRI, for more details, see Ponomarenko and Waters ). Analytical functions are fitted to the variation in ACF power and phase with lag time to estimate Doppler velocity, spectral width, and backscatter power. The performance of different methods for performing this fitting is considered in the companion paper [Ribeiro et al., 2013].
 In order to assess fitting algorithms quantitatively, it is desirable to be able to perform the fitting on modeled radar data with tunable input parameters and realistic statistical characteristics. There have been several attempts to design such a simulator for SuperDARN applications [André et al., 1999; Ponomarenko et al., 2008]. The latest effort by Ponomarenko et al.  was based on the collective scatter approach and considered a single range gate with a combination of ionospheric scatter, ground scatter, and external noise components. This work represents a further development of the collective scatter approach. The improved model includes multiple range gates, accounts for CRI and pulse-overlap interference (which results from blanking the receiver during transmission), and generates output either as averaged ACFs or instant receiver voltages. It also contains physical justification and detailed description of the basic radar simulator which were only briefly mentioned in Ponomarenko et al. . The simulator is coded in the C programming language and has been thoroughly tested. While it is designed to analyze SuperDARN-specific fitting algorithms, as described in the companion paper [Ribeiro et al., 2013], the software can also be adapted to simulate operation of other types of backscatter radars.
2 Physical Justification for the Backscatter Model
 In testing radar data processing software, it is crucial to be able to simulate the test data set realistically. With respect to SuperDARN applications, this amounts to simulating ionospheric backscatter signals (ACFs). It is easy to generate an ideal ACF with pre-determined magnitude (SNR or “power”), phase variation (Doppler shift), and decorrelation time (spectral width). However, this sort of modeling does not provide objective information on the measurement errors which are mostly determined by (1) external noise and (2) statistical fluctuations of the signal itself. While modeling of the external noise/interference is relatively straightforward, a realistic description of the signal's statistics requires special attention. A simple way to “randomize” an ideal ACF is to add a “noise” ACF component that can be generated from “white” or “colored” noise, but that approach lacks clear physical justification. For a more adequate description of the statistical variability of the radar echoes, one has to adopt a realistic model of the scatterers, i.e., electron density irregularities. On average, the ionospheric irregularities are relatively weak where Ne is the electron density and δrepresents a perturbation, so that most of the wave power penetrates through the plasma with only a small portion scattered back to the receiver. The average backscatter field at the reception point can then be adequately described by the single-scatter approximation [e.g., Rytov et al., 1988], where each point of the scattering volume represents a discrete source of an elementary field
where , amplitude is proportional to the magnitude of the local electron density fluctuations, |A(r,t)|∝δNe(r,t), and phase is defined by φ(r,t)=−(ωt+k·r), where ωand kare the angular frequency and wave vector associated with the radar signal and r is the total path followed by the ray. The resulting field at the radar location then results from summation of the individual fields generated by the sources confined to the effective scattering volume (range gate). Statistical properties of the scattered field arise from the spatiotemporal variability of the individual fields, A(r,t), which are discussed in the following sections.
3 Implementation of the Simulator
 Operation of the simulator can be divided into four basic components: (1) individual scatterer, (2) collective properties, (3) radar operation, and (4) post-processing.
3.1 Individual Scatterer Model
 The fundamental elements of the simulator are the model scatterers. For this application, a scatterer is a point in space which reflects the radar signal. The behavior of scatterers is based on the model proposed by Moorcroft . Each scatterer, i, has a random time of appearance within a designated integration period that begins at time . For testing purposes, we also introduce an option to designate a finite scatterer lifetime, . This parameter is consistent with experimental observations from Ponomarenko et al. . This results in “boxcar” scatterers with constant amplitude, i.e.,
 The lifetime distribution of the scatterers can be set to either constant or exponential, i.e., in the former case and (where x is a uniformly distributed random variable between 0 and 1) in the latter case. In the future, it would be easy to introduce other models for the lifetime distribution of scatterers. The user inputs the constant tc and chooses the distribution. Table 1lists all of the user inputs to the simulator. Note that if tc is set to a large value compared to the duration of an integration period, the scatterers will effectively have infinite lifetimes. The last step in initializing the model scatterers is to give each one a noise-like random velocity in the line-of-sight direction drawn from a Gaussian distribution, designated as . Note that this velocity is distinct from the bulk drift velocity, which will be discussed in the following section. In reality, there is little evidence for these velocity fluctuations, but we have included them in the model for completeness [Villain et al., 1996]. The standard deviation of the distribution of the random velocity fluctuations, is set by the user and can be assigned separately to each range gate. If the user sets this value to 0, random velocity fluctuations will not exist in the model.
Table 1. List of User Inputs to the Simulator
Number of Elements
flag to indicate if noise is included
white noise level
number of ACFs to integrate
number of range gates
distance to first range gate in samples
flag indicating lifetime distribution of scatterers
number of pulses in pulse sequence
smallest interpulse separation
flag indicating whether to include CRI
disappearance time constant
standard deviation of Gaussian velocity fluctuations
irregularity decay time
irregularity growth time
amplitude factor of ACFs
flag indicating if a range gate contains backscatter
3.2 Collective Behavior Model
 The next step in the simulation is to integrate individual scatterers into a collective behavior model which determines statistical characteristics of the radar returns produced by the scatterers confined to a range gate. For this application, we consider a collection of a large number (n=2000) of elementary scatterers within a single range gate with linear dimension Δr. The number of scatterers was chosen as a trade-off between model validity and computing time. Each scatterer is assigned an initial position in two-dimensional space at a range rifrom the radar, which is selected randomly within the parent range gate such that rg≤ri≤rg+Δr where rg is the distance to the front edge range gate. We also assume that backscatter comes from the far zone, Δr/r≪1, so that we can neglect the difference in the geometrical decay factor for scatterers within a single range gate.
 We also use the collective behavior model to deal with characteristics which are common to all scatterers within a range gate. In general, the electron density fluctuations associated with ionospheric plasma irregularities are characterized by amplitude decay due to some kind of dissipation process, e.g., plasma diffusion. In our simulator, this is modeled through a combination of exponential growth and decay times, tg and td, respectively. It is worth noting that these two parameters are unrelated to the “boxcar” lifetime property discussed previously. This results in a reflected signal amplitude from the ith scatterer in a range gate at time t of
where the |Amax(t)| forces the amplitude to 0 outside of the scatterer lifetime, consistent with (2).
 Another characteristic which is shared by all scatterers within a common range gate is a collective Doppler velocity, vd. The total line-of-sight (LOS) velocity of a scatterer can then be expressed as . Assuming that the Doppler shift of the echo from a single scatterer is fully determined by the LOS velocity, this can be expressed as where k=2π/λ and λ is the radar transmit wavelength. In this case, the frequency of the elementary field reflected by the ith scatterer is where ω0is the radar transmission frequency. The phase of the returned signal depends both on time and range to the target as well as the velocity of the target and can be expressed as
where the factor of two represents the fact that the radar signal propagates from the radar to the target and back.
 As a result of (3) and (4), we can calculate the backscattered field at the radar location produced by a single scatterer at time t as
where |Ai(t)| is described by a combination of (2) and (3), and φi(t,ri) is described by (4). Thus, assuming 2000 scatterers within a range gate, we can calculate the backscattered field from a single range gate r at the radar location at time t as
 Note that Si and Vr are complex.
4 Model Radar Operation
 Once the model ionosphere has been created, it is sampled by the model radar. As described previously, SuperDARN radars employ a multipulse sequence, and therefore the simulator does as well. The pulse sequence is defined by the user and passed to the simulator. An example of a SuperDARN pulse sequence, katscan, is shown in Figure 2 of the companion paper. In theory any sequence can be used; in the current implementation, three standard SuperDARN pulse sequences are automatically available, normalscan, katscan, and tauscan. The sampling is done by calculating the returns at discrete sample times using (6). This process begins at t0≈1 s, which allows for scatterer appearance and decay to reach a steady state condition. Sampling of returns from a particular pulse begins at t=t0+tpul+tfrang, where t is the time of the current sample, tpul is the time of the pulse, and tfrangis the time it takes for the signal to travel to the location of the first range gate (distance to first range gate, in samples, is set by the user). Subsequent samples from this pulse are calculated by incrementing by a single range gate (equivalent to incrementing t by the sample separation, smsep) Nrangtimes. The user is responsible for passing an array, qflg in Table 1, of size Nrang(number of range gates) to the simulator which contains flags to indicate which range gates contain backscatter. In reality, all of the simulated range gates contain scatterers, but only those with a qflg of 1 will be sampled, rendering the scatterers in range gates with a qflg of 0 invisible. The radar returns are sampled as voltages in the receiver. The radars operate with I (in-phase) and Q (quadrature) channels, meaning that the returns (as well as the voltage levels in (6)) consist of real and imaginary parts in quadrature.
 For each pulse, a single sample is collected from each range gate, resulting in a series of measurements Vr(k) where the index r is associated with the rth range gate and the index k indicates a sample associated with the kth pulse. The data sampling is performed continuously at the rate determined by the spatial resolution (typically 45 km). Therefore, from the kth pulse, the received voltage due to backscattered signal from a range gate r is sampled at a time of r∗smsep+tfrang after the pulse is emitted. In order to simplify sampling in the simulator, for a pulse sequence with Npul pulses, Npul separate voltage sequences of length Nrang are calculated, one for the returns from each pulse. The voltage sequences from different pulses are then superposed with the proper time offset in order to generate the final set of voltage samples, with only a single voltage for each sample time within the pulse sequence.
 Note that with this manner of sampling, cross-range interference (CRI) is present in the radar samples. That is, if a pulse p2occurs before the last sample associated with a previous pulse p1, there is an ambiguity about whether subsequent returns are from p1 or p2. In actual radar operation, this effect is dealt with by averaging the returns from a number of pulse sequences so that the incoherent contribution from interfering range gates decreases at a rate of where Navg is the user-specified number of pulse sequences in the integration period. The user does however have the option to eliminate CRI from the simulated data, which is done by integrating range gates individually, which is equivalent to turning off the scatterers in all but one of the range gates.
4.3 ACF Calculation
 The next step is to calculate ACFs for each range gate from the samples recorded for the pulse sequence. An ACF consists of a series of complex samples at discrete integer lag times, each with a real part Re and an imaginary part Im. The lag times are multiples of the smallest spacing between two consecutive pulses in the multipulse sequence, mpinc, which is an integer multiple of smsep. The lag times are due to all possible differences, tj−ti where i,j=1,2,...Npul. The value of the ACF R from the pth pulse sequence at a particular integer lag l is calculated as
where the asterisk indicates a complex conjugate, τ=l∗mpinc, and lis the integer lag number of the ACF sample. The process of calculating voltage returns and ACFs is performed Navgtimes. Once this process is complete, the ACFs from the individual pulse sequences are averaged (integrated) in order to produce a single ACF for each range gate. Specifically, the final ACF sample at lag l can be calculated as
 In order to make the simulated data more useable, some post-processing is performed.
5.1 Amplitude Normalization
 Lag zero power, P(0), is the power level of an ACF at lag zero. It is calculated according to
 Because of the manner in which the simulator operates, the P(0) of all of the range gates will fluctuate around some arbitrary value, which has no particular meaning. Therefore, all of the ACFs are normalized to an average P(0) of 1 (arbitrary units) to allow for scaling to a user-defined value (amp0) and introduction of scaled noise into the signal. In order to do this, the average P(0) values of all of the range gates which contain backscatter, excluding any range gates which could contain CRI in lag zero, are calculated. The ACFs of all of the range gates which contain backscatter are then normalized by this value, resulting in all range gates containing scatter having an average P(0) of 1. The ACFs can then be scaled in order to produce ACFs with average amplitudes of any value desired. Currently, the simulator also includes an option to force the signal to decay as a function of 1/r2, where r is range from the radar. The reason for this is that in real-life situations, signal amplitude decays with range, and 1/r2 is a plausible dependence. This is the only step in the simulation where real propagation conditions can be considered. This decay is implemented after normalization and scaling.
5.2 Introduction of Noise
 The user of the simulator is able to set an option to model external noise by adding white noise ACFs to the simulated signal. If this option is set, then a second set of ACFs are calculated in the same fashion as before, where the scatterers have zero velocity, zero growth time, infinite lifetime, and a decorrelation time much less than mpinc. This causes the returned signal to correlate only with itself, resulting in ACFs of δ-correlated white noise. These ACFs are scaled by a value provided by the user to produce the desired SNR. The noise level is set relative to the magnitude of the signal ACFs. These noise ACFs are then added to the post-processed signal ACFs, which is the final product returned to the user. Note that the noise level is relative to the signal level at range gate 0, so if the user selects to have power decay with range, SNR will subsequently also decay with range.
 Alternatively, the user can choose to have the raw voltage samples returned instead of the calculated ACFs. This mimics actual radar operation in that data can be stored as (1) averaged ACFs in RAWACF files or (2) as voltages at the sampling times in IQDAT files, both of which are standard SuperDARN file formats. With option (2), the averaged ACFs can be obtained in post-processing but there is a considerable storage requirement (1 day of radar IQDAT data requires ≈1.5 GB of storage).
 The simulator has been designed to produce realistic data including statistical fluctuations. A real SuperDARN ACF from the Fort Hays East, Kansas radar recorded on 2 April 2012 at 05:30 UT using the katscan pulse sequence and Navg=21 is shown in Figure 1a. In this case, the sample separation is 300 μs, corresponding to a range separation of 45 km, and a basic lag time of 1500 μs. The fitted parameters for these data are as follows: td=55 ms, vLOS=365 m/s, and SNR=9 dB. An example of an ACF that has been generated with the simulator is shown in Figure 1b. This ACF was generated with the katscanpulse sequence, Navg=21, td=50 ms, vLOS=350 m/s, R(0)=10,000, and SNR=9 dB. It is apparent from this figure that realistic statistical fluctuations are present in the simulated ACFs as the two ACFs display similar properties in terms of both phase progression and amplitude decay. Note that lags are missing in the ACF derived from the data (indicated by diamonds) owing to bad samples from pulse-overlap interference and CRI.
 In order to test whether the statistical fluctuations present in simulated ACFs are at correct levels, ACF power and phase fluctuations were examined. A total of 1000 ACFs were simulated with td=5 ms, vLOS=350 m/s, and Navg=50. The effects of Gaussian velocity spread, scatterer disappearance, irregularity growth, CRI, and white noise were set to be negligible in this simulation. These parameters can be ignored because they do not affect the statistical fluctuation level. Figure 2 shows a histogram of ACF lag power which shows that the simulator accurately reproduces the statistical power fluctuations. The x axis shows ACF lag time and the y axis shows normalized ACF power, calculated using (9). The color coding indicates the number of simulated ACFs with lags in a power bin, and the diamonds represent the mean values. The solid curve represents the ideal ACF power curve, the vertical dash-dotted line represents the decorrelation time td, and the horizontal dash-dotted line represents the e-folding power. The fact that the diamonds follow the ideal curve very closely for t<tdindicates that the simulator is behaving as expected. The horizontal dashed line represents the statistical fluctuation level, . This fluctuation level is the magnitude of the expected value of the fluctuation of the ACF power level [Ponomarenko and Waters, 2006]. Thus, as ACF power approaches zero in later lags, the expectation is that statistical fluctuations become the dominant source of power. The fact that the diamonds are very close to σ for the later lags indicates that statistical fluctuations are being reproduced properly.
 Figure 3 shows a similar plot to Figure 2 for the phase variation. The data are taken from 1000 ACFs simulated with Navg=50, td=30 ms, and vd=350 m/s. The color coding is for number of ACF lags in a particular lag phase bin. The solid line shows an idealized lag phase progression for vLOS=350 m/s. It is evident that the simulated ACFs show a phase progression that is consistent with what one would expect for the simulated Doppler velocity. It is also apparent that as lag time increases, the variability in the phase also increases. This is expected and occurs because ACF amplitude decays with time while the statistical fluctuation level remains constant, meaning that statistical fluctuations become more prominent in the ACF.
 We have developed a robust, physically based SuperDARN data simulator which is able to model radar returns from ionospheric irregularities. Statistical fluctuations are well-modeled by the simulator. This simulator can be used to generate realistic data for the purpose of testing the processing of radar returns into higher-order products under controlled conditions. In the companion paper, the simulator is used to compare several methods of processing radar returns for Doppler velocity and spectral width. The simulator can be adapted to test processing algorithms for other types of pulsed ionospheric radars.
 The authors thank the National Science Foundation for support under grants AGS-0849031 and AGS-0946900.