TEC Map Completion Using DCGAN and Poisson Blending

Because of the limited coverage of global navigation satellite system (GNSS) receivers, total electron content (TEC) maps are not complete. The processing to obtain complete TEC maps is time consuming and needs the collaboration of five international GNSS service (IGS) centers to consolidate final completed IGS TEC maps. The advance of deep learning offers powerful tools to perform certain tasks in data science, such as image completion (or inpainting). Among them, deep convolutional generative adversarial network (DCGAN) is capable of learning the properties of the objects and recovering missing data effectively. With years of IGS TEC maps for training, the combination of DCGAN and Poisson blending (DCGAN‐PB) is able to effectively learn the completion process of IGS TEC maps. Both random and more realistic masks are used to test the performance of DCGAN‐PB. The results with random masks (15–40% missing data) show that DCGAN‐PB can achieve better TEC map completion than DCGAN alone, and more training data can significantly improve its generalization. For the cross‐validation experiment using the realistic mask from Massachusetts Institute of Technology (MIT)‐TEC data (~52% missing data), DCGAN‐PB achieves the average root mean squared error about three absolute TEC units (TECu) for high solar activity years and less than two TECu for low solar activity years, which is about 50% reduction of means and more than 50% reduction on standard deviations compared to two conventional single‐image inpainting methods. The DCGAN‐PB model can lead to an efficient automatic completion tool for TEC maps by minimizing the manual work.


Introduction
Total electron content (TEC) is an important parameter characterizing the ionospheric plasma number density (Mannucci et al., 1998). Especially, the dynamic TEC value can be used to identify traveling ionospheric disturbance, which can be caused by geomagnetic storm events. TEC also influences the communication between satellites and the ground stations and has been included as a parameter in the space weather forecasting (Afraimovich & Astafyeva, 2008;Azeem et al., 2015;Coster et al., 2017;Jakowski et al., 2002;Lyons et al., 2019;McGranaghan et al., 2017;McGranaghan et al., 2018;Zhang et al., 2017). The measurement of TEC values requires the receivers on the Earth surface. However, due to the limited coverage, for example, barely no receivers on the oceans that takes about 70% of the Earth surface, the observational data of TEC are incomplete for a global map. For example, if the observed TEC values are mapped onto a 2-D geographical map, a huge amount of missing data in the ocean parts will occur. Massachusetts Institute of Technology (MIT)-TEC is a large public database recording huge amount of TEC observations since 2000, currently involving 6,000+ global navigation satellite system (GNSS) receivers. On the other hand, international GNSS service (IGS) has been providing IGS Global TEC (IGS TEC) data since June 1998, with the help of cooperation among five members of IGS Ionosphere Associate Analysis Centers (IAACs) (Hernández-Pajares et al., 2009). The complicated processing, including analysis and validation algorithms, leads to the final IGS TEC maps. First, all five IAACs apply their distinct algorithms to fit the observations from a few hundred GNSS sites to form the 2-D TEC maps. Then the results are compared based on different parameters, such as vertical TEC (VTEC) performance, slant TEC (STEC) performance, and delay code biases (DCBs) estimations, to get the best-fitted map. A large amount of manual work is needed to finally produce a single complete global TEC map. To alleviate the large amount of manual work needed in TEC map completion, automatic processing methods are in demand to provide a timely and reliable solution.
Artificial intelligence (AI) has been a blooming research topic for the last decade. It shows great potential on replacing human beings in challenging and time-consuming works, such as language translation, gaming, and driverless automobile (Stallkamp et al., 2012). AI techniques have been adapted in space science, mostly for predication and pattern recognition. For example, a support vector machine (SVM) model was developed to predict high-latitude ionospheric phase scintillation (McGranaghan et al., 2018), and a deep learning model was used to extract auroral key local structures from large amount of auroral images to lessen intense labor of human experts (Yang et al., 2019). The recent fast growth of AI has been spurred by deep learning algorithms. Deep neural network (DNN) is one of the most representative deep learning algorithms, which disperses and discovers the properties of a huge amount of input data by processing them in multiple neural network (NN) layers (Cireşan et al., 2012). Each layer is filled with neurons as a weighted sum and usually a nonlinear activation of outputs from a previous layer. Finally, the output layer makes either classification or estimation based on the problem in hand. Deep convolutional neural network (DCNN), one important category of DNN, uses convolutional operations in hidden layers to extract hierarchical feature patterns in data. It can take 1-D or multidimensional arrays, such as 2-D images, as the input and adds convolution operation in each layer of DNN (Dumoulin & Visin, 2016;Krizhevsky et al., 2012). DCNN requires little hand-engineering of data, for example, images, compared to conventional machine learning methods because it learns the best filters to extract most useful features in data through the training, which is particularly suitable for automatic completion of TEC maps. In 2014, the generative adversarial network (GAN) has been proposed (Goodfellow et al., 2014), which can generate artificial data closely resembling the real data. GAN includes two DNNs: the generator and the discriminator. The generator produces the fake data from the random initialization to mimic the real data, while the discriminator is used to distinguish the fake data from the real data. The back propagation is used to improve the generator and the discriminator performance in a competition between two DNNs. The successful training of GAN leads to the artificial data that are hardly distinguishable from the real data (Yeh et al., 2017) and, therefore, a model capable of representing the physical situation. The deep convolutional GAN (DCGAN) (Radford et al., 2015) has been utilized to successfully generate fake human face images (Kim, 2016). An improved version of DCGAN has been applied to recover the artificially masked human face images and showed decent recovery results (Amos, 2016). One of the reasons that neural works perform so well in their applications is the large database. Usually, we are targeting at a class of thing, for instance, a bike or road signs, and they have their properties, the distribution of pixels. The more available data are, the better neural networks would get by optimizing the weights repeatedly. Moreover, neural networks are highly customizable. With customized number of convolutional layers, it can dig out the hidden features. Deep learning is the process of coding and recording the empirical like human beings.
Recently, we proposed a regularized DCGAN (RDCGAN) for data completion of TEC maps . The RDCGAN work is aimed to complete the MIT TEC maps using DCGAN. Since there were no complete MIT TEC maps for training, RDCGAN was developed with an additional reference discriminator (i.e., RDCGAN has two discriminators and one generator) where the generated TEC maps are regularized by the complete IGS TEC maps to improve the TEC completion performance. In this work, we aim to learn IGS TEC completion process through DCGAN, which is considerably simpler than RDCGAN. This work is not a direct extension of RDCGAN work but rather a parallel development of machine learning methods for automatic space physics data processing.
Although DCGAN can generate the missing TEC values that have the correct content aligned with the surrounding available observations, those generated TEC values may have some baseline shift and not join the surrounding observations continuously. Thus, directly overlying the DCGAN generated TEC values on the missing regions in the original TEC map leads to the "mosaic" artifact. To address this issue, we adapt Poisson blending (Pérez et al., 2003) after DCGAN, which blends the values according to the gradients around the gaps instead of direct use of the recovered values from DCGAN. Poisson blending has wide applications such as the reconstruction of 3-D surfaces from point samples robustly with fine details (Kazhdan et al., 2006). As shown by our experiments, this postprocessing method can lead to significantly improved completion performance. We also investigate the influence of training data size and missing data pattern on the proposed DCGAN model and compare DCGAN with conventional image inpainting methods.
The paper is outlined as follows: We introduce DCGAN and Poisson blending in section 2, followed by the data and numerical experiments as in section 3, where we lay out strategies to evaluate the performance of our model. With the qualitative and quantitative results given in section 4 and discussion in section 5, finally we conclude our work in section 6. More detailed description of the DCGAN architecture is given in Appendix A.

DCGAN for TEC Map Completion
GANs (Goodfellow et al., 2014) are made of two neural networks, the generator (G) and the discriminator (D). The generator is used to produce the artificial data that mimic the real data, while the discriminator is used to distinguish the artificial data from the real data. Two networks are trained in a competitive way until the discriminator is not able to distinguish the artificial data from the real data. To stabilize the training, we followed the architecture guidelines for stable DCGAN learning (Radford et al., 2015) including (1) replacing any pooling layers with convolution (discriminator) and de-convolution (generator); (2) using batch normalization in both the generator and the discriminator; (3) removing fully connected hidden layers; (4) using the rectified linear unit (ReLU) activation in the generator for all layers except for the output, where the hyperbolic tangent activation is used; and (5) using the leaky ReLU activation in the discriminator to fix the zero-activation problem of ReLU for all layers except for the output, where the sigmoid activation is used. The detail of the DCGAN architecture used in this work is similar to our previous work  and given in the Appendix A.
Once DCGAN has generated TEC missing values from a random initialization, these values need to be mapped to match the context of surrounding observed TEC values. A joint context loss and prior loss are used in this work (Yeh et al., 2017) to find z*, given the random initialization z, the incomplete TEC map y, and DCGAN output G(z) (see Appendix A and Figure A2). The mapping from z to z* can be obtained by minimizing the following joint loss function: where ⨀ stands for element-wise multiplication. L C zjy; M ð Þis the context loss, L p z ð Þ is the prior loss, and G and D denote the generator and the discriminator, respectively. The binary mask M comes from y, where the pixels (locations) of data gaps are noted as 0 and those with available TEC values as 1. The weighting parameters W are introduced as where i and j are the pixel indices, ℵ i as the neighborhood of pixel i, and N is the number of neighbor pixels. The weights are designed to have larger values for observed data points closer to the missing data region and to have zero values for pixels with missing data or surrounded by observed data. Specifically, the weights are nonzero only for pixels with observed TEC values and equal to the ratio of the number of missing data to the number of observed data in neighbor pixels. For example, if 1/3 of the neighbor pixels of pixel i are with missing data, the weight would be 1/3. In addition to the context loss, the prior loss is just the discriminator loss for DCGAN training. The context loss guides the data completion to improve locally, while the prior loss works to maintain the global appearance of completed TEC maps. Here, a factor λ is used to adjust the trade-off between these two loss functions.

Poisson Blending
After TEC map completion using DCGAN, Poisson blending is used as a postprocessing step to further match the filled TEC values in the gap with the surrounding available TEC values. Given z * the solution of Equation 1, the general idea of Poisson blending is to fill the data gaps based on the gradients of G(z * ) inside the gap, rather than the simple "copy and paste" method of using G(z * ) values directly (Pérez et al., 2003;Yeh et al., 2017). Let Ω be the gap region of missing data, that is, M i = 0, the final TEC values f inside Ω can be obtained by the following minimization: where ∇ is the gradient operator and f values outside Ω are kept the original TEC values (i.e., for M i = 1).
As shown in section 4, the direct filling G(z * ) for the gap areas leads to TEC value mismatches, thus the mosaic-looking artifacts of completed maps. After applying Poisson blending, the TEC values transit smoothly from the data-available area to the gap area, and the mosaic-looking artifacts of completed maps are largely suppressed.

Data and Experiments
All IGS TEC data are downloaded from https://cdaweb.gsfc.nasa.gov/pub/data/gps/ (Hernández-Pajares, 2004; Web, 2019) for the time period from 1 June 1998 to present. Each IGS TEC map is averaged over a time interval of 2 hr in Universal Time (UT) and resized from a 73 × 71 matrix to a 64 × 64 matrix by using 2-D cubic interpolation (Shepard, 1968), corresponding to a spatial resolution of 5.6°in longitude and 2.8°in latitude. The abnormal data out of the range of usual TEC values (>999 absolute TEC units [TECu]) are discarded. The total number of available IGS TEC maps for training and test is 91,579.

Random Mask Experiment
In the first experiment, we demonstrate the impact of the amount of training data and Poisson blending on the completion performance of DCGAN using random masking. DCGAN was first trained using (1) 2 years of IGS TEC maps (January 2010 to December 2011) for the same time period as our previous work  and (2) about 12 years of IGS TEC maps (June 1998 to December 2010) for one solar cycle. Three TEC maps at 08:00 UT on 7 October 2012, 12:00 UT on 17 March 2015, and 16:00 UT on 18 May 2019 were selected as test data, representing medium, high, and low solar activities, as shown in Figure 1. Three types of random masking are tested as shown in Figure 2: (1) 15% missing data with 2 × 2 minimum gap size (the minimum gap size means the smallest size of any missing data area; each pixel denotes 5.6°in longitude [x] and 2.8°in latitude [y]); (2) 40% missing data with 2 × 2 minimum gap size; and (3) 15% missing data with 4 × 4 minimum gap size. In addition to the completed TEC maps for those three representative times shown in Figure 1, the root mean squared error (RMSE) using randomly selected 360 IGS TEC maps in the year of 2012, 2015, and 2019 is used as a quantitative measure: where f i is the original TEC value, e f i is the recovered TEC value, and N is the total number of missing TEC values.
In addition to RMSE, we applied Bland-Altman (BA) plot (Altman & Bland, 1983) as another criterion to quantify TEC completion performance. The BA plot is used to evaluate the agreement between the original TEC values and the filled TEC values using different methods. The x and y coordinates on the BA plot for the ith missing TEC value are calculated as Thus, the BA plot highlights the distribution of filled TEC values from the true ones with the bias and 95% confidence intervals labeled as horizontal lines.

MIT-TEC Mask Experiment
In the second experiment, we extract a more realistic mask using the MIT-TEC data (obtained from the Madrigal database, http://www.openmadrigal.org), so-called MIT-TEC mask as shown in Figure 3. The missing data, mostly in the oceans and polar regions, are about 52% of the total data (white regions in Figure 3). It represents a more realistic and challenging case for TEC map completion as the missing part takes a large portion of the map and is continuous. In this case, IGS TEC data about 18 years (see section 3.3) were used for DCGAN training and Poisson blending ("DCGAN-PB") to get the final completed maps.
For comparison, two traditional image inpainting techniques, TELEA (Telea, 2004) and Naïve-Stokes (NS) (Bertalmio et al., 2001), are also applied to complete TEC maps. Starting from the peripheral intensities of the data gap, these two inpainting methods recover the gap value accordingly from external to internal of the gap through applying the weights on the surrounding ground truths of the pixel to be filled. Both methods have shown decent image inpainting performance (Telea, 2004;Bertalmio et al., 2001).
Since it is more meaningful for space weather application to evaluate the performance of different methods under high solar activities than low solar activities, three representative time points are selected: 14:00 UT on 14 July 2000, 10:00 UT on 1 November 2003, and 12:00 UT on 17 March 2015. As shown in section 4, the completed TEC maps and RMSE from our DCGAN-PB method, TELEA and NS have been compared. In order to avoid the appearance of test cases in the training set, a ten-fold cross-validation of DCGAN-PB is applied. In each fold, TEC maps of 18 years are used for training and the remaining 2 years for testing. We select three cross-validation sets to inpaint TEC maps at those three time points.

Cross-Validation Experiment
To evaluate the performance of the proposed DCGAN-PB method in a systematic way, we divide 20-year IGS TEC data from 1999 to 2018 into 10 high solar activity years (F10.7 > 100 sfu) and 10 low solar activity years (F10.7 ≤ 100 sfu), according to the yearly average F10.7. As shown in Table 1, each of 10 sets includes one high activity year and one low activity year randomly selected, and then ten-fold cross-validation is applied. Specifically, DCGAN is trained using nine sets (18-year data), while the remaining one set (a pair of high and low solar activity years) not used in training serves as test data. The process is repeated for 10 times so that     Figure 1 and different random masks shown in Figure 2. DCGAN fills the TEC data gaps well in some places, for example, the high TEC value region in 2012 maps (the top row). In addition, the larger portion of missing data, the worse the completed map. However, the completed maps suffer the mosaic artifacts, for example, the middle row for high solar activity in 2015, as the TEC values generated from DCGAN in the missing data region have some baseline shift from the surrounding region. One possible reason for the worse performance in 2015 than in 2012 is that the training data (2010-2011) contain most low solar activity maps, thus leading to a trained model lack of generalization of high solar activities.
As we increase the training data from 2 years to about 12 years, the completed TEC maps are notably improved as shown in Figure 5. Since one solar cycle is about 11 years, the training data including all the cases of TEC maps such as high solar activity years (with more large geospace storms and F10.7 > 100 sfu) and low solar activity years (F10.7 ≤ 100 sfu) lead to a much-improved completion performance as shown in comparison of Figures 5 and 4. Similarly, the completed maps with less missing data and smaller gaps are better than those with more missing data and larger gaps. Nevertheless, the mosaic-looking noises still exist in the completed TEC maps. The mosaic artifacts can be seen in Figures 4 and 5, where unrealistic changes of TEC values are obvious in the missing data region. These artifacts likely generate abnormal

Table 1
High/low solar activity years and assignment of 10 sets for tenfold cross validation disturbances if feeding into global circulation models (GCMs), which may affect space weather forecasts depending on the specific method used for data assimilation. However, the detailed quantitative investigation is out of the scope of this work and will be conducted in our future work.
With postprocessing of PB, the mosaic-looking artifacts are effectively removed as shown in Figure 6. The completed TEC maps reassemble the original IGS TEC map under all conditions, which demonstrates the superior image completion performance of DCGAN with large training data and PB.
To quantitatively evaluate the performance of three completion procedures, the root mean squared error (RMSE) values are summarized in Figure 7. In general, more training data lead to smaller RMSE values (~50% decrease), that is, better recovered TEC values, and the additional PB further brings down RMSE values dramatically (more than 50% decrease). It is interesting to note the abnormality that RMSE with 2-year training data in 2019 is slightly better than that with 12-year training data.  (1) 15% missing data with 2 × 2 minimum gap size; (2) 40% missing data with 2 × 2 minimum gap size; and (3) 15% missing data with 4× 4 minimum gap size.
the 12-year model reduces these errors by half. This demonstrates the importance of training data on deep learning-based data completion.

MIT Mask Results-Realistic Comparison With Two Other Completion Methods
The original IGS TEC maps at 14:00 UT on 14 July 2000, 10:00 UT on 1 November 2003, and 12:00 UT on 17 March 2015 are shown in the first row of Figure 8. The MIT-TEC mask shown in Figure 3 is applied to them, and TELEA, NS, and DCGAN-PB are utilized to fill the data gap to reconstruct the global map. As shown in Figure 8, DCGAN-PB outperforms the other two methods and produces the completed TEC maps closest to the original IGS TEC maps, particularly notable in the high TEC value regions (hot colored areas). Quantitative results of RMSE are shown in Figure 9. The average RMSE for TELEA and NS is about 7-9 TECu, while DCGAN-PB effectively suppresses the error to 3-4 TECu. The performance for the MIT mask ( Figure 9) is worse than that for the random masks (Figure 7) because the MIT mask has much larger gaps and more missing data than the random masks, thus posing a harder completion task. To further look into the distributions of recovered TEC values from different completion methods, the recovered TEC values e ðf i Þ versus the original values (f i ) are plotted out in Figure 10. The green line denotes the perfect recovery, that is, e f i ¼ f i . In general, all three methods produce the data points around this line, reflecting a decent recovery of missing TEC values. The plots of DCGAN-PB are less spread than those for the other two methods. After linear regression, the fitted lines using these points are drawn in red. The deviations from the green line (perfect recovery) are noticeable for TELEA and NS, while DCGAN-PB leads to almost overlapped red and green lines. One possible reason PB works well is that the IGS TEC maps are based on some sphere-harmonic functions and thus vary smoothly. Our proposed DCGAN-PB outperforms TELEA and NS in terms of both bias and variance of recovered TEC values.

Ten-fold Cross-Validation Results With the MIT-TEC Mask
The ten-fold cross-validation (CV) results are shown in Figures 11 and 12. The former displays the RMSE separately for high (a) and low (b) solar activity years. Similar to results in Figure 9, DCGAN-PB leads to smaller RMSE values (less than 4.04 TECu for high solar activity years and about two TECu or less for low solar activity years) than TELEA and NS in all low and high solar activity years. The fluctuation of RMSE values for DCGAN-PB is remarkably smaller than those from TELEA and NS as well. The statistical values (mean ± standard deviation [SD]) of RMSE from 10 sets for TELEA, NS, and DCGAN-PB are 7.21 ± 1.49 TECu, 6.69 ± 1.49 TECu, and 3.16 ± 0.74 TECu for high solar activity years and 3.60 ± 0.91 TECu, 3.11 ± 0.92 TECu, and 1.65 ± 0.32 TECu for low solar activity years, respectively. DCGAN-PB reduces the average RMSE by about 50% and the variation by more than 50%. Figure 12 shows the BA plots of all the test cases of cross-validation for TELEA, NS, and DCGAN-PB. The mean difference is −0.41 TECu for TELEA, 0.62 TECu for NS, and −0.31 TECu for DCGAN-PB. The 95% confidence interval (±1.96 SD) is [−11.79, 10.97] TECu for TELEA, [−9.79, 11.03] TECu for NS, and [−5.32, 4.71] TECu for DCGAN-PB. DCGAN-PB has the least mean difference of −0.31 TECu (about 25% improvement over TELEA) and the least confidence interval of 10 (more than 50% reduction compared to TELEA and NS). Note that the large differences of TELEA and NS are caused by the edge pixels as shown in Figure 8. If we removed these edge pixels, the 95% intervals are [−9.38, 7.60] TECu for TELEA, [−7.24, 7.58] TECu for NS, and [−4.74, 4.31] TECu for DCGAN-PB, which still demonstrates the significant improvement of DCGAN-PB

Space Weather
TECu deviation can satisfy the particular application requirement, our DCGAN-PB model may provide sufficient accuracy. Nevertheless, the caveat is that this type of deep learning models is still at the early stage of development and needs more substantial investigation in order to be deployed for space weather forecast applications.

Discussion
Our initial Poisson blending results have some extreme abnormal values at the edges of the TEC maps. As mentioned by Mahmoud Afifi (Afifi & Hussain, 2015), the traditional Poisson blending technique suffers potential smudges if the blended region is at the edges of the image since there are no image values to use beyond the image domain. We have overcome this issue by assuming a cyclic pattern of the TEC maps in the longitude direction (so the left and right edges of TEC maps are essentially connected together).
For results in sections 4.1 and 4.2, the test data are separated from the training data for more than 6 months, and the memory of ionosphere is not an issue since the memory of ionosphere is typically less than 1 month. For cross-validation results in section 4.3, there are test data within a month from the training data. The memory effect may influence the final results. We conducted the additional calculation of RMSE of DCGAN-PB by excluding test data in January and December to represent results without forward and backward memory effect. The BA plots for the high solar activity years and the low activity years show the good agreement between two sets of results with mean differences of −0.07 TECu for the high years and −0.01 TECu for the low years and all individual differences falling inside 95% interval, except for one low solar year close to the interval edge. The memory effect seems to have negligible impact on our results.
It is difficult to provide the theoretical uncertainties for deep learning models although the generative models hold the potential to reason the uncertainty, which is beyond the scope of this work. Nevertheless, the mean and SD of RMSE of filled TEC values of DCGAN-PB from the cross-validation experiment are 3.16 ± 0.74 TECu and 1.65 ± 0.32 TECu, for the high solar activity year and the low solar activity year, respectively. Based on additional Bland-Altman analysis, the bias of DCGAN-PB is about −0.31 TECu, and the 95% of the differences are between −5.32 to 4.71 TECu, much improved over the other two automatic completion methods.
In our current work, we used 2-hr temporal average and spatial resolution of 5.6°in longitude and 2.8°in latitude for TEC maps. Therefore, the short-lived and small-scale features, such as polar cap patches, that are shorter than the temporal resolution and smaller than the spatial resolution in this work would be difficult to recover. Since our goal is to fill the gaps in the global large-scale TEC maps, the investigation of transient and small-scale phenomena is out of the scope of this study. Nevertheless, the deep learning methods, such as DCGAN, may be worth investigating for those phenomena using regional high temporal and spatial resolution TEC data.
The λ value used in the current DCGAN-PB model (Equation 1) is 0.05. To investigate the influence of the context loss, we have chosen the 2-year model to test with different λ values (0.0005-0.9) since the mosaic artifacts are prominent in this case. The RMSE results for 08:00 UT on 7 October 2012 and 12:00 UT on 17 March 2015 only change slightly with different λ values (usually around or less than 10%, except for 7 October 2012 with 4 × 4 minimum gap size). However, even for the largest differences on RMSE on 7 October 2012, the filled TEC maps using different λ values are hard to discern any visual difference of the mosaic artifacts. Therefore, the λ parameter balancing the contribution of the context loss and the prior loss works well in the tested range of 0.0005-0.9.
For DCGAN training, it is time consuming and needs parameter tuning for satisfactory results. To save time, in this work, we use a relatively small TEC matrix 64 by 64. The training time with 18-year TEC images with 200 epochs is about 1 day 4 hr using a single NVIDIA Titan V GPU card. The completion takes about 12 s without PB and 39 s with PB for 4,000 iterations. Furthermore, the tuning of DCGAN structure and hyperparameters is empirical; thus, no optimal settings are guaranteed. We tried different learning rates and found that the learning rate of 0.00002 yielded an expected trend of training losses of three models (2, 12, and 18 years). The 2-year model converges quickly and has a dip at 100 epochs. Thus, we used 95 epochs for this model. For the 12-and 18-year models, although the training losses move in the right direction, it seems that the fluctuation of each batch update is much larger than the 2-year model and more epochs than 100 are needed to further improve the training performance. Therefore, 200 epochs are used in this work for the 12-and 18-year model training. We also investigated the errors (RMSE) versus epochs for the 2-year model. The test error follows the trend of the training error with slight larger values. A slight upward trend at the end of the test error curve was observed, which indicates the over-training might start to occur. The optimal choice of the learning rate and the number of epochs along with other hyperparameters are worth thorough investigation in future studies. In this work, we focused on the impact of the training size on TEC map completion performance. Our results indicated that the larger training data size, the better the learning performance for the current network structure. We also empirically found that the learning rate of 0.00002 and the λ value of 0.05 yielded a good completion performance for the proposed DCGAN-PB model. Since the model training is very time consuming, the hyperparameter tuning in this work is coarse and may not be optimal.

Space Weather
The optimization of the network architecture and hyperparameters may be addressed by the new techniques, such as AutoGAN to dissect GAN into modules (e.g., conv layers and pooling layers) and to test the performance with different combinations and complexities in a smaller scale (Gong et al., 2019;Salimans et al., 2018). It would be our future work to apply these new techniques to optimize our DCGAN model. Nevertheless, our results demonstrate promising performance of the current model that is superior to the traditional methods.
The success of the learning methods is built on the assumption that the training and test data are independently sampled from the identical distribution. However, the TEC value distribution in each map is highly dependent on the solar and geomagnetic conditions. In general, the larger the training set is, the greater the chance the training sample would be from the same distribution as the test sample. Indeed, for the three test TEC maps (08:00 UT on 7 October 2012 for medium solar activity, 12:00 UT on 17 March 2015 for high solar activity, and 16:00 UT on 18 May 2019 for weak solar activity), we conducted the Kolmogorov-Smirnov (KS) two-sample test between them and the training maps. For 2-year training data, the null hypothesis cannot be rejected in 6, 0, and 0 cases for three test maps (i.e., the cases with the same distribution between the training map and the test map), respectively, while for 12-year training data, the numbers increase to 11, 1, and 2. If the Bonferroni correction was applied, the numbers for the KS null hypothesis are 180, 9, and 0 for 2-year training data and 933, 221, and 295 for 12-year training data. These results demonstrate that the larger the training dataset is, the greater chance it could match the distribution of the test data.
The comparison with simulation results based on first-principle physics models, such as GITM (Ridley et al., 2006) and TIE-GCM (Richmond et al., 1992), would be worth future investigation although the simulation results from these models usually have differences from the ground truth. Such a simulation and comparison effort are beyond the scope of this work. However, the physics-based models may be combined with deep learning to improve the task-based performance. For example, the physics-guided neural networks (PGNNs) add the output of a physics model of temperature and water density/depth relationship as an input in addition to common environmental parameters for lake temperature prediction (Karpatne et al., 2017). PGNN with prior physics knowledge showed better generalization than neural networks alone. Along this direction, the proposed deep learning framework may be improved by adding physical constraints (in Equation 1) for better completion performance.
DCGAN-PB along with our previous work R-DCGAN  provides an automatic option for the completion of TEC maps through deep learning of the existing filled TEC maps. The filled TEC maps can not only provide references and inputs for physics models for space weather prediction applications, but also be used to study traveling ionosphere disturbances (TIDs) during the geomagnetic events. The comparison of deep learning-based methods and first-principle physics models will strongly improve our understanding and prediction of ionospheric disturbance. Furthermore, the developed deep learning tools can greatly expedite the filling process for new TEC observations, which is important to space weather forecast as the timely output of the complete maps is needed for (nearly) real-time applications in GPS and high-frequency (HF) communications. However, the caution must be taken to avoid the artifacts for learning-based methods, such as mosaic artifacts introduced by DCGAN. If these artifacts affect space weather forecasts to an unacceptable level, the techniques to remove these artifacts, such as Poisson blending in DCGAN-PB, have to be applied.
DCGAN-PB has a simpler network structure than R-DCGAN , where the former is to mimic IGS TEC completion process and the latter is to fill MIT-TEC maps using IGS TEC as a reference. Although there is a slight difference on two models' purposes, both of them can be used to fill TEC maps. A thorough comparison is worth investigating in near future. If the similar performance is achieved by two models, the simpler model would be preferred since the runtime is shorter and the tunable parameters are fewer that lead to a fast and robust implementation.

Conclusions
In this work, DCGAN with Poisson blending (DCGAN-PB) has been proposed to learn and complete the IGS TEC maps using random and MIT-TEC masks. Our results show that DCGAN is strongly influenced by the amount of training data and PB can significantly improve the final completed maps. The proposed method also shows a superior TEC completion performance over the traditional single-image inpainting methods such as TELEA and NS using the realistic MIT-TEC masks and ten-fold cross-validation. Our work show that the deep learning methods can learn from thousands of TEC maps in different conditions and extract useful features to overcome the challenges of effectively recovering the missing data in a large area not covered by the observation network. It may lead to an efficient automatic tool to generate complete TEC maps by minimizing time-consuming manual work.

Appendix A: The Architecture of DCGAN
The architecture of the generator and discriminator of DCGAN is shown in Figure A1 following the guidelines for stable DCGAN training (Radford et al., 2015). The number on the top denotes the number of channels for convolution and that at the bottom denotes the dimension of the output matrix.
For the generator ( Figure A1 top), the input layer projects and reshapes a random vector z (1 × 100) into a 4 × 4 × 512 high-level feature matrix before passing through the batch normalization (Norm) (for stability of training) and the rectified linear unit (ReLU) (to introduce the nonlinearity). The three hidden layers have the similar structures with a deconvolution (DeConv) layer, a Norm layer, and a ReLU layer to convert the high-level features to details in TEC maps. The output layer is composed of another DeConv layer and a hyperbolic tangent activation function (Tanh) to get the final TEC map. Note that four DeConv layers are fractionally strided convolution instead of real deconvolution, which converts the original 4 × 4 × 512 feature matrix into the final 64 × 64 TEC map. Thus, the generator produces a 64 × 64 TEC map (G(z)) from an 1 × 100 random vector (z).
For the discriminator ( Figure A1 bottom), the TEC map, either the real IGS TEC map or the generated TEC map (shown in Figure A1), is fed into four convolutional layers to extract the high-level features from the TEC map. Note that the input layer uses only a leaky ReLU and three hidden layers have both the Norm layer and leaky ReLU for a stable training. Leaky ReLU is used here to avoid the zero-activation problem of ReLU. Finally, the output layer reshapes the 4 × 4 × 512 feature matrix into a 1 × 4,096 vector, and then a fully connected layer followed by a sigmoid function is used to produce a value between 0 and 1. Thus, the discriminator reads a TEC map and yields a single value D(G(z)), based on which the prediction is made to tell whether it is the true TEC map (>0.5) or the fake one (<0.5).
The proposed DCGAN-PB model is illustrated in Figure A2. Blue arrows denote training of DCGAN using the IGS-TEC maps. The discriminator (D) takes both the TEC maps produced by the generator (G) and the true IGS TEC maps and makes prediction whether it is an authentic (true) TEC map or a fake (false) one. Both the discriminator and generator are learned in a competitive way to improve their performance. The brown arrows denote the mapping from z to z * in the context of a particularly incomplete TEC map y, while the red arrows denote the Poisson blending applied the generator produces the fake TEC map G (z * ) with the incomplete TEC map y.