Characterization of base station deployment distribution and coverage in heterogeneous networks

Considering different types of base stations (BSs) in future cellular networks are overlapping deployment with the status of dense, multi-tier and heterogeneous in general, how to optimize the real BS deployment becomes a complicated problem. Based on it, repulsive BS dataset and clustering BS dataset are statically characterized with various types of spatial point processes. It shows that the improvement of coverage probability between 1-tier and 2-tier network ﬂuctuates as the signal-to-interference ratio (SIR) threshold increases for different datasets. The authors’ proposed hybrid model ﬁts well with repulsive BS dataset, while the log-Gaussian Cox point process (LGCP) and Cauchy model are reasonable models for clustering BS dataset. Besides, in order to dynamically analyze the coverage problem affected by adding new BSs, a cell boundary constructed by an irregular circle is introduced under an equal SIR constraint, and a BS placement scheme is proposed to place new BSs at the points of minimum interference. Numerical results show that the coverage probability may increase after adding BSs in the target area of heterogeneous network


INTRODUCTION
With the commercialization of the fifth generation (5G) system, current base station (BS) deployments in cellular networks are generally overlapping and coexisted with the previous systems, showing the status of dense, multi-tier and heterogeneous in general. Considering the real BS deployments are usually from different operators, it becomes a complicated problem for optimizing the BS deployment, which is related closely with intensity of interference and quality of service (QoS). What is more serious, due to a rapid growth with mobile Internet services, customers dramatically demand faster wireless data traffic, such as high-rate downloading at hot spots [1]. Driven by this demand, it needs to deploy more BSs to enhance signal coverage and improve the QoS [2], even by adding unmanned aerial vehicle BSs [3,4]. Under this background, it is critical to find a way to optimize BS coverage problems by means of the statistical modelling and analysis of BS deployment.
In the past years, several approaches in geometry theory were used to characterize these coverage probabilities in BS deploy-suitable for inhibitive BS deployment patterns, and the Geyer saturation process was suitable for clustered BS deployment patterns. Specially, a Strauss process was tested to provide a better fit than the PPP [16]. As an effective tool, determinantal point processes (DPP), such as Gauss DPP, Cauchy DPP and generalized Gamma DPP had been characterized in homogeneous networks [17], in which the system-level coverage results for each model were compared, respectively. In particular, a Ginibre point process as a special DPP was well known to model repulsion among points [18,19]. By means of a goodness-of-fit test, it showed that the DPP models may be more suitable for the datasets in homogeneous networks because real macro BS deployments exhibited repulsion among the BSs.
Although the non-PPP models could bring a better fit for real BS deployment in some cases, it was still hard to quantitatively derive the interference statistical characteristics in theory. Aiming to this problem, Haenggi et al. [20][21][22][23] proposed a simple approximate framework, written as approximate signal to interference ratio (SIR) analysis based on PPP (ASAPPP). In this framework, it approximated the SIR distribution in a non-PPP model by scaling that in the corresponding PPP model, where the scaling coefficient was called SIR gain [24]. By means of the ASAPPP method, we could get an approximate SIR distribution based on non-PPP models for networks, but the drawback of this method was that it needed to use lots of Monte Carlo simulations to obtain the SIR gain.
In addition to theoretical modelling, some other works studied BS distribution characteristics from the statistical analysis of massive BS datasets. For example, the real BS deployment datasets in European homogeneous networks were analyzed statically in [25], in which BSs were classified into the following two types: global system for mobile communications (GSM) and universal mobile telecommunication system (UMTS), and its system-level coverage results related to the European cities showed that a log-Gaussian Cox process provided a higher accuracy in system-level coverage, compared with the PPP, Strauss process and MCP.
However, for K-tier heterogeneous networks (K > 1), since BSs belonging to different tiers, such as macro BS, pico BS and femto BS, were always deployed in a hybrid way, a practical BS deployment always exhibited clustering among the BSs [26], which indicated that it was hard to apply the repulsive property of real homogeneous BSs into multiple tier heterogeneous networks. Considering this problem, Gauss-Poisson process [27] had been recently used to model spatial distribution with attraction for clustering nodes. Besides, the Poisson cluster process [28,29], Poisson hardcore process [30] and Matern hardcore process [31] were, respectively, used to model the real BS deployment in two-tier heterogeneous cellular networks, and the coverage problem was analyzed by using the ASAPPP method. It showed that the ASAPPP method may provide simple approximations of the SIR distribution in general cellular networks [32]. But until now, most of the existing works were usually major in taking a single model to fit the BS deployment dataset and analyze its SIR distribution. It still lacked the effective algorithms that was from the statistical method to establish the precise models to fit both repulsive and clustering BS deployment dataset.
What is important, it is not sufficient for us to analyze the system-level coverage probability from the static viewpoint, which lacks adaptability for deploying new BSs. For network operators, changes in local BS deployment may directly affect its coverage results. Thus, characterizing the effect of a new BS deployment under a given-cell coverage is more important to improve the BS deployment than system-level coverage results. When we consider two types of cells include macro and femto cells in a multi-tier heterogeneous network (Hetnet), deploying new BSs may increase the interference on their neighbouring BSs in the system. Therefore, it is still hard to reflect the coverage effect of adding new BSs for the existing statical analysis of real BS distribution.
The contributions of this paper are summarized as follows:

Static characteristic analysis with a repulsive BS dataset
The repulsive BS deployment dataset in multi-tier Hetnet is collected in Smithfield. Compared with a 1-tier network scenario, the coverage probability is improved under a 2-tier network scenario because of small cell deployment. It shows that the coverage probability improves more at a high SIR region for this repulsive BS dataset, which illustrates that the efficiency of small cells is much better for repulsive BS deployment at high SIR regions. In order to find an accurate point distribution model, the repulsive BS deployment dataset is statically modelled according to the spatial point process theory, in which the good-of-fit is verified by means of the Ripley's K function, the L function, the empty space function and the nearest neighbour distance function. Considering it is hard to find a single saturation process to accurately fit the real dataset, it is feasible to combine multiple saturation processes in a hybrid way to approach the dataset. The results show that our proposed hybrid model is well fitted with the collected BS dataset in Smithfield, on the basis of the proposed searching algorithm that could effectively find the parameters to yield smaller gaps for hybrid model. Our work gives a novel way for the modelling of BS dataset fitting in the future.

Static characteristic analysis with a clustering BS dataset
The clustering BS deployment dataset in multi-tier Hetnets is collected in Glasgow. It shows that the coverage probability also improves under a 2-tier network scenario because of small cell deployment. The difference is that the coverage probability improves more at low SIR regions for this clustering BS dataset. It illustrates that the efficiency of small cells is much better for clustering BS deployment at low SIR regions, and this practice helps to guide small cell deployment in future. Furthermore, the clustering BS deployment dataset is statically modelled to find the accurate point distribution model. By means of testing a good-of-fit law for real dataset and clustering point processes, a searching algorithm is proposed to find the proper clustering point processes for its dataset, such as MCP, log-Gaussian Cox point process (LGCP), Vargamma model and Cauchy model. The error results illustrate that the LGCP and Cauchy model become the reasonable models with smaller gaps between real dataset and theoretical value.

Dynamic coverage probability analysis
Based on the statical analysis of BS deployment, we further consider the dynamic coverage analysis by adding a new BS, and analyze the change of coverage after adding a different type of a new BS. We propose a BS placement searching scheme for deploying a new BS, in which a new BS can be deployed at the minimum interference location in a region of interest, so that the interference caused by the new BS can be minimized. Numerical results show that the coverage probability may increase after adding BSs in the target area by using the proposed scheme, but as the density of the femto BS reaches a certain value, its coverage may remain unchanged even after adding more femto BSs. Our work could provide a reference for the site selection of new BSs in the future. The remainder of this paper is organized as follows: BS deployment is statically modelled in Section 2 using spatial point processes. The static spatial distribution of BS deployment is characterized in Section 3. The dynamic effect of a new BS deployment on coverage probability is analyzed in Section 4. The dynamic coverage probability results by adding BSs are compared in Section 5. Finally, conclusions are drawn in Section 6.

STATIC MODELLING OF BS DEPLOYMENT USING SPATIAL POINT PROCESSES
The conventional analysis of BS deployment is always taking an approach in which we choose a specific BS dataset from the whole city or one country, and then give system-level results that reflect user distribution. However, it is still hard to provide the guidance to actual BS deployment. In order to characterize the BS deployment problem, we choose some typical real BS distribution datasets from Ofcom database 1 , such as repulsive BS dataset at a urban area in Smithfield and clustering BS dataset at a rural area in Glasgow, as shown in Figures 1 and 2, respectively, which are scaled according to the corresponding map. First of all, we take spatial point processes to statically model BS deployment [33], and consider the following theoretical concepts in stochastic geometry [   Hardcore model, Strauss model and Hybrid model [33,35]. The parameters in each model are listed in Table 1.Especially, the Geyer model and Strauss model can be classified into saturation point process model, also written as pairwise interaction model, in which the radius of a point from its pairwise interaction with other points is trimmed to a certain value. As its interaction radius is fixed, the saturation model belongs to the Geyer model when its saturation threshold is limited, but this model may reduce to the Strauss model when its saturation threshold is approaching to infinity. As a special case, the hybrid model is combined with the Geyer process and the Strauss process, and where f ( ) and g( ) are the probability density function of Geyer process and the Strauss process, respectively.

Clustering point process
The typical clustering spatial point process models generally include a Thomas point process (Thomas), a Neyman-Scott process with variance Gamma cluster kernel (VarGamma), MCP, LGCP and Neyman-Scott process with Cauchy cluster kernel (Cauchy) [35,37]. For these clustering point processes, we can take the method of minimum contrast to fit a theoretical model to the given dataset [36]. Take the Ripley's K function as an example, the K function can be computed from the dataset, and its theoretical value of K function under certain point process model is estimated from simulations of this model. Then this model is fitted by finding the optimal parameter values, aiming to give the closest match between the theoretical and empirical curves. Specially, the function of thomas.estK, lgcp.estK and matclust.estK in the spatstat software can be used to fit Thomas, LGCP and MCP, respectively.

G function
The G function is a nearest neighbour distance function, where the estimation of G function is a statistic summarizing one aspect of the clustering point process, G (r ) = ℙ(dist (x, X ) ≤ r ). As a special case of Poisson point processes, G (r ) = 1 − e − r 2 , where r is the distance among events, and is the density of events. It can reflect the spatial clustering or spatial regularity of dataset according to the deviations between the empirical and theoretical G curves.

Ripley's K function
The Ripley's K function is used as a measure of clustering and repulsiveness of spatial point processes, and K (r ) = Ω∕ , where Ω denotes the expectation of the number of extra events within distance r of a randomly chosen event, and is the density (number per unit area) of events [37]. For a stationary Poisson process, the theoretical value K (r ) is expressed as K (r ) = r 2 .

L function
The L function is a transformation of the Ripley's K-function, where L(r ) = √ K (r )∕ . The theoretical value of the L func-tion is L(r ) = r for a uniform Poisson point pattern. L(r ) > r corresponds to a clustering point process, while L(r ) < r corresponds to a repulsive point process.

Envelope function
By means of the envelope function, we can compute envelopes of a summary statistic based on the simulations, and the results include lower envelope and upper envelope, which can be used to assess the goodness-of-fit of a point process model to point pattern data.

NUMERICAL RESULTS OF STATIC SPATIAL DISTRIBUTION
On the basis of current BS distribution in both Smithfield dataset and Glasgow dataset, we numerically obtain the coverage probability. Furthermore, we make a model fitting analysis for both of the two datasets using spatstat software [37], in which the G function, K function, L function and Envelope function can be obtained conveniently. For one assumed theoretical distribution model (among PPP, MCP, LGCP, Cauchy etc.), it may provide a confidence envelope, where the lower/upper envelope denotes the lower/upper bound, respectively. For real data, if its function is within the lower and upper bound, then the theoretical distribution model can be considered to be effective and its error depends on the gap with the theoretical function. Thus, we can verify the validity of the theoretical distribution model and the real dataset using spatstat software and R language [37].

Coverage area of BS distribution with equivalent SIR
Combined with various types of BSs in the real BSs dataset, we consider a new BS located at a point z i of the k-th tier of a Hetnet, which provides users with services within its coverage region. We assume that the signal-to-interference ratio (SIR) threshold for guaranteeing the required quality of service (QoS) is k for the k-th tier BSs. The transmitted signals in this Hetnet experience large-scale fading of propagation path-loss and shadowing, as well as small-scale fading modelled as a Rayleigh distribution with the power of {|h| 2 } = 1. Then, for a user equipment (UE) located at a point z and served by the BS at z i , the average received power is given by where is the propagation path-loss exponent, and P i denotes the transmit power of the i-th BS in the k-th tier. On the other hand, the interference imposed on the user at the point z by all the other BSs can be expressed as where Φ j is a set containing the BSs belonging to the j -th tier.
For notational convenience, we introduce Φ = ⋃ K k=1 Φ k in (3). Given the SIR threshold k , the coverage of the i-th BS in the k-th tier is given by the enclosed boundary consisting of the points, which yield the SIR equalling to the threshold k . Outside this region, the SIR is below the threshold k . Typically in a multi-tier Hetnet, the noise power is much smaller than the interference power. Therefore, the coverage area of a BS can be usually determined by the equivalent SIR according to the condition of For a two-tier Hetnet consisting of macro BSs and micro BSs, the coverage areas can be observed in Figure 3, when actual BS distribution is taken as an example. The coverage areas of BSs show irregular circles. Furthermore, there is a discrepancy between the coverages generated by the grid or Voronoi cells, and the real equal-SIR cell coverage results. Although both the Voronoi cell and grid cell are generally constructed for cell boundary analysis in cellular networks, we show an irregular circle boundary of given multi-tier cells by simulation under an equal SIR constraint, in which its irregular circles are obviously different from the Voronoi cells and grid cells. On this basis, a given-cell could be constructed as a closure line having an identical SIR, and the coverage areas in multi-tier BSs can be constructed as irregular circles.

Static coverage probability analysis
We assume the equal power ratio among BSs for a 1-tier Hetnet, while setting the power ratio between macro BS and femto BS to 1000:1 for a 2-tier Hetnet, and the path-loss exponent to 4. As shown in Figures 4 and 5, the static system-level coverage probability is compared for Smithfield dataset and Glasgow dataset, respectively, in which theoretical results are consistent with simulation results, for both repulsive BS dataset and clustering BS dataset. Moreover, we take the coverage probability results in PPP model as a comparison [8][9][10]. In Smithfield dataset, it can be observed in Figure 4 that the PPP results are approaching to the 1-tier results, but with a large deviation to the 2-tier results. However, in Glasgow dataset, it can be observed in Figure 5 that the PPP results are only approaching to the 2-tier results in relatively high SIR regions, but still with a deviation to the rest results. On the whole, it illustrates that the PPP model is not a proper model for such two datasets.  Especially, when the characteristics of BS deployment are kept with the same, it can be observed that the coverage probability for both two datasets is improved under the 2-tier network scenario, compared to the 1-tier network scenario. One reason is that BSs with smaller power could effectively reduce interference to the other BSs. Moreover, the comparison of the coverage probability difference for 1-tier and 2-tier scenarios is shown in Figure 6. It can be found that the coverage probability improves more at high SIR regions for a repulsive BS distribution, like Smithfield dataset. On the other side, the coverage probability improves more at low SIR regions for a clustering BS distribution, like Glasgow dataset. As shown in Figure 6, the SIR threshold for high SIR regions and low SIR regions is 6 dB. When the SIR is in the range of [6,20], the coverage probability improves more at high SIR regions for repulsive BS distribution (Smithfield dataset), where the range of coverage probability improvement is from 0.2620 to 0.0932. When the SIR is in the range of [−10, 6], the coverage probability improves more at low SIR regions for clustering BS distribution (Glasgow dataset), where the maximum value of coverage probability improvement is 0.5858. It illustrates that the efficiency of small cells is much better for the repulsive BS deployment under high SIR regions, while much better for the clustering BS deployment under low SIR regions. This result helps to guide small cell deployment in future.

3.2.1
Goodness-of-fit by G function It can be observed from Figure 7 that the G functions of both datasets are larger than the theoretical PPP curves, which indicates that the PPP model generally does not well match both of the datasets. In particular, for real dataset in Smithfield, the gap of G function between real data and theoretical PPP is shrinking as the r value increases. But for real dataset in Glasgow, the deviation of G function between real data and theoretical PPP is larger than for Smithfield. Thus, it is necessary to compare

Model fitting for smithfield dataset
We further consider finding the proper repulsive spatial point process model that matches the data sample.

Single model fitting
By means of a statistical tool in R language, we attempt to make goodness-of-fit for the Smithfield dataset by using some repulsive spatial point models, which can be called for analysis through spatstat software [37]. It can be observed from Figure 8 that for 0 < r < 88.76, the K function of real dataset approximates to theoretical models, such as Fiksel, Geyer and Strauss, in which the Geyer model yields the smallest gap. But for the hardcore model, the K function of real dataset is outside its theoretical upper envelope, which indicates that it is not suitable for the dataset in this size segment. On the other hand, for r ≥ 88.76, the K function of real dataset deviates from the theoretical value, but is still within its theoretical lower envelope, which is based on Fiksel, Geyer and Strauss models [33]. However, for the hardcore model, the K function of real dataset approximates to its theoretical value in this size segment.
On the other hand, Figure 9 shows the analysis of L function. For 0 < r < 88.76, the L function of real dataset still approximates to the theoretical models, such as Fiksel, Geyer and Strauss, in which the Geyer model yields the smallest gap. For r ≥ 88.76, the L function of real dataset deviates from the theoretical value, but is still within its theoretical lower envelope, which are based on Fiksel, Geyer and Strauss models. But for the hardcore model, its L function of real dataset approximates to its theoretical value in this size segment.

Hybrid model fitting
Since it is difficult to completely match the dataset by using a single model, we consider taking a hybrid model in spatsat software, which generally combines with two types of spatial point processes, and adjusts its theoretical parameters to fit the dataset. The process is shown in Algorithm 1.
In simulation, we attempt to take many different types of hybrid models, and find a hybrid model with the smallest gap, which combines with Strauss(150) and Geyer(50, 2). In such a hybrid model, the interaction radius of the Strauss process is set to 150 m, while the interaction radius of the Geyer process is set to 50 m and its saturation threshold is set to 2. The result of the hybrid model is shown in Figure 10.
The cumulative sums of absolute errors for different models are listed in the Table 2. It is observed that the gap between the real data and the theoretical value is the smallest for hybrid model, where the cumulative sums of absolute errors satisfy 7.6113 × 10 5 for K function and 1.9850 × 10 3 for L function. From the above analysis of the K function and the L function, it can be found that the saturation process fits well with for the dataset in Smithfield. In particular, the Geyer model is a type of saturation process, which considers its pairwise interaction with all other points. Especially, when its saturation threshold is set to infinity, the Geyer model may approach to the Strauss model. The analysis reveals that even though it is hard to find a single saturation process to fit the real dataset, it is feasible to consider combining multiple saturation processes in a hybrid way to approach the dataset.

Model fitting for glasgow dataset
As shown in Figure 11(a), the K function of real data in Glasgow is outside the upper envelope boundary of the PPP model, which illustrates that the PPP model could not well match the actual results. It can be observed from Figure 11(b) that the L function of real data in Glasgow satisfies that L(r ) > r, which reveals that it corresponds to a clustering point process.

Model fitting algorithm
On this basis, we also choose a series of clustering point process models to fit the real data and compare its results, respectively, with the Thomas model, the VarGamma model, LGCP and the Cauchy model. Furthermore, we attempt to use these clustering spatial point process models to fit the dataset, and the algorithm for searching reasonable models is shown in Algorithm 2.

Fitting results
We consider the clustering spatial point process models to make goodness-of-fit for the Glasgow dataset sample, such as a Thomas model, a VarGamma model, an LGCP model and a Cauchy model. By using the spatstat software [37], we further test the goodness-of-fit for the Thomas model, the VarGamma model, LGCP and the Cauchy model. Figure 12 shows the K function fitting based on four different models. It can be observed that the K function theoretical values of these models approximate to the actual values of real data, and it is also within the upper and lower boundaries of the envelope, which verify a validity of these models. Besides, the LGCP, Thomas, Cauchy and VarGamma models yield smaller gap among these models. We further test the goodness-of-fit for the Thomas model, the VarGamma model, LGCP and the Cauchy model by using L function. Figure 13 shows the L function fitting based on the above models, respectively. It can be observed from Figure 11(b) that the L function value of real data lies outside the upper and lower boundaries of the envelope, which verifies a invalidity of PPP model. But for the Thomas model, the VarGamma model, LGCP and the Cauchy model, the L function theoretical values of these models approximate to the actual values of real data, which lie within the upper and lower boundaries of the envelope, which verify a validity of these models.

Error test
The cumulative sums of absolute errors for different models are listed in the Table 3, it can be found that, among all the test models in Glasgow dataset, the minimum cumulative sum of absolute errors for K function is equal to 1.5585 × 10 6 in the LGCP model, while the minimum cumulative sum of absolute errors for L function is equal to 2.4005 × 10 3 in the Cauchy model. As a whole, the LGCP model is the most reasonable model   Especially, Figure 14 shows the relative errors of K function and L function respectively, in which the expression of relative error is defined as follows where denotes the relative error, Θ denotes the theoretical value and ℜ denotes the real value. Combined with Table 3 and Figure 14, it can be found that the errors in both LGCP model and Cauchy model are very small and close with each other, so both LGCP model and Cauchy model are reasonable models for real BS deployment in Glasgow dataset.

DYNAMIC COVERAGE ANALYSIS BY ADDING BSs
Generally, we could provide a system-level result as a whole by using the static analysis of BS deployment. However, the process of network optimization is dynamic according to the needs of operators. It is necessary to add additional micro cells or femto cells to accommodate service demand at hot spots, and   to deploy new BSs at appropriate locations. Therefore, it is critical to analyze the coverage dynamically from the viewpoint of given-cell deployment. In other words, how to add macro BS or femto BS, and its effects on the coverage problem.

BS location searching scheme
The existing studies usually pay attention to the characteristic of real BS deployment in the static coverage analysis. However, the effects of adding BS are neglected. Extending the static coverage analysis, we here consider the dynamic coverage analysis for BS deployment. In order to meet the service demand at hot-spots, it is necessary to add additional micro cells or femto cells, and to deploy new BSs at appropriate locations. In addition, it is highly desirable to find the near optimal positions for deploying the new BSs within a bounded region of interest. Intuitively, if we can place a new BS at the point in a region S of interest, where the BS receives minimum interference from the existing BSs. The transmit power of this new BS, which requires to meet the target SIR, may be minimized. Based on irregular circle cells, we adopt the following strategy in order to find the minimum interference location in a target region S , and divide the target region S into a number of sub-regions. After we find the minimum interference point in each of the sub-regions, we can then obtain the minimum interference point in the whole region S , simply by comparing the minimum interference points of the sub-regions. According to [8], the coverage results derived from the delaunay triangulation are usually closer to the practice than that derived from the gridbased method [8]. Therefore, as shown in Figure 15, we take a delaunay triangulation approach to partition the region S into N triangular sub-regions, where the value of N depends on the number of existing BSs contained in S . Consequently, the site selection problem is to find the minimum interference point within each of the delaunay triangles.
Assume that there are M existing BSs in the region of interest S ⊂ ℝ 2 . The BS indices are listed in a set as Φ = {1, 2, … , M}, and the corresponding locations of these M BSs are given by the set of {z 1 , z 2 , … z M }. Since the region S is partitioned into N delaunay triangles, we further assume that the n-th delaunay triangle, denoted by T n ⊂ S , has three vertices, z n 1 , z n 2 and z n 3 with n 1 , n 2 , n 3 ∈ Φ, representing that z n 1 , z n 2 and z n 3 indicate the locations of three existing BSs. Then, at any point z ∈ T n , where P m is the transmit power of the BS located at z m . The minimum interference point z n,min in T n is, therefore, given by the solution of the optimization problem z n,min = arg min z∈T nĪ n (z ).
In order to find z n,min , the initial iteration is given as follows Next, the following iterative gradient expression can be expressed as [38] z (t ) where (t ) denotes the number of iterations, ∇ z denotes to calculate the gradient about z, and is a small positive step size. Note that, after iterations, if z (t ) n lies outside T n , then it is projected onto the boundary of T n along ∇ zĪn (z . In summary, the algorithm for finding the minimum interference point z min in deploying a new BS is given by Algorithm 3. Especially, the time complexity of our searching algorithm is O(n 2 ), which increases as the number of statistical base stations  increases. Besides, the step size also affects the time complexity. The search speed may be slow if the step size is set to be small, but it may fluctuate over iterations if the step size is set to be large.
After adding a new BS at z min , it is necessary to check the new coverage of the target area in the Hetnet. Since BS distribution in practice is not regular, it is generally hard to derive the formula relying on theoretical models for computing the coverage probability of a multi-tier Hetnet. For this sake, we introduce a tractable but still effective numerical method, as detailed by Algorithm 4, for evaluating the coverage of Hetnet. For convenience of description, some notations introduced in Algorithm 3 are used. Furthermore, the SIR threshold for determining the coverage area of the i-th BS is denoted bȳi, wherē i = k , if the i-th BS is a k-th tier BS. The coverage area of the i-th BS is given by the area within which SIR i (z ) ≥̄i, by the Equation (4).
In order to dynamically evaluate the coverage probability of a Hetnet, a large number of UEs N U , for example N U ≥ 10, 000, are assumed to be uniformly and randomly distributed in S , which are expressed as a set of U = {u 1 , u 2 , … , u N U }, while the locations of the UEs are given by a set of {z u 1 , z u 2 , … , z u N U }. Specifically, for the u m -th UE, its serving BS should be the one that provides the UE with the highest SIR, which can be determined as follows. First, for all the BSs in Φ, we compute

2:
Uniformly and randomly distribute N U UEs in S , whose indices are given by Then, based on the SIR values caused by the M BSs, the one serving the u m -th UE can be chosen as Furthermore, if SIR k u m (z u m ) >̄k u m , the required SIR of the u m -th UE is met. Otherwise, the UE is not efficiently covered. Therefore, the coverage probability can be defined as the percentage of the UEs, which meet their SIR requirements, formulated as

DYNAMIC COVERAGE RESULTS BY ADDING BSs
In our simulations, we set the power ratio between macro BS and femto BS to 1000:1, the path-loss exponent to 4, and the SIR threshold to 4.77dB. Furthermore, we assume that active UEs are uniformly distributed in the areas of interest, with a density satisfying U ≫ , where is the density of BSs. The numerical results obtained by Algorithm 3 are shown in Figures 16-18, for a Poisson distribution model, urban and rural BS distribution models, respectively. In these figures, three new BSs are planned to be placed at the area within the solid red rectangle. A pentagram indicates the minimum interference location, while diamonds indicate the candidate locations with non-minimum interference. It should be noted that, in our simulation, the target area is completely divided into triangles by the delaunay triangulation, in order to avoid the boundary effect.
As listed in Table 4, it can be observed that the interference received at three different candidate locations, for a Poisson   distribution, a real urban area (Smithfield) and a rural area (Glasgow), respectively. Especially, candidate location 1 is reserved for a macro-cell, and candidate location 2 is reserved for a micro-cell, which is considered according to the strength of interference. By means of the proposed scheme, it provides a way to find a series of candidate locations for adding new BSs, according to the interference based on gradient decrement.Once the locations of the existing and added BSs are fixed, the coverage probability can be evaluated by using Algorithm 4, and the result is listed in Table 5 for three different types of distribution: the Poisson distribution, urban area and rural area. It can be found that adding a macro BS at the candidate location improves the coverage. We assume that the BS density is set to 8.75 atoms∕km 2 for macro BSs and 55.60 atoms∕km 2 for femto BSs. Then, the coverage improves in the range of 2.08%, 3.64%, 0.93%, respectively, for the three models: Poisson, urban, and rural area. By contrast, adding a femto BS at the candidate location can improve the local QoS, but not necessary the overall system performance. As shown in the table, the coverage improves in the range of 0.04%, 0.49%, −1.14%, respectively, for the three models considered too. Hence, depending on the situations, the performance of the entire network may improve or degrade.
The effect of femto BS density on the coverage probability in a two-tier Hetnet with different site selection methods is shown in Figure 19, where the density of macro BS 1 is set to 5 atoms∕km 2 , while the femto BS density is initialized to 2 = 0. We assume that the SIR threshold for macro BSs is 1 = 2dB, and that for femto BSs is 2 = 1dB, while the size of the simulation area is set to 1000 km 2 . The results of Figure 19 shows that our proposed scheme yields the highest coverage probability among the three site selection methods: our proposed one, the grid model [8] and the PPP model [11].
The coverage probability remains constant, when the thresholds for macro BSs and femto BSs are kept the same after adding femto BS. The coverage probability can increase if the threhold for macro BSs is greater than that for femto BSs, while it decreases when the threhold for macro BSs is less than that for femto BSs. Although adding a new BS is motivated to improve network coverage, it also results in the interference among nearby BSs. Consequently, the performance of the entire network may become saturated, when the density of BSs reaches a certain value. As shown in Figure 19, the coverage of networks may even decrease due to a resultant increase in the interference as the density of femto BSs increases. When the density of femto BSs reaches a certain value, approximately 30 atoms∕km 2 , the network coverage does not further improve by adding more femto BSs.

CONCLUSIONS
Considering the complication of optimizing BS deployment problems in future cellular networks, we chose repulsive BS dataset and clustering BS dataset to make statistical modelling, and analyzed the real BS deployment from a static and dynamic perspective respectively. The real datasets were statically characterized with various types of spatial point processes. Our work could provide a guidance for BS location optimization, such as the modelling of BS dataset fitting and small cell deployment. It showed that the coverage probability increased more at high SIR regions for the repulsive BS dataset in Smithfield. Since it was hard to find a single saturation process to fit this real dataset, we proposed an algorithm to search for reasonable joint saturation models, which was feasible to combine multiple saturation processes in a hybrid way to approach the dataset. The error results showed that our proposed hybrid model fitted well with the real repulsive BS dataset in Smithfield. The difference was that the coverage probability increased more at low SIR regions for the clustering BS dataset in Glasgow. Especially, we proposed a searching algorithm to find the proper clustering point processes, and made a good-of-fit test for this dataset. The error results illustrated that the LGCP and Cauchy model became reasonable models.
Furthermore, on the basis of statical BS deployment analysis, we investigated a dynamic analysis for the effect of adding new BSs on the coverage problem. In addition, we constructed a cell boundary with irregular circles under an equal SIR constraint. Moreover, we proposed a BS placement searching algorithm to find the location for adding new BSs, in which we placed additional BSs at the points of minimum interference in the region of interest. The numerical results showed that the coverage probability increased after adding BSs in the target area of a heterogeneous network by using the proposed scheme, but as the density of the femto BS reached a certain value, its coverage remained unchanged even after adding more femto BSs.