Evaluating Polygon Overlay to Support Spatial Optimization Coverage Modeling

Authors

  • Ran Wei,

    Corresponding author
    1. Department of Geography, University of Utah, Salt Lake City, UT, USA
    • Correspondence: Ran Wei, Department of Geography, University of Utah, Salt Lake City, UT 84112-9155, USA

      e-mail: ranwei@asu.edu

    Search for more papers by this author
  • Alan T. Murray

    1. GeoDa Center for Geospatial Analysis and Computation, School of Geographical Sciences and Urban Planning, Arizona State University, Tempe, AZ, USA
    Search for more papers by this author

Abstract

Minimizing costs and maximizing coverage are important goals in many planning contexts. These goals often necessitate an abstraction of a continuous demand region, resulting in potential errors when applying traditional coverage models. To reduce coverage errors caused by spatial abstraction, a number of spatial representation schemes have been proposed and applied. A new representation scheme using polygon overlay recently received much attention because potentially it can eliminate representation errors in coverage modeling. However, this overlay-based approach is computationally challenging in terms of both the generation of demand units and the complexity of the resulting coverage model. This article investigates the operational and computational challenges of polygon overlay for delineating continuous demand in coverage models, an issue that has yet to be fully explored. We present a theoretical evaluation of the computational complexity associated with representation using polygon overlay in coverage modeling. Evaluations of two study regions provide empirical support for the computational complexity analysis. The analysis results provide insight regarding expected problem size and computational requirements if polygon overlay is relied upon to delineate demand unit boundaries in coverage modeling.

La minimización de costos y la maximización de la cobertura espacial son objetivos importantes en muchos contextos de planificación. Estas metas a menudo requieren una abstracción de una región continua de demanda , dando lugar a posibles errores en la aplicación de modelos de cobertura tradicionales. Para reducir los errores de cobertura provocadas por la abstracción, la comunidad académica ha propuesto y aplicado una serie de esquemas de representación espacial. Recientemente un nuevo esquema de representación que utiliza la superposición de polígonos ha recibido mucha atención porque potencialmente puede eliminar los errores de representación en el modelado de la cobertura. Sin embargo, este enfoque es computacionalmente difícil, tanto en términos de la generación de unidades de demanda, como en la complejidad del modelo de cobertura resultante. Este artículo investiga los retos operacionales y de cómputo de la superposición de polígonos para delinear la región continua de demanda en los modelos de cobertura, un problema que aún no se ha explorado a fondo. Se presenta una evaluación teórica de la complejidad computacional asociada a la representación mediante superposición de polígonos en el modelado de cobertura espacial. Se presentan evaluaciones de dos regiones de estudio como apoyo empírico para el análisis de la complejidad computacional. Los resultados del análisis proporcionan información sobre el tamaño del problema esperado y los requerimientos computacionales en los casos en que el método de superposición de polígonos es usado para delinear límites de la región de demanda para el modelado de cobertura espacial

最小成本和最大区域覆盖是许多规划情境研究中的重要目标。实现这些目标通常需要对连续需求区域进行抽象,而这又会导致在应用传统覆盖模型时出现潜在误差。为减小由空间抽象引起的覆盖误差,已提出了一系列空间表达方案并得到应用。一种新型的利用多边形覆盖的表达方案,因其或可消除覆盖建模过程的表达误差,近来得到较多的关注。然而,这种基于覆盖的方法在需求单元生成与覆盖模型结果复杂性等方面面临着计算挑战。本文提出了一种在覆盖建模中采用多边形叠加表征的计算复杂度的理论评估方法。两个研究区域的评估为计算复杂度分析提供了经验支撑,如果在覆盖建模中多边形叠加依赖于描述需求单元边界时,该分析结果有助于深入考察预期问题规模大小及计算需求。

Introduction

Minimizing costs and maximizing coverage are common goals in many planning applications. Examples include locating fire stations to guarantee immediate response to calls for service (Daskin and Stern 1981; ReVelle 1991; Badri, Mortagy, and Alsayed 1998; Tavakoli and Lightner 2004; Chevalier et al. 2012; Murray and Wei 2013), placing emergency warning sirens to alert the public of impending danger (Current and O'Kelly 1992; Murray and O'Kelly 2002; Murray, O'Kelly, and Church 2008), siting cellular towers to allow widespread access of wireless broadband (Grubesic and Murray 2002; Akella et al. 2010; Berman, Drezner, and Krass 2010; Shillington and Tong 2011), and awarding franchise outlets to satisfy market demand for a product (Current and Storbeck 1988; Miliotis, Dimopoulou, and Giannikos 2002). To support facility coverage provision, two general coverage models, with an intent to minimize costs and to maximize coverage, have been developed and applied extensively in location decision-making. One is the location set covering problem (LSCP), aiming to locate the fewest number of facilities that provide complete service coverage for regional demand (Toregas et al. 1971). However, complete coverage may not be possible given a limited budget, leading to the development of another coverage model, the maximal covering location problem (MCLP). The MCLP identifies the best locations for a prespecified number of facilities (the budget) to serve the most demand possible within a given response distance or time (Church and ReVelle 1974).

Given the broad application of coverage models for addressing various planning problems, both the LSCP and the MCLP have received considerable research attention. One focus involves developing efficient solution techniques. These two models and their extensions are non-deterministic-polynomial-time-hard (NP-hard) problems (Garey and Johnson 1979), indicating that they are computationally challenging to solve optimally. Therefore, a variety of solution approaches have been developed. Exact approaches include branch-and-bound (Balas and Carrera 1996; Downs and Camm 1996) and cutting planes (Nobili and Sassano 1992). Heuristic approaches, such as genetic algorithms (Beasley and Chu 1996), Lagrangian relaxation (Beasley 1990a; Galvao and ReVelle 1996; Caprara, Fischetti, and Toth 1999), local search (Jacobs and Brusco 1995), and ant colony (Ren et al. 2010), also have been proposed. Another research focus is extension of basic models to deal with additional issues. Examples include backup coverage (Hogan and ReVelle 1986), expected coverage (Daskin 1983; Batta, Dolan, and Krishnamurthy 1989), gradual coverage (Berman and Krass 2002; Drezner, Wesolowsky, and Drezner 2004), and variable radius coverage (Plastria and Carrizosa 1999; Berman et al. 2009).

A third research orientation addresses spatial representation issues of regional demand. In many planning applications, demand for a service can exist anywhere in a continuous region. Due to geometric and computational simplicity, continuous regional demand is traditionally abstracted as discrete points in coverage modeling, like simplifying U.S. census block groups as centroids, which could result in unintended measurement and interpretation errors (Miller 1996; Church 2002; Murray and O'Kelly 2002; Murray 2010). This complication relates to the modifiable areal unit problem popularized by Openshaw and Taylor (1981). Many research efforts focus on identifying a more appropriate or detailed representation for continuous demand to reduce potential errors in coverage modeling. For example, Murray and O'Kelly (2002), Murray, O'Kelly, and Church (2008), Cromley, Lin, and Merwin (2012), and Tong and Church (2012) evaluate coverage errors associated with point-based and area-based representations. Murray (2005), Tong and Murray (2009), Alexandris and Giannikos (2010), Murray, Tong, and Kim (2010), and Tong (2012) formulate new models aiming to achieve more accurate spatial representations of continuous demand. While these works contribute to representing more accurately continuous demand, error and uncertainty due to spatial representation still remain. Recent proposed approaches identify error-free representation schemes. One is the iterative disaggregation algorithm developed in Murray and Wei (2013) for the LSCP to identify a spatial configuration with no representational errors. Cromley, Lin, and Merwin (2012) and Yin and Mu (2012) use another scheme based on polygon overlay. This approach relies on vector geographic information systems (GIS)-based overlay to identify the finest level of geographic resolution needed for a demand region in order to avoid representation errors. The downside of such an approach is that it involves substantial GIS/geometric processing such as polygon overlay and partitioning, procedures well known to be computationally intensive (see Waugh and Hopkins 1992; Park and Shin 2002; De Berg et al. 2008). Combined with computational issues of delineating a demand region, the resulting number of demand units using overlay may be large and beyond the computational capabilities of commercial software used to solve the corresponding coverage model.

Although the overlay-based approach provides a theoretically error-free representation scheme for coverage modeling, its computational efficiency is key for this approach to be feasible in practical applications. This article investigates the operational and computational challenges of polygon overlay for representing continuous demand in coverage models, an issue that has yet to be explicitly studied. Analysis results provide insight into expected problem size and computational requirements if this approach is used in coverage modeling.

Coverage models

Coverage represents an important notion and category of a spatial optimization model oriented toward enhancing accessibility to facilities or services. In this section, we review two basic coverage models, the LSCP and MCLP, though the discussion applies equally to other coverage models as well, particularly those previously mentioned. Presentations of mathematical formulations of the LSCP and MCLP highlight complexity issues.

The LSCP aims to site the minimum number of facilities needed to ensure complete coverage of all demand. It was first formulated by Toregas et al. (1971) to site emergency service facilities. Consider the following notation:

display math
display math

The aij elements indicate an evaluated coverage standard, reflecting an ability to provide suitable service response or a sufficient access. Taking fire services as an example, aij = 1 reflects personnel at a fire station at location j being able to reach service demand at location i in eight minutes or less, a common response time goal. A GIS is generally used to evaluate such service standards (Church and Murray 2009). Given this notation, the LSCP formulation may be stated as follows:

display math(1)
display math(2)
display math(3)

The objective of the LSCP, (1), is to minimize the number of facilities located. Constraints (2) ensure that each demand area is covered by at least one facility. Constraints (3) impose binary integer restrictions on decision variables.

The LSCP requires that each demand unit be completely covered. However, this requirement may not be feasible due to limited resources. Another coverage model, the MCLP, relaxes the LSCP requirements to reflect the intent to cover as much demand as possible, given limited resources (Church and ReVelle 1974). Consider the following additional notation:

display math

Because the MCLP does not require each demand unit to be covered, decision variables, Yi, are employed to track whether a demand unit i is covered. The MCLP formulation may be stated as follows:

display math(4)
display math(5)
display math(6)
display math(7)

The objective of the MCLP, (4), is to maximize the total amount of demand served. Constraints (5) track whether a demand unit i is covered by at least one facility that can suitably serve unit i. Constraint (6) specifies that p facilities are to be sited. Constraints (7) impose binary integer restrictions on decision variables.

The MCLP could be conceived of as an LSCP if p is large enough to cover all demand units. The potential facility locations, demand units, and coverage sets need to be identified in advance to apply these coverage models. Doing so in an error-free manner remains a challenge (Murray and Wei 2013). In this article, we focus on how continuous regional demand has been delineated.

Spatial representation and polygon overlay

When the purpose of planning is to ensure service coverage to all or part of a continuous region, a need exists to abstract the area into discrete spatial objects, like points or polygons. However, the abstraction process is well known to create uncertainties or errors in coverage modeling. As an example, a point representation could result in an underestimate of the number of required facilities (Murray and O'Kelly 2002), whereas an area representation may lead to an overestimate of the number of needed facilities to achieve a certain level of coverage (Murray, O'Kelly, and Church 2008; Tong and Murray 2009). Recently, a vector-based overlay approach was suggested and employed to partition a continuous demand region, where each resulting unit is a disjoint portion of coverage provided by potential facilities (Cromley, Lin, and Merwin 2012; Yin and Mu 2012). A requirement for this vector-based overlay approach is that potential facility sites are known and finite. Fig. 1 shows how this approach works when facility service coverage is circular. This example is constructed using the demand region, potential facility locations, and coverage areas associated with each potential facility location. Overlay, then, involves the physical overlay of the region boundary layer with the facility coverage layer. The result is that each demand unit is the smallest areal unit that a sited facility could possibly cover, because all potential coverage combinations are considered simultaneously in the overlay process. Such a property is extremely important because it ensures that demand unit polygons are partitioned in a way that no error in coverage representation would result based on the given potential facility locations and assumed coverage provided. The following propositions prove that the polygon overlay approach leads to the true minimum number of facilities and/or the maximum coverage for a continuous region.

Figure 1.

Polygon overlay. (a) Demand area. (b) Potential facility locations. (c) Potential facility coverage. (d) Demand units created by polygon overlay.

Proposition 1. The minimum number of facilities obtained by solving the LSCP applied to a polygon overlay representation, ZLSCP (polygon overlay), is equal to the true minimum number of facilities required to cover the entire region, ZLSCP (*).

Proof 1. Proof Let Cj represent the coverage of facility j, the set XLSCP(polygon overlay) = {j|Xj = 1} denote an optimal solution for covering the polygon demand set I partitioned based on coverage overlay, and XLSCP(*) = {j|Xj = 1} denote an optimal solution for covering the continuous demand region ϕ.

Given Ui = ϕ, XLSCP (polygon overlay) is also a feasible solution for covering the demand region ϕ by construction. Because the LSCP is a minimization problem, the objective value of an optimal solution is always less than or equal to that of any feasible solution. Therefore, ZLSCP (polygon overlay) ≥ ZLSCP (*).

Suppose that XLSCP (*) is not a feasible solution for covering demand set I. Then ∃iI, iCji, ∀jXLSCP(*). Given that i is the smallest coverage unit, iCj = i or iCj = Ø, ∀jJ, we can conclude that ∃iI, iCj = Ø, ∀jXLSCP(*). However, because XLSCP (*) is the optimal solution for covering ϕ, math formula, which means ∃iI, i ∩ Φ = Ø. This is an obvious contradiction to Ui = ϕ. Therefore, XLSCP (*) is also a feasible solution for covering demand set I. Because XLSCP (polygon overlay) is an optimal solution for covering demand set I, XLSCP (*) is a feasible solution for covering demand set I, and the objective value of an optimal solution for a minimization problem is always less than or equal to that of any feasible solution, ZLSCP(polygon overlay) ≤ Z*.

Given ZLSCP (polygon overlay) ≥ ZLSCP (*) and ZLSCP (polygon overlay) ≤ ZLSCP (*), then ZLSCP (polygon overlay) = ZLSCP (*).

Proposition 2. The coverage achieved using the MCLP applied to units derived by polygon overlay, ZMCLP (polygon overlay), is equal to the true maximum coverage of the continuous region that p facilities can achieve, ZMCLP (*).

Proof. Proof Let XMCLP (polygon overlay) = {j|Xj = 1} and YMCLP (polygon overlay) = {j|Yi = 1} denote an optimal solution for maximizing the coverage of demand set I partitioned based on coverage overlay with p facilities, and XMCLP (*) = {j|Xj = 1} denote an optimal solution for maximizing the coverage of continuous demand region ϕ with p facilities.

XMCLP (polygon overlay) and YMCLP (polygon overlay) are feasible solutions for the continuous MCLP. Because the MCLP is a maximization problem, the objective value of the optimal solution is always greater than or equal to that of any feasible solution. Therefore, ZMCLP (polygon overlay) ≤ ZMCLP (*).

The overlay of Cj, jXMCLP (*) results in a demand set I*. From the definition of I, we know that each demand unit in I* is equivalent to a unit or a union of some units in I. Therefore, XMCLP (*) is also a feasible solution for covering demand set I using p facilities. Because XMCLP (polygon overlay) is an optimal solution for covering demand set I (p facilities), XMCLP (*) is a feasible solution for covering demand set I using p facilities, and the objective value of an optimal solution for a maximization problem is always greater than or equal to that of any feasible solution, ZMCLP (polygon overlay) ≥ ZMCLP (*).

Given ZMCLP (polygon overlay) ≤ ZMCLP (*) and ZMCLP (polygon overlay) ≥ ZMCLP (*), then ZMCLP (polygon overlay) = ZMCLP (*).

The preceding propositions establish that polygon overlay-based representation eliminates error due to spatial representation of demand in coverage modeling. The partition of the demand region using polygon overlay gives the theoretical maximum number of demand units required to avoid representation error.

Unlike the overestimate of other area-based representations or the underestimate of point-based representations (Murray and O'Kelly 2002; Murray, O'Kelly, and Church 2008), the polygon overlay approach provides a theoretical error-free representation scheme. However, an issue is whether the approach is computationally feasible in practice, because the complexity of the geometric computations involved is not trivial, and the resulting spatial optimization model is not necessarily possible to solve.

Evaluating polygon overlay

Two major computational concerns exist for applying polygon overlay to address representation issues in coverage modeling. The first is the a priori generation of demand units because it involves considerable geometric operations and processing time. The second is associated with solving the resulting coverage model. A comprehensive evaluation of the approach must consider both aspects, but this has not been done to date.

Vector-based overlay is numerically intensive and time consuming (Waugh and Hopkins 1992; Park and Shin 2002; De Berg et al. 2008). It typically consists of identifying intersection points of boundary lines, splitting boundary lines based on intersection points, constructing new composite polygons, and assigning corresponding polygon attributes. The computational complexity associated with identifying intersection points dominates these procedures (De Berg et al. 2008). Much related work in GIS and computer science addresses the enhancement of the computational efficiency of vector overlay algorithms (see Wang 1993). Plane sweep is the most popular overlay algorithm and has been implemented in many GIS software packages and libraries (Park and Shin 2002). The traditional plane sweep algorithm has an expected running time of O(ulogu + klogu), where u is the number of line segments, and k is the number of intersections. Some variations have been devised to improve the plane sweep algorithm. One is based on a monotonic chain, leading to a running time close to O(ulogv + klogv), where v is the minimum number of monotone chains (Park and Shin 2002).

In the context of coverage modeling, each facility provides coverage that can be conceived of as a polygon. The layer of facility coverage polygons generally has considerable overlap, as depicted in Fig. 1c. To address representation issues, a need exists to carry out polygon overlay for these coverage polygons in order to find each unique demand unit such that no error results. The number of line segments (u) is the sum of the number of vertices in each coverage polygon. Each coverage boundary can be transformed into a series of monotone chains (see Park and Shin 2002), the total number of which is v. The number of intersections (k) is the total number of intersections among n facility coverage polygons. Given the characterization of u, v, and k, the running time for generating demand units from n potential facility coverage polygons is O(ulogv + klogv).

Next, we focus on assessing the computational challenges associated with solving the resulting coverage model. Given that both the LSCP and the MCLP are NP-hard problems (Garey and Johnson 1979), the problem size, dictated by the number of decision variables and constraints, largely determines whether the coverage model can be solved using commercial software. Taking the standard benchmark problems from the OR-Library as an example (Beasley 1990b), while the optimal solutions for small- to-medium-sized LSCP instances have been found, the largest instances, involving 1,000 or more decision variables and 10,000 or more constraints, still are not readily able to be solved optimally (Lan, Depuy, and Whitehouse 2007; Yelbay, Birbil, and Bülbül 2012). From the LSCP and MCLP formulations, we can observe that the numbers of decision variables and constraints are determined by the number of potential facility locations and by the number of demand units to be covered. As a result, both the running time to perform overlay operations and the number of demand units derived using overlay have significant impacts on whether applying polygon overlay to support coverage modeling is computationally feasible.

The issue now is how to determine the number of demand units resulting from a polygon overlay operation, which is well recognized to be difficult (NCGIA 1997). Even establishing a valid bound is not easy (Saalfeld 1989). However, if the coverage standard is Euclidean distance, and facility coverage is considered to be a circle with radius R, then an upper bound exists for the number of unique polygons that result from overlay. This Euclidean assumption is not unrealistic because many types of facilities have circular service coverage, like emergency warning sirens and cellular towers (see Current and O'Kelly 1992; Akella et al. 2005). Equivalent to the plane division by circles problem in Yaglom and Yaglom (1987), the maximum number of demand units, m, into which facility coverage can be divided is

display math(8)

This bound results if every pair among n circles intersects with each other transversally, without any three circles ever being concurrent. These conditions are rarely satisfied in practice, so the bound may be very loose. As an example, the 291 facility coverage polygons shown in Fig. 2b can generate, theoretically, as many as 84,392 demand units. In practice, however, only 13,320 unique units are observed. Therefore, the bound is seven times greater than the actual number. As a result, a need exists to establish a tighter bound for the number of demand units.

Figure 2.

Polygon overlay in Dublin, Ohio, using regularly spaced points for potential facility locations. (a) Dublin. (b) Potential facility locations. (c) Demand units created by polygon overlay.

Given that more intersections of coverage circles tend to result in more generated demand units, accounting for potential intersections in evaluating expected demand units is reasonable. To accomplish this, pairwise distances between facility coverage circles can be computed in advance, which is computationally efficient because facility sites are represented as points. Let math formula be the distance between facility location j and j′, R the coverage radius, and ψj the set of locations whose coverage transversely intersects with coverage j. A tighter bound for m can be derived as

display math(9)

where math formula, math formula, math formula, and s = n mod (t + 1). Because a facility coverage circle would have, at most, t intersections with other coverage circles, the maximum number of demand units generated by overlaying (t + 1) coverage circles is math formula, if every two intersect and no three are concurrent. The quotient, q, represents how many (t + 1) coverage circles exist. The remainder, s, denotes the number of facility coverage circles that cannot mutually intersect with (t + 1) coverage circles. Equation (9) is equivalent to equation (8) if t = n – 1. Yet, in practice, t is smaller than n – 1 in most cases, leading to a tighter bound for m.

The study design

To provide empirical support for the theoretical computational complexity associated with representation derived using polygon overlay in coverage modeling, various potential problem characteristics are examined in two study regions. One is Dublin, Ohio, about 46 km2 (Fig. 2a). The planning goal is to site omnidirectional emergency sirens to cover the entire region, using the LSCP. This region has been utilized in a number of other studies to locate warning sirens, including Current and O'Kelly (1992), Murray and O'Kelly (2002), Murray (2005), Murray, O'Kelly, and Church (2008), Tong and Church (2012), and Murray and Wei (2013). The other region of analysis is Elk Grove, California (Fig. 3a), which is about 215 km2. The goal of this study is to provide complete regional coverage by locating the fewest fire stations. Again, the LSCP is utilized to determine this number. Elk Grove also has been studied in previous work, including Murray, Tong, and Grubesic (2012) and Murray and Wei (2013).

Figure 3.

Polygon overlay in Elk Grove, California, using PIPS for potential facility locations. (a) Elk Grove. (b) Potential facility locations. (c) Demand units created by polygon overlay.

The study design includes 20 combinations of potential facility locations and service coverage standards in Dublin and 16 combinations in Elk Grove to assess computational requirements. Potential facility locations are identified in two ways to ensure that an entire study region is completely covered. One employs regularly spaced points. The spacing of points ranges from 200 to 500 m at an interval of 50 m for Dublin, whereas the spacing for Elk Grove ranges from 457.2 to 1,066.8 m at an interval of 152.4 m. Fig. 2b depicts the regularly spaced points 400 m apart. The other way to identify potential facility locations is through the use of the polygon intersection point set (PIPS) (see Murray and Tong 2007 for details). Fig. 3b presents the unitized PIPS for Elk Grove. Two coverage standards are considered for each region. Standards of 976 and 1,464 m are the audible range of sirens in Dublin, whereas standards of 950 and 2,850 m for effective service response for fire stations are used for Elk Grove. Different layers for potential facility sites and varying service coverage standards combine to create problem instances with different spatial structures, enabling computational complexity to be explored.

Due to space limitations, only the LSCP is considered here. However, the discussion and results for the polygon overlay approach are similar for the MCLP and for other extensions, such as backup coverage, expected coverage, and gradual coverage. These models are NP hard (Garey and Johnson 1979) and consequently challenging to solve.

Application results

The analysis was carried out on a Intel Xeon (2.53 GHz) computer running Windows with 6 GB of RAM. The study regions were partitioned into demand units by overlaying facility coverage circles using Shapely, a Python geometry computation library. The most commonly used GIS software, ArcGIS, also was tested to perform the overlay analysis but was found to take significantly more time for processing. In fact, some instances could not be processed using ArcGIS. A commercial optimization package, Gurobi, was used to solve the associated LSCP integer programming instances.

For each of the 20 application instances for the Dublin region, the demand units were generated and then used to structure the LSCP. The information associated with polygon overlay using 976 and 1,464 m as the audible ranges of a siren in Dublin is summarized in Tables 1 and 2, respectively. The “Number of sites” column specifies the number of potential facility locations, n. This is followed by the variables of total number of intersections, k, and the maximum number of intersections per facility coverage, t. The “Number of demand units” column reports the number of demand units, m, resulting from polygon overlay. The last two columns, “Processing time” and “LSCP solution time,” indicate the computational time to partition the study region into demand units and the time to solve the corresponding LSCP using Gurobi, respectively.

Table 1. Analysis of Polygon Overlay for Dublin, Ohio (976 m Coverage Standard)
Number of sites (n)Number of intersections (k)Maximal intersections per coverage (t)Number of demand units (m)Processing time (s)LSCP solution time (s)
  1. Note: *denotes that the problem could not be optimally solved after running for three days.
1865,952445,54318.101.00
2219,503608,73532.1972.63
29114,5506813,32049.6569.86
36624,8889622,70295.331,536.95
40134,48612331,269199.2079.81
49647,12013442,369217.755,723.01
63789,18020481,9071,122.145,527.45
64690,44020782,2831,104.973,414.84
73597,75518487,374672.2244,402.39
1,151242,861292214,9053,024.94*
Table 2. Analysis of Polygon Overlay for Dublin, Ohio (1,464 m Coverage Standard)
Number of sites (n)Number of intersections (k)Maximal intersections per coverage (t)Number of demand units (m)Processing time (s)LSCP solution time (s)
  1. Note: — denotes that the solver reported an “out of memory” error message before a feasible solution was identified.
18612,64810611,08645.102.19
22118,56413316,22575.333.95
29132,30117427,815155.1019.64
36649,77621042,376286.97226.94
41369,79728858,997913.7350.81
49689,28027875,775723.21128.16
735201,390424168,0123,303.22136.35
866377,576684345,65029,429.18
1,151485,722651409,19218,150.33
1,197705,033932641,45985,727.69

The number of potential facility locations ranges widely from 186 to 1,151 in Table 1, where an audible range of 976 m is used. Fig. 2 shows the region, potential facilities (291), and resulting demand units (13,320) associated with the third row in Table 1. What we see in Table 1 is that the number of demand units generated by polygon overlay increases from 5,543 for 186 potential facility sites to 214,905 for 1,151 potential sites. The processing time to perform the overlay operation also increases. The largest LSCP instances cannot be solved using Gurobi. These trends are also evident in Table 2 using 1,464 m as the audible range. The number of generated demand units is as high as 641,459 for 1,197 potential facility sites, with processing time increasing to 85,727.69 s. Three of the largest LSCP instances in Table 2 cannot be optimally solved. Comparing the solution time to the number of potential locations in Tables 1 and 2 reveals that all instances with larger than 1,000 potential sites cannot be solved. This outcome illustrates that a larger number of potential sites generally results in a larger number of demand units and, subsequently, a larger sized LSCP instance. Although solution time is highly dependent on the problem size, it is not the only factor. Current commercial optimization solvers employ techniques to speed up solution time, such as adding cuts and integrating heuristics. Thus, problem size may not always be indicative of solution time, which is why some larger sized problems whose results appear in Tables 1 and 2 are not always more difficult to solve.

The application results for polygon overlay in Elk Grove are detailed in Tables 3 and 4, which report results for facility service ranges of 950 and 2,850 m, respectively. The number of potential sites has a range of 182–1,343, similar to what appears in Tables 1 and 2. The positive relationship among the number of potential sites, the number of generated demand units, and required processing time also appears in Tables 3 and 4 noteworthy observation in Table 4 is that the overlay representation for 1,343 facility coverage circles cannot be successfully processed due to the computer running out of memory. Among the 16 different problems summarized in Tables 3 and 4, four of largest LSCP instances are not able to be optimally solved using the commercial solver.

Table 3. Analysis of Polygon Overlay for Elk Grove, California (950 m Coverage Standard)
Number of sites (n)Number of intersections (k)Maximal intersections per coverage (t)Number of demand units (m)Processing time (s)LSCP solution time (s)
  1. Note: *denotes that the problem could not be optimally solved after running for three days.
1821,09281,2564.220.00
2051,435111,4525.010.14
2512,510122,5628.300.01
2572,570132,5548.400.83
3626,154206,08418.971,239.59
57513,8002813,50646.000.12
72322,4134122,34276.8212.03
1,01747,7995647,190188.49*
Table 4. Analysis of Polygon Overlay for Elk Grove, California (2,850 m Coverage Standard)
Number of sites (n)Number of intersections (k)Maximal intersections per coverage (t)Number of demand units (m)Processing time (s)LSCP solution time (s)
  1. Note: — denotes that the solver reported an “out of memory” error message before a feasible solution was identified.
18215,65212713,15657.502.30
25130,12017625,074140.0912.59
36260,45424949,956387.5283.18
487136,847408119,7461,920.30658.48
575153,525394126,1272,061.04439.06
863479,828764435,46721,660.36
1,017484,092698395,73819,226.45
1,3431,153,6371,191

Each problem instance reported in Table 1-4 has a different spatial structure because of the differing potential facility locations and service coverage. This ensures that the computational complexity analysis for the polygon overlay-based approach is comprehensive and unbiased.

The theoretical analysis shows that the processing time for generating demand units is associated with the total number of vertices, the number of monotone chains, and the number of intersections. Because facility coverage is considered to be a circle and its polygon approximation has 65 vertices, the total number of vertices is equal to 65 * n. The number of monotone chains is twice as large as that of potential sites because every circle can be decomposed into two monotone chains (see Park and Shin 2002 for details). Therefore, the running time for polygon overlay is dependent on the number of potential sites (n) and the number of intersections (k). The processing time is plotted against n and k in Fig. 4 to show how the algorithm behaves as problem size grows. We can observe that the processing time increases as n and k increase (consistent with the complexity analysis presented previously), with an expected running time of O(nlogn + klogn). Given that precomputing k is computationally efficient, a reasonable estimate of processing time based on k and n is computable.

Figure 4.

Empirical results of computational complexity associated with overlay processing.

As mentioned, a LSCP problem size, mostly determined by the number of sites and number of demand units, is closely related to the computational feasibility of solving it. As shown in Table 1-4, all problems having a size larger than 2 × 108 cannot be optimally solved. This outcome demonstrates that problem size is important in determining whether a model is likely to be solved using a commercial solver. Equation (9) provides a bound for the number of demand units that will be generated by overlaying facility coverage polygons. This bound can be utilized to estimate problem size. Fig. 5 depicts the differences between the actual number of demand units and the upper bound defined by equations (8) and (9). The horizontal axis is the actual number of demand units, and the vertical axis represents the upper bound divided by the actual number of demand units. Using the upper bound defined by equation (9), the ratios range from 1.18 to 1.67, which are much smaller than those derived using equation (8), which vary from 1.71 to 28.8. Therefore, the bound given by equation (9) can generally provide a much more accurate prediction of problem size than can the theoretical bound defined by equation (8). The computational feasibility of solving the coverage model associated with a polygon overlay representation of a demand region can be more precisely estimated using the bound furnished by equation (9).

Figure 5.

Differences between the actual number of demand units and the upper bound defined by equations (8) and (9).

Discussion and conclusions

The application results demonstrate that the polygon overlay-based approach for addressing representation error in coverage models is computationally intensive and can become infeasible as problem size grows, even when using a state-of-the-art GIS and commercial solver. For example, when the number of intersections is larger than 1 × 106, the overlay operation fails; when the product of the number of potential sites and the number of demands exceeds 2 × 108, the corresponding LSCP is unlikely to be solved using commercial software. The theoretical analysis of computational complexity associated with overlaying facility coverage suggests that the processing time is dependent on the number of potential sites and their intersections. In addition, the number of generated demand units can be bounded by a function of the number of potential locations and the maximal number of intersections per coverage.

Several issues merit further discussion. The first is associated with the upper bound derived for the number of demand units. The bounds described by equations (8) and (9) assume a demand region to be a large plane containing all coverage circles. In practice, some coverage circles are not completely within a demand region and could intersect with the region's boundary, resulting in more demand units. Alternatively, the overlay units outside of a region's boundary should be excluded, leading to a reduction in needed demand units. This increase/decrease of demand units is difficult to predict unless the overlay operation is performed. Our application results indicate that accounting for a demand region usually renders fewer demand units, especially for larger sized problems since coverage extends outside of the regional boundary. The upper bound still applies.

The second issue is that the polygon overlay-based approach is generally more computationally challenging relative to other abstraction approaches utilized in previous work (see Murray and O'Kelly 2002; Murray, O'Kelly, and Church 2008; Murray and Wei 2013). For instance, the LSCP problem with 1,151 potential facility locations was optimally solved to completely cover the city of Dublin in Murray and O'Kelly (2002), but as reported in Tables 1 and 2, LSCP problems with more than 1,000 potential facility sites cannot be optimally solved. This outcome is attributable to the tremendous number of demand units that result using the overlay approach. The benefit of the overlay approach is that no representation error exists. In other words, the overlay approach produces an extremely detailed representation scheme for continuous demand that can eliminate representation errors, but its accompanying computational cost is not negligible. Moreover, problem instances may result that simply are not solvable using exact methods. Alternatively, Cromley, Lin, and Merwin (2012) maintain that heterogeneously distributed demand can readily be accounted for by using the overlay approach. Beyond this, potential for reduction-based techniques may exist, like that proposed in Toregas and ReVelle (1973), to enhance solution capabilities, but this possibility remains for future research.

Although the applications presented here use Euclidean distance as the coverage standard, the polygon overlay-based approach remains applicable when other metrics are used, like network distance or travel time (see Gutiérrez and García-Palomares 2011; Chevalier et al. 2012). The difference is that the coverage area is no longer a circle but rather an irregular polygon. This alteration generally increases the computational complexity of overlay operations. In addition, more demand units could be expected from overlaying irregular polygons, resulting in more computational time to solve the corresponding coverage models. Further research is necessary to better understand the potential impacts of nonbinary coverage of irregular polygons.

Although the polygon overlay-based approach is theoretically able to eliminate representation errors in coverage modeling, other errors can still exist in modeling results. Given the computational intensiveness of polygon overlay, many approximations and precision issues exist in the computational process, not to mention the errors in input data, such as inaccuracies of potential facility locations. These errors or uncertainties most likely impact the modeling results.

This article presents a theoretical and empirical analysis of the computational complexity required by representation using polygon overlay to support coverage modeling. Although the spatial representation scheme derived using polygon overlay is theoretically error free, computational limitations exist when applying this approach in coverage modeling. Both the overlay operation and the resulting coverage model solution require significant computational effort. The overlay operation has an expected running time of O(ulogv + klogv), which may not be viable due to a large number of potential sites or many coverage intersections. Furthermore, the size of the resulting coverage model is likely to exceed the computational capabilities of current commercial software as potential sites and demand units (identified using polygon overlay) increase. The polygon overlay approach for addressing abstraction and representation issues in coverage modeling is likely to be less of a concern in terms of computational complexity as computing technology advances; but, until then, limitations in application are likely to be encountered in practice.

Acknowledgements

Ran Wei acknowledges the support from the 2012–2013 Benjamin H. Stevens Graduate Fellowship in Regional Science and from an Arizona State University Graduate College Completion Fellowship.

Ancillary