Corresponding author: A. Ostfeld, Faculty of Civil and Environmental Engineering, Technion–Israel Institute of Technology, 32000 Haifa, Israel. (firstname.lastname@example.org)
 Reliability in general, and in water distribution systems in particular, is a measure of probabilistic performance. A system is said to be reliable if it functions properly for a given time interval and within boundary conditions. Although water distribution system reliability has attracted considerable research attention over the last three decades, there is still no consensus on what reliability measures or evaluation methodologies should be used for the design/operation of water distribution systems. No system is perfectly reliable. In every system undesirable events—failures—can cause a decline or interruption in system performance. Failures are of a stochastic nature and are the result of unpredictable events that occur in the system itself and/or in its environs. A least cost design problem with normal design loadings will result in the cheapest system, but this system will have minimum residual capacity. However, if an increased loading (i.e., higher than the normal design) is implemented, the system's capacity will be increased, thus improving its residual capacity. Finding this “virtual increased loading,” which results in a minimum cost residual system capacity that sustains a required reliability level, is the essence of the proposed methodology, which follows decomposition. The methodology is demonstrated on two example applications of increasing complexity. The main limitation of the suggested method for further extensions to real sized water distribution systems is the computational effort associated with the computation of the “inner” problem. Exploring the required computational burden divided between the “outer” and “inner” problems is a major challenge for future elaborations of this approach.
 Reliability considerations are an integral part of all decisions regarding the planning, design, and operation phases of water distribution systems. Quantitatively, the reliability of a water distribution system can be defined as the complement of the probability that the system will fail, where a failure is defined as the system's inability to supply its consumers' demands.
 A major problem, however, in reliability analysis of water distribution systems is defining reliability measures that are meaningful and appropriate, and computationally feasible. While the question, “Is the system reliable?”, is usually understood and easy to follow, the question, “Is it reliable enough?”, does not have a straightforward response, as it requires both the quantification and evaluation of reliability measures. Much effort has already been invested in reliability analysis of water supplies. These examinations, however, still commonly follow heuristic guidelines like ensuring two alternative paths to each demand node from at least one source, or having all pipe diameters greater than a minimum prescribed value. By using these guidelines, it is implicitly assumed that reliability is assured, but the level of reliability provided is not quantified or measured.
 This study presents a new approach for reliability inclusion in the design and operation of water distribution systems through decomposition. Utilizing decomposition, an optimization problem is partitioned into “outer” and “inner” problems, the “outer” being in a much smaller space than the “inner.” The methodology is demonstrated on two example applications of increasing complexity.
2. Literature Review
 Reliability of water distribution systems gained considerable research attention over the last three decades. This research has concentrated on methodologies for reliability assessment and for reliability inclusion in optimal design and operation of water supply systems. This section provides a summary of these efforts.
2.1. Reliability Evaluation Models
Shamir and Howard  were the first to propose analytical methods for water supply system reliability. Their methodology took into consideration flow capacity, water main breaks, and maintenance for quantifying the probabilities of annual shortages in water delivery volumes. Vogel  suggested the average return period of a reservoir system failure as a reliability index for water supply. A Markov failure model was utilized to compute the index, which defined failure as a year in which the yield could not be delivered. Wagner et al. [1988a] proposed analytical methods for computing the reachability (i.e., the case in which a given demand node is connected to at least one source) and connectivity (i.e., the case in which every demand node is connected to at least one source) as topological measures for water distribution systems reliability. Wagner et al. [1988b] complemented Wagner et al. [1988a] through stochastic simulation in which the system was modeled as a network whose components were subject to failure with given probability distributions. Reliability measures such as the probability of shortfall (i.e., total unmet demand), the probability of the number of failure events in a simulation period, and the probability of interfailure times and repair durations were used as reliability criteria. Bao and Mays  suggested stochastic simulation by imposing uncertainty in future water demands for computing the probability that the water distribution system will meet these needs at minimum pressures. Duan and Mays used a continuous-time Markov process for reliability assessment of water supply pumping stations. They took into consideration both mechanical and hydraulic failure (i.e., capacity shortages) scenarios, all cast in a conditional probability frequency and duration analysis framework.Jacobs and Goulter  used historical pipe failure data to derive the probabilities that a particular number of simultaneous pipe failures will cause the entire system to fail. Quimpo and Shamsi  employed connectivity analysis strategies for prioritizing maintenance decisions. Bouchart and Goulter  developed a model for optimal valve locations to minimize the consequences of pipe failure events, recognizing that in reality, when a pipe fails, more customers are isolated than those situated at the pipe's two ends. Jowitt and Xu  proposed a microflow simplified distribution model to estimate the hydraulic impact of pipe failure scenarios. Fujiwara and Ganesharajah explored the reliability of a water treatment plant, ground-level storage, a pumping station, and a distribution network in a series, using the expected served demand as the reliability measure.Vogel and Bolognese developed a two-state Markov model for describing the overall behavior of water supply systems dominated by carryover storage. The model quantifies the trade-offs among reservoir system storage, yield, reliability, and resilience.Schneiter et al.  explored the system capacity reliability (i.e., the probability that the system's carrying capacity is able to meet flow demands) for enhancing maintenance and rehabilitation decision making. Yang et al. [1996a]employed the minimum cut-set method for investigating the impact of link failures on source-demand connectivity.Yang et al. [1996b] complemented the reliability connectivity model of Yang et al. [1996a] with Monte Carlo simulations for pipe failure impact assessments on a consumer's shortfalls. Xu and Goulter  developed a two stage methodology for reliability assessment of water distribution systems using a linearized hydraulic model coupled with probability distributions of nodal demands, pipe roughnesses, and reservoir/tank levels. Fujiwara and Li  suggested a goal programming model for flow redistribution during failure events for meeting customers' equity objectives. Tanyimboh et al. used pressure-driven simulation to compute the reliability of single-source networks under random link failures.Ostfeld et al.  applied stochastic simulation to quantify the reliability of multiquality water distribution systems, using the fraction of delivered volume, demand, and quality as reliability measures. Shinstine et al. coupled a cut-set method with a hydraulic steady state simulation model to quantify the reliability of two large-scale municipal water distribution networks.Ostfeld classified existing reliability analysis methodologies and compared two extreme approaches for system reliability assessment: “lumped supply-lumped demand” versus stochastic simulation.Tolson et al.  used the same approach as Xu and Goulter for optimizing the design of water distribution systems with capacity reliability constraints by linking a genetic algorithm (GA) with the first-order reliability method (FORM).Ostfeld  complemented the study of Ostfeld and Shamir  by designing a methodology for finding the most flexible pair of operational and backup subsystems as inputs for the design of optimal reliable networks. Recently, Torii and Lopez utilized first-order reliability methods in conjunction with an adaptive response surface approach for analyzing the reliability of water distribution systems; andTanyimboh et al.  compared the surrogate measures of statistical entropy, network resilience, resilience index, and the modified resilience index for quantifying the reliability of water networks.
2.2. Reliability Inclusion in Optimal Design and Operation of Water Supply Systems
Su et al. were the first to incorporate reliability into least cost design of water distribution systems. Their model established a link between a steady state one loading hydraulic simulation, a reliability model based on the minimum cut-set method [Tung, 1985], and the general reduced gradient GRG2 [Lasdon et al., 1984] for system optimization. Ormsbee and Kessler  used a graph theory methodology for optimal reliable least cost design of water distribution systems for creating a one level system redundancy (i.e., a system design that guarantees a predefined level of service in case one of its components is out of service). Khang and Fujiwara  incorporated minimum pipe diameter reliability constraints into the least cost design problem of water distribution systems, showing that at most two pipe diameters can be selected for a single link. Park and Liebman  incorporated into the least cost design problem of water distribution systems the expected shortage of supply due to failure of individual pipes. Ostfeld and Shamir  used backups (i.e., subsystems of the full system that maintain a predefined level of service in case of failure scenarios) for reliable optimal design of multiquality water distribution systems. Xu and Goulter coupled the first-order reliability method (FORM), which estimates capacity reliability, with GRG2 [Lasdon et al., 1984] to optimize the design of water distribution systems. Ostfeld developed a reliability assessment model for regional water supply systems, composed of storage-conveyance analysis in conjunction with stochastic simulation.Afshar et al.  presented a heuristic method for the simultaneous layout and sizing of water distribution systems using the number of independent paths from source nodes to consumers as the reliability criterion. Farmani et al.  applied for Anytown USA [Walski et al., 1987], a multiobjective evolutionary algorithm for trading off cost and the resilience index [Todini, 2000] as a reliability surrogate. Dandy and Engelhardt  used a multiobjective genetic algorithm to generate trade off curves between cost and reliability for pipe replacement decisions. Agrawal et al.  presented a heuristic iterative methodology for creating the trade off curve between cost and reliability (measured as a one level system redundancy) through strengthening and expanding the pipe network. Reca et al.  compared different metaheuristic methodologies for trading off cost and reliability, quantified as the resilience index [Todini, 2000]. van Zyl et al.  incorporated reliability criteria for tank sizing. Duan et al.  explored the impact of system data uncertainties such as pipe diameter and friction on the reliability of water networks under transient conditions. Ciaponi et al.  introduced a simplified procedure based on the unavailability of pipe for comparing design solutions with reliability considerations.
 Reliable systems are those able to sustain a required level of service in case of failures. In designing a reliable system, both topological redundancy and residual capacity must be assured. Topological redundancy is provided through layout connectivity measures (e.g., reachability [Wagner et al., 1988a]). Residual capacity is provided using engineering heuristics (e.g., minimum pipe diameters or pumping unit's power) and/or modeling (e.g., a one level redundancy [Ormsbee and Kessler, 1990; Park and Liebman, 1993; Ostfeld and Shamir, 1996]). While topological redundancy is relatively easy to accomplish, guaranteeing residual capacity is the main challenge. This is due to the fact that the existence of a path between two nodes in a distribution system only provides necessary conditions for supplying a required level of service.
 Engineering systems such as structure truss or transportation demonstrate that for a given system's layout (i.e., connectivity), its capacity (e.g., number of vehicle lines in a transportation network) results from the imposed system's loading—the higher the loading, the greater the resulting system's capacity. A least cost problem with normal design loadings will result in the cheapest system, but this system will have minimum residual capacity. However, if an increased loading (i.e., higher than the normal design) is implemented, the system's capacity will be increased, thus improving its residual capacity. Finding this “virtual increased loading”, which results in a minimum-cost residual system capacity that sustains a required reliability level, is the essence of the proposed methodology which follows decomposition. This framework implies decomposition between the loadings (i.e., the imposed demands) and the least cost design and operation problem. Below is a formal description of decomposition followed by its utilization in this study.
 Decomposition is the split of an optimization problem of the form given in equation (1) to the form presented in equation (2):
where = decision variables; f(x, y), h(x) = objective functions; X = a nonempty subset of Rn; and = a nonempty subset of Rm defined in equation (3), where gi(x, y) = the ith constraint, and p = the total number of constraints
 Decomposition formulates an “inner” and an “outer” problem. The “inner” problem is the minimization of f(x, y) within the subset over y for a fixed value of x, while the “outer” problem is the minimization of h(x) over the subset X with respect to x. h(x) is entitled the optimal value function, while gi(x, y) forms the constraints of the “inner” problem.
where = vector of the circular flows (same dimension as the basic water distribution system's loops); Q = sub space of the circular flows; c = unit cost vector of the candidate pipe diameters; q0 = an initial vector of flows satisfying mass continuity at nodes (i.e., Kirchoff's Law No. 1); u = vector of length of the candidate pipe diameters; L, P, J, I = basic loop, path, hydraulic gradient, and identity matrices, respectively; and b1, b2= right-hand side parameter vectors, respectively.Equation (5) represents continuity of energy constraints over closed loops (i.e., Kirchoff's Law No. 2); equation (6)—minimum pressure head constraints, and equation (7)—length constraints.
 Using the above formulation is the “outer” optimal value function, which is highly nonlinear and nonsmooth (for example, see the surface in Loganathan et al. [1995, Figure 2]); where the “inner” problem minimizes cTu subject to the domain defined in equations (5)–(7).
3.2. Model Formulation
 The model formulated and solved in this study is defined in (8). The reader is referred to the general formulation of the water distribution system's reliability problem prior to decomposition as given in Su et al. [1987, equations (1)–(5)].
where α = base demands coefficient vector (decision variables), αmax = upper bound for α, = optimal value function [see Ostfeld and Shamir, 1996], = water distribution system's least cost design problem (where F = objective function, = decision variables, and = feasible domain), = optimal solution, and = outcome of a reliability evaluation model receiving the value of zero if reliability criteria are met or a penalty proportional to the extent of not fulfilling reliability measures.
 The model defined in equation (8) has three interconnected parts: (1) ; (2) ; and (3) . Each of those is first described followed by the proposed overall solution scheme.
3.2.1. Least Cost Design Problem
 is the least cost design problem of a multiple loadings looped pumping and storage water distribution system according to Ostfeld and Tubaltzev :
where Zi (di) = unit length cost ($/m) of candidate pipe diameter (mm); np = number of links (pipes); and Li = the length of link i (m).
 Pump operational cost ($) (POC):
where POC = pump operational cost ($); AD = annual duration (365 days); APPV = annual pumps present value coefficient (AI = annual interest rate, TPLD = total pumps life duration (years)), = 0.746/270—units conversion factor (kW-h/m4); = 0.75—pump efficiency, assumed constant (-); npump = number of pumping units;T = 24 (h d−1); i = time duration index (i = 1, 2, … , 24); = energy tariff at time duration i($/kW-h); = head added by pumping unit j at time duration i (m); and = flow supplied by pumping unit j at time duration i (m3 h−1).
 Pump construction cost ($) (PUCC):
where CPUMP = unit power cost of pump construction ($/HP) (1 Horsepower (HP) = 0.746 Kilowatts (kW)).
 Tank construction cost ($) (TCC):
where ntank = number of tanks; = unit water level cost of tank i ($/m); = diameter of tank i (m); and = the water level at tank i at the end of the kth time step (loading) (m).
 Model constraints are defined in equations (15)–(17) for head, maximum permitted amounts of water withdrawals for each of the sources, and tank closure, respectively:
where = total head at consumer node j at time step i (m); , = minimum and maximum allowable total head requirements at consumer node j, respectively (m); cn = consumer node; and ncn = number of consumer nodes.
where = total amount of permitted water withdrawal annually from source node j (m3); and nsn = number of source nodes.
where , = the jth tank initial and final storage water volumes, respectively (m3); = a user selected tolerance number. Note that if = 0 and ntank = 1, the initial and final tank volumes, (and thus, the tank water levels) coincide. It is assumed that the tank locations are set in advance, so the optimal tank layout is not considered in the present study.
220.127.116.11. Decision Variables
 The model decision variables are the pipe diameters [d (size np)]; and the power required at each pumping unit j, at each time duration i: (i = 1,…,T).
 is set to zero if the system is feasible in terms of reliability (i.e., meets required reliability criteria). For instance, may be set to zero if the outcome of a stochastic simulation model yields a probability of less than 2% for receiving pressures below 30 m for a duration longer than 2 h at any demand node. Otherwise, a penalty is imposed that is proportional to the extent of not meeting this reliability level.
 Herein is minimized on . resulted from decomposition, which has already been shown to be highly nonlinear and nonsmooth [Loganathan et al., 1995]. The objective function nonlinearity and nonsmoothness will also be demonstrated in section 4.
3.2.4. Solution Scheme
 The overall solution scheme for the model given in equation (8) is schematically described in Figure 1. It consists of solving the two linked “outer” and “inner” problems using the ant colony algorithm implemented in Ostfeld and Tubaltzev . In the “outer” problem, each node represents a solution of the “inner” problem, which corresponds to an extended demand pattern and reliability model outcome. In the “inner” problem, the least cost design problem is solved. The complete ant colony algorithm formulation appears in Ostfeld and Tubaltzev [2008, Figure 2]. Thus, it does not appear again here.
4. Example Applications
 The methodology is demonstrated on two example applications from Ostfeld and Tubaltzev  through base runs and sensitivity analyses: Example 1—simple illustrative, and Example 2—Anytown USA [Walski et al., 1987]. Both networks were hydraulically simulated using EPANET (U.S. Environmental Protection Agency, EPANET 2.0, 2002, http://www.epa.gov/ORD/NRMRL/wswrd/epanet.html). The ant colony “inner” and “outer” algorithms were self coded in Visual Basic 6.0.
4.1. Example 1
 The system is shown in Figure 2. It consists of 11 pipes, two pumping stations (P1 and P2), and 10 nodes: three constant head source nodes (S1, S2, and S3), one elevated tank, and six consumers (A … F). The optimization is performed for a typical day partitioned into four loading conditions, and starting at 12 midnight.
4.1.1. Base Run
Table 1 shows the node data: elevations, base demands, and minimum pressure requirements. Table 2 describes the distribution of the demand and energy tariff coefficients for the four loading conditions considered. Table 3 provides the unit cost of the candidate pipe diameters, all with an assumed Hazen Williams coefficient of 130, and Table 4 presents the algorithm parameters [Ostfeld and Tubaltzev, 2008]. The length of all pipes is 1000 (m), except for pipe 11: 1100 (m) and pipe 10: 100 (m); the tank initial level is 2 (m) with an assumed diameter of 36 (m); AD = 365 (days); = 20,000 (m3), j= S1, S2, S3; APPV = 10.04 (AI = 5.5%, TPLD = 15 years); CPUMP = 3200 ($/HP) [2387 ($/kW)]; Based energy tariff = 0.08 ($/kW-h); UTC = 40,000 ($/m);ε = 1.0 × 10−5; and η = 0.75. Minimum pressure of 24 (m) under any one link failure between 12:00 to 18:00 h is considered the residual system reliability requirement. Results for the best base run (BR) solution attained from multiple trials are given in Figure 3. Figure 4 shows the base run solution progress for the total power, pipes, pumps, and tank creation costs. This describes a typical behavior of the algorithm in a “stair like” progress. Most reductions to the total cost of the system are made by improving the pump construction and operational costs, less the tanks and pipes (note that the cost of pipes even increases in the best attained solution). The computational time required for a single trial was ∼50 h on an IBM ThinkPad X200, Intel® Duo CPU P8400@2.26 GHz, 2.96 GB of RAM.
Table 1. Elevation, Base Demand, and Minimum Pressure Requirements for Example 1
 It can be seen from Figure 3 that the least cost optimal reliable solution is 4.605 × 106 ($) which is about 18.5% higher than the least cost design solution of Ostfeld and Tubaltzev  of 3.885 × 106($). The optimal reliable solution incorporates an increase in almost all pipe diameters and in the tank volume. It is interesting to note that the pump creation cost is unchanged, as is the switch of the maximum pumping power between sources 1 and 2. A minimum pressure of 24.98 (m) is attained at node F in case link E-F fails. A detailed breakdown of the optimal reliable base run solution and its comparison to the least cost design solution ofOstfeld and Tubaltzev  are given in Table 5.
Table 5. Base Run and Five Sensitivity Analysis Results for Example 1
Most Binding Pressure Constraint (m) at Failure Mode
BR = base run; SA1 = sensitivity analysis 1; NA = not applicable; 1.131 (25) = cost of $1.131 × 106 (25% of total cost); 24.98@F, E–F = most binding pressure constraint of 24.98 m at consumer node F, occurring in failure mode of pipe connecting nodes E to F.
 The objective of the sensitivity analysis in this study is to explore the model response to modifications made to the data, the physical system, or to the solution scheme parameters. The goal is to see if the model is behaving in an explainable manner thus strengthening the confidence of using it as a decision support tool. Example 1 incorporates such sensitivity analyses, where example 2 is targeted toward exploring the solution quality versus computational effort.
Table 5 provides five sensitivity runs aimed at testing the model's response to data modifications. In sensitivity analysis 1 (SA1), the failure duration was increased from 10:00 to 20:00 (12:00 to 18:00 h at the BR), resulting in an increase of the total cost to 5.176 × 106($). The minimum pressure attained was 26.44 (m) at node F in case link C-E failed. In SA2, the ant colony iteration number of the “inner” problem was reduced to 1000 (3000 at the BR), which results in a total cost increase of 5.127 × 106 ($). In SA3, the linear penalty incurred on infeasible solutions for the BR is modified to cubic (i.e., penalty coefficient B = 2, see Table 4), resulting in a solution increase of 5.167 × 106 ($). In SA4, the pheromone table size is set to 10 (20 in the BR), causing a total cost solution of 4.915 × 106 ($) and a modification in the minimum pressure location to node A. In SA5, the iteration one ant colony coefficient (see Table 4) is reduced to 5 (10 in the BR), yielding a similar result to SA1 – SA3 of 5.162 × 106 ($). Figure 5 shows the extended “inner” problem surface, demonstrating the nonsmoothness surface nature of the optimal value function (i.e., , see equation (8), and Loganathan et al. ). It was constructed by solving the example application with base demands at QC and QD. The plot shows the total system cost as a function of QC and QD with a minimum attained value of 2.068 × 106 ($) corresponding to extended demands of 257.4 and 168 (m3 h−1) at consumer nodes C and D, respectively. The plot describes multiple minima and maxima and the highly nonsmoothness nature of the objective function surface. The attained result represents a good solution to this problem as seen in Figure 5.
4.2. Example 2
 The purpose of Example 2 is to demonstrate the proposed methodology on a more complex “near real” world water distribution system, and to compare its results to a known optimal design solution with no reliability considerations [i.e., Ostfeld and Tubaltzev, 2008].
 The layout of Example 2 is given in Figure 6, which is a slight modification to Anytown USA [Walski et al., 1987], following Ostfeld and Tubaltzev . This modification contains an additional source (S1) connected to node 16, and a tank connected to node 4. Anytown consists of 34 pipes, two pumping stations (P1, P2), and 19 nodes: 14 consumer nodes, two internal nodes with zero demands (nodes 16 and 17), two constant head sources (S1, S2), and one elevated storage tank. The optimization starts at 12 midnight and is conducted for 24 h, partitioned into four loading conditions. The demand pattern coefficients and energy tariffs, as well as the unit length cost data of the candidate pipe diameters, appear as in Example 1 (i.e., Tables 2 and 3, respectively). The node data is provided in Table 6.
Table 6. Elevation, Base Demand, and Minimum Pressure Requirements for Example 2
 Additional data: the length of pipes 1, 2, 3, 30, and 32 is 3000 (m), of pipe 34, 100 (m). The rest are 1000 (m) in length. The tank's initial level is 10 (m) with an assumed diameter of 36 (m); AD = 365 (days); = 100,000 (m3), j= S1, S2; APPV = 10.04 (AI = 5.5%, TPLD = 15 years); CPUMP = 3200 ($/HP) (2387 ($/kW)). The based energy tariff = 0.08 ($/kW-h); UTC = 40,000 ($/m);ε = 1.0 × 10−5; and η = 0.75. A minimum pressure of 24 (m) under any one link/pump failure between 10 a.m. to 10 p.m. is considered the residual system's reliability requirement. The algorithm parameters used for this example are described in Table 7.
Table 7. Ant Colony Algorithm Parameters for Example 2
 The computational time required for a single trial on an IBM ThinkPad X200, Intel® Duo CPU P8400@2.26 GHz, 2.96 GB of RAM, varied according to the ant colony “inner” iterations number (see Table 7). For 100 ants the computational time was ∼3.2 h, for 200 ∼6.4 h, and for 500 ∼16 h. Note that for Example 1, a single trial took ∼50 h on the same machine, but this is based on a different set of parameters (see Table 4). Applying the suit of parameters Example 1 to Example 2 results in a single run duration of about 9.2 days. In Example 2, the computational time for a single run was intentionally reduced substantially to explore statistics on multiple runs of the algorithm with different computational efforts. The analysis was made by modifying the “inner” problem optimization effort through the ant colony “inner” iterations number. Other means of computational effort modifications (e.g., changing the “inner” or “outer” ants number) could also be explored. The run results (15 repetitions, 5 successful (i.e., feasible) for each “inner” problem iterations number modification) are summarized in Table 8, and are further discussed below.
Table 8. Multiple Runs Sensitivity Analysis for Example 2a
Ant Colony “Inner” Problem Iterations Number (see Table 7)
Cost ($ × 106)
Most Binding Pressure Constraint (m) at Failure Mode
Here, 3.575 (18) = cost of $3.575 × 106 (18% of total cost); 29.7@4, P2 = most binding pressure constraint of 29.7 m at consumer node 4, occurring in failure mode of pump P2.
Best Cost, Average Cost, % of Successful (Feasible) Trial Runs: 45.230, 57.166, 20 (e.g., 2 out of 10)
Best Cost, Average Cost, % of Successful (Feasible) Trial Runs: 41.678, 48.356, 80 (e.g., 8 out of 10)
4.2.1. Run Results
 The run outcomes for Example 2 are summarized in Table 8 and in Figures 7–11. Table 8 shows statistics of multiple runs of the algorithm, each with a different ant colony “inner” iterations number: 100, 200, and 500, respectively. Table 8 shows that the best solution is attained at run number 9 with 200 “inner” ant colony iterations. It also demonstrates that as the number of “inner” ant colony iterations increase, the average result of the best ant colony improves: from $57.166 × 106 for 100 ants, to $52.829 × 106 for 200, and $48.356 × 106 for 500 (see Figure 7). In addition, as the number of “inner” ant colony iterations increase so does the likelihood of receiving a successful (i.e., feasible) trial. For example, about eight runs with 100 “inner” ant colony iterations were required to receive one feasible outcome trial.
 Almost all solutions identify consumer node 4 as the critical node, and pump P2 as the critical system component. Also note that in all runs, the percentages of total construction and operational costs are about 60% and 40%, respectively.
 An engineering a priori estimate that node 4 (the highest elevated consumer node in the system) in conjunction with one of the sources (i.e., pumps P1 or P2) outage would form the critical/extreme failure scenario for which backup capacity is required, would probably be the foremost engineering intuitive guess. This is verified in almost all runs. However, run 9, having the lowest overall cost, points to node 10 as the critical node, with an overall cost fraction of 0.56 for construction and 0.44 for operation. This is definitely not an easy engineering guess which the proposed methodology revealed, thus reducing the overall cost of the system.
Figure 8 shows detailed design results of run number 9, emphasizing all increased pipe diameters. Note that the overall cost is about twice as high as the best ant colony solution of Ostfeld and Tubaltzev  with no reliability considerations (i.e., $40.399 × 106 versus $20.356 × 106). In addition, at nodes 6 and 11, the base demand coefficient expansion factor is binding (i.e., it is extended to its maximum value of 1.96, see Table 7). This indicates that a better solution might be found if the base demand coefficient expansion factor would be increased.
Figure 9 describes detailed hydraulic flows and total heads of run number 9 at 5 p.m., where the lowest pressure of 28.6 (m) at node 10 is attained. Figure 9 shows that pump P1 is at its maximum power of 3542 Horsepower. The tank is supplying 1320 (m3/h), the maximum tank flow withdrawal during the entire simulation.
Figure 10 details the best obtained design results for the case of no possible pump/source failures. Figure 10 demonstrates that the overall cost is reduced to $30.596 × 106, and the critical node is consumer node 8 if pipe 6 is out of service. The minimum attained pressure is 27.8 (m) from between 12:00 p.m. to 5:00 p.m.
Figure 11 shows a tradeoff curve between cost and failure durations. It shows that the overall cost of the system increases nonlinearly as the failure durations increase. The results for Figure 11 (and for Figure 10) are the best received after several runs of the algorithm with different “inner” ant colony iterations figures.
 This study presented a conceptual decomposition scheme for optimal reliable design and operation of water distribution systems. The decomposition divides the problem into “outer” and “inner” formulations with the consumer's demands serving as the decomposition variables. Within the “inner” problem the least cost design problem is solved and simulated to meet reliability constraints, while within the “outer” problem the optimal increase in demands which results in a reliable system at minimum cost is searched. The methodology is demonstrated on the two example applications described in Ostfeld and Tubaltzev .
 The two explored example applications were analyzed through base runs and sensitivity analyses, revealing, in all cases, explainable solution outcomes. Computational intensity as a barrier for solving real sized water distribution system problems is definitely an important issue, which demands further elaboration when using any evolutionary computation based technique, including ant colony algorithm used in this study. However, even with a limited computational effort, as demonstrated for Example 2, good logical solutions were revealed in all runs, one of which was an improved nontrivial reduced cost outcome.
 There is no claim that any of the solutions obtained is optimal, not even locally, as with any heuristic evolutionary scheme. Nevertheless, from a practical/engineering perspective, the objective is to identify better solutions than those from other traditional methods such as good engineering judgment or trial and error. The methodology suggested in this work provides such an instrument.
 Still, further research is warranted to quantify the computational complexity growth and quality of the solutions that result as system size increase. These solutions should be addressed by exploring additional example applications of different size, layout, and loading.
 This research was supported by the Fund for the Promotion of Research at the Technion, and by the Technion Grand Water Research Institute (GWRI). The coding assistance of Ariel Tubaltzev is highly acknowledged.