Minimizing the network availability upgrade cost with geodiversity guarantees

As telecommunication networks are a critical infrastructure of our society, they must evolve to provide high end-to-end availability and high resilience to large-scale disasters. Path protection mechanisms can improve end-to-end availability but, in general, might not be enough to reach the availability required by critical services. Moreover, adding geodiversity to the routing paths (i.e., selecting path pairs with higher geographical distance between them) enhances the network disaster resilience but also makes it more challenging to reach a high end-to-end availability as the resulting paths tend to be longer. So, for a network where each link is characterized by its current availability and by the cost of upgrading its availability to a new value, this paper proposes some strategies aiming to determine a set of links to be upgraded at a minimum cost ensuring a desired level of availability and geodiversity. The problem is defined as an integer non-linear programming model, a solving algorithm based on different greedy strategies is proposed and the relative performance of the different strategies is evaluated on a set of problem instances.


I. INTRODUCTION
It is well known that telecommunication networks are currently one of the critical infrastructures upon which our society depends and services are expected to be always-on. Moreover, disaster-based failures are becoming more frequent in time and wider in scope, degrading drastically the services supported by telecommunication networks [1] (a survey of strategies to protect networks against large-scale natural disasters is presented in [2]). Thus, it is imperative that telecommunication networks become both highly available and resilient to disasters.
Networks need to guarantee that all node pairs of interest (i.e., node pairs involved in critical services such as emergency calls, smart grid communications, financial transactions, etc.) have a high end-to-end availability [3], [4], [5]. Moreover, when a disaster-based failure occurs, it is important not only to quickly recover the network in the disaster area but also to minimize the disaster impact for node pairs outside the disaster area. Here, we explore the idea of path geodiversity, i.e., how to take into consideration the geographical diversity of the network topology when making routing decisions. Path geodiversity has been used to improve services availability [6] and disaster resilience [7].
Consider a network topology such that the geographical distance between any two network elements (nodes or links) is known. Like in [8], [9], we consider a path geodiversity strategy where the routing between two network nodes is based on two paths geographically separated by at least a minimum distance D, so that, if a disaster with a geographical coverage whose diameter is lower than D affecting one intermediate element of one path cannot affect the other path. Here, we follow [9] which considers intermediate elements both as nodes and links, while [8] considers only intermediate nodes.
Note that the larger the value D is, the more resilient the network is to disasters but the more difficult it becomes to reach high end-to-end availability since the resulting paths tend to be longer.
In this work, we consider jointly the end-to-end availability and the disaster resilience of a given network topology. We address the network availability upgrade problem where the availability of some links must be upgraded to provide the required end-to-end availability and disaster resilience for a set of node pairs of interest. In [10], the authors also select a set of links to be upgraded (shielded, so that they become invulnerable). However only end-to-end connectivity against geographic based attacks is ensured, without considering availability requirements.
The upgraded network must guarantee that each node pair of interest is provided with at least one path pair fulfilling two requirements. Concerning end-to-end availability, at least one path pair with a minimum target availability of Λ must be provided to each node pair. As pointed out in [9], the maximum geodiversity value that can be provided by a path pair to a given node pair is constrained by the geographical locations of the nodes and the geographical paths of the links. Hence, concerning disaster resilience, a target geodiversity value D with a soft requirement is used: at least one path pair with a minimum geodiversity of D must be provided to each node pair whose maximum geodiversity value is at least D; for node pairs such that the network topology cannot provide D, the maximum possible geodiversity value is used.
The paper is organized as follows. In section II, the network availability upgrade problem is described. In section III, a solving algorithm based on different greedy strategies is proposed. Section IV presents the computational results comparing the efficiency of the different greedy strategies. Finally, section V presents the main conclusions, along with some further work.

II. PROBLEM DEFINITION
Consider two given parameters: a minimum availability parameter Λ and a minimum geodiversity parameter D. Consider a given biconnected network defined by an undirected graph G = (N, E) where N is the set of nodes and E is the set of edges representing node pairs connected by a direct link. Each edge e ∈ E is characterized by its current availability a e , its upgraded availability a u e and the cost c e required to upgrade its availability from a e to a u e . The aim is to determine a set of edges to be upgraded at a minimum cost. For a given set of node pairs of interest K, the upgraded availability solution must guarantee the existence of at least one pair of paths for each node pair (s, t) ∈ K with (i) a minimum availability of Λ and (ii) a minimum geodiversity of D, if the topology of G allows it, or the maximum possible geodiversity value if it is lower than D.
For each node pair (s, t) ∈ K, consider the set of all pairs of paths available in G defined by R st where each path pair r ∈ R st is defined by two sets of edges: S r1 (the edge set of the first path of r) and S r2 (the edge set of the second path of r). Then, consider the following binary variables: x e is equal to 1 if edge e ∈ E is upgraded; 0 otherwise. y str is equal to 1 if node pair (s, t) ∈ K may be provided with path pair r ∈ R st ; 0 otherwise. Using these variables, the availability of a path pair r ∈ R st , represented by Λ r , is given by: Following [9], the geodiversity of a path pair is the minimum distance between any intermediate node or edge of one path and any node or edge of the other path. In [9], it is shown that the geodiversity of a path pair can be modelled based only on geographical distances between edges provided that these distances are defined as follows. For each r ∈ R st between s and t, the geographical distance between one edge e i ∈ S r1 and one edge e j ∈ S r2 , represented by δ(e i , e j ), is defined as (i) the minimum distance between any point in the geographical path of e i and any point in the geographical path of e j if they do not share s or t or (ii) the minimum distance between one edge and the non-common end node of the other edge if they share either s or t. Fig. 1 shows part of a network with the source node s, five other nodes (2 to 6) and five edges (a to e) illustrating the geographical distances between edges in the different cases. Examples of case (i) are the distances δ(a, e) and δ(b, e). Note that the zero distances δ(a, b), δ(c, d) and δ(d, e) also illustrate this case, because each pair of edges shares a node, which is neither s nor t. As for case (ii), the distance δ(a, c) between edges a and c, since they share the source node s, is the minimum between the distance between node 2 (the non-common end node of a) and edge c and the distance between node 3 (the non-common end node of c) and edge a. In this work, it will be considered that links follow the shortest path over a sphere that represents Earth.
Then, the geodiversity value of r ∈ R st , represented by D r , is given by: Note that the geodiversity value of a path pair only depends on the geographical path of each edge. Moreover, for each node pair (s, t) ∈ K, there is a maximum geographical distance, which we represent by D M ax st , above which a pair of paths is infeasible (these values can be computed in advance using [9]). Therefore, we consider the geodiversity requirement using The network upgrade problem is, then, defined by the following integer non-linear programming model: Subject to: The objective function (3) is the minimization of the total cost of all upgraded edges. Constraints (4) guarantee that each (s, t) ∈ K is provided with at least one path pair (i.e., the paths r ∈ R st such that the variable y str is set to 1) and each of these path pairs has an availability value not lower than Λ, guaranteed by constraints (5), and a geodiversity value not lower than D st , guaranteed by constraints (6). Finally, constraints (7)-(8) are the variable domain constraints.
Note that, in constraints (5), Λ r is given by expression (1). These constraints relate variables x e with variables y str in a non-linear way which turns the proposed formulation in an integer non-linear programming model. In this case, standard solution techniques are not valid and appropriate exact methods (i.e., able to compute the optimal solutions) must be investigated.

III. SOLVING ALGORITHM
The solving algorithm proposed here, named Minimum Upgrade Cost with Availability and Geodiversity (MUCAG), uses an iterative approach based on a greedy strategy. Starting with the network configuration without any upgraded edge, the algorithm selects iteratively one edge to be upgraded until the resulting network configuration provides at least a pair of paths for each node pair (s, t) ∈ K with a minimum availability of Λ and a minimum geodiversity of D st .
We designate a pair of paths with a minimum geodiversity D st as a pair of geodiverse paths. As already explained, a pair of geodiverse paths is always feasible for all (s, t) ∈ K, as D st = min(D, D M ax st ). Nevertheless, the existence of a pair of geodiverse paths with a minimum availability Λ depends on the set of upgraded edges. So, a nuclear task of MUCAG is, for a given network configuration and a given (s, t) ∈ K, to compute a pair of geodiverse paths r whose availability Λ r is at least Λ. This task is implemented through an algorithm, named Guaranteed Available Pair of Geodiverse Paths (GAPGP), which is an adaptation of the algorithm in [11] for calculating the most reliable pair of link disjoint paths. For reasons that will be explained in the detailed description of MUCAG, algorithm GAPGP also computes the pair of geodiverse paths with the highest availability value if such value is lower than the required Λ.
In the following three subsections, we describe separately first the GAPGP algorithm, then, the MUCAG algorithm and, finally, the different greedy strategies tested in practice.

A. GAPGP algorithm description
GAPGP is specified in Algorithm 1. For a given node pair (s, t) ∈ K, minimum availability value Λ, minimum geodiversity value D st and network configuration (defined by the values assigned to the binary variables x e ), GAPGP computes a pair of geodiverse paths r ∈ R st with an availability value Λ r which is either not lower than Λ, if such path pair exists, or is maximal if max r∈Rst Λ r ≤ Λ.
GAPGP starts by computing an edge cost c e for all e ∈ E (lines 1-3) such that enumerating the k shortest paths using these costs corresponds to the enumeration of the k paths with the highest availability.
Then, in the main while cycle (lines 6-27), the algorithm iteratively generates a new first path p with function nextshortest-path (line 7) by non-increasing order of availability value (next-shortest-path corresponds to the iterative use of Yen's [12] k-shortest path algorithm or of the loopless version [13] of the MPS algorithm [14]). For each first path p, a second path q is computed by function path-geo-distance (line 16) as the path with the highest availability and a geodiversity of D st with p (function path-geo-distance runs a shortest path algorithm in an auxiliary graph given by G without the edges of p and the edges with a distance from any edge of p below D st ). If the second path q exists (q = ∅ in line 17), the availability of path pair r = (p, q) is evaluated to check if the current best solution r must be updated (lines 18-21).
The algorithm stops (line 6) when either the availability of the current best path pair r is at least Λ or if variable opt becomes true (which means that the path pair with the highest availability has been reached). Variable opt becomes true in one of two cases. The first case (line 25) is when function next-shortest-path (line 7) returns no path (i.e. p = ∅ in line 8), which means that all possible paths have already been enumerated. The second case (line 12) is when the availability of the current best path pair cannot be further improved (line 11) (condition also used in [11]).
To understand the condition of line 11, let Av(p) represent the availability of a path p, i.e., Av(p) = e −c (p) , where c (p) = e∈p c e . Consequently, the availability of a path pair (p, q) is Av(p) + (1 − Av(p))Av(q). Let the current best path pair be r = (p w , q w ) with availability Λ r , where w is the order of generation of the first element of the pair (path p obtained in line 7) and q w is the second path (path q obtained in line 16). Finally, let the new first path generated by next-shortest-path The verification of this statement is straightforward. Following [11], notice first that Av(p i ) ≤ Av(p w ). Let q i be the path with the highest availability which guarantees a geodiversity of D st with p i . If Av(q i ) > Av(p i ), this path pair would have already been obtained when p = q i . On the other hand, if Av(q i ) ≤ Av(p i ), this path pair has an availability which is at most Av(p i ) + (1 − Av(p i )))Av(p i ) and, by the above condition, lower than Λ r . So, any path pair obtained from this point onwards has an availability not better than Λ r .
In the first step of each cycle (line 5), the auxiliary sets K and R are first initialized empty (note that at the end of each cycle, R includes all the best pairs of geodiverse paths, which still do not attain Λ). Then, for each (s, t) ∈ K (lines 6-12), algorithm GAPGP (line 7) computes a pair of geodiverse paths r with an availability Λ r (recall the description in the previous subsection) that, if below the required value Λ (line 8), makes node pair (s, t) to be added to K (line 9) and path pair r to be added to R (line 10). Note that when Λ r < Λ, r is the path pair with the highest availability of the current network configuration for node pair (s, t). So, the edges involved in such path pairs are likely to be the most promising ones to be upgraded to reach the required network configuration in subsequent cycles.

11:
if g(2 − g) ≤ Λ r then Av(p) Av(q) end if 27: end while Then, K is set with K (line 13), i.e., the new set K has only the node pairs (s, t) for which there is still no pair of geodiverse paths compliant with the availability requirement. Finally, if set K is not empty (lines 14-18): (i) function selectEdge (line 15) selects one edge e (among the nonupgraded ones belonging to path pairs in R) and calculates the end nodes K of the path pairs whose availability becomes at least Λ with the selected edge (this set can be an empty set), (ii) the selected edge e is upgraded (line 16) and (iii) K is removed from K (line 17).
Note that if all the edges in set R (see line 15) have already been upgraded, the algorithm ends without achieving the desired end-to-end availability for the demands currently in set K. For simplicity, this situation is not considered in the algorithm description.

C. Variants for the greedy algorithm
In line 15 of Algorithm 2, the edge to be upgraded next may be selected according to different greedy strategies. Consider first that E(R) represents the set of edges, among the ones still not upgraded, that belong to at least one of the path pairs in R. The edge to be upgraded next is one edge of E(R). The

Algorithm 2 MUCAG
Require: G, K, Λ, (a e , a u e , c e ) : ∀e ∈ E, D st : ∀(s, t) ∈ K Ensure: x e : ∀e ∈ E 1: for all e ∈ E do 2: x e ← 0 3: end for 4: while K = ∅ do 5: for all (s, t) ∈ K do 7: (r, Λ r ) ←− GAPGP(s, t, D st , Λ, x e : e ∈ E) 8: if Λ r < Λ then if |K| > 0 then 15: (e, K ) ← selectEdge(R, K, x e : e ∈ E) 16: x e ← 1 17: end if 19: end while different strategies described next use one (or a combination) of the following subsets of E(R): E M (R) is defined as follows: first, a counter is associated to each edge in E(R) with the number of times it is in the path pairs of R; then, E M (R) is composed by the edges with maximum counter value (i.e. the edges whose upgrade improves the availability of more path pairs). E O (R) is defined as follows: first, a counter is associated to each edge in E(R) with the number of path pairs of R whose availability becomes at least Λ if the edge is upgraded; then, E O (R) is composed by the edges with maximal counter value or is empty if the maximal counter value is 0 (i.e. when not empty, it contains the edges whose upgrade results in the highest number of path pairs fulfilling the required availability).
In the following, we describe different greedy strategies, creating different variants for the greedy algorithm:

IV. COMPUTATIONAL RESULTS
The computational results presented in this section consider two network topologies (also used in [9]) representative of typical telecommunication transport networks: the Germany50 topology (geographical location of nodes available at [15]) and the CORONET CONUS topology (geographical location of nodes available at http://www.monarchna.com/topology.html), referred to as Coronet in this section. Since there is no available information on the geographical path of each link, we have considered that links follow the shortest path over the terrestrial surface assuming that Earth is a sphere, as already mentioned in section II. For both topologies, the information concerning edge lengths and geographical distance between edge-edge pairs and node-edge pairs is available at http://www.av.it.pt/asou/geodiverse.htm.
In both topologies, we have considered the worst case of set K composed by all node pairs. Then, for each (s, t) ∈ K, the maximum geodiversity value D M ax st was obtained solving the Maximum Distance D of Geodiverse Paths (MDDGP) optimization problem proposed in [9]. Table I shows the topology characteristics of both networks.
Concerning the edge availability values, we have considered the current availability a e of e ∈ E based on its length [16]: where M T BF and M T T R are the mean time between failures and mean time to repair in hours, respectively (as in in [4], we consider M T T R = 24 and CC = 450). CC is the cable cut metric in km (cable lengths also in km). Moreover, we have assumed an upgraded availability a u e for each edge e ∈ E equivalent to the addition of a parallel edge of the same length. i.e., a u e = a e (2 − a e ). Concerning the upgrade cost values, note first that in the general case, the upgrade cost c e of each edge e ∈ E should be composed by a fixed cost and a cost per unit length of the edge. Since the expression to calculate the actual cost is uncertain (it depends on many different factors), here, we analyze the results of the different greedy strategies for the two extreme cases: (1) the upgrade cost is the number of upgraded edges (equivalent to consider a cost of 1 per upgraded edge) and (2)  the upgrade cost of each edge is given by its length (equivalent to consider a cost of 1 per length unit of each edge). In the following two subsections, we present and discuss separately the computational results obtained to each network topology.

A. Results for the Germany50 network
Due to its geographical coverage and the considered edge availability values, Germany50 already provides four nines (0.9999) availability between all node pairs even in the case of requiring a geodiversity value D M ax st for all node pairs (s, t) ∈ K. So, we have considered a minimum availability Λ = 0.99999. Table II presents the number of upgraded edges, the upgrade cost (length based) and the CPU time for the geodiversity values D of 40 km, 80 km, 120 km and 160 km (solutions providing the minimum number of upgraded edges and/or the minimum upgrade cost highlighted in bold). Table II shows that, in terms of number of upgraded edges, the Min-Cost-Max-On has the best results, on average, but both Max-On-Max-Count and Max-Count-Max-On are only slightly worse. On the other hand, in terms of upgrade cost, Max-On-Max-Count is clearly the best strategy. So, the main conclusion is that Max-On-Max-Count is the best compromise strategy for Germany50, as it finds solutions with the lowest costs and a number of upgraded edges at most 16% above the minimum values found by any strategy.
Moreover, the simplest Min-Cost strategy is, by far, the worst strategy both in terms of number of upgraded edges and upgrade cost, showing that considering the number of path pairs whose availability is improved by the selected edge (as exploited by the other strategies) leads to more efficient algorithms. The Min-Cost strategy was considered because it is strongly associated with the objective function in equation (3). As expected, it resulted in a large number of short edges being selected for upgrade.
Finally, two expected observations are that, in the overall: (1) higher geodiversity values D impose upgrade solutions with both higher number of upgraded edges and cost (with only a few exceptions); (2) the CPU times are also higher since the number of iterations of MUCAG (see Algorithm 2) grows with the number of upgraded edges. Fig. 2 presents the solutions found by the best compromise Max-On-Max-Count strategy for the four considered values of D. Besides illustrating that higher values of D impose a higher number of required upgraded edges, it also shows that the set of upgraded edges required by a given D is not a subset of the set of upgraded edges required by a geodiversity value higher than D. So, in practice, the right choice of parameter D is of paramount importance since if later on the required value D becomes larger, the previous upgraded edges might not be the best choices. Fig. 3 presents the solutions found by the other four strategies for D = 80 km (the solution of Max-On-Max-Count is in Fig. 2b). The Min-Cost solution (Fig. 3a) illustrates the inefficiency of this strategy since it gives preference to many shorter upgraded edges, some of them with a minor   Table II. First, strategies that consider the edges in E M (R) (i.e., the most frequent edges in the path pairs of R), namely, the Max-On-Max-Count (Fig. 2b) and the Min-Cost-Max-Count (Fig. 3b), find solutions with lower upgrade costs. Second, strategies that consider the edges in E O (R) (i.e., edges whose upgrade makes the availability of more path pairs of R to become at least Λ), namely, the Min-Cost-Max-On (Fig. 3c) and the Max-Count-Max-On (Fig. 3d), find solutions with a lower number of short edges, leading to solutions with a lower number of upgraded edges. The Coronet network has a much wider geographical coverage and cannot provide four nines (0.9999) availability to many node pairs without upgraded edges. So, in this case, we have considered two minimum availability values. Table III (for Λ = 0.9999) and Table IV (for Λ = 0.99999) present the number of upgraded edges, the upgrade cost and the CPU time for the geodiversity values D of 100 km, 200 km, 400 km and 600 km (as before, solutions providing the minimum number of upgraded edges and/or the minimum upgrade cost highlighted in bold). Table III shows that for Λ = 0.9999 the Max-On-Max-Count and Max-Count-Max-On strategies are the best compromise between the number of upgraded edges and the  Table IV show that for Λ = 0.99999, with the exception of Min-Cost, which is much worse, none of the other strategies represents a better compromise between the number of upgraded edges and the upgrade cost.
The results of both tables reinforce the observations already made for Germany50, namely, (1) Min-Cost is always the worst strategy, (2) the Max-Cost-Max-Count and Max-On-Max-Count strategies (that give preference to the most frequent edges in the path pairs of R) find solutions with lower upgrade costs and (3) the Min-Cost-Max-On and the Max-Count-Max-On strategies (that give preference to edges whose upgrade makes the availability of more path pairs to become at least Λ) find solutions with a lower number of upgraded edges. Fig. 4 presents the solutions found by the five strategies for Λ = 0.9999 and D = 200 km. In this figure, it is possible to check that the Min-Cost strategy selects a much higher number of upgraded edges. Due to the larger dimension of this network (when compared to the dimension of the Germany50 network), the differences between the lengths of the selected edges are not so clear. However, it is possible to observe that the Min-Cost-Max-On (Fig. 4c) and the Max-Count-Max-On (Fig. 4e) strategies tend to have less short length edges.

V. CONCLUSIONS AND FURTHER WORK
Telecommunication networks must provide high end-toend availability and high resilience to large-scale disasters. Path protection improves end-to-end availability but might be not enough to reach the availability required by critical services. Moreover, adding path geodiversity to enhance disasterresilience of networks makes the provision of high end-to-end availability even more challenging.
Here, we have addressed the problem of selecting a set of edges to be upgraded at a minimum cost ensuring a required level of availability and geodiversity. We have proposed a solving algorithm which uses an iterative approach based on a greedy strategy: starting with the network configuration without upgraded edges, the algorithm selects iteratively one edge to be upgraded until the resulting network configuration fulfils the required availability and geodiversity levels. Different edge selection strategies were proposed and tested on a set of problem instances. The computational tests showed that a simple strategy of edge selection only based on edge cost is very inefficient, while strategies taking into account the improvement impact of the selected edge on the end-to-end availability of node pairs lead to more efficient algorithms.
Regarding future work, an exact resolution approach and/or lower bounds for the cost of the solutions will be pursued for small network instances, which will allow to gain some insight into the quality of the approximate solutions and also to allow some tuning of the heuristics.