Tree embeddings for two-edge-connected network design

The group Steiner problem is a classical network design problem where we are given a graph and a collection of groups of vertices, and want to build a min-cost subgraph that connects the root vertex to at least one vertex from each group. What if we wanted to build a subgraph that two-edge-connects the root to each group---that is, for every group g ⊆ V, the subgraph should contain two edge-disjoint paths from the root to some vertex in g? What if we wanted the two edge-disjoint paths to end up at distinct vertices in the group, so that the loss of a single member of the group would not destroy connectivity?
 In this paper, we investigate tree-embedding techniques that can be used to solve these and other 2-edge-connected network design problems. We illustrate the potential of these techniques by giving poly-logarithmic approximation algorithms for two-edge-connected versions of the group Steiner, connected facility location, buy-at-bulk, and the k-MST problems.


Introduction
Edge survivability has long been a desired property in network design, and problems enforcing higher edgeconnectivity have been well studied in the literature. We now have very strong approximation results for some of the basic problems, like the edge-survivable (and element-survivable) network design problems [29,19], which have been recently extended to the case of vertex connectivity as well [12]. The techniques that have proved useful for these results are primal-dual algorithms (which were used for the first few results here) and subsequently, iterative rounding, which gave much stronger results.
However, higher-connectivity versions of several other network design problems still lack good approximations: let us consider the group Steiner tree problem, where given a rooted undirected graph, and subsets of * Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA. Supported in part by NSF awards CCF-0448095 and CCF-0729022, and an Alfred P. Sloan Fellowship.
† Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA 15213, USA. Supported in part by NSF award CCF-0728841.
vertices (called groups), the goal is to find a minimum cost subgraph that contains paths from the root to at least one vertex in each group. What if we wanted two edge-disjoint paths to at least one vertex in each group?
A key difficulty in addressing this problem is that all known solution methods for the singly-connected version first reduce the given problem instance to one where the graph is a tree which approximately preserves pairwise distances; one can then either write a LP relaxation and round it, or use a clever greedy algorithm and dynamic programming, to obtain an approximation. In fact, it has been a long-standing open problem to obtain a logarithmic approximation guarantee in polynomial time that does not use the method of treeapproximations. Note that reducing to a tree instance is bad for us when a 2-edge-connected graph is desired, since we have lost the higher connectivity in the very first (but crucial) step.
In earlier work [23] on online survivable network design problems, we observed that approximating the given graph by a random spanning tree [1], we need not discard the non-tree edges, but can just raise their lengths to match the distance along the tree between their end-points. Hence the random tree-embedding can now be viewed as a random embedding into a backboned graph: one that has a "backbone" spanning tree such that the cost of a non-tree edge is at least that of the tree path between its end vertices. This enables us to write linear programming relaxations as in the singlyconnected versions, and moreover, the modified costs on non-tree edges gives us the additional structure we can use to achieve 2-connectivity. Using this approach, in Section 3, we give a O(log 3 n log q)-approximation algorithm for instances of the 2-edge-connected group Steiner tree problem with q groups.
We show how similar ideas can be used to solve other 2-edge-connectivity problems. In Section 4, we consider the two-edge-connected version of the connected facility location problem (2-CFL). As in facility location, we are given clients with demands in an undirected network, and must open a set of facilities (paying facility opening costs) and assign the clients to some open facility (paying a connection cost equal to the shortest-path distance between them). But we must also build a two-edge-connected network (the "core") on the open facilities, paying M times the cost of the edges in this 2-connected core. The motivation is a commonlyfaced one for network designers: it is crucial to achieve fault-tolerance in the core of the network. This problem has been studied when the network is a complete graph and the costs satisfy the triangle inequality, but nothing was known for the general graph case even for the simple case M = 1 [45]. We use our general technique to reduce the problem to backboned networks, where we give a constant approximation-hence giving us an O(log n) approximation for general graphs. We also give a poly-logarithmic approximation for the case where the connection costs and the core network costs are unrelated.
In Section 5, we give a O(log 2 n)-approximation algorithm for the 2-edge-connected buy-at-bulk problem with concave scaling costs for buying cables. In this problem, we are given a graph and a set of demand pairs (s i , t i ) that require 2-edge-connectivity from s i to t i . A feasible solution is a collection of two edge-disjoint paths for every (s i , t i ) pair, and the cost incurred by an edge e in such a solution is c(e) · Φ(l(e)) where c(e) is the length/distance of edge e, l(e) is the load on edge e (the number of demand pairs using e), and Φ(·) is a concave scaling function that models the economies of scale phenomenon. The goal is then to minimize the total cost on all the edges used. This problem was first studied (in the more general 2-vertexconnectivity setting) by Antonakopoulos et al. [2],where they showed a O(log 3 n)-approximation for the (singlesink) buy-at-bulk problem, when there is only one cable type. What we show is that the additional properties of backboned graphs can be leveraged in order to separate the problem into that of buying tree paths and covering them appropriately. This structure enables us to get our O(log 2 n)-approximation algorithm for the 2-edgeconnectivity version of (multi-commodity) buy-at-bulk under any concave scaling function.
Finally, in Section 6 we also show how essentially the same techniques can be used to give a poly-logarithmic approximation for the k-2EC problem, which is a generalization of k-MST to higher connectivity. Here, we want to find a minimum-cost subgraph of a given graph G that contains at least k nodes and is 2-edge-connected. The first approximations to this problem were given only recently by Lau et al. [38], and improved by Chekuri and Korula [6] (whose solution also works for the node-connected case). We show that our framework also gives anÕ(log 3 n)-approximation for the k-2EC problem: while our guarantees are quantitatively worse than those in the previous results, our proof shows how simple ideas can be used to obtain results in the same ballpark.
1.1 Related Work 1.1.1 Higher Connectivity problems. There is a huge body of work on higher connectivity problems. A long stream of work has studied the 2-edgeconnected spanning subgraph problem: Frederickson and Ja'Ja' [20] gave the first 3-approximation algorithm by augmenting a minimum spanning tree, showing also in the process that the problem of augmenting any spanning tree to make it 2-edge-connected can be approximated within a factor of 2. This was subsequently improved by Khuller and Vishkin [33], who showed a 2-approximation for the general kedge-connected spanning subgraph problem. Then, primal dual algorithms [34,49,22] were used to obtain O(log k) approximations for more general kconnectivity problems. Jain [29] gave an iterative rounding based 2-approximation algorithm for the general edge-connectivity survivable network design problem. These techniques have also been employed recently to obtain tight results for network design with degree constraints [38,40,3]. The element-connectivity and {0, 1, 2}-node connectivity versions were solved by Fleischer et al. [18,19]. The generalized vertex connectivity problems are less well understood: [32,9,37,17] give approximations for the k-vertex-connected spanning subgraph problem, while Cheriyan and Vetta [10] consider the general problem on instances with a complete metric. Recent papers [4,7,11,43,44] have shed more insight into the subset k-node-connectivity case, and very recently, Chuzhoy and Khanna [12] show that element connectivity can be used as a black-box to give good approximations for the generalized vertex-connectivity version via an elegant sampling idea. From the inapproximability side, Kortsarz et al. [36] give Ω(2 log 1−ε n ) hardness for node-connected SNDP, and [4] give T ε hardness for node-connecting T pairs. 1.1.2 Group Steiner Tree (GST). An LP rounding algorithm for the group Steiner problem with was given by Garg et al. [21]; an alternate greedy algorithm avoiding the LP rounding was given by Chekuri et al [5]. Similar poly-logarithmic approximations are also known for the covering Steiner problem, a generalization of the group Steiner tree problem where a requirement r i is given with each group g i and we require a minimum cost subgraph that (one-)connects at least r i terminals from each group g i to the root [35,26]. Note the covering Steiner problem does not solve the 2-ECGS problem we consider here, since the paths from the root to two nodes from a group may share edges. Poly-logarithmic integrality gaps and hardness results are known for all these group and covering Steiner problems [27,28]. Very recently and independent of our work, Khandekar et al. [31] also consider 2-connected group Steiner problems and give an O(k log 2 n)-approximation for the setting when vertex-disjoint paths were required to two distinct vertices, when groups have size at most k.
1.1.3 Connected Facility Location (CFL). This problem has been very widely studied in the approximation algorithms literature-here we want facilities to be (singly-)connected together by a Steiner tree. Several constant-factor approximations are known, based on ideas like LP rounding [45,24], reduction to classical facility location [30], primal-dual methods [47], and random sampling of facilities [25,13,14,48,15]. We note that a special case of the 2-connected version we study here (called the "ring-star" problem or "tour-CFL") in which the underlying graph is a complete graph and the edge costs satisfy the triangle inequality was studied in [45]. The observation that an Euler tour can give a TSP with cost twice the Steiner tree cost implies that this is essentially equivalent to the 1-connected CFL. This proof breaks down when the graph is not complete.
A different version of the two-connected CFL problem can also be formulated, where we have to pick two edge-disjoint paths to connect each demand to its facility, and also build a two-connected subgraph on the facilities. A constant-factor approximation for this problem can be obtained from previous random sampling techniques; we give the details in the full version. Again, these techniques do not seem to extend to our case: loosely, this new version implies that demands are cheaply two-connected to each other, and hence opening up a subset of them may be a feasible solution; this is certainly not the case for the 2-CFL problem we study.
1.1.4 The two-edge-connected buy-at-bulk problem. Antonakapoulos et al. [2] first studied fault tolerant versions of the single cable buy-at-bulk problem. They showed a constant approximation for the single-sink case and a O(log 3 n)-approximation for the multi-commodity setting of 2-vertex-connected buyat-bulk Subsequently, Chekuri and Korula showed an O(log |T | b )-approximation algorithm for the single-sink 2-vertex-connected buy-at-bulk problem with b cable types and any set of T demand pairs. In this work, we show an O(log 2 n)-approximation algorithm for the multiple cable multicommodity problem. However, we should note that the previous approximations hold for the more general setting of 2-vertex-connectivity, while our algorithm solves the 2-edge-connectivity problem.

The
k-two-edge-connected subgraph problem. The k-2EC problem was first studied by Lau et al. [38] who claimed an O(log 3 k)-approximation algorithm; this was corrected to an O(log n log k)approximation [39].
Independently, Chekuri and Korula gave an O(log n log k)-approximation for the 2-node-connectivity version of the problem. At a high level both these algorithms use the idea of finding repeatedly dense subgraphs, and then pruning the resulting graph to have the right number of terminals. These ideas give better approximation guarantees than we do, but require more machinery; we show how simple ideas can give non-trivial approximations.

Backboned Graphs
In [23] we noted that the standard techniques used for approximating graph metrics by distributions over their subtrees implied that graphs could be wellapproximated by random graphs with "nice" structure, which we called backboned graphs. While this is a trivial observation, it opens up the possibility of leveraging the added structure to design LP rounding algorithms, much like tree embeddings have been used. In this section, let us give the basic definitions we will use in the rest of the paper. The following result is a simple consequence of the results of Elkin et al. and Abraham et al. [16,1]. We give the proof in Appendix A for completeness. Theorem 2.1. Given a network-design problem Π whose objective function is linear in the edge-costs, any β-approximation algorithm for the problem Π on backboned graphs implies a randomized β × O(log n)approximation algorithm for Π on general graphs.

Backboned Graphs and Tree Embeddings
Given this reduction, for the subsequent sections we will assume that the input graph is a backboned graph, and will use its properties to design our algorithms.

A Covering Lemma on Backboned Graphs
We begin with some notation. Let G be a backboned graph with base tree T . For any non-tree edge f = {u, v}, let P T (u, v) denote the base tree path from u to v, and let O f denote the fundamental cycle with respect to T ; i.e. O f = {f } ∪ P T (u, v). Because G is backboned graph, observe that the cost of the cycle We now prove a simple but crucial property of 2edge-connected subgraphs on backboned graphs.
Proof. Consider an edge e on the base tree path P T (r, v). Removing the edge e would separate the base tree T into two components, one containing r (which we call C r ) and the other containing v (denoted by C v ). Since r and v are 2-edge-connected in the subgraph H and e is the only tree edge crossing C r and C v , there must exist a non-tree edge f = {x, y} ∈ E(H) \ E(T ) such that one end vertex of f is in C r and the other is in C v ; otherwise e would be a cut edge separating r and v in H. But then, since x and y are in different components of T \ e, it follows that e ∈ P T (x, y) ⊆ O f . This completes the proof.

2-Edge-Connected Group Steiner
In this section, we consider the 2-edge-connectivity extension of the group Steiner problem, which we call 2-ECGS, and give an O(log 2 n log q)-approximation algorithm for instances with backboned graphs, where q is the number of groups. Formally, we are given a graph G = (V, E) with edge costs c : E → R, a set of groups G = {g 1 , g 2 , . . . , g q } where g i ⊆ V , and a designated root r ∈ V . The objective is to find a minimum cost subgraph H and identify representatives r i ∈ g i (for 1 ≤ i ≤ q) such that r i and r are connected by two edge-disjoint paths in H. (One can consider a variant of the problem where it is sufficient have two edge-disjoint paths to the group g i , possibly to different vertices: we consider this in Section 3.2.) At a high level, our techniques for solving the 2-ECGS problem use the underlying base tree in the backboned graph to set up a linear program using ideas from the LP relaxations for group Steiner trees [21] and the tree augmentation problem [8] (where nontree edges must be added to 2-edge-connect a tree). Our LP identifies terminals that will be fractionally 1connected from the root along the base tree; the nontree edges then (fractionally) 2-connect these terminals, which is enforced by tree augmentation constraints. Our algorithm then employs the group Steiner rounding, and follows this up with a second stage of choosing nontree edges to 2-connect the first stage subtree. The crux of our analysis is to show that the expected cost of the second stage solution is no more than an extra logarithmic factor of the original LP cost, and this argument uses the level structure of the group Steiner LP rounding in a careful way.
3.1 An O(log 2 n log q) Approximation for 2-ECGS on Backboned Graphs. Consider the following linear program (LP 2GS ) for a 2-ECGS instance I on a backboned graph G = (V, E) with edge costs c(·) and base tree T . The variable x e is an indicator variable for whether tree edge e is present in the solution or not, and y f is an indicator for whether or not the edges of the base cycle O f are included. Call a set S "valid" iff there exists a group g i such that g i ⊆ S and r / ∈ S. Let ∂S to denote the set of edges crossing the cut S, V \ S.
Though the above LP has exponentially many constraints, it can be solved near optimally in polynomial time as there is an efficient min-cut based separation oracle to verify feasibility. It is also (almost) a relaxation: Proof. Let Opt be some optimal solution for the given instance, and let r i be the representative from group g i which is 2-edge-connected to the root r. From Observation 2.1, we can construct a cycle-closed subgraph Opt such that c(Opt ) ≤ 2c(Opt) and Opt ⊆ Opt . Also, since Opt is cycle-closed , we know (from Observation 2.2) that Opt contains the base tree path P T (r, r i ) for all i ∈ [1, q]. Therefore, for any valid cut S, there is a tree edge in Opt crossing it -this means that all constraints (3.1) would be satisfied by the integer solution corresponding to Opt .
Furthermore, the Covering Lemma ensures that any edge e on the path P T (r, r i ) has a "covering cycle" O f ⊆ Opt such that e ∈ O f -this ensures that constraints (3.2) would also be satisfied. As a result, the solution corresponding to Opt is feasible to LP 2GS . As for the cost, the LP solution is charged c(e) for any tree edge in e in Opt and is charged the cost of the entire cycle O f corresponding to each non-tree edge in Opt . Therefore, the value of the objective function for this solution is at most 2c(Opt ) ≤ 4c(Opt).

Rounding the LP solution.
We first give the overview of the rounding procedure and then present the details in two stages.
• Firstly, constraints (3.1) ensure that the x e variables form a feasible solution to the group Steiner LP on the base tree T , and so we round (in Stage 1) the x e variables using one iteration of the Garg et al. [21] randomized rounding for the group Steiner problem (which we refer to as the GKR algorithm). At the end of Stage 1, we show our partial solution H 1 would 1-connect roughly Ω(1/ log n)-fraction of the groups to the root.
• In Stage 2, we need to pick covering cycles such that each tree edge in the partial solution H 1 is covered by some cycle. To do this, we essentially use algorithms for Set Cover to get a low-cost collection of cycles covering all tree edges picked in the first stage. This ensures that there are no cut-edges in the subgraph H 1 , and therefore the resulting subgraph 2-edge-connects all the groups connected to the root in H 1 .
• Finally, we repeat these two stages independently O(log 2 n) times and output all the edges bought to get a feasible solution 2-connecting all groups to the root r with very high probability.
We now present the details of the two stages, as well as the analysis of the algorithm.
It remains to explain how to obtain the 2approximate set cover in Step 3 (of Stage 2). The LP Stage 1 -Picking Base Tree Edges 1: solve the linear program LP 2GST ; let (x * , y * ) denote an optimal solution. 2: round up each fractional x * e variable to the nearest power of 2. Then, set x * e := 0 if x * e ≤ 1 2n and scale each non-zero x * e to 2x * e . 3: round the x * variables using one round of the GKR rounding scheme. 4: let H 1 denote the set of edges bought by the GKR algorithm.

Stage 2 -Picking Covering Cycles
1: let H 2 := ∅. 2: setup the following set cover instance: 2a: universe: there is an element for each edge obtain a set cover S whose cost is at most twice the cost of the LP relaxation.
relaxation (LP H 1 ) of the set cover problem to cover all edges in H 1 is the following (with a variable y f ∈ [0, 1] for non-tree edge each f ).
Consider a new instance obtained by replacing each nontree edge f = {u, v} with lca(u, v) = a by two edges f l and f r , the former covering all edges on the path P T (u, a) and the latter covering edges on P T (a, v), and making both their costs equal c(O f ). Setting y f l and y fr for the LP relaxation to this new instance equal to y f in the old instance gives a solution of cost at most twice the value of LP H 1 . However, the constraint matrix in the LP for this new instance is a network matrix [46, Section 13.3], which is totally unimodular, and hence all optimal basic solutions are integral; moreover, any such integral solution is a 2-approximate set cover for the original instance.
3.1.2 Analysis of the LP Rounding. We first show that the subgraph H 1 ∪H 2 output by running both stages above 2-connects any fixed group to the root with non-trivial probability.

Lemma 3.2. (Success Probability) For each group
Proof. To show this, we observe the following properties of the subgraphs H 1 and H 2 .
(i) For each group g i , the probability that a vertex from The first part is a direct consequence of one round of the GKR group Steiner rounding algorithm. The second part follows from the way H 2 was obtainedeach element/edge e ∈ H 1 has some set S f ∈ S which covers it, and in Step 4 of Stage 2, we ensure that H 2 contains the entire cycle O f . Therefore, consider a group g i which is connected to Now we analyze the total expected cost of the subgraph LPOpt e and LPOpt f denote the tree cost and the non-tree cost of the optimal fractional solution respectively. Let LPOpt = LPOpt e + LPOpt f denote the overall cost of the LP relaxation.  Proof. For any fixed outcome of H 1 , the cost of the subgraph H 2 is at most the cost of the set cover solution S in Step 3 of Stage 2 (which, in turn, is at most twice the cost of an optimal LP solution to LP H1 ). Therefore, to prove the lemma, it would suffice to exhibit a fractional solution to the linear program LP H 1 , whose expected cost is at most O(log n)LPOpt f , the expectation being over the first stage randomization.
Consider base tree T rooted at r (see Figure 3.1). If a is neither u nor v, then P T (u, v) is a disjoint union of subpaths P T (u, a) and P T (a, v). In this case, let e 1 and e 2 denote the edges furthest from the root r (along the base tree) on P T (a, u) and P T (a, v) that are included in H 1 by the rounding in Stage 1. We then set the value y f in the following way: On the other hand, if the lca a ∈ {u, v}, then let e denote the edge furthest from the root r (along the base tree) on Proof. Consider some edge e ∈ H 1 , and let f = {u, v} be any non-tree edge such that e ∈ O f . Without loss of generality, we assume that the least common ancestor a of u and v is distinct from u and v, and that e ∈ P T (a, u) (the proof for other cases is similar). Now, recall that when we set the value of y 1 f , we considered the edge e 1 furthest from the root on P T (a, u) that belonged to H 1 , and then defined y 1 But since the edge e is contained in H 1 ∩ P T (a, u), this means e 1 is further from r than e along the base tree T (i.e., e 1 is a descendant of e on T ). Therefore, we have that x * e1 ≤ x * e (from the structure of the group Steiner LP, edges further from the root have smaller x e values than their ancestors). Consequently,

where the expectation is taken over the randomization in Stage 1.
Proof. In the following, let parent(e) (or parent(v)) denote the parent edge of any given edge or vertex with respect to the base tree, i.e. the edge incident on the given edge or vertex that is closest to the root r. Also, for any tree edge e, say that level(e) = l if x * e = 2 −l (after the scaling in Step 2 of Stage 1).
Consider any non-tree edge f = {u, v}, and let a denote the least common ancestor of u and v on the base tree T . We focus on the case where a / ∈ {u, v}; the other case when a ∈ {u, v} is similar. Moreover, in order to bound the expected value of y f , it is sufficient to analyze the expected value of y 1 f ; the analysis for y 2 , if an edge further from a along T has the same level as edge e, then e is not included in P .
Let Z e denote the event that an edge e is the edge furthest from r on the path P T (u, a) that was picked in H 1 by the Stage 1 algorithm. Then, we have Here, the second equality follows because if level(e j ) = level(e j+1 ), then whenever e j is picked by the GKR rounding, e j+1 would also be selected (this is a property of the GKR algorithm, and this is why we rounded the x e values in Step 2 of Stage 1). Therefore, the event Z e j can never occur (i.e. Pr [Z e ] = 0 for e / ∈ P ). In the last-but-one inequality, we use Pr [Z e ] ≤ x * e because the event Z e is dominated by the event that e is picked by the GKR scheme, which happens with probability x * e . Finally, the last inequality holds because the edges in P all belong to distinct levels, and there are at most log n levels.
We can bound the expected value of y 2 f using a symmetric argument, and therefore, by linearity of expectation, we have E [ y f ] ≤ (2 log n)y * f . Hence, the total expected cost of the fractional solution is O(log n)LPOpt f . Now Claims 3.1 and 3.2 show that the expected cost of a fractional solution to LP H 1 is O(log n)LPOpt f . Since we find an integer solution which is a 2-approximation to the LP cost, the proof of Lemma 3.4 is completed. Lemmas 3.3 and 3.4 together show that the expected cost of the subgraph H 1 ∪ H 2 is O(log n)LPOpt, and each group is 2-edge-connected to the root with probability Ω(1/ log n). Therefore, if we independently repeat this process O(log n log q) times, we get 2-edgeconnectivity to the root for all groups with high probability, and the expected cost is O(log 2 n log q)c(Opt). Thus we get the following theorem. There are two more natural variants of the 2-ECGS problem: and v i2 and edge-disjoint paths P i1 and P i2 going from the root r to these two chosen vertices 2. For each group g i , we want two edge-disjoint paths P i1 and P i2 going from the root r to any two vertices in g i , which may or may not be the same.
In the following section, we show how we can solve the first variant requiring edge-disjoint paths to two distinct vertices, and explain later how we can reduce the latter variant to this setting.

2-ECGS with Distinct Vertices.
Our algorithm is based on rounding an LP relaxation for this problem. The idea behind our LP formulation is the following structural property of any feasible solution: for any group g i , suppose the representatives that are connected to the root are v 1 and v 2 . Then the tree paths from the root r to v 1 and v 2 share a common prefix P T (r, v f ) till some vertex v f , and then fork into two disjoint Furthermore, it is sufficient to cover the tree edges on the common prefix P T (r, v f ) to ensure that the group is 2-connected to r. Now, before we explain the LP formulation, we need to slightly alter the problem instance for some technical issues which we will explain later. Given a backboned instance I, we create a new instance I in the following manner: I is the same as I except for the following additional vertices: for each vertex v ∈ V , we add a dummy vertex v and include an edge {v, v } of 0 cost. The edge {v, v } is also included in the base tree of the modified instance. Now for any group g i = {v 1 , v 2 , . . . , v t } in the original instance I, the corresponding group in I comprises of the dummy vertices {v 1 , v 2 , . . . , v t }. It is easy to see that there is a bijection between feasible solutions to I and those to I , and the cost of the optimal solutions to both instances is the same, since the dummy edges added have 0 cost.
The reason we include these dummy vertices is to prevent an LP solution from finding 2-edge-disjoint paths from a single vertex of a group to the root. Such a solution is not possible in I because for any vertex v, its corresponding dummy vertex v cannot be 2-edgeconnected to any other vertex in the new instance.
We are now ready to explain the LP relaxation (presented in Figure 3.2): x d e is an indicator variable for whether tree edge e is present on the un-forked portion of the tree path(s) from the root to some group: from the above discussion, only such edges need to be covered by non-tree cycles. The variable x s e indicates whether tree edge e is on a forked branch to some group representative. Finally, as used previously, y f is the indicator variable for whether or not the base cycle O f is included, and a set S is called "valid" iff there exists a group g i such that g i ⊆ S and r / ∈ S. Notice that in any feasible solution, the variable x d e is 0 for any dummy edge e, since there are no covering cycles which contain it, and thus constraint 3.8 would be violated otherwise. Proof. Let Opt be some optimal solution for the given instance, and let v 1 i and v 2 i be the representatives from group g i that have edge-disjoint paths to the root r. From Observation 2.1, we can construct a cycle-closed subgraph Opt such that c(Opt ) ≤ 2c(Opt) and Opt ⊆ Opt . Also, since Opt is cycle-closed , we know (from Observation 2.2) that Opt contains the base tree paths P T (r, v 1 i ) and P T (r, v 2 i ) for all i ∈ [1, q]. Let us create an LP solution in the following manner: for any tree edge Set the y f variables according to whether or not O f is present in Opt . Now, consider any valid cut S separating a group g i from r. If there are 2 tree edges in ∂S∩(P T (r, v 1 i )∪P T (r, v 2 i )), then it is clear that constraint 3.5 is satisfied for this cut S. If only one tree edge e crosses ∂S, then it must be on the common prefix of P T (r, v 1 i ) ∩ P T (r, v 2 i ), which means that x d e was set to 1, and therefore constraint 3.5 is satisfied for this cut S in this case as well.
Furthermore, the Covering Lemma ensures that any edge e on the path P T (r, v 1 i ) ∩ P T (r, v 2 i ) has a "covering cycle" O f ⊆ Opt such that e ∈ O f -otherwise we would have a cut edge separating group g i from r. This ensures that constraints (3.8) would also be satisfied. Constraint 3.6 trivially holds in our solution since no tree edge e has both x d e and x s e set to 1, and constraint 3.7 holds because the common prefixes are all tree (sub-)paths anchored from the root, meaning that the set of variables with x d e set to 1 form a sub-tree rooted at r and are therefore downward non-increasing. As a result, the LP solution corresponding to Opt is feasible to LP 2GSd , and incurs a charge c(e) for any tree edge in e in Opt and the cost of the entire cycle O f corresponding to each non-tree edge in Opt . Therefore, the value of the objective function for this LP solution is at most 2c(Opt ) ≤ 4c(Opt).
We now present our rounding algorithm, which is essentially two stages of GKR rounding followed by a set covering phase.    Proof. Consider any cut S (with r / ∈ S) separating group g i from root r. Create a new cut S = S ∪ T i , where T i is the subtree induced by e i in T (e i is as defined in Step 2). Since we include all vertices of T i in S and the root r is not contained in T i ∪ S, the only additional edge (if at all any) in δS ∩ E(T ) \ δS is e i . Also, because the vertices in g i \ g i are all contained in in T i , the cut S separates the entire group g i from the root r. The following lemma is also then a consequence of the GKR scheme. Proof. Let v i ∈ g i be the vertex connecting g i to the root in H 1 (as chosen in

Algorithm 3 2-ECGS with Distinct Vertices
Step 2), and let e i be the edge closest to r on P T (r, v i ) which has x d e < 1/4. Then, by the way we defined our group g i , any vertex in g i is not contained under the subtree beneath e i . Therefore, the maximal extent to which the path P T (r, v i ) (for any v i ∈ g i ) can overlap with P T (r, v i ) is until the parent edge e of e i (which has x d e ≥ 1/4 by definition).

Lemma 3.10. The expected cost of H 3 is O(1)LPOpt.
Proof. The set H 3 is formed by solving a Set Cover relaxation for covering edges whose x d e value is at least 1/4. Therefore, the cost of a feasible solution to the LP relaxation of the associated Set Cover problem is O(1)LPOpt. Furthermore, we can make the constraint matrix in the LP totally unimodular, like we did for the 2-ECGS algorithm (Section 3.1.1), which implies that the cost of solution H 3 is O(1)LPOpt.
From Lemma 3.9, we know that for any group, the x d e values on any edge e which belongs to the common tree path until the fork is high (at least 1/4). But all such edges are covered in H 3 by cycles. Therefore, the subgraph H 1 ∪ H 2 ∪ H 3 is feasible to the given instance, and Lemmas 3.6 ,3.8 and 3.10 bound the expected cost.

Theorem 3.2. The above algorithm is a randomized O(log n log q)-approximation algorithm to 2-ECGS with distinct vertices on backboned graphs, and anÕ(log 2 n log q)-approximation algorithm on general graphs.
Finally, Khandekar et al. ([31], Section 1.2) observe that the variant where the two edge-disjoint paths to any group could be to the same vertex or to distinct vertices, can be reduced to the setting where we force the paths to end up in distinct vertices. Therefore our O(log 2 n log q)-approximation for the distinct vertices setting carries over to this variant as well.

2-Edge-Connected Facility Location
In the standard connected facility location (CFL) problem we are given a set of clients that we assign to some facilities that we open, and then we connect these opened facilities together by a Steiner tree (which can be thought of as the core of the network). However, the network designer would ideally like the core to be resilient to edge failures, and hence it is desirable to two-edge-connect the facilities together. In this section we give a constant-factor approximation for the 2-edge-connected CFL problem (2-CFL) on backboned networks, and hence an O(log n)-approximation for the problem on general graphs.
. We refer to the three terms in the above sum as the facility opening cost, the Steiner cost and the client connection cost [15].

2-CFL on Backboned
Graphs. Let G be a backboned graph with base tree T . As a first step towards writing an LP relaxation, we guess a facility which an optimal solution Opt opens and call it r. Also, Observation 2.1 says that if H * is the Steiner subgraph Opt builds to 2-edge-connect the facilities, then there is a cycle-closed subgraph H ⊇ H * with cost c(H ) ≤ 2c(H * ); hence we seek to build a Steiner subgraph that is cycle-closed . The LP relaxation is then given in Figure 4.3.
The variable x e is an indicator variable for whether the tree edge e is included in the Steiner subgraph or not, y f indicates the inclusion of the cycle O f , z uv indicates if demand u is assigned to the facility at v, and z v corresponds to whether a facility is opened at v. Constraints (4.9) and (4.10) are the usual facility location constraints ensuring that clients are (fractionally) connected to some open facility, and (4.11) ensures that the "root" facility r is opened. Constraint (4.12) ensures that open facilities are connected to the root along the base tree: if some client is connected to facilities in S ⊆ V \{r}, then we need to buy tree edges crossing the cut ∂S (such a tree path exists because we seek cycleclosed Steiner subgraphs). Finally, constraints (4.13) ensure that tree edges bought are "covered" by fundamental cycles-note that this is a valid constraint be-cause of the Covering Lemma 2.1).

Lemma 4.1. The cost of an optimal solution LPOpt of the linear program LP 2CFL is at most 4c(Opt), where Opt is a minimum cost solution to the 2-CFL instance I.
Proof. Let Opt be some optimal solution for the given instance and let demand u i ∈ D be connected to facility v i . Create an LP solution in the following manner: set Finally, the Covering Lemma ensures that any edge e on the path P T (r, v i ) has a "covering cycle" O f ⊆ Opt such that e ∈ O f -otherwise e would be a cut edge separating v i from r. This ensures that constraints (3.2) are also satisfied. As a result, the solution corresponding to Opt is feasible to LP 2CFL . As for the cost, the LP solution is charged c(e) for any tree edge in e in Opt and is charged the cost of the entire cycle O f corresponding to each non-tree edge in Opt . The facility opening and connection costs are identical to that incurred by Opt . Therefore, the value of the objective function for this solution is at most 2c(Opt ) ≤ 4c(Opt).

Rounding the LP Solution.
The LP rounding algorithm works in four stages. We first filter the solution to make sure that clients are not fractionally connected to any distant facility. Then we identify disjoint balls that are within reasonable distance to all the clients. In the third stage, we temporarily open a (possibly expensive) facility in each such ball and 2-edgeconnect it to the root. Finally, in the fourth phase, we identify cheap facilities in each ball and 2-edge-connect them to the nearby temporary facilities. Here are the details. Stage I. Filtering: Let (x * , y * , z * ) denote an optimal LP solution. Filter on the client connection costs [41] as follows: For u ∈ D, let C * u := v∈V c(u, v)z * uv . Set z * uv = 0 if c(u, v) > 2C * u , and "double" the resulting solution (x * , y * , z * ). That is, set x * e = min(2x * e , 1), y * f = min(2y * f , 1), z * uv = min(2z * uv , 1), and z * v = min(2z * v , 1). As usual, this ensures that any client In the next stage, we will temporarily open some facilities in each ball B u and 2-edge-connect them to the root. However, these facilities may be very expensive compared to what the LP has fractionally opened. We resolve this issue in the final step by actually opening cheap facilities from each ball and 2-edge-connecting them to the temporary facilities. The transitivity of edge-connectivity ensures that the cheap facilities are 2edge-connected to the root. The crux of the argument is in showing that these two steps can be successfully done without blowing up the cost. In the above step, we crucially use the fact that for any ball B u , the vertex lca(B u ) is also contained in B u .
To see why this is true, consider any vertex x ∈ B u . Since all shortest path distances are along the base tree in a backboned graph, we know that all vertices in the path P T (u, x) are also contained in B u . Thus for any pair of vertices in B u , all the vertices in the tree path P T (x, y) (an in particular, their lca) belongs to B u .   Proof. From the definition of C * u , it must be that v ∈V | c(u,v )≤2C * u z * uv ≥ 1 2 . Therefore, setting z * uv to 0 when c(u, v) > 2C * u and scaling the solution by factor 2 would indeed be feasible to the LP and incur a cost of at most 2LPOpt. In fact, the fractional client connection cost, opening cost, and Steiner cost are all at most 2C * , 2(E * + F * ), and 2O * respectively. Proof. If a client u belongs to V D , then it must be that some facility is opened in B u (in Stages III and IV), which means that the client connection cost for u is at most 2C * u . If u / ∈ V D , then by the way we constructed V D , we know that there exists u ∈ V D such that B u ∩ B u = ∅ and C * u ≤ C * u . Consequently, the client u can connect to the facility opened in B u and the connection cost would be at most 2C * u + 2C * u + 2C * u ≤ 6C * u . Therefore, the total client connection cost is at most u 6C * u = 6C * . Proof. Consider a client u ∈ V D . In the feasible LP solution (x * , y * , z * ) obtained after the Stage I filtering, u is fractionally connected only to facilities in B u . Hence (4.12) ensures that lca(B u ) can send unit flow to the root along the base tree using the x e variables. Moreover, constraint (4.13) ensures that each tree edge on the path from lca(B u ) to r is fractionally covered by fundamental cycles. Hence, for any u ∈ V D , lca(B u ) is fractionally 2-edge-connected to the root r, implying that the LP solution (x * , y * ) (ignoring the facility opening component) is feasible to the problem of 2-edge-connecting the set of vertices S = {lca(B u ) | u ∈ V D } with the root, and the fractional cost is at most 2(E * + F * ), the extra factor of 2 arising from doubling the variables during filtering. Since the flow-based LP formulation for edge-connectivity SNDP has an integrality gap of 2 ([29]), we can use the approximation algorithm of Jain [29] to build the subgraph H that 2-edge-connects r ∪ {lca(B u ) | u ∈ V D } incurs a cost 4(E * + F * ). Furthermore, the cost of the subgraph H at most doubles when we make it cycle-closed .
This brings us to the interesting part of the proof: showing that we can open cheap facilities in each ball and 2-connect them to previously opened facilities. For any ball B u , the LP solution (x * , y * , z * ) is fractionally feasible to the problem of opening a facility in B u and 2-edge-connecting it to lca(B u ). Indeed, u is fractionally connected to facilities in B u that can send unit flow (along tree edges) to lca(B u ), and the tree edges are fractionally covered to an equal extent by the fundamental cycles. Hence we can send 2 units of flow from the fractionally opened facilities in B u to their least common ancestor-and if we chose one of these facilities (say, at random), the cost incurred to 2-connect it to the lca would be at most O(LPOpt). But we cannot do this analysis independently for all the balls, since that may cost LPOpt for each ball.
To resolve this problem, we now show that the LP solution can be decomposed into disjoint parts corresponding to the balls {B u , u ∈ V D }. Consider a ball B u for u ∈ V D , and consider the LP relaxation (given in Figure 4.4) for the problem P u of opening a facility in B u and 2-edge-connecting it to lca(B u ) in the cheapest possible way.
Say that a variable in the solution (x * , y * , z * ) is critical for B u if setting it to 0 would make the resulting solution infeasible for LP u . The fractional solution (x * u , y * u , z * u ) is then formed by taking all fractional variables critical for B u (and setting all other variables to 0). Clearly, from definition of criticality, (x * u , y * u , z * u ) is feasible to LP u : we now show that these solutions are (nearly) disjoint. In the following, if a variable x e or y f is critical for B u , we will say that edges e or f are critical for B u .  (a f , x). We now claim that the edge f is critical for e ∈ B u only if there is no other ball B u1 closer to x than B u such that f is also critical for B u1 . Indeed, suppose there were such a ball, as in the figure. The fact that f is critical for B u 1 means that there is an edge e ∈ P T (a f , x) that is contained in B u 1 . Hence, a 1 = lca(B u 1 ) is an ancestor of x on the base tree T , and lies on the path P T (a f , x). Now, since B u ∩B u = ∅, the edge e must be on the path P T (a 1 , a f ) ⊆ P T (a 1 , r). However, since the subgraph H bought in Stage III 2-edge-connects a 1 to r, there must be a cycle O f bought in H that contains e. Hence the constraint 4.16 would not appear in LP u since e ∈ H. This is a contradiction to the fact that f was critical for B u because of e. Therefore, any edge f can be critical for at most 2 balls -the ones closest to the end vertices of f . This completes the proof of Claim 4.1. Proof. Consider a subproblem P u and the corresponding solution (x * u , y * u , z * u ) to the LP relaxation LP u . We know that this solution is feasible to the problem of (fractionally) opening a facility and two-edgeconnecting it to lca(B u ), assuming H is already bought in Stage III. Now, suppose we simulate the facility opening component at any vertex v by adding a vertex v f and including a tree edge {v, v f } of cost f (v)/2 and a covering non-tree edge {v, v f } of the same cost. It is easy to check that the problem P u is identical to that of finding a minimum cost set of edges to augment to H to make some vertex v f two-connected to lca(B u ). Also, the solution (x * u , y * u ) is a feasible solution to the new instance (when we set . But now, any fractional solution can be thought of as a 2-flow from lca(B u ) to the collection of facility-vertices {v f , v ∈ B u }. This can then be decomposed into a linear combination of integral 2-flows from these facility-vertices to lca(B u ). Hence by an averaging argument, there exists an integral 2-flow of cost at most c(x * u , y * u , z * u ).    The idea behind our algorithm is simple: if we first 1-connect each demand pair via the tree path, then it would suffice to buy covering cycles (to an appropriate extent to match the load on the tree edges) so that each s i -t i pair has 2-edge-disjoint paths between its end points. Therefore, with this in mind, let us for the moment assume that the tree path between each s i -t i pair has already been bought, and that we only need to buy the non-tree edges at bulk to cover these tree edges. To this end, consider the following problem of choosing non-tree edges (note that the constraints are linear, but the objective function is non-linear): Lemma 5.1. The optimal solution of the optimization problem NLP BaB has cost at most c bab (Opt), where Opt is an optimal solution for the given buy-at-bulk instance on a backboned graph.
Proof. Let us consider the optimal solution Opt, and set x i f to 1 whenever a non-tree edge f carries load on behalf of s i -t i . Clearly, this definition ensures that i x i f is exactly equal to the total load on any non-tree edge f in Opt. Therefore, the total cost incurred by our solution in NLP BaB is at most c bab (Opt).
To show that this is a feasible solution, suppose one of the constraints (5.17), corresponding to edge e and demand pair s i -t i is violated. Now, removing the edge e would separate the base tree T into two components, one containing s i (which we call C si ) and the other containing t i (denoted by C t i ). Since s i and t i are 2-edge-connected in Opt and e is the only tree edge crossing C s i and C t i , there must exist a non-tree edge f = {x, y} ∈ E(H) \ E(T ) carrying load for s i -t i such that one end vertex of f is in C si and the other is in C ti ; otherwise e would be a cut edge separating s i and t i in H. Therefore, our solution would have set x i f = 1, which contradicts the assumption that this was a violated constraint. Proof. Let us incrementally create a subgraph H in the following manner: all edges begin with a load of 0. For each non-tree edge f , for each i, if x i f is 1, then increase the load in H on all edges of the cycle O f by 1 (all these edges are made to carry load for s i -t i ).
When the process has been completed, what this ensures is that for each s i -t i , for any tree edge e ∈ P T (s i , t i ), there is a cycle O f containing e which carries load for s i -t i . Therefore, by applying transitivity of edge-connectivity, it immediately follows that s i and t i are 2-edge-connected within the edges that carry load for the demand pair s i -t i .
We now compare the cost c bab ( H) with the cost of solution x. For this, consider the step in the above process when non-tree edge f is being considered. Clearly, as the load on edge f increases from 0 to i x i f , the load on each tree edge e also increases by the same amount i x i f . Therefore, if l (·) denotes the modified load on the edges (after f is completely processed) and l(·) the original load (before processing f ), we have that the increase in cost of network H is at most e∈O f ∩E(T ) c(e)(Φ(l (e)) − Φ(l(e))) + c(f )Φ(l (f )). However, the concavity of the scaling function Φ ensures that Φ(l (e)) − Φ(l(e)) ≤ Φ(l (f )) for any tree edge e ∈ O f . Therefore, the cost increment is at most . Therefore, the total cost c bab ( H) is at most twice the cost incurred by x in NLP BaB .
Finally, it remains to show how we can get an approximately optimal integral solution for the problem NLP BaB , since we can then use Lemma 5.2 above to convert it into a solution for buy-at-bulk. Since the problem has a concave objective function, we first convert it to one with a linear objective function via the reduction given by Meyerson et al. ([42,Section 5.7]). In particular, using their reduction, we lose a constant factor in the objective function, but get T = |D| copies/cable-types of each edge f with cable type t ∈ [1, T ] having a "fixed cost" of c(f ) · A t and an "incremental cost" of c(f ) · B t . Now the following problem of choosing the cables (i.e. setting X i f,t to 0/1 corresponding to selecting cable type t for edge f ) is identical to NLP BaB : (i) constraints (5.17) are satisfied, and (ii) Since this modified problem is now linear, we can write an LP relaxation. In the following, we have a variable z f,t for each edge/cable type which indicates whether we buy cable type t on edge f . Variable x i f,t denotes whether edge f carries any load for s i -t i (using cable type t).
Finally, notice that this LP resembles that of the standard group Steiner tree problem on a 2-level tree (with the z f,t edges all connected to the root, and the x i f,t edges hanging off the z f,t edges) with the groups appropriately defined based on each tree edge e needing to be covered for every terminal pair s i -t i such that e ∈ P T (s i , t i ). Therefore, if we perform the GKR rounding algorithm, we would get an integer solution (to LP BaB and therefore to NLP BaB ) of cost at most a factor O(log n) of the optimal LP solution. This coupled with Lemmas 5.1 and 5.2 gives us the following theorem.
Theorem 5.1. The above algorithm is a randomized O(log n)-approximation algorithm for 2-edge-connected buy-at-bulk on backboned graphs and consequently, añ O(log 2 n)-approximation on general graphs. 6 The k-2EC problem In the k-2EC problem, the goal is to find a minimum cost set of edges that 2-edge-connects at least k of some given set X ⊆ V of terminals to the designated root vertex; Informally, this is the 2-connectivity variant of the well-studied k-MST problem. In [38], Lau et al. claimed a O(log 3 n) approximation algorithm for this problem, which was later shown to be incorrect. Subsequently, Lau et al. [39] gave an improved algorithm with approximation ratio O(log n log k), and Chekuri and Korula [6] gave the same O(log n log k) approximation for the more general 2-vertex-connectivity version, which implies an identical approximation for the k-2EC problem as well. In this section, we point out that applying techniques very similar to those for the 2-ECGS algorithm from Section 3 give us a simple algorithm for k-2EC problem, though with a weaker approximation guarantee of O(log 3 n).

The LP Relaxation and its Rounding.
We write an LP similar to the covering Steiner tree problem (there is one universal group which contains all the vertices and requires a connectivity of k) [26] along with covering constraints for the tree edges. For the following LP, we create a dummy leaf vertex l v (corresponding to each vertex v) and connect it to v with an edge {v, l v } of 0 cost. There is also a parallel covering edge {v, l v } of 0 cost (just to make sure there is a feasible solution to 2-edge-connect k of the dummy vertices). The group X then comprises of the set {l v | v ∈ V }. In the following, parent(v) denotes the parent edge of a vertex v along the base tree T , and T (e) denotes all the vertices in the subtree subtended beneath edge e.
x e , y f , z v ∈ [0, 1] Constraint 6.22 requires that if an edge e is part of the solution, there can be at most k terminals in the subtree T (e) which require connectivity -this is trivially true in integer solutions but is used to cut-off bad fractional solutions (see [35,26]). Constraint 6.23 re-quires that the fractional solution be monotonically nonincreasing as we move down the tree T . Constraint 6.24 requires that any tree edge included also be covered by a cycle -otherwise it would mean the solution has a cut-edge and is therefore not feasible. Finally, constraint 6.25 simply says that there are at least k terminals which are connected to r. Again, an argument identical to that for Lemma 3.1 shows that an optimal solution to this LP has cost O(Opt).
If we wanted to settle for a O(log 2 n log k) approximation algorithm on backboned graphs, we could round this LP exactly like in the 2-ECGS problem, except that instead of O(log 2 n) rounds of repetition, we repeat the two stages of rounding O(log n log k) times-the reason for this change is simple. An application of Janson's inequality tells us that after a single round of Stage 1, at least k 2 vertices would be connected to the root by the solution H 1 with probability Ω( 1 log n ). (This proof can be found in [35,Section 3.2], where it is shown that the probability that we choose less than (1 − δ)k vertices is at most 1 − δ 2 γ, with γ = Θ(1/ log n). Setting δ = 1 2 , we get that we choose at least k/2 vertices with probability at least γ/4.) Therefore, we only need to repeat the two stages of the rounding Ω(log n log k) times to guarantee that we connect at least k terminals with high probability.
However, we can incorporate techniques used in [26] for the covering Steiner tree problem to get rid of a logarithmic factor. Consider the following changes to the rounding algorithm: Case (1): If at least k/2 of the flow is reaching vertices that each receive at least 1/4 units of flow: Scaling up the fractional solution by a factor of 4 ensures that at least k/2 nodes are connected deterministically in the scaled solution. Also, the covering constraints are satisfied completely for each edge bought entirely in the fractional solution -there is a good fractional solution to the set cover problem of covering each tree edge by cycles, which implies that there is an integral solution of at most twice the cost (recall from Section 3.1.1 that such set cover instances have a totally unimodular constraint matrix). Thus we can 2-edgeconnect k/2 terminals to the root paying at most O(LPOpt). Therefore, since we halve the requirement each time this case holds, there can be at most O(log k) times this case applies. The total cost of the edges bought whenever we execute this step is O(log k)c(Opt).
Case (2): Case (1) does not hold, but at least 3k/4 flow reaches vertices that each receive at least 1/ log n units of flow: In this case, it must be that at least (3k/4 − k/2) = k/4 units of flow reach vertices that receive flow in the interval [1/16 log n, 1/4). But this must mean that the number of such vertices is at least k. So scaling up the solution by 16 log n will connect them all deterministically; and again, the y f variables are just scaled by O(log n). Therefore, at a cost of O(log n)c(Opt), we have 2-edge connected k vertices to the root.
Case (3): Neither of the above cases hold: In this case at least k/4 of the flow reaches vertices that receive at most 1/16 log n flow each. In this case, we scale up the flow by O(log n), and do the GKR randomized rounding. An argument similar to the one in Case (2) of [26] shows that we hit at least k vertices with constant probability. But in this case, the cost of a feasible set cover solution could be as large as O(log 2 n)c(Opt) -the original solution was scaled by O(log n), and furthermore, the expected cost of a fractional set cover solution costs O(log n)LPOpt like in the 2-ECGS case, because we do a GKR style rounding. Therefore, in this case, we can cover k vertices at a cost of O(log 2 n)c(Opt).
Thus, the total cost in any case would be at most O(log 2 n)c(Opt). This gives us an O(log 2 n) approximation for backboned graphs and anÕ(log 3 n) approximation for general graphs.
Theorem 6.1. The above algorithm is an O(log 2 n) approximation algorithm for k-2EC on backboned graphs, and anÕ(log 3 n) approximation algorithm on general graphs.
A Proofs from Section 2 Proof of Theorem 2.1: Using the construction of [1], we can draw a random spanning tree T = (V, y). where the tree distance d T is the defined as usual: if P T (u, v) is the unique u-v path in T , then d T (u, v) = e∈P T (u,v) c(e). Now, suppose we consider the same graph G, but with the following edge costs instead: (i) tree edge e ∈ T has cost c T (e) = c(e), and (ii) non-tree edge e ∈ E \ E(T ) has cost c T (e) = max{c(e), d T (u, v)}. Then, it is simple to verify that G with edge costs c T (.) is backboned.
Consider a problem Π, and let the optimal solution to the given instance on G with edge costs c(.) be a subgraph H ⊆ G. Then, from the low-stretch property of the random embedding, the expected cost of H under the cost function c T is E T [ e∈H c T (e)] ≤ O(log n)· e∈H c(e). Therefore the expected cost of any optimal solution under edge costs c T is at most O(log n)· e∈H c(e). Consequently, any β-approximation algorithm for the problem Π on backboned graphs would return a subgraph H ⊆ G, with expected cost (with respect to c T ) at most (β × O(log n)) · e∈H c(e). Since c(e) ≥ c(e) for any edge e ∈ E(G), the expected cost of the subgraph H with respect to edge costs c(.) is also at most (β × O(log n))· e∈H c(e) ≤ (β × O(log n))c(H). Therefore, the solution H is a randomized β × O(log n)approximate solution on the original edge costs c(.).
B Proofs from Section 4 B.1 2-CFL on Non-Metric Instances We now consider instances where the connection cost for the clients is given by some distance function d(·, ·) which may itself not satisfy triangle inequality, and the edge costs for building the 2-connected core is c(·).
We show how we can get poly-logarithmic approximations for the above "non-metric" 2-CFL problem us-ing essentially the same techniques we used for 2-ECGS. We first guess one facility which the optimal solution opens and call it r. The LP is almost identical to the one given for 2-CFL on general graphs, except for the clientfacility connection cost being some arbitrary function d(·, ·) instead of the tree distances c(·). Here is a brief overview of the rounding algorithm for 2-CFL. We skip the details of the proofs as they are very similar to the ones given in the earlier sections.
(i) Solve the LP relaxation optimally.
Then filter the client connection costs: If we let i and scale the solution by factor 2. (ii) For each u ∈ D, create a group g u = {v ∈ V | z * uv = 0} of facilities associated with this client. It is easy to check that the solution (x * , y * ) is a feasible solution for the 2-ECGS LP with these groups.
(iii) Perform Stage I and Stage II of the 2-ECGS algorithm once; if a group g u is 2-connected to the root, open a facility at the representative vertex v gi . Because the 2-ECGS algorithm ensures that Ω( 1 log n ) groups are 2-connected to the root, and we open facilities for these groups, we know that Ω( 1 log n ) clients have a facility opened near them. A similar analysis as the one for the 2-ECGS problem can be used to see that the total cost spent in this step is at most O(log n)LPOpt.
(iv) We can then repeat this process O(log 2 n) times and output the union of all previous partial solutions to guarantee with high probability a feasible solution to the 2-CFL problem.
Theorem B.1. Non-metric 2-CFL admits an O(log 3 n) approximation algorithm on backboned graphs, and an O(log 4 n) approximation algorithm for general graphs.