Stochastic Steiner Tree with Non-Uniform Inﬂation

. We study the Steiner Tree problem in the model of two-stage stochastic optimization with non-uniform inﬂation factors , and give a poly-logarithmic approximation factor for this problem. In this problem, we are given a graph G = ( V, E ) , with each edge having two costs c M and c T (the costs for Mon-day and Tuesday, respectively). We are also given a probability distribution π : 2 V → [0 , 1] over subsets of V , and will be given a client set S drawn from this distribution on Tuesday. The algorithm has to buy a set of edges E M on Monday, and after the client set S is revealed on Tuesday, it has to buy a (possibly empty) set of edges E T ( S ) so that the edges in E M ∪ E T ( S ) connect all the nodes in S . The goal is to minimize the c M ( E M ) + E S ← π [ c T ( E T ( S ) ) ] . We give the ﬁrst poly-logarithmic approximation algorithm for this problem. Our algorithm builds on the recent techniques developed by Chekuri et al. (FOCS 2006) for multi-commodity Cost-Distance. Previously, the problem had been studied for the cases when c T = σ × c M for some constant σ ≥ 1 (i.e., the uniform case), or for the case when the goal was to ﬁnd a tree spanning all the vertices but Tuesday’s costs were drawn from a given distribution b π (the so-called “stochastic MST case”). We complement our results by showing that our problem is at least as hard as the single-sink Cost-Distance problem (which is known to be Ω (log log n ) hard). Moreover, the requirement that Tuesday’s costs are ﬁxed seems essential: if we allow Tuesday’s costs to dependent on the scenario as in stochastic MST, the problem becomes as hard as Label Cover (which is Ω (2 log 1 − ε n ) -hard). As an aside, we also give an LP-rounding algorithm for the multi-commodity Cost-Distance problem, matching the O (log 4 n ) approximation guarantee given by Chekuri et al. (FOCS 2006).


Introduction
This paper studies the Steiner tree problem in the framework of two-stage stochastic approximation, which is perhaps best (albeit a bit informally) described as follows.On Monday, we are given a graph with two cost functions c M and c T on the edges, and a distribution π predicting future demands; we can build some edges E M at cost c M .On Tuesday, the actual demand set S arrives (drawn from the distribution π), and we must complete a Steiner tree on the set S, but any edges E T bought on Tuesday cost c T .How can we minimize our expected cost The Stochastic Steiner tree problem has been studied before in the special case when Tuesday's cost function c T is a scaled-up version of Monday's costs c M (i.e., there is an constant inflation factor σ > 1 such that c T (e) = σ × c M (e)); for this case, constantfactor approximations are known [9,10,12].While these results can be generalized in some directions (see Section 1.1 for a detailed discussion), it has been an open question whether we could handle the case when the two costs c M and c T are unrelated.(We will refer to this case as the non-uniform inflation case, as opposed to the uniform inflation case when the costs c M and c T are just scaled versions of each other.)This gap in our understanding was made more apparent by the fact that many other problems such as Facility Location, Vertex Cover and Set Cover, were all shown to admit good approximations in the non-uniform inflation model [22,24]: in fact, the results for these problems could be obtained even when the edge cost could depend on the day as well as on the demand set appearing on Tuesday.

Theorem 1 (Main Theorem).
There is an O(log 2 (min(N, λ)) log 4 n log log n)approximation algorithm for the two-stage stochastic Steiner tree problem with nonuniform inflation costs with N scenarios, on a graph with n nodes.Here λ = max e∈E c T (e)/c M (e), i.e., the maximum inflation over all edges.This is the first non-trivial approximation algorithm for this problem.Note that the cost of an edge can either increase or decrease on Tuesday; however, we would like to emphasize that our result holds only when Tuesday's costs c T do not depend on the materialized demand set S. (Read on for a justification of this requirement.) We also show that the two-stage stochastic Steiner tree problem is at least as hard as the single-source cost-distance problem.
Theorem 2 (Hardness).The two-stage stochastic Steiner tree problem is at least Ω(log log n)-hard unless N P ⊆ DT IM E(n log log log n ).
The hardness result in the above theorem holds even for the special case of Stochastic Steiner tree when the cost of some edges remain the same between days, and the cost of the remaining edges increases on Tuesday by some universal factor.
Finally, we justify the requirement that Tuesday's costs c T are fixed by showing that the problem becomes very hard without this requirement.Indeed, we can show the following theorem whose proof is deferred to the journal paper.
Theorem 3. The two-stage stochastic Steiner tree problem when Tuesday's costs are dependent on the materialized demand is at least Ω(2 log 1−ε n ) hard for every fixed ε > 0.
Finally, we also give an LP-rounding algorithm for the multi-commodity Cost-Distance problem, matching the O(log 4 n) approximation guarantee given by Chekuri et al. [4]; however, we note that the LP we consider is not the standard LP for the problem.
Our Techniques.Our approach will be to reduce our problem to a more general problem which we call Group-Cost-Distance: Definition 4 (Group-Cost-Distance) Consider a (multi)graph G = (V, E) with each edge having a buying cost b e and a renting cost c e .Given a set of subsets S 1 , S 2 , . . ., S N ⊆ V , find for each i a tree T i that spans S i , so as to minimize the total cost (1.1) Defining F = ∪ i T i and x e = number of trees using edge e, we want to minimize e∈F (b e + x e c e ).
The problem can also be called "multicast" cost-distance, since we are trying to find multicast trees on each group that give the least cost given the concave cost functions on each edge.Note that when each S i = {s i , t i }, then we get the (Multicommodity) Cost-Distance problem, for which the first poly-logarithmic approximation algorithms were given only recently [4]; in fact, we build on the techniques used to solve that problem to give the approximation algorithm for the Group-Cost-Distance problem.

Related Work
Stochastic Problems.For the Stochastic Steiner tree problem in the uniform inflation case where all the edge-costs increase on Tuesday by the same amount σ, an O(log n)approximation was given by Immorlica et al. [16], and constant-factor approximations were given by [9,10,12].These results were extended to handle the case when the inflation factors could be random variables, and hence the probability distribution would be over tuples of the form (demand set S, inflation factor σ) [11,15].
A related result is known for the Stochastic Minimum Spanning Tree problem, where one has to connect all the vertices of the graph.In this case, we are given Monday's costs c M , and the probability distribution is over possible Tuesday costs c T .For this problem, Dhamdhere et al. [7] gave an O(log n + log N ) approximation, where N is the number of scenarios.They solve an LP and randomly round the solution; however, their random rounding seems to crucially require that all nodes need to be connected up, and the idea does not seem to extend to the Steiner case.(Note that their problem is incomparable to ours: in this paper, we assume that Monday's and Tuesday's costs were deterministic whereas they do not; on the other hand, in our problem, we get a random set of terminals on Tuesday, whereas they have to connect all the vertices which makes their task easier.) Approximation algorithms for several other problems have been given in the nonuniform stochastic setting; see [22,24].For a general overview of some techniques used in stochastic optimization, see, e.g., [10,24].However, nothing has been known for the Stochastic Steiner tree problem with non-uniform inflation costs.
In many instances of the stochastic optimization problem, it is possible that the number of possible scenarios on Tuesday (i.e., the support of the distribution π) is exponentially large.Charikar et al. [2] gave a useful technique by which we could reduce the problem to a a much smaller number of scenarios (polynomial in the problem size and inflation factors) by random sampling.We shall use this tool in our algorithm as well.
Buy-at-Bulk and Cost-Distance Problems.There has been a huge body of work on so-called buy-at-bulk problems which model natural economies-of-scale in allocating bandwidth; see, e.g., [3] and the references therein.The (single-source) Cost-Distance problem was defined by Meyerson, Munagala and Plotkin [20]: this is the case of Group-Cost-Distance with a root r ∈ V , and each S i = {t i , r}.They gave a randomized O(log k)-approximation algorithm where k = |∪ i S i |, which was later derandomized by Chekuri, Khanna and Naor [5].(An online poly-logarithmic competitive algorithm was given by Meyerson [19].)These results use a randomized pairing technique that keeps the expected demand at each node constant; this idea does not seem to extend to Group-Cost-Distance.The Multicommodity Cost-Distance problem (i.e., with arbitrary source-sink pairs) was studied by Charikar and Karagiozova [3] who gave an exp{ √ log n log log n}-approximation algorithm.Very recently, this was improved to poly-logarithmic approximation ratio by Chekuri, Hajiaghayi, Kortsarz, and Salavatipour [4] (see also [13,14]).We will draw on several ideas from these results.
Embedding Graph Metrics into Subtrees.Improving a result of Alon et al. [1], Elkin et al. [8] recently showed the following theorem that every graph metric can be approximated by a distribution over its subtrees with a distortion of O(log 2 n log log n).
Theorem 5 (Subtree Embedding Theorem).Given a graph G = (V, E), there exists a probability distribution D G over spanning trees of G such that for every x, y ∈ V (G), the expected distance Note that spanning trees T trivially ensure that d G ≤ d T .The parameter β EEST will appear in all of our approximation guarantees.

Reduction to Group Cost-Distance
Note that the distribution π may be given as a black-box, and may be very complicated.However, using a theorem of Charikar, Chekuri, and Pál on using sample averages [2, Theorem 2], we can focus our attention on the case when the probability distribution π is the uniform distribution over some N sets S 1 , S 2 , . . ., S N ⊆ V , and hence the goal is to compute edge sets E 0 .= E M , and E 1 , E 2 , . . ., E N (one for each scenario) such that E 0 ∪ E i contains a Steiner tree on S i .Scaling the objective function by a factor of N , we now want to minimize Basically, the N sets will just be N independent draws from the distribution π.We set the value N = Θ(λ 2 −5 m), where λ is a parameter that measures the "relative cost of information" and can be set to max e c T (e)/c M (e) for the purposes of this paper, m is the number of edges in G, and is a suitably small constant.Let ρ ST be the best known approximation ratio for the Steiner tree problem [23].The following reduction can be inferred from [2, Theorem 3] (see also [25]): Lemma 1 (Scenario Reduction).Given an α-approximation algorithm for the above instance of the stochastic Steiner tree problem with N scenarios, run it independently Θ(1/ ) times and take the best solution.With constant probability, this gives an O((1 + )α)-approximation to the original stochastic Steiner tree problem on the distribution π.
Before we go on, note that E 0 and each of the E 0 ∪ E i are acyclic in an optimal solution.We now give the reduction to Group-Cost-Distance. Create a new (multi)graph, whose vertex set is still V .For each edge e ∈ E in the original graph, we add two parallel edges e 1 (with buying cost b e1 = N • c M (e) and renting cost c e1 = 0) and e 2 (with buying cost b e2 = 0 and renting cost c e2 = c T (e)).The goal is to find a set of N trees T 1 , . . ., T N , with T i spanning the set S i , so as to minimize It is easily verified that the optimal solution to the two objective functions (2.2) and (2.3) are the same when we define the buying and renting costs as described above.Using Lemma 1, we get the following reduction.(As an aside, if the distribution π consists of N scenarios listed explicitly, we can do an identical reduction to Group-Cost-Distance, but now the value of N need not have any relationship to λ.)

Observations and Reductions
Recall that a solution to the Group-Cost-Distance problem is a collection of trees T i spanning S i ; their union is F = ∪ i T i , and x e is the number of trees that use edge e.Note that if we were just given the set F ⊆ E of edges, we could use the ρ STapproximation algorithm for finding the minimum cost Steiner tree to find trees where T i ⊆ F is the minimum cost Steiner tree spanning S i .We use OPT = F * to denote the set of edges used in an optimal solution for the Group-Cost-Distance instance, and hence cost(OPT) is the total optimal cost.Henceforth, we may specify a solution to an instance of the Group-Cost-Distance problem by just specifying the set of edges F = ∪ i T i , where T i is the tree spanning S i in this solution.
As an aside, note that cost(F) is the optimal cost of any solution using the edges from F ⊆ E; computing cost(F) is hard given the set F, but that is not a problem since it will be used only as an accounting tool.Of course, given F, we can build a solution to the Group-Cost-Distance problem of cost within a ρ ST factor of cost(F).
We will refer to the sets S i as demand groups, and a vertex in one of these groups as a demand vertex.For simplicity, we assume that for all i, |S i | is a power of 2; this can be achieved by replicating some vertices.

The Pairing Cost-Distance Problem: A Useful Subroutine
A pairing of any set A is a perfect matching on the graph (A, A 2 ).The following treepairing lemma has become an indispensable tool in network design problems (see [21] for a survey): Lemma 3 ( [18]).Let T be an arbitrary tree and let v 1 , v 2 , . . ., v 2q be an even number of vertices in T .There exists a pairing of the v i into q pairs so that the unique paths joining the respective pairs are edge-disjoint.
Let us define another problem, whose input is the same as that for Group-Cost-Distance.
Definition 6 (Pairing Cost-Distance) Given a graph G = (V, E) with buy and rent costs b e and c e on the edges, and a set of demand groups {S i } i , the Pairing Cost-Distance problem seeks to find a pairing P i of the nodes in S i , along with a path connecting each pair of nodes (x, y) ∈ P i .
Let F be the set of edges used by these paths, and let x e be the number of pairs using the edge e ∈ F , then the cost of a solution is e∈F (b e + x e c e ).As before, given the set F , we can infer the best pairing that only uses edges in F by solving a mincost matching problem: we let cost (F ) denote this cost, and let OPT be the optimal solution to the Pairing Cost-Distance instance. 4So, again, we can specify a solution to the Pairing Cost-Distance problem by specifying this set F .The following lemma relates the costs of the two closely related problems: Lemma 4. For any instance, the optimal cost cost (OPT ) for Pairing Cost-Distance is at most the optimal cost cost(OPT) for Group-Cost-Distance.
Proof.Let F be the set of edges bought by OPT for the Group-Cost-Distance problem.We construct a solution for the Pairing Cost-Distance problem.Recall that OPT builds a Steiner tree T i spanning S i using the edges in F. By Lemma 3, we can pair up the demands in S i such that the unique paths between the pairs in T i are pair-wise edgedisjoint.This gives us a solution to Pairing Cost-Distance, which only uses edges in F, and moreover, the number of times an edge is used is at most x e , ensuring a solution of cost at most cost(OPT).Note that A is not a true approximation algorithm for Pairing Cost-Distance, since we compare its performance to the optimal cost for Group-Cost-Distance; hence we will call it an α-pseudo-approximation algorithm.
Proof.In each iteration, when we connect up pairs of nodes in S i , we think of taking the traffic from one of the nodes and moving it to the other node; hence the number of "active" nodes in S i decreases by a factor of 2. This can only go on for O(log n) iterations before all the traffic reaches one node in the group, ensuring that the group is connected using these pairing paths.Since we pay at most α cost(OPT) in each iteration, this results in an O(α log n) approximation for Group-Cost-Distance.

An Algorithm for Pairing Cost-Distance
In this section, we give an LP-based algorithm for Pairing Cost-Distance; by Lemma 5 this will imply an algorithm for Group-Cost-Distance, and hence for Stochastic Steiner Tree.
We will prove the following result for Pairing Cost-Distance (PCD): Since H = O(N n) and we think of N ≥ n, this gives us an O(log 2 N log 3 n log log n) pseudo-approximation. Before we present the proof, let us give a high-level sketch.The algorithm for Pairing Cost-Distance follows the general structure of the proofs of Chekuri et al. [4]; the main difference is that the problem in [4] already comes equipped with {s, t}-pairs that need to be connected, whereas our problem also requires us to figure out which pairs to connect-and this requires a couple of new ingredients.
Loosely, we first show the existence of a "good" low density pairing solution-this is a solution that only connects up some pairs of nodes in some of the sets S i (instead of pairing up all the nodes in all the S i 's), but whose "density" (i.e., ratio of cost to pairsconnected) is at most a β EEST factor of the density of OPT.Moreover, the "good" part of this solution will be that all the paths connecting the pairs will pass through a single "junction" node.The existence of this single junction node (which can now be thought of as a sink) makes this problem look somewhat like a low-density "pairing" version of single-sink cost-distance.We show how to solve this final subproblem within an O(log H • log n) factor of the best possible such solution, which is at most β EEST times OPT's density.Finally, finding these low-density solutions iteratively and using standard set-cover arguments gives us a Pairing Cost-Distance solution with cost O(β EEST log 2 H • log n) cost(OPT), which gives us the claimed theorem.

Defining the Density
Consider an instance of the Pairing Cost-Distance (PCD) problem in which the current demand sets are S i .Look at a partial PCD solution that finds for each set S i some set P i of t i ≥ 0 mutually disjoint pairs {x i j , y i j } ti j=1 along with paths P i j connecting these pairs.Let P = ∪ i P i be the (multi)set of all these t = i t i paths.We shall use P to denote both the pairs in it and the paths used to connect them.Denote the cost of this partial solution by cost (P) = b(∪ P ∈P P ) + P ∈P c(P ).Let |P| be the number of pairs being connected in the partial solution.The density of the partial solution P is defined as cost (P) |P| .Recall that where H(I) is the total number of terminals in the instance I.
Theorem 9. Given an algorithm to find f -dense Partial PCD solutions, we can find an O(f log H)-pseudo-approximation to Pairing Cost-Distance.
To prove this result, we will use the following theorem which can be proved by standard techniques (see e.g., [17]), and whose proof we omit.
Theorem 10 (Set Covering Lemma).Consider an algorithm working in iterations: in iteration it finds a subset P of paths connecting up |P | pairs.Let H be the number of terminals remaining before iteration .If for every , the solution P is an f -dense solution with cost

H
, then the total cost of the solution output by the algorithm is at most f • (1 + ln H) • cost(OPT).
In the next section, we will show how to find a Partial PCD solution which is f = O(β EEST log H • log n)-dense.

Finding a Low-Density Partial PCD Solution
We now show the existence of a partial pairing P of demand points which is β EESTdense, and where all the pairs in P will be routed on paths that pass through a common junction point.The theorems of this section are essentially identical to corresponding theorems in [4].Proof Sketch.The above theorem can be proved by dropping all edges in E \ OPT and approximating the metric generated by rental costs c e in each resulting component by a random subtree drawn from the distribution guaranteed by Theorem 5. We chose a subset of OPT , and hence the buying costs cannot be any larger.Since the expected distances increase by at most β EEST , the expected renting cost increases by at most this factor.And since this holds for a random forest, by the probabilistic method, there must exist one such forest with these properties. 2 Definition 12 (Junction Tree) Consider a solution to the Partial PCD problem with paths P, and which uses the edge set F .The solution is called a junction tree if the subgraph induced by F is a tree and there is a special root vertex r such that all the paths in P contain the root r.
As before, the density of a solution P is the ratio of its cost to the number of pairs connected by it.We can now prove the existence of a low-density junction tree for the Partial PCD problem.The proof of this lemma is deferred to the journal paper.
Lemma 6 (Low-Density Existence Lemma).Given an instance of Pairing Cost-Distance problem, there exists a solution to the Partial PCD solution which is a junction tree and whose density is In the following section, we give an O(log H • log n)-approximation algorithm for finding a junction tree with minimum density.Since we know that there is a "good" junction tree (by Lemma 6), we can combine that algorithm with Lemma 6 to get a Partial PCD solution which is f = O(β EEST log H • log n)-dense.

Finding a Low-Density Junction Tree
In this section, we give an LP-rounding based algorithm for finding a junction tree with density at most O(log H • log n) times that of the min-density junction tree.Our techniques continue to be inspired by [4]; however, in their paper, they were given a fixed pairing by the problem, and had to figure out which ones to connect up in the junction tree.In our problem, we have to both figure out the pairing, and then choose which pairs from this pairing to connect up; we have to develop some new ideas to handle this issue.
The Linear-Programming Relaxation.Recall the problem: we are given sets S i , and want to find some partial pairings for each of the sets, and then want to route them to some root vertex r so as to minimize the density of the resulting solution.We will assume that we know r (there are only n possibilities), and that the sets S i are disjoint (by duplicating nodes as necessary).
Our LP relaxation is an extension of that for the Cost-Distance problem given by Chekuri, Khanna, and Naor [5].The intuition is based on the following: given a junction-tree solution F , let P denote the set of pairs connected via the root r.Now F can also be thought of as a solution to the Cost-Distance problem with root r and the terminal set ∪ (u,v)∈P {u, v}.Furthermore, the cost cost (E ) is the same as the optimum of the Cost-Distance problem.(This is the place when the definition of cost becomes crucial-we can use the fact that the cost measure cost is paying for the number of paths using an edge, not the number of groups using it.) Let us write an IP formulation: let S = ∪ i S i denote the set of all terminals.For each demand group S i and each pair of vertices u, v ∈ S i , the variable z uv indicates whether we match vertices u and v in the junction tree solution or not.To enforce a matching, we ask that u z uv = v z uv ≤ 1.For each e ∈ E, the variable y e denotes whether the edge e is used; for each path from some vertex u to the root r, we let f P denote whether P is the path used to connect u to the root.Let P u be the set of paths from u to the root r.
Clearly, we want P ∈Pu f P ≤ x e for all e ∈ P .Moreover, P ∈Pu f P ≥ v∈Si z uv for each u ∈ S i , since if the node u is paired up to someone, it must be routed to the root.Subject to these constraints (and integrality), we want to minimize min e∈E b e x e + u∈S P ∈Pu c(P It is not hard to check that this is indeed an ILP formulation of the min-density junction tree problem rooted at r.As is usual, we relax the integrality constraints, guess the value M ≥ 1 of the denominator in the optimal solution, and get: x e , f P , z uv = z vu ≥ 0 We now show that the integrality gap of the above LP is small. Theorem 13.The integrality gap of (LP1) is O(log H • log n).Hence there is an O(log H • log n)-approximation algorithm for finding the minimum density junction tree solution for a given Partial PCD instance.
Proof.Consider an optimal fractional solution given by (x * , f * , z * ) with value LP * .We start off with z = z * , and will alter the values in the following proof.Consider each set S i , and let w i = u,v∈Si z * uv be the total size of the fractional matching within S i .We find an approximate maximum weight cut in the complete graph on the nodes S i with edge weights z uv -this gives us a bipartite graph which we denote by B i ; we zero out the z uv values for all edges (u, v) ∈ S i × S i that do not belong to the cut B i .How does this affect the LP solution?Since the weight of the max cut we find is at least w i /2, we are left with a solution where N i=1 u,v∈Si z uv ≥ M/2 (and hence that constraint is almost satisfied).Now consider the edges in the bipartite graph B i , with edge weights z uv -if this graph has a cycle, by alternatively increasing and decreasing the values of z variables along this even cycle by in one of the two directions, we can make z u v zero for at least one edge (u , v ) of this cycle without increasing the objective function.(Note that this operation maintains all the LP constraints.)We then delete the edge (u , v ) from B i , and repeat this operation until B i is a forest.
Let us now partition the edges of the various such forests {B i } N i=1 into O(log H) classes based on their current z values.Let Z max = max u,v z uv , and define p = 1 + 2 log H = O(log H).For each a ∈ [0..p], define the set C a to contain all edges (u, v) with Z max /2 a+1 < z uv ≤ Z max /2 a ; note that the pairs (u, v) ∈ ∪ p a=1 C a have a cumulative z uv value of less than (say) Z max /4 ≤ M/4.Hence, by an easy averaging argument, there must be a class C a with (u,v)∈Ca z uv ≥ Ω(M/ log H).Define Since we have restricted our attention to pairs in C a , we can define B ia = B i ∩ C a , which still remains a forest.For any tree T in this forest, we apply the tree-pairing lemma on the nodes of the tree T , and obtain a matching C ia on S i of size |V (T )|/2 .Defining C a = ∪ i C ia , we get that |C a | Z a = Ω(M/ log H) as well.
Finally, we create the following instance of the Cost-Distance problem.The terminal set contains all the terminals that are matched in C a , and the goal is to connect them to the root.Set the values of the variables fP = f * P /Z a and xe = x e /Z a .These settings of variables satisfy the LP defined by [5] for the instance defined above.The integrality gap for this LP is O(log n) and so we get a solution with cost at most O(log n) • LP * /Z a .However, this connects up |C a | = Ω( M Za log H ) pairs, and hence the density is O(log H • log n) LP * M , hence proving the theorem.
5 Reduction from Single-Sink Cost-Distance Theorem 14.If there is a polynomial time α-approximation algorithm for the twostage stochastic Steiner tree problem, then there is a polynomial time α-approximation algorithm for the single-source cost-distance problem.
The hardness result of Theorem 2 follows by combining the above reduction with a result of Chuzhoy et al. [6] that the single-source cost-distance problem cannot be approximated to better than Ω(log log n) ratio under complexity theory assumptions.
Proof of Theorem 14.Consider an instance of the Cost-Distance problem: we are given a graph G = (V, E), a root vertex r, and a set S of terminals.Each edge e has buying cost b e and rental cost c e .A solution specifies a set of edges E which spans the root and all the nodes in S: if the shortest path in (V, E ) from u ∈ S to r is P u , then the cost of the solution is b(E ) + u∈S c(P u ).We take any edge with buying cost b e and rental cost c e , and subdivide this edge into two edges, giving the first of these edges a buying cost of b e and rental cost ∞, and the other edge gets buying cost ∞ and rental cost c e .
We reduce this instance to the two-stage stochastic Steiner tree problem where the scenarios are explicitly specified.The instance of the stochastic Steiner tree problem has the same graph.There are |S| scenarios (each with probability 1/|S|), where each scenario has exactly one unique demand from S. For an edge e which can only be bought, we set c M (e) = b e and c T (e) = ∞; hence any such edge must necessarily be bought on Monday, if at all.For an e which can only be rented, we set c M (e) = c T (e) = |S| • c e ; note that there is no advantage to buying such an edge on Monday, since we can buy it on Tuesday for the same cost if needed -in the rest, we will assume that any optimal solution is lazy in this way.
It can now be verified that there is an optimal solution to the Stochastic Steiner Tree problem where the subset F of edges bought in the first stage are only of the former type, and we have to then buy the "other half" of these first-stage edges to connect to the root in the second stage, hence resulting in isomorphic optimal solutions.2

Summary and Open Problems
In this paper, we gave a poly-logarithmic approximation algorithm for the stochastic Steiner tree problem in the non-uniform inflation model.Several interesting questions remain open.When working in the black-box model, we apply the scenario reduction method of Charikar et al. [2], causing the resulting number of scenarios N to be a polynomial function of the parameter λ, which is bounded by the maximum inflation factor on any edge.Hence our running time now depends on λ, and the approximation ratio depends on log λ.Can we get results where these measures depend only on the number of nodes n, and not the number of scenarios N ?
In another direction, getting an approximation algorithm with similar guarantees for (a) the stochastic Steiner Forest problem, i.e., where each scenario is an instance of the Steiner forest problem, or (b) the k-stage stochastic Steiner tree problem, remain open.

Lemma 2 .
An α-approximation for the Group-Cost-Distance problem implies an O(α)-approximation for the non-uniform Stochastic Steiner tree problem.

Lemma 5 (
Reducing Group-Cost-Distance to Pairing Cost-Distance).If there is an algorithm A for Pairing Cost-Distance that returns a solution F with cost (F ) ≤ α cost(OPT), we get an O(α log n)-approximation for the Group-Cost-Distance problem.

Theorem 7 (
Main Result for Pairing Cost-Distance).There is an α = O(β EEST log 2 H • log n) pseudo-approximation algorithm for the Pairing Cost-Distance problem, where H = max{ i |S i |, n}.
is the total number of terminals in the Pairing Cost-Distance instance.Definition 8 (f -dense Partial PCD solution) Consider an instance I of the Pairing Cost-Distance problem: a Partial PCD solution P is called f -dense if

Theorem 11 .
Given an instance of Pairing Cost-Distance on G = (V, E), there exists a solution F to this instance such that (a) the edges in F induce a forest, (b) F is a subset of OPT and hence the buying part of cost (F ) ≤ b(OPT ), and (c) the renting part of cost (F ) is at most O(β EEST ) times the renting part of cost (OPT ).

min e∈E b e
x e + u∈S P ∈Pu c(P ) f P z uv = M P ∈Pu:P e f P ≤ x e for all u ∈ S P ∈Pu f P ≥ v∈Si z uv for all u ∈ S i , i ∈ [1..N ] v∈Si z uv ≤ 1 for all u ∈ S i , i ∈ [1..N ]