The Pipelined Set Cover Problem

Consider the following generalized min-sum set cover or multiple intents re-ranking problem proposed by Azar et al. (STOC 2009). We are given a universe of elements and a collection of subsets, with each set S having a covering requirement of K(S). The objective is to pick one element at a time such that the average covering time of the sets is minimized, where the covering time of a set S is the first time at which K(S) elements from it have been selected. There are two well-studied extreme cases of this problem: (i) when K(S) = 1 for all sets, we get the min-sum set cover problem, and (ii) when K(S) = |S| for all sets, we get the minimum-latency set cover problem. Constant factor approximations are known for both these problems. In their paper, Azar et al. considered the general problem and gave a logarithmic approximation algorithm for it. In this paper, we improve their result and give a simple randomized constant factor approximation algorithm for the generalized min-sum set cover problem.


Introduction
The min-sum set cover problem is a min-latency version of the well-known set cover problem: for ease of exposition we will consider the equivalent hitting set formulation of the set cover problem. Here, we are given a universe U of n elements, and a collection S = {S 1 , S 2 , . . . , S m } of subsets with S i ⊆ U , and the objective is to select one element at a time (i.e., find a linear ordering of the elements) such that the average hitting (or "cover") time of the sets is minimized. Formally, we pick one element at every time instant: if an element e is picked at time t its cover time is of a set S is Cov(S) = min e∈S Cov(e), and the goal is to minimize S∈S Cov(S). For this problem the greedy algorithm of picking the element which covers the most number of uncovered sets is known to be a 4-approximation for this problem [BNBH + 98,FLT04], and this is the best possible unless P=NP [FLT04]. A problem that is similar in spirit is the min-latency set cover problem, where the cover time of a set S is Cov(S) = max e∈S Cov(e), the time at which all the elements of the set have been selected. This problem also admits a constant factor approximation algorithm [HL05]. In fact, this problem easily reduces to that of precedence-constrained scheduling on a single machine, for which a 2-approximation is known using various techniques [HSSW97,MQW03,CM99].
A substantial generalization of these two problems was offered recently by Azar, Gamzu and Yin [AGY09]: the multiple intents re-ranking problem or the generalized min-sum set cover problem (GenMSSC). Here each set S ∈ S also comes with a covering requirement K(S) ∈ {1, 2, . . . , |S|}, and its cover time is defined to be the time at which K(S) elements from S are selected: The goal is to minimize S Cov(S). Note that we get the min-sum set cover problem if we set K(S) = 1 for all sets S ∈ S, and the min-latency set cover problem if we set K(S) = |S| for all S ∈ S. Azar et al. [AGY09] gave an O(ln r)-approximation algorithm for this problem, where r is the largest size of any set in S via a modified greedy algorithm, and left open the question of obtaining a constant factor approximation for the problem. We resolve that question in this paper.
Theorem 1.1. The generalized min-sum set cover problem (a.k.a the multiple intents re-ranking problem) admits a randomized 485-approximation algorithm.
Our approach is based on formulating a strengthened LP relaxation for the problem, obtained by adding the so-called "knapsack-cover inequalities" [CFLP00] to the natural LP relaxation. This is necessary as one can construct examples (see Section 6.2) where the natural LP has an unbounded integrality gap. We then use a simple stage-based randomized rounding scheme which works as follows. We consider exponentially increasing prefixes of time, and round the (fractional) assignments in these prefixes to obtain partial orderings. Then, we combine these partial orderings into a single ordering. For any set S, our rounding guarantees an expected cover time of O(t S ), where t S is its cover time in the LP relaxation.
1.1 Related Work The fact that the greedy algorithm was a constant-factor approximation algorithm for min-sum set cover was implicit in the work of Bay-Noy et al. [BNBH + 98], and was made explicit in papers by Feige et al., who also simplified the proofs, both in the conference version [FLT02], and then further in the journal version [FLT04]. They also showed that the 4-approximation was the best possible unless P=NP. Other variants of this problem have been studied in different contexts, like when the set coverage is probabilistic (stochastic) [CFK03], or when the cost of a set depends on the set of uncovered elements at the time when it is picked [MBMW05].
At the other end of the spectrum is the minlatency set cover problem. This was formally studied by Hassin and Levin [HL05], who gave a factor eapproximation for the problem via techniques similar to those for the min-latency tour, a.k.a. the traveling repairman problem. Subsequently, they observed that min-latency set cover can be modeled as a special case of the classic precedence-constrained scheduling problem 1|prec| j w j C j , for which several 2-approximation algorithms are known using a variety of different techniques (see, e.g., [CK04,KSW99] for surveys). This special case corresponds to the so-called "bipartite constraints" case, where there are two types of jobs J 1 and J 2 . All jobs in J 1 have w j = 0, p j = 1 (these correspond to elements), all jobs in J 2 have w j = 1, p j = 0 (these correspond to sets S j ⊂ J 1 ) and the precedence constraints have the form that each job j ∈ J 2 must be preceded by the jobs S j ⊂ J 1 . To see the equivalence to min-latency set cover problem, note that any valid schedule is just an ordering of jobs in J 1 (as jobs in J 2 have size 0). Moreover, only jobs in J 2 contribute to completion time (as jobs in J 1 have weight 0), and being of size 0, a job in J 2 can be assumed to be completed immediately after its preceding jobs in J 1 have been scheduled. Woeginger [Woe03] showed that this special case (or equivalently the min-latency set cover problem) is as hard to approximate as the general 1|prec| j w j C j problem. Recently it has been shown [BK09], that assuming a variant of the Unique Games Conjecture, it is hard to approximate 1|prec| j w j C j , and hence minlatency set cover, to better than 2 − for any > 0.
Multiple Intents Re-Ranking: The multiple intents re-ranking problem was introduced by Azar et al. [AGY09]. In this problem, each set S has a weight vector w S of length |S|, and if the elements of the set are output at times τ S = (t 1 , t 2 , . . . , t |S| ) where t 1 < t 2 < · · · < t |S| , then the cost of the set is w S · τ S ; the goal is find an ordering of the elements that minimizes the sum of these costs S∈S w S ·τ S . (However, as noticed in that paper, by making copies of sets, one can equivalently imagine each set S to have a single requirement K(S), and we are charged for the first time at which K(S) elements from S have been chosen; i.e., the model we use.) They showed that if all the weight vectors were increasing or decreasing, one could get constant factor approximations, even though the naïve greedy algorithm could be arbitrarily bad. They then gave an O(log r)-approximation for a greedy-like algorithm using a clever harmonic interpolation idea; here r is the size of the largest set in the set system. However, we can show (see Section 6.1) that their algorithm cannot give a constant-factor approximation for the general problem.

Min-sum Set Cover and GenMSSC
A key difference between min-sum set cover and the generalized version of the problem can be illustrated by looking at the max-coverage variants of both these problems. In the max-coverage problem, given a bound k, the goal is to choose k elements which maximizes the number of sets hit. While it is known that the greedy algorithm is a 1 − 1/e approximation algorithm for this problem, the max-coverage variant of the generalized problem becomes Dense-K-Subgraph hard even for the case when a set is covered when 2 of its elements are selected. Indeed, given a graph G, consider the following instance of GenMSSC: elements are the vertices, and sets the edges. Each set e = {u, v} has a covering requirement K(e) = 2. Clearly, the set of k elements/vertices which "hits" the most number of sets/edges is the collection of k vertices which induces the most number of edges. Therefore, the max-coverage version of GenMSSC is as hard as the Dense-K-Subgraph problem.
Hence, while one can get constant factor approximations for the min-sum set cover problem by solving the max-coverage problem for bounds of 2 i (for 1 ≤ i ≤ log n ) and combining these solutions to get a global linear ordering, naïvely using this approach would fail for the GenMSSC problem. (Hassin and Levin [HL05] use the max-coverage approach differently for their e-approximation, and it would be interesting to see if that approach can be extended to work for GenMSSC.) Our approach is based on a variation of this idea.
In particular, we use the following observation, which suffices for our purposes even though it is too weak to yield a useful guarantee for max-coverage. Consider the LP formulation for the max-coverage instance given a bound k, strengthened by adding the knapsack cover inequalities. Let denote the number of sets which are covered fractionally to an extent of at least 1/2 (or any constant) in an optimal fractional solution. Then the solution obtained by applying a round of randomized rounding (to the LP solution scaled by a suitable constant factor) covers at least Ω( ) sets. At a high level, it is this observation that forms the basis of our algorithm and its analysis. We next describe the details.

An LP Relaxation
Let [n] = {1, 2, . . . , n}, where n = |U |, the number of elements in the universe. In the LP relaxation given in Figure 3, x et is the indicator variable for whether element e ∈ U is selected at time t ∈ [n], and y St is the indicator variable for whether set S has been covered before time t ∈ [n]. If x et and y St are restricted to only take values 0 or 1, then this is easily seen to be a valid formulation for the problem. In particular, Constraints (3.1) require that only one element can be assigned to a time slot and constraints (3.2) require that each element must be assigned some time slot. Constraints (3.3) correspond to the knapsack cover constraints and require that if y St = 1, then for every subset of elements A, at least K(S) − |A| elements must be chosen from the set S \ A before time t. As a consequence, we get that y St can be 1 if and only if there have been K(S) elements picked from S before time t. Therefore, the set would incur an LP cost of exactly the cover time of the set in the integral ordering (since the term (1 − y St ) would keep contributing 1 to the LP objective until the set has been covered).
Let Opt denote any optimal solution of the given GenMSSC instance, and let LPOpt denote the cost of an optimal LP solution. From the above discussion, the LP is a valid relaxation and hence we have that,

The Rounding Algorithm
Let (x * , y * ) denote the optimal LP solution. Our rounding algorithm proceeds in O(log n) stages, with the i th stage operating in the time interval [1, 2 i ). In stage i, we perform one round of randomized rounding (as described below) on the fractional solution restricted to the interval [1, 2 i ) and obtain a set O i of elements. At the conclusion of these stages, we output the elements of O 1 , followed by elements of O 2 , O 3 , . . . , O log n , with the elements of any set O j being output in an arbitrary order. (Of course, we should only keep the first occurrence of any element in the final output, but imagining elements to potentially be output multiple times will be easier for the analysis.) The rounding process for stage i that generates the set O i is the following: Algorithm 1 Randomized Rounding for stage i 1: let t i = 2 i . 2: let z e,i ← t <ti x * et be the fractional extent to which e is selected before time t i , for each e ∈ U . 3: let p e,i ← min(1, 8z e,i ) for all e ∈ U . 4: mark each element e ∈ U independently with probability p e,i . 5: let O i be the set of marked elements. 6: if |O i | > 16 · 2 i then drop all but 16 · 2 i elements from O i . 7: return O i .

The Analysis
In the interests of expositional simplicity, we have not tried to optimize the constants in our analysis.
Observation 5.1. The fractional coverage of the sets is monotonically non-decreasing. That is, y * St ≥ y * St for all sets S ∈ S and 1 ≤ t ≤ t ≤ n.  Proof. Consider any set S, and let S g = {e ∈ S | z e,i ≥ 1/8}. By the choice of p e,i in step 3 of our rounding procedure, we know that all elements in S g are definitely marked in stage i, and any element e ∈ S \ S g is independently marked with probability 8z e,i . Thus, if |S g | ≥ K(S), then clearly the lemma holds. Thus we consider the case when |S g | < K(S). Recall that we are considering a set S and stage i such that t * S ∈ [1, t i ); since t * S was the last time t at which y * St ≤ 1 2 , it follows that y * St i > 1 2 . Hence, setting Therefore, the expected number of elements from S \ S g marked in stage i is e∈S\Sg 8z e,i ≥ 4(K(S) − |S g |) Since these elements are marked independently of each other, we can use the following Chernoff bound [MR95] (Theorem 4.2): if X 1 , X 2 , . . . , X n are independent {0, 1}-valued random variables with For our application, since we have µ ≥ 4(K(S) − |S g |) ≥ 4, we can substitute β = 3 4 and bound the tail probability that fewer than (K(S) − |S g |) elements are marked from S \ S g by exp(− (3/4) 2 2 · 4) = e −9/8 . As the elements in S g are all marked with probability 1, it follows that the probability that fewer than K(S) elements are marked from S is also at most e −9/8 . Lemma 5.2. The probability that any elements are dropped in step 6 is at most e −6 .
Proof. To show this, we use the following concentration inequality [BLM00] (Theorem 1, Remark 3): if X 1 , X 2 , . . . , X n are independent {0, 1}-valued random variables with X = i X i such that E[X] = µ, then In our setting, since the probability with which an element is picked in O i is at most 8 times the extent to which it was scheduled in [1, 2 i ) by the fractional LP solution, the expected number of elements picked (i.e. µ) in O i is at most 8 · 2 i . Therefore, by substituting β = 8 · 2 i and µ ≤ 8 · 2 i in the above inequality, we get that the probability of picking more than 16 · 2 i is at most exp( −64·2 2i (64/3)2 i ) ≤ exp(−6).
We now bound the cover time of the set S for the above algorithm. Proof. Let Cov Alg (S) denote the cover time of set S with respect to the ordering output by our algorithm. For ease of analysis, we will consider a set S to be covered in some stage i only if that t * S ∈ [1, t i ), and moreover the set O i returned is not truncated. Note that if the set S is actually covered with any of these criteria not met, its cover time only improves. Let E iS denote the event that set S is first covered in stage i under this modified notion of coverage. Then we have since if S is covered in stage i, its cover time is at most i j=1 16 · 2 j ≤ 32 · 2 i . Also, we know that any set will certainly be covered by stage log n because the matching constraints (3.1) and (3.2) would ensure that each element be picked to an extent 1 by time n. Now, the event E iS that a set S is first covered in stage i is strictly contained in the event that S is not covered in stages log t * S , ( log t * S +1), . . . , (i−2), and (i − 1). But for any i, the event that S is not covered in stage i occurs only when either 1. K(S) elements from S were not picked in O i , or 2. O i was truncated in step 6.
The former event happens with probability at most e −9/8 from Lemma 5.1, and the latter event happens with probability at most e −6 from Lemma 5.2. Hence, the probability that S is not covered in any fixed stage is at most e −9/8 + e −6 < e −1 . Thus, we have Plugging this into equation (5.6), we get By linearity of expectation, the expected covering time of all the sets is at most 64e e−2 S t * S , which by Fact 5.1 is at most 128e e−2 LPOpt ≤ 485 LPOpt. This completes the proof of Theorem 1.1.
6 Two Examples 6.1 A Bad Example for Harmonic Interpolation-Based Greedy We now give an example where the algorithm of [AGY09] has an approximation ratio of Ω( √ log n) for the multiple intents re-ranking problem.
(See Section 1.1 for the problem definition; recall that it is equivalent to our problem.) Consider the following set system: the universe is U = {a 1 , a 2 , . . . , a n , b 1 , b 2 There is a "large" set S 0 = {a 1 , a 2 , . . . , a n } with a weight vector w S 0 = (1, 1, . . . , 1). There are also t other "small" sets S 1 , S 2 , . . . , S t , with the set S i = {b i } having a weight vector w Si = (H n/2 ). Here, H n is the n th harmonic number, defined by H n = 1 + 1/2 + . . . + 1/n.
Consider the ordering b 1 , b 2 , . . . , b t , a 1 , a 2 , . . . , a n of the vertices (henceforth called order A): . In this ordering, any small set S i would incur a (weighted) cost of exactly iH n 2 , while the large set incurs a cost of (t + 1) + (t + 2) + . . . + (t + n) = nt + Θ(n 2 ). Therefore, the total weighted cost of such an order would be O(t 2 H n 2 + nt + n 2 ). The harmonic interpolation method [AGY09] was to replace each weight vector w = (w 1 , w 2 , . . . , w l ) by a new "harmonic" weight vector w = (w 1 , . . . , w l ), where w j = j ≥j w j · 1 j −j+1 , and then run the greedy algorithm on these new weight vectors. Note that the harmonic weight vector for the small sets remains the same as the original weight vector, but the one for the large set changes to w S 0 = (H n , H n−1 , . . . , H n 2 , . . . , 1). Now the greedy algorithm would not pick any of the vertices from {b i : i ∈ [1, t]} during the first n/2 time instants, since the weight vector for S 0 has larger values. Therefore, each of the small sets would incur a cost of at least n 2 H n 2 , and the large set incurs a cost of Ω(n 2 ). As a result, the total cost for the instance under this ordering would be Ω(ntH n 2 + n 2 ). Setting t = n(log n) −1/2 , we see that the cost incurred by order A is O(n 2 ) whilst the cost incurred by the harmonic algorithm is Ω(n 2 · √ log n), and this gives us an algorithmic gap of Ω( √ log n).
6.2 A Bad Example for the Standard LP Relaxation Consider the LP relaxation without the knapsack cover inequalities for the GenMSSC problem. We now show that the integrality gap of this LP can be arbitrarily bad for large values of K(S). Consider the following universe: U = l{a 1 , a 2 , . . . , a n , b 1 , b 2 , . . . , b l }. There are l sets, with set S i = {a 1 , a 2 , . . . , a n , b i } for each i ∈ [l].l Moreover, all the sets have a covering requirement of K(S) = (n + 1). Note that this is just an instance of the min-latency set cover problem, since any set is covered only when all the elements contained in it are selected.
∀ e ∈ U, S ∈ S, t ∈ [n] (6.9) t <t x et ). We first analyze the LP cost of this assignment: clearly, with each element we pick from {a 1 , a 2 , . . . , a n }, the y St term would decrease by an additive 1/(n + 1); and this happens for each set, in each of the first n time steps. Hence, each set would incur a cost of 1 + (1 − 1 n+1 ) + (1 − 2 n+1 ) + . . . + (1 − n−1 n+1 ) (roughly n/2) for the first n time-steps. After this, however, we cover one set at a time in time slots (n+1), (n+2), . . . , (n+l): but by this time each set has only a 1/(n+1) uncovered fraction which needs to incur any cost for this final covering step: hence the total LP cost would be Θ(nl + 1 n+1 (nl + l 2 )). However, each integer solution might as well schedule the elements {a 1 , a 2 , . . . , a n } before selecting the elements {b 1 , b 2 , . . . , b l } one by one, giving a cost of (n + 1) + (n + 2) + . . . + (n + l) = nl + l 2 . Now setting n = √ l gives us an integrality gap of Ω( √ l). Notice that with the knapsack cover inequality, the y St values cannot decrease by 1/(n + 1) with each additional covered element. In fact, for this extreme case of min-latency, the above LP strengthened with the knapsack cover inequalities is equivalent to the time-indexed LP relaxation for precedence-constrained scheduling on a single machine, which has an integrality gap of 2.

Closing Remarks
The proofs trivially extend to the case where each set S also has a weight w S ∈ R + , and the objective function is S w s Cov(S).
The current approximation factor is a rather large constant, and it would be interesting to pin down the integrality gap of this LP relaxation better. For both the extreme cases of the min-sum set cover and minlatency set-cover, it is known that the integrality gap of this LP relaxation is 4 (see [FLT02] for a dualfitting proof) and 2 [HSSW97] respectively. We have not tried to optimize the constants in this abstract; however, getting a substantially lower constant might require other ideas.