Secretary Problems: Weights and Discounts

The classical secretary problem studies the problem of selecting online an element (a “secretary”) with maximum value in a randomly ordered sequence. The difﬁculty lies in the fact that an element must be either selected or discarded upon its arrival, and this decision is irrevocable. Constant-competitive algorithms are known for the classical secretary problems (see, e.g., the survey of Freeman [7]) and several variants. We study the following two extensions of the secretary problem: (cid:129) In the discounted secretary problem , there is a time-dependent “discount” factor d ( t ) , and the beneﬁt derived from selecting an element/secretary e at time t is d ( t ) · v ( e ) . For this problem with arbitrary (not necessarily decreasing) functions d ( t ) , we show a constant-competitive algorithm when the expected optimum is known in advance. With no prior knowledge, we exhibit a lower bound of Ω( log n loglog n ) , and give a nearly-matching O (log n ) -competitive algorithm. (cid:129) In the weighted secretary problem , up to K secretaries can be selected; when a secretary is selected (s)he must be irrevo-cably assigned to one of K positions, with position k having weight w ( k ) , and assigning object/secretary e to position k has beneﬁt w ( k ) · v ( e ) . The goal is to select secretaries and assign them to positions to maximize (cid:2) e,k w ( k ) · v ( e ) · x ek where x ek is an indicator variable that secretary e is assigned position k . We give constant-competitive algorithms for this problem. Most of these results can also be extended to the matroid secretary case (Babaioff et al. [2]) for a large family of matroids with a constant-factor loss, and an O (log rank ) loss for general matroids. These results are based on a reduction from various matroids to partition matroids which present a uniﬁed approach to many of the upper bounds of Babaioff et al. These problems have connections to online mechanism design (see, e.g., Hajiaghayi et al. [9]). All our algorithms are monotone, and hence lead to truthful mechanisms for the corresponding online auction problems.

• In the discounted secretary problem, there is a time-dependent "discount" factor d(t), and the benefit derived from selecting an element/secretary e at time t is d(t)•v(e).For this problem with arbitrary (not necessarily decreasing) functions d(t), we show a constant-competitive algorithm when the expected optimum is known in advance.With no prior knowledge, we exhibit a lower bound of Ω( log n log log n ), and give a nearlymatching O(log n)-competitive algorithm.
• In the weighted secretary problem, up to K secretaries can be selected; when a secretary is selected (s)he must be irrevocably assigned to one of K positions, with position k having weight w(k), and assigning object/secretary e to position k has benefit w(k) • v(e).The goal is to select secretaries and assign them to positions to maximize e,k w(k) • v(e) • x ek where x ek is an indicator variable that secretary e is assigned position k.We give constant-competitive algorithms for this problem.

Introduction
The classical secretary problem [5,7] captures the question of finding the element with the maximum value in an online fashion, when the elements are presented in a random order.
It is well known that waiting until one sees 1/e fraction of the elements, and picking the first element attaining a value greater than the maximum value seen in the first 1/e fraction of the elements gives an e-competitive algorithm, and this is the best possible.The problem is of interest due to its connections with online mechanism design: if we have a single good to sell and agents with varying valuations for that object arriving online (albeit in a random order 1 ), then the secretary problem captures the difficulty in picking the person with the largest valuation for that good [9,10].In this case, the elements of the secretary problem are agents and the element value is the agent's value for the good; the goal of the mechanism designer is to maximize social welfare, or sell the good to the agent with the highest valuation.Another application is to modeling the economic decision facing an agent who wishes to select one of an online sequence of goods-e.g., an agent buying a house or a company hiring an employee.In this case, the elements are the goods and the element value is the value of the agent for the said good.Given the above interpretations, there are certain natural cases which the secretary problem does not address: e.g., it does not capture the opportunity cost incurred due to delay in selecting an element.For example, when seeking to purchase a house, we might think of choosing a slightly suboptimal house at the beginning of the experiment (and being able to occupy it for the entire period) as being more desirable than a long wait to pick the most desirable house.We model such a problem as the discounted secretary problem, where we are given "discount" values d(t) for every time step t: the benefit derived from choosing an element with value v(e) at time t is the product d(t) • v(e).In this example, the discount function d(t) is a monotone decreasing function of t, but in general the discount function may be more complicated due to other considerations.For example, our financial situation may improve over time and waiting longer may get us a better mortgage rate, so our "discount" function d(t) may increase up to some point in time, and then decrease. 2n orthogonal extension of the classical secretary problem is the weighted secretary problem.In this case, there are K heterogeneous goods {1, 2, . . ., K}, with the k th good having a publicly known "weight" w(k) (with higher numbers indicating greater desirability), such that if agent e has an "intrinsic value" v(e) for good k, then assigning k to agent e accrues an actual value of v(e) • w(k).Such "product valuations" are commonly assumed in industries like online banner advertising, where goods are banner advertising space, the weight of the good is its visibility or the number of people that are likely to see it, and the intrinsic value of the advertising company is the value it derives when one person sees its ad.Similarly, product valuations might be observed in hiring scenarios: e.g., a company may wish to hire sales managers for several regional markets of varying sizes.The weights of the goods (job positions, in this case) are the market sizes, and the value of a manager is his or her inherent ability to convert peoples' interest into actual sales.Again, if the elements arrive in a random order, and assigning good k to agent e accrues a benefit of v(e) • w(k), how should we choose K agents and assign goods to them to maximize the total expected benefit?

Our Results
Surprisingly, the discounted secretary problem is interesting even if we know the values of all the items in advance: given discount function d(t), it is not a-priori obvious which item the online algorithm should choose (we prove a constant lower bound, see Theorem 4.4).Our first theorem is the following.The assumption that the values of element are known holds, say, when the value is a function of the ordinal preference (e.g., the value of the top candidate among n candidates is n, the second-best candidate has value of n − 1, and so on), which has been used in [14].Alternatively, E [OPT] may be estimated by market research into prior experiments (which is a more realistic assumption in the mechanism design framework).Our next result shows that the knowledge assumption is essential for constant-competitive algorithms: THEOREM 1.2.(DISCOUNTED SEC'Y: UNKNOWN-OPT) Any algorithm for discounted secretary has a (worst-case) competitive ratio of Ω( log n log log n ).Moreover, there is a nearly matching O(log n)-competitive algorithm.
For the weighted secretary problem, we show the following.

THEOREM 1.3. (WEIGHTED SEC'Y)
There is a 4competitive algorithm for the weighted secretary problem.
3 E [OPT] is the expected benefit the optimal algorithm gets.
As the classical secretary problem is a special case of the weighted secretary problem when there is only one non-zero weight, there is clearly a lower-bound of e for the weighted secretary problem.In the setting with both discounts and weights, we show that a combination of the above algorithms yields a nearly-optimal result (since the Ω( log n log log n ) lower bound still holds).THEOREM 1.4.(DISCOUNTED WEIGHTED SEC'Y) There is an O(log n)-competitive algorithm for the secretary problem with both weights on goods and discounts on times.
Finally, we consider the discounted and weighted versions of the matroid secretary problem [2], where the goal is to choose a set of items in order to maximize the total expected value, subject to the constraint that the chosen set is independent in a given matroid (see Section 5 for definitions).We show that the algorithms of Theorems 1.1, 1.2 and 1.3 can be extended to a large family of matroids (including uniform matroids, partition matroids, graphical matroids) with only a constant loss in the competitive ratio, and to all matroids with an O(log rank) loss in the competitive ratio.These results are based on reductions from various classes of matroids to partition matroids, which also give a unified approach to many of the upper bounds of [2], whilst improving some of them.For example, our techniques imply the following: THEOREM 1.5.There is a 3e ≈ 8.15-competitive algorithm for the matroid secretary problem for graphical matroids.(The previous best known was a 16-competitive algorithm [2], and recently a 2e ≈ 5.44-competitive algorithm has been developed [11]) While we state our results as algorithms, the setting which motivates us is actually an economic one in which the elements are strategic agents and their values are private information.In this case, it is important to consider the incentives facing agents in our proposed algorithms.Assuming single parameter agents (i.e., agent e has value v(e) if he is picked by the algorithm, and 0 otherwise), it is well known that a mechanism is truthful (in dominant strategies) if it is bid monotonic (a winner would keep winning if she increases her bid).An alternative interpretation is that the mechanism presents the agent with a price that is independent of the agent's bid and the agent decides if she would like to win given the price.This is indeed the case for all our algorithms, thus one can interpret our algorithms as truthful mechanisms in which winners pay the threshold value they needed to bid in order to win. 4 The competitive ratio that an algorithm achieves corresponds to the fraction of the social welfare that the truthful mechanism guarantees.
Related Work.The study of truthful on-line mechanism design in the competitive analysis framework was initiate by Lavi and Nisan [12] and many other papers followed this line of research, e.g.[13].The classical secretary problem was first studied by Lindley [14] and by Dynkin in 1963 [5].Since then, many variants have been studied (see [6,7] for a survey), including some in the computer science literature highlighting the connections to online mechanism design [9,10] and combinatorial preference structures [2,1].Recently, Dimitrov and Plaxton [4] gave a 16-competitive algorithm for the secretary problem on transversal matroids, improving on a result of [2], and then Korula and Pál [11] improved the analysis to show that it was in fact 8-competitive.The discounted problem was studied previously in multiple contexts for specific "well-behaved" functions like d(t) = β t [16] or d(t) = n i=t β t [15] for some fixed β < 1, whereas here we study it for general functions d(t).For the weighted case, Derman, Lieberman, and Ross [3] studied a version where there are the same number of goods as agents (so K = n) and the values of the agents are independently and identically distributed, as opposed to our setting where the values are arbitrary but arrive in random order.Similarly, a recent paper by Gershkov and Moldovanu [8] studied the variant where the values of the elements are independently and identically distributed and element arrivals are given by some renewal stochastic process, instead of the classical secretary assumption of discrete time with a uniformly random ordering.They further incorporated the discounted model into the weighted model (so the value of agent e for good k at time t is v(e) • w(k) • d(t), and show how to maximize revenue as well as welfare in their setting).

Model, Preliminaries, and Notation
In the classical secretary problem, there is a universe U of secretaries/elements with |U | = n.Each secretary/element e ∈ U has an intrinsic value v(e) ∈ R ≥0 .An algorithm A for the secretary problem observes the elements of U in a random order and chooses one element e A in an online fashion.In other words, it must decide at the moment an element arrives whether or not to select it, and all decisions are irrevocable.The goal is to maximize the expected value E [v(e A )], the expectation being taken over the random order, as well as over the randomness in the algorithm, if any.In this paper, we extend the classical setting as follows: Weighted Secretary Problems.Here, we want to choose K secretaries and match them to one of K positions.Each position k has a non-negative weight w(k) with w(1) ≥ w(2) ≥ • • • ≥ w(K), each secretary e has an intrinsic value v(e).Our algorithm A must build online the assignment map s A : [K] → U ∪ {⊥}, where we interpret s A (k) = e as meaning secretary e is assigned to position k, and s A (k) =⊥ as meaning that no secretary has been assigned to the kth position.(Initially s(k) =⊥ for all k ∈ [K].)For ease of notation we extend the agent valuation function by letting v(⊥) = 0.When the secretary e arrives the algorithm finds out v(e); if it decides to choose secretary e, it must immediately and irrevocably assign e to some unassigned position k (and hence set s A (k) = e).The goal of the algorithm is to output a final map s A that maximizes where the expectation is over the random ordering and the random choices of the algorithm.Discounted Secretary Problems.Here, the algorithm is given as input a discount function d : [n] → R ≥0 which maps "time" to "discounts".Again, the ground set is presented in random order: we formalize this as picking a bijective ordering function π : [n] → U uniformly at random from all bijective functions from time instants [n] to elements U , implying that element π(t) ∈ U appears at time instant t.In this problem, we want to choose an element e A in an online fashion to maximize the expected discounted value In the known-OPT model, the algorithm knows the expected value of the optimal offline solution ahead of time (for example, it is sufficient to assume the algorithm knows the valuations, see Lemma A.1), it just does not know the random permutation π-whereas in the unknown-OPT model, the algorithm does not have any such prior knowledge about the input.For this problem, since the optimum offline solution is itself a random variable, a function of the random order π of arrivals, we consider the competitive ratio of an algorithm A to be the smallest value α such that It is well-known that the following algorithm is ecompetitive for the classical secretary problem: observe a (1/e) fraction of the elements without selecting any.In the remaining (1 − 1/e) fraction of elements, select the first element whose value is greater than all elements preceding it.In the remainder of the paper, we will refer to this algorithm as the classical secretary algorithm, and we will use it (and the sample-then-select intuition behind it) to design algorithms for the weighted and discounted cases.Finally, all the secretary problems mentioned above can be extended to the matroid secretary case where, instead of picking a set S of one or K elements, we are given a matroid M = (U, I) and want to pick a set S ∈ I that maximizes the objective function.Details of this extension, and background on matroids, appear in Section 5.

The Weighted Secretary Problem
Recall the weighted secretary problem was to choose, in an online fashion, up to K secretaries and assign each one irrevocably to one of K positions (with weights w(1) ≥ w(2) ≥ • • • ≥ w(K)) so that no position has more than one secretary.The goal was to maximize the weighted value , where s(k) is the secretary that is assigned to position k and v(s(k)) = 0 if position k is not filled by any secretary (i.e.s(k) =⊥).In this section, we show a constant-competitive algorithm for this problem.We assume, without loss of generality, that the values of the secretaries are all distinct.
We will use the following algorithm, called the interval reservation algorithm: a. Observe the first l = n 2 secretaries in the random sequence without assigning any of them.Call these secretaries the sample set S. b.Compute the optimal solution on the sample set, which will just fill position i with the ith most valuable secretary in S. For each position k let a k be the value of the secretary that fills it, and for k > 1 let I(k) be the interval (a k , a k−1 ).For position 1, let Note that all of these intervals are disjoint.c.Now consider the remaining (n − l) secretaries.When secretary e arrives, let k e be such that v(e) ∈ I(k e ); if there is at least one free position k ≥ k e , then assign secretary e to the position k with the lowest such index.
To analyze this algorithm, we consider a slightly different algorithm, where in step c above, when considering secretary e, we assign it only to position k e if it is free, else we discard it.(Let us call this Algorithm B. Note that while the interval reservation algorithm is monotone 5 , Algorithm B may not be monotone.)LEMMA 3.1.The expected weighted value achieved by the interval reservation algorithm is at least that achieved by Algorithm B.
Proof.Consider a position k.For any fixed permutation, it is immediate that position k is filled in the interval reservation algorithm by an agent with value at least as large as the agent filling position k in Algorithm B. The claim follows.LEMMA 3.2.If a secretary e is assigned to some position by Algorithm B, then the weight of this position is at least as large as the weight of the position e is assigned to in the optimal solution (if any).
Proof.Suppose secretary e is assigned to position k by Algorithm B. Then v(e) was not in I(s) for s < k, so there were k − 1 secretaries in the sample set with value greater than v(e).Since the optimal solution fills position i with the ith most valuable secretary, and there are at least k − 1 secretaries more valuable than e, the optimal solution will not assign e to a position with weight greater than w(k).
Hence, as long as any secretary chosen in the optimal solution is also chosen by Algorithm B with reasonable probability, we are fine.The next lemma shows exactly this: LEMMA 3.3.If a secretary is assigned to some position by the optimal solution, then the probability that it is assigned to some position by Algorithm B is at least 1/4.
Proof.Consider a secretary e that is assigned to some position by the optimal solution.Given that e appears in the random order at position j ≥ n/2, then with probability l j−1 the next more valuable secretary that appeared before e actually appears in the sample.Conditioned on this event, with probability l−1 j−2 the next less valuable secretary that appeared before e also appeared in the sample.This is a sufficient condition for e to be assigned a position, since it implies that there's some position k whose interval I(k) contains v(e) and which will not have been filled by the time that e appears.Thus the probability that secretary e is assigned to some position is at least This expression is at least 1  4 when n ≥ 4. It is easy to see that the Algorithm B also satisfies this lemma for n < 4, thus proving the lemma.

Time-dependent Weights, or the Discounted Secretary Problem
In the discounted secretary problem, we are given a function d(•) that maps time instants to "discounts", such that the actual benefit obtained by picking an element e of intrinsic value v(e) at time t is actually the product v(e) • d(t).This is a natural model when picking an item is more valuable at some times rather than others: while the problem has been studied in the simple case of d(t) = β t for some β < 1, here we consider general time-varying functions d(t).
We first show that the problem is fairly hard in general.We show an Ω(log n/ log log n) lower bound on the competitive ratio of any algorithm, and show a nearly matching O(log n) upper bound.Surprisingly, the problem becomes much easier with a small amount of information about the input.We show that knowing an estimate of E [OPT] enables one to design a constant competitive algorithm.We remark that all our algorithms are monotone, and thus can be converted to truthful mechanisms.
A Warmup Example.Consider the simple case when d(t) = β t for some constant β bounded away from 1.A simple constant-competitive algorithm is one that always picks the first element.This algorithm has an expected value of On the other hand, the expected value that OPT gets from time step j is at most As β is a fixed constant this is constant-competitive.

Discounted Secretary: General Case
It turns out that the discounted secretary problem is significantly harder in the unknown-OPT model, and we show a lower bound of Ω( log n log log n ) on the competitive ratio in this case, even when the discounts are decreasing.However, we also give an algorithm with an almost-matching competitive ratio of O(log n), for arbitrary discounts.

A Nearly-Logarithmic Lower Bound
We construct a discount function d and a family of instances I 1 , . . ., I 2c such that no randomized algorithm can be Θ( log n log log n )-competitive on all the I t 's.More formally, we use the following definitions: • Instance Size: Let L = c be an integer and let n = L 4c .
Thus L = c = Θ( log n log log n ).
• Discount Function: For t ≤ 2c, let n t = L 2t .For the discount function, we use a step function d(j) = L −t for n t−1 ≤ j < n t .
• Instances: Let K be a large enough integer (say n 2 ).The instance I 1 consists of n n1 K's and the remaining zeroes.For t < 2c, I t+1 is obtained inductively from I t by changing n t /n t+1 of the K t 's to K t+1 .Thus I t has n nt K t 's.

LEMMA 4.1. (ESTIMATE ON E[OPT(I
Proof.With probability at least (1 − 1 e ) over the random permutation, one of the n/n t K t values falls in the first n t slots, leading to a value of at least K t L −t .
Let A be a c/10-competitive algorithm.We argue that A must have a large probability of picking an item in each of the intervals (n t , n t+1 ).LEMMA 4.2.Let A be a c/10-competitive algorithm.Then on instance I t , the probability that A picks one of the first n t items is at least t/c.Proof.We prove the claim by induction on t.For the base case, note that the most A can get from time steps n 1 + 1 onwards is K/L 2 = 1 c ( K L ).Thus to be c 10 competitive, it must in expectation get value at least 2 c ( K L ) from the first n 1 time steps.Since A never gets more than K/L on I 1 , the base case holds.
Assume that the claim holds for I t .Consider a run of the algorithm on instance I t+1 .Note that I t+1 differs from I t only in n nt+1 of the items.The probability that any one of these items occurs in the first n t time steps is at most nt nt+1 .Thus except with probability nt nt+1 = 1 L 2 , the behavior of A on I t and I t+1 is indistinguishable in the first n t steps.Thus by the induction hypothesis, the algorithm accepts an item by time n t with probability at least (1 − 1 L 2 )(t/c).The expected revenue of A in the first n t time steps, on instance I t+1 is bounded by , where the first term bounds the expected contribution from the K t+1 's, and the second term bounds the most one can get from the smaller items.Since this is at most 4OP T /c, A must get at least 6OP T /c from time steps n t + 1 onwards.As in the base case, time steps n t+1 + 1 and higher cannot contribute more than 2OP T /c, so that the algorithm must get contribution 4OP T /c from time steps [n t + 1, n t+1 ].Thus the probability that the algorithm picks an item in time steps [n t + 1, n t+1 ] is at least 2 c .Since A picks exactly one item, this event is disjoint from A picking an item in time steps [1, n t ].Thus the probability that A, on instance I t+1 picks an item in times steps [1,  Proof.If A is c/10-competitive, then Lemma 4.2 implies that on instance I 2c , the probability that A picks an item in the first n 2c time steps is larger than 1, which is a contradiction.Thus A cannot be c/10-competitive.

e. E[OPT]/E[A] ≤ O(log n).
Proof.We begin by noting that E [OPT] = location corresponding to d max with probability at least 1 n .Moreover, OPT c equals Thus OPT gets most of its value for the top O(log n) discount scales.
Clearly the expected value that the offline optimum gets from time steps in P c equals OPT c .Thus the expected value of the offline optimum for the problem restricted to times in P c is at least as large, and the classical secretary algorithm gets an expected value OPT c /2e (we lose an additional factor of two since we treat the discount values in P c equally).The claim follows.
When there are both weights and discounts, we can combine Theorem 4.2 with Theorem 3.1 to get the following result.We defer the proof to the full version.

Proof Sketch:
In the unweighted case the proof of Theorem 4.2 implies that we can ignore all but the first Θ(log n) discount classes.Using the same reasoning, in the weighted case we can ignore all but the first Θ(log(nK)) discount classes, where the extra K is due to the ability to assign K goods.Now we can just choose a c ∈ [Θ(log(nK))] uniformly at random and run the weighted secretary algorithm of Theorem 3.1 in the time steps with discount in class c.Following the analysis of Theorem 4.2, this is an O(log(nK))-competitive algorithms, and thus is also O(log n)-competitive (as K ≤ n).

Discounted Secretary with Known OPT
In this section we consider the case when the algorithm knows a good estimate for E [OPT] ahead of time.Interestingly, we can show that even if we know all the values and not just E [OPT] in advance, no online algorithm can be perfectly competitive.This is in contrast to the case without discounts (i.e., d(t) = 1∀t), in which the naïve algorithm that knows all the values can pick an element of maximal value.
. We next show that if A observes one of the two valuable elements (with value 1) before time t it immediately picks it (this is the optimal online algorithm for this input).This is true as the expected value from picking the element is 1.On the other hand, if there are still k element to come, the expected value from skipping the current element and picking the next valuable element is Thus the algorithm is always (weakly) better off picking the first valuable element it sees.Now, the expected value that A gets is . We conclude that As this expression clearly tends to 4( √ 2 as n goes to infinity, the claim follows.

The Upper Bound for Known-OPT
The main result of this section shows that one can get good competitive algorithms if we have a good estimate Proof.The proof considers two cases.In the first (easier) case, suppose the algorithm A picks some element with probability at least 1/2.Then with probability at least 1/2, it gets benefit at least Z/2, and hence its expected performance is at least Z/4.
In the second case, suppose the algorithm A does not pick an element with probability at least 1/2: i.e., we are in the case where at least half the permutations cause all products {v(e) d(π(e))} e∈U to be small (and hence are "rejecting").In this case, we can show that OPT gets most of its benefit from the other, "accepting" permutations.Moreover, on these accepting permutations, we would pick an element differently from OPT only when there was some 'blocking' element with v(e) d(π(e)) ≥ Z/2: but for the same reason that there are few accepting permutations, there are few accepting permutations with such a blocking element.Hence we pick the same elements OPT picks on the "accepting" permutations with high probability, and have a good performance.
Let us now make this general idea for the second case precise.Let us write ≥ Z/2} be the "accepting permutations" on which the algorithm picks some element.The contribution to E [OPT] from these accepting permutations is where we used the fact that Z ≤ E [OPT].To complete the proof, it suffices to show that A has expected value at least L/2, which is at least Z/4 by (4.1).We begin by rewriting L as (Here we implicitly assume that there is exactly one product d(i)v(j) which is highest: we break ties in some consistent fashion.)Combining (4.1) and (4.2) gives us For each such i, j pair where d(i)v(j) ≥ Z/2, let G ij be the set of permutations on which A chooses element j at time i, getting benefit d(i)v(j): let us call these "good".(Note that, in these permutations, d(i)v(j) must be the first product which is at least Z/2.)The performance of the algorithm is The following claim shows there are many "good" permutations.
CLAIM 4.1.For each i, j such that Proof.We show the first inequality by giving an n-to-1 map f ij from S n \S acc to G ij , the second inequality follows from the fact that at most half the permutations are in S acc .Define f ij (π) by changing π and mapping element j to location i, swapping it with whatever was there originally.In other words, which follows from our choice of i, j), and that no other position i has d(i )v(π (i )) ≥ Z/2 (which follows almost as easily from the fact that π ∈ S acc and hence did not have any positions i such that The rest of the proof of Theorem 4.5 is immediate: by (4.4) and Claim 4.1, it follows that where the last step is by equation ( 4.3).
To use Theorem 4.5 fruitfully and obtain benefit ≈ E [OPT]/4, the bound Z should be close to E [OPT]: such an estimate could come from expert knowledge, prior runs of the problem, or some other source.A special case when such an estimate of E [OPT] can be obtained is the situation when we know the values of all the elements.In this case, we can use a sampling-based estimator for E [OPT] (as shown in Lemma A.1) to get where , δ > 0 are arbitrary constants that only affect the amount of sampling necessary.

Extensions to Matroid Secretary Problems
In this section, we show how the secretary problem on many classes of matroids can be "reduced" to partition matroids.Moreover, we extend our results for the discounted and weighted secretary problems to partition matroids, and hence to such classes of matroids as well.This reduction to partition matroids also gives a simple unified view of several results of Babaioff et al. [2], and also improves some of the results from their paper.
Recall that a matroid M = (U, I) consists of a ground set U , and a collection I of subsets of U that is closed under taking subsets satisfying the well-known exchange conditions 6 .A partition matroid (U, I) is one where there is some partition P of U , and S ∈ I if and only if I has at most one element from each set in P .The following definition formalizes the notion of a "reduction" from arbitrary matroids to partition matroids.

A Discounted Secretary Problems: Missing Proofs A.1 A Sampling Lemma
The following lemma shows that given access to the element values, one can estimate the expected optimum for the discounted secretary fairly efficiently.
A quick observation: if we pick the element that appears in the location with the highest discount, the expected benefit is d max v max /n, and hence and thus if we set M ≥ 3 2 n ln 2 δ , this probability is at most δ, which proves the lemma.

Note that Z
1+ is a lower bound for E [OPT] with probability 1 − δ, and hence when the values of the elements are known, we can use Theorem 4.5 to get ( Proof.Let G be a graph defining a graphic matroid.Pick an edge {u, v} from G uniformly at random, with probability 1/2 color u red and v blue, and with probability 1/2 color u blue and v red.Then independently color every other node red or blue, each with probability 1  2 .Create a part in the partition matroid for each red node x, and add to it all the red-blue edges incident on x.Then run this procedure recursively on the graph induced by the edges that have both endpoints colored blue to create more sets in the partition (red-red edges are discarded).

B Reductions for Several Matroid Classes
It is easy to see that picking one bichromatic edge incident on each red node will result in a forest.It is also clear that taking the union of such a forest with any set of blue-blue edges that is itself a forest still gives a forest, implying that any set which is independent in the partition matroid we create is independent in the original graphic matroid.
For a graph G, let OPT(G) be the value of the optimal independent set in the graphic matroid defined by G. Let v(G) be a random variable denoting the value of the maximum independent set in the partition matroid constructed by this reduction.We claim that E[v(G)] ≥ 1 3 OPT(G), and prove it by induction on the number of edges of G.For the base case, if G only has one edge then we color it bichromatically with probability 1, so E[v(G)] = OPT(G) ≥ 1  3 OPT(G) as claimed.For the inductive step, let X rb be a random variable denoting the value of the optimal independent set from the partition matroid corresponding just to the sets we created for the red nodes.Let T be the optimal forest (which without loss of generality is a tree, since otherwise we just analyze each component separately).Root T arbitrarily.After the random coloring, the set of edges that go from a red node to a blue parent are clearly bichromatic and will get assigned to different sets in the partition corresponding to red nodes, so the optimal independent set in the partition matroid is at least as large as their sum.Every edge in T has probability 1/m (where m is the number of edges in G) of being the initial edge chosen to be colored bichromatically, and if it is the one chosen then with probability 1/2 the parent is blue and the child is red.If it is not chosen then it still has probability 1/4 of having its parent colored blue and its child red.So the total probability that it is colored in this way is at least )OPT(G).Also, since we colored at least one edge bichromatically by induction we know that E[v(G bb )] ≥ Proof.Create k bins (where k is the rank of the uniform matroid), and place each element in U into one of these bins uniformly at random.The i-th largest value v(i) is not in the same bin as any larger value with probability at least (1 − 1/k) i−1 .This implies that the expected value of the maximal weight basis of the partition matroid is at least Proof.Create a partition in the partition matroid for each vertex on the right, and put a vertex on the left (i.e., an element of U ) in one of the bins corresponding to its neighbors uniformly at random.Given a maximum independent set I in the transversal matroid, there is some matching between I and the vertices on the right.Clearly each node x in I gets put in the bin corresponding to its neighbor y in this matching with probability at least 1/d.If this even does happen, then the maximum independent set in the partition matroid contains either x or an element of greater value.Thus the expected value of the maximum independent set is at least Proof.We use the same reduction as for low-degree transversal matroids.Without loss of generality, let v(1) ≥ v(2) ≥ • • • ≥ v(n).Then the probability that element i is the most valuable in the set that it is assigned to is at least 1 − i−1 d , since at most i − 1 of its neighbors can contain more valuable elements and it has at least d neighbors.Considering only the elements {1, 2, . . ., d/2} and using linearity of expectations, we get that the expected value of the max base in the partition matroid is at least

THEOREM 1 . 1 .
(DISCOUNTED SEC'Y: KNOWN-OPT) The discounted secretary problem has an O(1)-competitive algorithm for the case when the values of all elements are known in advance.In fact, the algorithm only needs to know E [OPT].3
1)/c.The claim follows.Now we can prove the main lower bound theorem.THEOREM 4.1.(UNKNOWN-OPT: LOWER BOUND) Every algorithm A for the discounted secretary problem has E[OPT]/E[A] ≥ Ω(log n/ log log n) in the worst case.
Let d max be the maximum discount and let v max be the maximum value.For c ≥ 1 let I c = (2 −c d max , 2 −(c−1) d max ] be the interval defining the c-th discount class, and let P c = {i ∈ [n] : d(i) ∈ I c } be the set of times that have discount value in class c.Our algorithm A chooses a c ∈ [3 log n + 2] uniformly at random, and then runs the classical secretary in the time steps i ∈ P c .Note that A uses no knowledge of E [OPT].THEOREM 4.2.(UNKNOWN-OPT: UPPER BOUND) Algorithm A is O(log)-competitive, i.

THEOREM 4 .
4. (KNOWN-OPT: LOWER BOUND) For any > 0, every algorithm A for the discounted secretary problem has E[OPT]/E[A] ≥ √ 2 − , even when A knows the set of values (and hence E[OPT]) in advance.

Algorithm A :
Suppose Z ≤ E[OPT].Pick the first element e seen (say, at time j) that satisfies v(i)d(j) ≥ Z/2.THEOREM 4.5.(KNOWN-OPT: UPPER BOUND) If Z ≤ E [OPT], the algorithm A for the discounted secretary problem satisfies E[A] ≥ Z/4.

DEFINITION 5 . 1 .
(AN α-PARTITION PROPERTY) A matroid M = (U, I) satisfies an α-partition property if one can (randomly) define a partition matroid M = (U , I ) on some subset U of the universe U such that for any values of the elements U , we have• E(value of max-weight base in M ) ≥ 1/α × value of max-weight base in M.• Every independent set in M is an independent set in M .
3. (LOW-DEGREE TRANSVERSAL MATROIDS)Transversal matroids satisfy a d-partition property, where d is the maximum degree of vertices on the left.

x∈I 1 d
v(x) = 1 d x∈I v(x) = 1 d OPT LEMMA B.4. (HIGH-DEGREE TRANSVERSAL MATROIDS) Any transversal matroid satisfies a 4k/d-partition property, where d is the minimum degree of vertices on the left and k is the rank of the matroid.
). max denote the value of the most valuable element, and let d max denote the largest discount.For a given trial t, let Y i {d(i)v(π t (i))}.Clearly, E [ Z] = E [OPT], and it just remains to show that Z is tightly concentrated about its mean.Let v Clearly v(G) = X rb + v(G bb ), where G bb is the graph induced by edges that have both endpoints colored blue.The probability that an edge in T is colored monochromatically blue is at least(1 − 1 m ) 1 4 = 1 4 − 1 4m, since if it is not picked as the initial bichromatic edge its endpoints are both colored blue with probability 1/4.By linearity of expectations this means that E[OPT(G bb )] ≥ ( 1 4 −