A strongly competitive randomized paging algorithm

Thepaging problem is that of deciding which pages to keep in a memory ofk pages in order to minimize the number of page faults. We develop thepartitioning algorithm, a randomized on-line algorithm for the paging problem. We prove that its expected cost on any sequence of requests is within a factor ofH k of optimum. (H k is thekth harmonic number, which is about ln(k).) No on-line algorithm can perform better by this measure. Our result improves by a factor of two the best previous algorithm.


Introduction
The paging problem arises when trying to control a two-level memory system. Such a system has k pages of fast memory and n -k pages of slow memory. A sequence of requests to pages is to be satisfied in their order of occurrence. In order to satisfy a request to a page, that page must be in fast memory. When a requested page is not in fast memory, a page fault occurs, and a page must be moved from fast memory to slow memory to make room for the new page to be put into fast memory. The paging problem is that of deciding which page to eject from fast memory. The cost to be minimized is the number of page faults. Belady [1] gave a simple optimum algorithm for the paging problem. Belady's algorithm ejects the page that will remain unused for the longest time. This algorithm is off-line, using knowledge of future requests. An important class of paging algorithms are the online algorithms. These algorithms are not allowed use information about the future to process the pending request. Sleator and Tarjan [5] introduced the idea of comparing the performance of on-line paging algorithms with that of the off-line optimum. They showed that any on-line algorithm for the paging problem will, on some sequence of requests, have cost k times that of the optimum off-line algorithm.
A randomized paging algorithm is allowed to make use of a source of randomness when deciding what to do. Fiat et al. [2] extended the work of Sleator and Tarjan to the domain of randomized algorithms. To describe the results of that paper, it is useful to introduce the terminology of randomized competitiveness from Manasse et al. [3]. A randomized on-line algorithm is said to be c-competitive if on every sequence of requests its expected cost is within a factor of c (plus a constant) of that of every other algorithm, including those that are off-line. More formally, we let CA{O) be the expected cost incurred by an algorithm A in processing a request sequence a. An algorithm B is c-competitive if there exists a constant a such that for all algorithms A and all sequences a,

C B {°) <c-C A {a)+a.
The constant c is known as the competitive factor. An algorithm is said to be strongly competitive if it has the smallest possible competitive factor.
The marking algorithm of Fiat et al. [2] is a randomized algorithm for the paging problem. This algorithm is /^-competitive if k = n -1 and 2ffjt-competitive in general. (Here H k denotes the k ih harmonic number: H k = 1 + \ + | + • • • + j.) It is also shown that Hk is the smallest possible competitive factor for the paging problem. These results leave a gap of a factor of two between the lower bound on the competitive factor and what is achieved by an algorithm. The main result of this paper is a new algorithm whose competitive factor matches the lower bound, and is thus strongly competitive. The paper has two main parts. The first part (Section 2) describes a way of maintaining a dynamically changing partition of the set of pages. We show that the partitioning procedure can be used to obtain a lower bound on the cost of any algorithm handling a given sequence of requests. The second part (Section 3) describes a randomized algorithm based on partitioning and analyzes its performance.
The paging problem is a special case of the k-server problem, the deterministic version of which was studied in Manasse et al. [3,4]. In this problem there is a set of n vertices numbered 1,2,..., n, and a distance measure among them satisfying the triangle inequality.
A collection of k mobile servers reside on these vertices. Given a sequence of requests, each of which specifies a vertex that requires service, the ^-server problem is to decide how to move the servers in response to each request. If a requested vertex is unoccupied, then some server must be moved there. The requests must be satisfied in order of their occurrence in the request sequence. The cost of handling a sequence of requests is equal to the total distance moved by the servers.
In the uniform k-server problem, the cost of moving a server from any vertex to any other is one. The paging problem is isomorphic to the uniform fc-server problem. The cor-espondence between the two problems is as follows: the pages of address space correspond to the n vertices, and the pages in fast memory correspond to those vertices occupied by servers. In the remainder of this paper we shall use the terminology of the uniform ^server problem rather than that of the paging problem.

A Lower Bound on Optimal Cost
In this section we describe a dynamically changing labeled partition of the vertices. The partition evolves deterministically as a function of the request sequence, and can be maintained by an on-line algorithm. We show how to use this partition to obtain a lower bound on the cost incurred by any algorithm in satisfying the request sequence.
The partition is actually an ordered sequence of disjoint sets of vertices 5 a , 5«+!,... S^-i, Sp (some of which could be empty) whose union is the set {1,2,... n}. Each set 5 t (except Sp) is labeled with an integer fc f -. Initially a = 1 and 0 = 2, 5 X is the set of vertices that are not initially covered by a server, 52 is the set of vertices that are, and ki = 0. The numbers a and /3 increase over time.
The labels are related to the cardinalities of the sets and satisfy the following set of conditions, which we call the labeling invariant: We will show that these conditions hold initially and that they are maintained as the partition evolves.
We can now describe how the labeled partition is updated in response to a request at a vertex v. Let i be such that v G 5t-. There are three cases: Rule 1, i = ß: Dp nothing.

Si-{v) S/3 *-SffU {v} kj «-kj: -1 i<j<P
It might now be the case that some label is changed from one to zero. If this happens, let j be the largest integer such that kj = 0. Now apply the following two assignments: Rule 3, i = a: Do the following assignments: An easy induction shows that the labeling invariant remains satisfied after a request is processed in this way.
The following table shows the labeled partitions generated for a particular sequence of requests when n = 9 and Jfc = 6. Each line shows the partition resulting after a request to the underlined vertex in the previous partition, as well as the rule that was applied to obtain the new partition. The leftmost set on a line is the current S a and the rightmost set is the current Sp. Each set 5,-(except Sp) is labeled with the appropriate k{. The partitioning procedure is significant because it permits an on-line algorithm to track the performance of an optimal off-line algorithm. The optimal off-line algorithm, For completeness we now prove that OPT is optimal for any request sequence o. Suppose A is an algorithm that starts in the same state as OPT, and makes the same moves as OPT for the first * requests of o. We now show how A can be modified, without increasing its cost, so that it also makes the same moves as OPT on the (i + l) 8t request.
Suppose the request is at v, that OPT moves the server on w to v, and A moves a server on x to v. Define algorithm A 1 as follows: On the (i + 1)** request, A! moves the server on w to v. A 1 now mimics the moves of A exactly, until one of two things happens: (1) A moves its server on w to another vertex u. Then A 1 moves the server on x to u, and A and A 1 are in the same state and have incurred the same cost. (2) There is a request at x, and A moves from u to x. Then A 1 moves from u toiy. Algorithms A and A 1 are again in the same state, and the cost incurred by A' is at most that incurred by A. By the definition of OPT, we know that there must be a request to x before any request to w. This guarantees that A 1 is well defined and costs no more than A. By repeatedly modifying algorithm A in this manner, it can be transformed, without increasing its cost, into OPT. It follows that no algorithm A can handle a more cheaply than OPT.
To streamline further discussion, we introduce some notation. Let

S* = S a U S a + 1 U • • • U 5 f --i U S { .
After processing a(l), a(2),..., a(t) the algorithm OPT covers a particular set of vertices. This set can be computed using the partition and the remainder of the request sequence o{t + 1),<j(£ + 2),..., as follows:

Sort the vertices in Sp_ x in order of earliest occurrence in a{t + 1) • • •. That is, if a vertex v occurs in a(t+1)
• • • before another vertex w, then v comes before w in the sorted list. All vertices requested in o{t+ 1) • • • come before all those that are not, and two vertices not requested are ordered by vertex number, lowest first.

Repeat the following step for each vertex v in the order defined in step 2, until 5
contains k vertices: Add v to S unless it would force S to contain more than i tvertices from any set 5/.  (a < i < (3). It must be the case that \Sp\ < k, so the set S contains at least one vertex from Sp_ x . Vertex v is first in the sorted list of vertices from 5/, so it must be in S. By the induction hypothesis, OPT is already covering S and does not move. We must show that S does not change.

Before the partition is updated, 5 contains at most kj vertices from each This means that S contains v and at most fcy -1 other vertices from Sy when i < j < (3 -1. When Rule 3 is applied to update the partition, v is moved from S^_ 1 to 5^, and kj is decremented for all j from t to /? -1. The decremented bounds offset the fact that v is no longer in Sj for any j < /3. The net effect is that S does not change. •
Using this lemma, we can obtain a lower bound on the cost of any paging algorithm. Let D(a) denote the number of times a vertex in set S a is requested during the processing of request sequence a.

Theorem 1 For any algorithm A and request sequence a, C A (a) > D{a).
Proof. Because OPT is optimal, the cost of algorithm A is no less than the cost of OPT.

An On-Line Algorithm
We can now describe the partitioning algorithm, a randomized ^-competitive algorithm for the uniform fc-server problem. It works by maintaining the labeled vertex partition described above, augmented with a system of marks. There will be one kind of mark When a request arrives for a vertex v in 5,-, a < i < (3, before applying Rule 2, we move marks around so that there is a j-mark on v for all j satisfying % < j < (3. This mark movement is done by repeating the following step for each j starting at i and ending at 0-1: If v has a j-mark then do nothing. Otherwise randomly choose some vertex w that has a j-mark. Transfer each /-mark (where / > j) from it; to v. When Rule 2 is applied, v ends up in 5^, and all the marks on v are erased. If a changes, all z-marks (t < a) are deleted. There are now the right number of marks of each type, confined to the right places. Recall that each ((3 -l)-mark corresponds to a server. If a {(3 -l)-mark is moved to v from some vertex, a server is also moved from the same vertex.
When a request arrives for a vertex v in 5 a , we apply update Rule 3. We then create k -1 new ((3-l)-marks and distribute them randomly among the k (/?-Ineligible vertices. These eligible vertices are exactly those covered by a server before the request. The one of these that is chosen not to have a (/? -l)-mark is the vertex from which the server is moved. T\ a <i<0(ki + 1), because there are exactly (ki + 1) ways to place the t-marks. The following lemma shows that while running the partitioning algorithm, each valid arrangement of marks is equally likely.

Lemma 2 The partitioning algorithm
is equally likely to produce each valid arrangement of marks.
Proof. We shall prove the lemma inductively. Clearly it is true initially, when there are no marks or eligible vertices. Now suppose the assertion is true before a request to a vertex v. The remainder of the proof shows that the lemma remains true after the request to v is processed.
If v 6 Sp, nothing happens to the partition or the marks, so the lemma remains true.

If v € S a , Rule 3 is applied and /3 is incremented. No i-marks are moved for any i < (3 -1, so the distribution of these marks is unchanged. The algorithm introduces kp-i new ((3 -l)-marks, randomly placed on the fc^-i + 1 eligible vertices.
This leaves the case where t; € and a < i < /3. In this case the j-marks, where a < j < i, are not changed by the partitioning algorithm, so their distribution remains the same. For the j-marks with j > i, the situation is more complex. The action of this step of the partitioning algorithm can be broken into two parts. The first part loads vertex v with j-marks for all j > i. The second part moves v into Sp and removes all of the marks on v. We claim that after the first part of the process, the arrangement of marks is equally likely to be any valid arrangement that obeys the additional constraint that vertex v has a j-mark for all j > i. It immediately 'dlows that the induction hypothesis is maintained after the second part of the step.

It remains to prove our claim. Call an arrangement of marks (t, j, v)-constrained if it is a valid arrangement of marks and if vertex t; has marks of types {/ | i < I < j}.
(It may have other marks as well.) The initial arrangement of marks is equally likely to be any arrangement that is (t,t -1, v)-constrained.
It is easy to verify that the j th step of the mark-moving process transforms a random (i,j -1, v)-constrained arrangement into a random (t, j 9 v)-constrained arrangement. An induction on j proves our claim. • We use Lemma 2 to prove the following lemma, which bounds the probability that a server moves on a request that invokes Rule 2.
Lemma 3 While processing a request for vertex v 6 Si, where a < i < /3, the probability that the partitioning algorithm moves a server is at most ^ k + 1 Proof. Recall that the mark-moving procedure works by iterating over levels from i up to /3-1. For each level, the iteration step ensures that v receives an *-mark. Eventually v receives a (/3 -l)-mark, which corresponds to a server. The probability that a server moves is bounded above by the expected number of iterations on which marks move. The probability that a move occurs on step j equals the probability that v still has no j-mark after the (j -1)'' step. As shown in Lemma 2, the j ih step begins in a random (i, j -1, v)constrained arrangement. In this arrangement, the probability that v has no j-mark is l/{kj + 1). This is the probability that a move occurs on the j th step. Summing of over all j, we obtain the desired bound. •

Theorem 2
The partitioning algorithm is an Hu-competitive algorithm for the uniform k-server problem.