Finding a maximum matching in a sparse random graph in O(n) expected time

We present a linear expected time algorithm for finding maximum cardinality matchings in sparse random graphs. This is optimal and improves on previous results by a logarithmic factor.


Introduction
A matching M in a graph G = (V, E) is a set of vertex disjoint edges.The problem of computing a matching of maximum size is central to the theory of algorithms and has been subject to intense study.Edmond's landmark paper [4] gave the first polynomial time algorithm for the problem.Micali and Vazirani [8] reduced the running time to O(mn 1/2 ) where n = |V | and m = |E|.These are worst-case results.In terms of average case results we have Motwani [9] and Bast, Mehlhorn, Schäfer and Tamaki [2] who have algorithms that run in O(m log n) expected time on the random graph G n,m , in which each graph with vertex set [n] and m edges is equally likely.
One natural approach to finding a maximum matching is to use a simple algorithm to find an initial matching and then augment it.This will not improve execution time in the worst-case, but as we will show, it can be used to obtain an O(n) expected time algorithm for graphs with constant average degree (O(m) in general).For a simple algorithm we go to the seminal paper of Karp and Sipser [6].They describe a simple greedy algorithm and show that whp it will in linear time produce a matching that is within o(n) of the maximum.Here "with high probability" is short hand for "with probability 1−o(n −1/2 )".Aronson, Frieze and Pittel [1] proved that whp the Karp-Sipser algorithm is off from the maximum by at most Õ(n 1/5 ) (this notation hides poly-logaritmic factors).In this paper we show that whp we can take the output of the Karp-Sipser algorithm and augment it in o(n) time to find a maximum cardinality matching.Our failure probability will therefore be o(1/ log n) and so we get a linear expected time algorithm if we back it up with the algorithm from [2].We will define a randomised algorithm Match and prove Theorem 1 Let m = cn/2 where c is a sufficiently large constant.Let G = G n,m .Then the algorithm Match finds a maximum matching in G in O(n) expected time.
The expectation here is over the choice of input and the random choices made by the algorithm.

The Karp-Sipser algorithm
This is a simple greedy algorithm.It adds a matching edge as follows: If the graph G has a vertex of degree 1, then it chooses one such vertex v at random and adds the unique edge (u, v) to the matching it has found so far and deletes the vertices u, v from G and continues.If the graph has minimum degree at least 2 then it picks a random edge (u, v), adds this to the matching and deletes u, v from G and continues.Vertices of degree zero are also considered to be deleted from G. The algorithm stops when G has no vertices.
We identify two phases in the execution of the Karp-Sipser algorithm.Phase one starts at the beginning and finishes when the current graph has minimum degree at least two.We note that if M 1 is the set of edges chosen in Phase 1 then there is some maximum cardinality matching that contains M 1 , i.e. no mistakes have been made so far.
Algorithm 1 below is a pseudo-code description.
Algorithm 1 Karp-Sipser Algorithm 1: procedure KSGreedy(G) 2: if G has vertices of degree one then 5: Select a vertex v uniformly at random from the set of vertices of degree one.

6:
Let (v, u) be the edge incident to v. Select an edge (v, u) uniformly at random.M KS ← M KS ∪ (v, u).

11:
G ← G \ {v, u}. 12: Remove any isolated vertices of G. 13: end while 14: return M KS 15: end procedure A vertex is said to be matched by a matching M if it is incident to an edge of M , otherwise it is unmatched.
Let M 1 denote the matching produced by Phase 1 of the Karp-Sipser algorithm.Let the graph at the end of Phase 1, beginning of Phase 2 be denoted by Γ. Γ will comprise (i) a collection C 1 , C 2 , . . ., C k of vertex disjoint cycles and (ii) a graph Γ 1 where each component has minimum degree two, but is not a cycle.At the end of the Karp-Sipser algorithm there will be some unmatched vertices.We partition them into I 1 ∪ I 2 .Here I 1 consists of (i) vertices that are not matched in Phase 1 and are already isolated by the end of Phase 1 and so are not in Γ, (ii) one vertex from each odd cycle in the collection C 1 , C 2 , . . ., C k .Thus I 2 consists of the vertices of Γ 1 that are not matched by M KS .This partition is significant in that to obtain a maximum matching, it is enough to augment the matching M 1 so that as many vertices in I 2 as possible are matched.
We now consider the output of the Karp-Sipser algorithm to be amended to include I 2 .
As shown in [6], all but o(n) vertices of Γ are matched by the Karp-Sipser algorithm when G is the random graph G n,m .This result was improved in [1] to show in fact that whp all but Õ(n 1/5 ) vertices of Γ are matched.It was shown in Frieze and Pittel [5] that whp a maximum cardinality matching of Γ covers every vertex except one for each isolated odd cycle and one further vertex if after removing odd cycles, the number of vertices remaining is odd.(This is an existence result, non-algorithmic).The number of vertices on isolated cycles is O(log n) whp but this fact will not be needed and therefore not proven.After running the Karp-Sipser algorithm and dealing with the isolated cycles, our task will whp be to match together Õ(n 1/5 ) unmatched vertices.

Outline Description of Match
We will take the output of the Karp-Sipser algorithm and then take the unmatched vertices I 2 = {v 1 , v 2 , . . ., v l } in pairs v 2i−1 , v 2i , i = 1, 2, . . ., ⌊l/2⌋.We then search for an augmenting path from v 2i−1 to v 2i for i = 1, 2, . . ., ⌊l/2⌋.We will refer to this as Phase 3 of our algorithm.We will show that this can be done in o(n) time whp.Phase 3 will make use of all of the edges of the graph Γ.The reader will be aware that the Karp-Sipser algorithm has conditioned the edges of this graph.We will show however that we can find a large set of edges A that have an easily described conditional distribution.This distribution will be simple enough that we can make use of A to show that we succeed whp.Intuitively, we can do this because the Karp-Sipser algorithm only "looks" at a small number of edges and discards most of the edges incident with the pair u, v chosen at each step.Dealing with conditioning is a central problem in Probabilistic Analysis.Often times it is achieved by the use of concentration.Here the problem is more subtle.Note that one cannot simply run the Karp-Sipser algorithm on a random subgraph G n,m 1 ⊆ G n,m and then use the m − m 1 random edges.This is because in this case, Phase 1, when run on the sub-graph will leave extra unmatched vertices.
Algorithm 2 below is a pseudo-code description of Match.Before giving this, let us describe match in words.The input is a graph G.
1. Run Phase 1 of the Karp-Sipser algorithm on G.

2.
As above let Γ denote G at the end of Phase 1. Γ is comprised of (i) a collection C 1 , C 2 , . . ., C k of vertex disjoint cycles and (ii) a graph Γ 1 where each component has minimum degree two, but is not a cycle.
3. Do a breadth first search of Γ and use it to identify and mark the vertices of C 1 , C 2 , . . ., C k .
4. Run Phase 2 of the Karp-Sipser algorithm on Γ.Note that the Karp-Sipser algorithm deals correctly with C 1 , C 2 , . . ., C k .Let I 2 = {v 1 , v 2 , . . ., v l } be the vertices of Γ 1 that remain unmatched at the end of Phase 2. Unmatched vertices in I 1 will remain unmatched and the marking done in 3. will allow us to distinguish I 2 .
5. For i = 1, 2, . . ., ⌊l/2⌋, look for an augmenting path from v 2i−1 to v 2i and use it to augment the current matching.Formally this entails running a procedure AugmentPath.
6.If there is a i ≤ ⌊l/2⌋ such that we fail to find an augmenting path, declare failure and run the algorithm of [2] from scratch.
Here is an alternative pseudo-code description of the above.

8:
end if end for 10: return M * 11: end procedure Procedure AugmentPath(M * , u, v) is described in Section 2. Its analysis constitutes the main subject matter of the paper.
Given an unmatched vertex u, an augmenting tree T u will be a tree of even depth rooted at u such that for k ≥ 0, edges between vertices at depth 2k and 2k +1 are not matching edges and edges between vertices at depth 2k + 1 and 2k + 2 are matching edges.We refer to the vertices at levels 2k as even vertices of the tree and vertices at level 2k + 1 as odd vertices for k ≥ 1.We let Odd(T) and Even(T) denote the set of even and odd vertices respectively.In an augmenting tree, every vertex in Odd(T) has exactly one child in Even(T), and so in particular |Odd(T)| = |Even(T)|.We define the (disjoint) neighborhood N (S) of a set of vertices S by In which case we have by N (Even(T)) ⊃ Odd(T).
We refer to the leaves of T u as the front of the tree and denote them by F u or Φ(T u ) .Given a pair of unmatched vertices u, v, AugmentPath will grow trees T u , T v until they are both "large" and then whp we will be able to connect a vertex x at the front (leaves) of T u , to a vertex y at the front of T v by a path x, a, b, y where (a, b) is a matching edge.This produces an augmenting path from u to v. (There are some less likely ways resulting in an augmenting path and these are also taken account of below).
Analysing the growth of these trees is the subject of Section 2. It requires detailed knowledge of the distribution of the random graph Γ.This is discussed in Section 2.3.1.
To prove that we can whp find a quadruple x, a, b, y for T u , T v requires knowledge of the conditioning imposed by Karp-Sipser algorithm on the edges not in M KS .This is the subject of Section 3.
In Section 4 we use what we prove in Section 3 to show that for a given pair of unmatched vertices u, v and corresponding augmenting trees T u , T v , we will whp be able to complete an augmenting path.

Augmenting Path Algorithm
As already explained, an execution of AugmentPath searches for an augmenting path between two unmatched vertices of I 2 .If such a path is found, the matching is augmented, if not the algorithm returns Failure and we resort to the algorithm of [2].Match executes AugmentPath until either there is at most one unmatched vertex of I 2 or there is a failure.Clearly, if the algorithm does not fail then it finds a maximum cardinality matching.
We now define some more terms associated with the tree growing process.A blossom rooted at v is an cycle of odd length where the edges on the path starting and ending at v alternate between matching and non-matching edges.An edge (x, y) that creates a blossom when added to T u is called a blossom edge of T u .
Given two augmenting trees T u , T v , rooted at u, v respectively, a hit edge is an edge (x, y) such that x is an even node in T u and y is an odd node in T v .Note that given a hit edge (x, y) the subtree of T v rooted at y can be taken from T v and added to T u , by removing the edge from y to its parent node in T v (see figure 2).
The trees T u , T v are grown until we either find an augmenting path or the trees cannot be grown further.For each of u, v we maintain a list of blossom edges and hit edges encountered so far.
The basic operation of our algorithm is to explore a vertex on the front.Vertices are explored in the order in which they are added to the tree.Throughout the algorithm we will keep track of which vertices we have explored in Phase 3, labeling them as exposed.This is mainly to keep track of which vertices have "no randomness" left because we have seen all the vertices they are adjacent to.
To explore x on the front F u of T u we look at each y incident to x such that (x, y) is not a matching edge: T1 If y belongs to neither of the trees and is matched, add it to T u along with its matching edge (y, y ′ ) T2 If y is unmatched we have an augmenting path from u to y T3 If y is an even vertex of T v then the path from u to x in T u , with the edge (x, y) and the path from y to v in T v forms an augmenting path T4 If y is an odd vertex of T v then (x, y) is a hit edge, append it to the list of hit edges for u.
T5 If y is an even vertex of T u then (x, y) along with the paths from x, y to their lowest common ancestor in T u form a blossom, append (x, y) to the list of blossom edges for u.
T6 If y is an odd vertex of T u we do nothing.
After examining all edges incident to x, we label it as exposed.Examples of the six cases for the edge (x, y) are shown in Figure 1 and examples for hit edges and blossoms are shown in Figures 2  and 3.The description of AugmentPath uses a small constant ε, any ε small enough will work, but one concrete value is given in (1).In each execution of AugmentPath the growth of both trees is split into three stages.Within a stage the growth of each tree is divided into growth spurts.In a spurt one of the trees T u , T v will be selected and all of the eligble vertices of its front will be explored.A tree starts in Stage 1 until it reaches a size of n .6−ε, after that Stage 2 starts.If a tree is in Stage 1 then all vertices on its front are eligible.Although such a tree is "large enough" we cannot be sure that it has enough unexposed vertices on the front.This is why we need Stage 2. During this stage only already exposed vertices on the front are eligible.Thus we do not explore the unexposed vertices at the front in this stage and it lasts until an n −ε fraction of the vertices on its front are unexposed.This means that we create at most n .6−εnew exposed vertices per tree in Phases 1 and 2. When the tree, T u say, has the required fraction of unexposed vertices on the front, it is placed in Stage 3 and we select a set S u of size n .6−2εat random from the set of unexposed vertices from the front.If both trees reach Stage 3 then we try to connect S u , S v by a path (x, a, b, y) of length three whose middle edge is a matching edge.If we succeed then we will have created an augmenting path from u to v. Otherwise we declare failure.Note that we expose at most n .6−ε+ n .6−2εnew currently unexposed vertices per tree per execution of AugmentPath.
During a spurt we explore the front of the selected tree and first deal with cases of T1,T2.If exploring a vertex finds an augmenting path (T2) then the current execution of AugmentPath is finished.If this does not happen and there are no instances of T1 then we inspect first the list of blossom edges and see if we can grow the tree using them.For each blossom found, contract the blossom into a supernode and consider this supernode to be at the front of the tree and try to grow by exploring edges available to the supernode.If there is no growth through blossom edges then we inspect the list of hit edges.If (x, y) is a hit edge in T4 with x ∈ F u then we can grow T u by adding to it the edge (x, y) plus the sub-tree of T v below y.If the tree still doesn't grow then we exit the execution of AugmentPath and fail.
The restriction in Stage 2 to only exploring already exposed vertices is a little un-natural and one would hope to be able to avoid it.We do it here so that we can properly bound the amount of work done over-all.Without the restriction, we could conceivably do a super-linear amount of work by unnecessarily exposing too many vertices.
To determine which of the trees to grow we look at the stages and sizes of the trees.If both trees are in Stages 1 or 2 we grow the smaller of the two.When only one tree is in Stage 1 we grow that tree.When one tree is in Stage 2 and the other is in Stage 3, we grow the one in Stage 2. When both are in Stage 3, we finish via a simple search for an augmenting path.
As part of our argument we will show that whp we only expose o(n .8 ) vertices altogether.This will ensure that we reach Stage 3, unless T2 or T3 occurs.Otherwise we will have run out of exposed vertices.Typically, we expect to find an augmenting path through T3 or we find that both trees reach Stage 3. Then at the first time both are in Stage 3, there will be many opportunities for the unexposed vertices to find a path x, a, b, y where x ∈ S u , y ∈ S v , (a, b) ∈ M .The edges e i correspond to cases i in the algorithm for i = 1, . . ., 6.

Data Structures
Aiming for a linear running time for Match means that we have to be careful in our use of data structures.We describe what data structures are needed for Karp-Sipser and AugmentPath to ensure a linear running time.For Karp-Sipser, given in Algorithm 1, we need to efficiently (i) access a list of neighbours of a vertex, (ii) delete a vertex from the graph, (iii) select an edge at random and (iv) select a vertex of degree one at random.This can be done by storing the vertices of G as array of vertices V .Each vertex V [i], stores its degree, a linked list, L, of edge-structures, and a pointer to its position in the array of vertices of degree one if it has one.Each edge-structure stores a pair of vertex indices and a pointer to the edge-representative-structure for (u, v) described below.
We also store a list E of edge-representative-structures.Each edge-representative-structure corresonding to an edge (u, v) stores the vertex indices u and v and one pointer per vertex pointing to the position of the (u, v) in the edge-structure list of V [u] and V [v] respectively.
Finally we keep track of which vertices are of degree one as an array V 1 of vertex indices.
Accessing the neighborhood of any vertex u can be done in time O(deg G (u)).Selecting an edge at random can be done by selecting an index i at random from 1 to |E| and accessing E[i], similarly for selecting a vertex of degree one at random.Deleting a vertex u involves deleting all edges (u, v) for all neighbours of v, this can also be achieved in time O(deg G (u)).This shows that the total running time of Karp When deleting a vertex u we repeat the following for all neighbours v of u.First swap the edge-representative-structure with the last element of E, repairing the pointers as necessary and decreasing the size of E by one.Then the two edge nodes of V [u] and V [v] are removed from their lists.After deleting all edges incident with u we swap V [i] with the last element of V , repairing all pointers of the elements swapped and decreasing the index of V by one.Finally if u is in V 1 we delete it from V 1 by swapping the element it points to out with the last element of V 1 and decreasing its size by one.
The matching computed by Karp-Sipser is stored as an array M indexed by vertex indicies containing the pair of vertices in the edge, unmatched vertices have an empty pair.
For the extension to Karp-Sipser we note that we can detect if a vertex u is on an isolated cycle by running a breadth-first search algorithm and counting the number of vertices and edges seen.If the two are equal then the connected component containing u is an isolated cycle.
Given a list of all vertices on isolated cycles allows us to find I 2 in time O(n).
For AugmentPath we allow ourselves more room in keeping track of things.We need to keep track of the trees T u and T v and determine whether vertex x belongs to either of them.We also need to keep track of blossom and hit edges and supernodes formed by contracting blossoms.Each of these tasks can be done with a suitable sophisticated data structure, such as a balanced tree, with a running time of O(log n).Our analysis will show that the number of operations executed in Phase 3 is less than n 1−ε and this saving of n −ε will easily account for any log n factors.

Tree Expansion
In this section we will show that whp both of our augmenting trees will reach Stage 3, unless their growth stops prematurely because an augmenting path has been found.Later we will argue that when both of the trees are in Stage 3 then their fronts contain many unexposed vertices and then whp an augmenting path can be found via a short path connecting these fronts.
In particular we will show that if the absolute constant α 1 is sufficiently large and 0 < α 2 < 1/2 and c = 2m/n ≥ c 0 and c 0 is sufficiently large then the following four lemmas hold.The proofs of the first three lemmas are heavy on computation.We have moved them to a separate section.
The algorithm uses the constant ε and in addition the proof uses a constant γ: Lemma 5 Let Γ be such that the low probability events described in Lemmas 2, 3 and 4 do not hold.Let u, v be a pair of isolated vertices for which an augmenting path is sought.Let M be an arbitrary matching of Γ. Suppose that we do not find an augmenting path as in Steps T2,T3.Then the following hold: Proof of Lemma 5: Suppose that (a) does not hold and that T u is selected for a spurt and the execution of AugmentPath fails.Neither tree can grow in size via T1-T6.We split the analysis into two cases depending on the size of T u .
Case 1 |T u | ≤ 1 10 α 2 log c n.Note that when we explore vertices on the front we have seen one matching edge incident to them and that they have degree at least two.Furthermore u has degree at least two, so F u starts out with at least two vertices.The only way the tree cannot grow is if we are in cases T4, T5 or T6.
All of the non-matching edges incident with the front F u of T u go to within T u .Observe that there are no hit edges leaving F u , for otherwise T u would grow and AugmentPath would not fail.Observe next that |F u | can only be reduced either R1 when a non-matching edge (x, y) with x ∈ F u has y ∈ T u , thus creating a cycle inside T u , or R2 a piece of T u is removed by a hit edge from T v .
Remark 6 We will see in the analysis in Case 2 below that no hit edges are needed by T v when either we have two edges going from F u to within itself that form two cycles, a violation of Lemma 3, or |F u | = 2 and we have one edge that forms a blossom.If the algorithm has failed after the contraction of the blossom then T u plus the blossom edge then there are several cases: (i) T u plus the blossom edge is an isolated odd cycle (contradicting the fact that u ∈ I 2 ); (ii) The vertices of T u already contain a short cycle through a case of T6, which with the blossom contradicts Lemma 3; (iii) The tree T u had a branch which was grafted to T v via a hit edge.But by Remark 6 we see that at this time |T v | ≤ 1 10 α 2 log c n.So either the vertices of T v contain a small cycle, which with the blossom contradicts Lemma 3 or there are at least two hit edges, which again contradicts Lemma 3.
If F u = {x} is of size one, then either x has degree at least two or it is a supernode from a previously contracted blossom.If u has degree at least two, we must already have encountered R1 or R2 above.In case R1 we have found one small cycle C with vertex set contained in V (T u ) \ F u .Since there are no hit edges and T u cannot grow, the non-matching edge incident with F u has both endpoints in T u and we therefore have two cycles in V (T u ), which violates the conditions of Lemma 3, a contradiction.(The reader might be concerned that we could have found an isolated odd cycle.However u, v ∈ I 2 and so neither of them are on such a cycle).Now consider the case where F u = {x} is of size one and the reduction in front size was caused by a hit edge (x, y) from T v .By the argument above we know that when T v used this hit edge we must have had either (i) 10 α 2 log c n.We see from Remark 6 that when T v used (x, y) to grow its size was at most 1 10 α 2 log c n and either its vertices contained a cycle C or there were at least two hit edges from T v to T u which again leads to a small cycle.Now consider the non-matching edge (x, y) incident with F u where x ∈ T u .y cannot be in T v since there are no hit edges from T u at this time and it cannot be in T u because of Lemma 3, contradiction.
If the only node on the front F u is a supernode with degree one, then either R1 or R2 have happened previously since |F u | ≥ 2 initially.If R1 occurred this would imply that V (T u ) has two cycles.If R2 occurred then by the same argument as above both trees must have been of size 1 10 α 2 log c n at the time.Therefore there exists a subgraph contained in V (T u ∪ T v ), of size at most 2 10 α 2 , with two cycles, contradicting Lemma 3.
Case 2. |T u | ≥ 1 10 α 2 log c n.We now show if T u is selected for a spurt then it will grow a new front of size at least 4c 5 (1−n −ε )|T u |.We can assume that c ≥ 10α 1 /α 2 .By Lemma 2 we know that the size of the new front is at least The factor (1 − n −ε ) accounts for only growing from exposed vertices.Some of these vertices might be odd vertices of T v and we must account for this.It is enough to show that the front of T u cannot be adjacent to c 10 |T u | odd vertices of T v .Suppose this is the case and call this set A. Let T A be the tree obtained by taking the union all the paths from A to v within T v .A is not considered to be part of T A .Now T A is contained within the tree T ′ v which is T v minus the matching edges on the front.Consider the last time T ′ v was grown and look at the rule used to decide which tree to grow.At this point T u is a subtree T ′ u of the T u at failure.Case 2a: If both trees were in Stage 1 or 2 then the smaller tree was grown, so T ′ v was smaller than a subtree of T u .This implies that 1+ 20 c ≥ 3 2 edges, and this contradicts the result of Lemma 4. We have seen that while in Stage 2, a tree grows in size by a factor of at least 3c/5, unless it reaches size n .99 , see Lemma 2. This is impossible as it would require the exposure of at least n .99−εvertices to this point.But we only expose at most 2n .6−εnew vertices during an execution of AugmentPath and there are only Õ(n 1/5 ) executions of AugmentPath.Thus both trees reach Stage 3.
Part (b) follows because we have shown that T u grows by a factor 3c/5 during Stage 2.

Proofs of Lemmas 2, 3, and 4
This requires some knowledge of the distribution of the random graph Γ.

Structure of Γ
Suppose that it has ν vertices and µ edges.By construction it has minimum degree at least two.It is shown in [1](Lemma 2) that if G is distributed as G n,m , m = cn/2, c > e (the base of natural logarithms) then Γ 1 is distributed as G δ≥2 ν,µ i.e.Γ has ν vertices and µ edges and G δ≥2 ν,µ is uniformly chosen from simple graphs with ν vertices, µ edges and minimum degree at least 2. The precise values of µ, ν are not essential but we will describe how they are related to c: Let z, β be defined by Then whp where ∼ denotes = (1 + o(1)).(Equation (3) follows from Lemma 8 of [1]).Note that as c → ∞ we have z ր c.
The maximum degree in G is less than log n whp and Γ inherits this property.Equation ( 7) of [1] enables us to claim that that if ν k , 2 ≤ k ≤ log n is the number of vertices of degree k then whp for some constant K 1 > 0.
In particular, this implies that if the degrees of the vertices in Γ are d 1 , d 2 , . . ., d ν then whp Given the degree sequence we make our computations in the configuration model, see Bollobás [3].Let d = (d 1 , d 2 , . . ., d n ) be a sequence of non-negative integers with m = nd even.Let W = [nd] be our set of points and let . Given a pairing F (i.e. a partition of W into m = dn/2 pairs) we obtain a (multi-)graph G F with vertex set [n] and an edge (φ(u), φ(v)) for each {u, v} ∈ F .Choosing a pairing F uniformly at random from among all possible pairings of the points of W produces a random (multi-)graph G F .
This model is valuable because of the following easily proven fact: Suppose G ∈ G n,d .Then It follows that if G is chosen randomly from G n,d , then for any graph property P Furthermore, applying Lemmas 4.4 and 4.5 of McKay [7] we see that if the degree sequence of Γ satisfies (6) then P(G F is simple) = Ω(1).In which case the configuration model can substitute for G n,d in dealing events of probability o(1).

Proof of Lemma 2
We show that whp augmenting trees of size λ = 2l + 1, where α 1 c log n ≤ λ ≤ n 0.9 , expand at a steady rate close to c.
We will first show that the probability of the event that there exists a tree T with |Even(T )| = l and |N (Even(T ))| = r, α 1 c log n ≤ l ≤ n 0.9 and l ≤ r ≤ 0.91cl, is polynomially small.If the above event occurs, then the following configuration appears (i) 2l edges of the tree connect Even(T) to Odd(T) (ii) r − l edges connect Even(T) to U = N (Even(T)) \ Odd(T) and (iii) none of the (l + 1)(n − r − l − 1) edges between Even(T) ∪ {root vertex} and V \ U are present.Given the vertices of T and N (Even(T )), the probability of the above event occurring in G δ≥2 n,m is bounded above by Explanation: The probability that an edge exists between vertices u and v of degrees d u and d v , given the existence of other edges in the tree, is at most less the number of edges already assumed to be incident with u.Hence, given the degree sequence, the probability that the augmenting tree exists is at most Here the first product corresponds to Even(T ), the second product corresponds to ODD(T ) and the final product corresponds to neighbours of T (not in T ).
We now simplify the expression (8) obtained for the probability to since r = O(l).
We now count the number of such configurations.We begin by choosing Even(T) and the root vertex of the tree in at most ν ν l ways.We make the following observation about augmenting path trees with |Even(T)| = l.The contraction of the matching edges of the tree yields a unique tree on l + 1 vertices.We note, by Cayley's formula, that the number of trees that could be formed using (l + 1) vertices is (l + 1) l−1 .Reversing this contraction, we now choose the sequence of l vertices, Odd(T), that connect up vertices in Even(T) in (ν − l − 1)(ν − l − 1)...(ν − 2l) = (ν − l − 1) l ways.We pick the remaining r − l vertices from the remaining ν − 2l − 1 vertices in ν−2l−1 r−l ways.These r − l vertices can connect to any of Even(T) in l r−l ways.Hence, the total number of configurations is at most Combining the bounds for probability and configurations, we get an upper bound of The expression eλl x x is maximized at x = λl > 0.9999cl.But r − l ≤ 0.91cl < λl.Hence, we have the bound where q = 0.93e 2 cλ 2 e 0.002c ≤ e −.001c .
We sum the above expression over all r and l with α 1 c log n ≤ l ≤ n 0.9 and l ≤ r ≤ 0.91cl and we get the probability to be at most for α 1 ≥ 5501.
We will now show that the probability of the event that there exists a tree having |Even(T )| = l and |N (Even(T ))| ≥ 1.07cl, where α 1 c log n ≤ l ≤ n 0.9 .It is enough to show that the probability that there exists a tree with |Even(T )| = l and |N (Even(T ))| ≥ r, where α 1 c log n ≤ l ≤ n 0.9 and r = 1.07cl, is polynomially small since a tree with |N (Even(T ))| > 1.09cl also contains a tree with |N (Even(T ))| ≥ 1.07cl.
The probability bound (9) remains the same with the 0.92 replaced by 1.08 and we get a bound of l using e 0.999 e 1.07 1.07 > e 0.002 = O(ν 3/2 )q l where q = 1.08e 2 cλ 2 e 0.002c ≤ e −.001c .
We sum the above expression over all l with α 1 c log n ≤ l ≤ n 0.9 and we get the probability to be at most Hence, whp, there does not exist a tree with |Even(T )| = l and |N (Even(T ))| = r, where α 1 c log n ≤ l ≤ n 0.9 and r / ∈ [0.9cl, 1.1cl].

Proof of Lemma 3
We show that this holds in G n,m and note that G δ≥2 ν,µ is a vertex induced subgraph of G n,m .Since the property is closed under edge deletion this will imply the Lemma.Because the property in question is monotone increasing we can estimate the events probability in G n,p where p = c n .If there are two small cycles close together then there will be a path P of length at most k = a + b + d plus two extra edges joining the endpoints of P to internal vertices of P .The probability of this can be bounded by

Proof of Lemma 4
We can work in G n,p , for exactly the same reasons as in Lemma 3.
We get the bound

Karp-Sipser conditioning
We now study the conditioning introduced on unused edges by the Karp-Sipser algorithm.We now view Γ as an ordered sequence of edges and look at an equivalent version of the Karp-Sipser algorithm.In the analysis of Karp-Sipser on random graphs we have two sources of randomness.One is the random graph itself and the other one is the random choices made by the algorithm.In order to simplify the analysis we change the choices into deterministic ones and simply randomize the order in which the edges are stored and take them in this (random) order.This is equivalent to the original algorithm.We now state the modified Karp-Sipser algorithm.We stress that we do not propose to implement the Karp-Sipser algorithm in this way, it is merely a vehicle used in our analysis.It does however produce the same output if the ordering is matched to the choices made by the original Karp-Sipser algorithm.We consider its effect on Phase 2. We assume the graph Γ at the start of Phase 2 is given as Γ = (ξ 1 , . . ., ξ µ ), an ordered set of µ edges.
We say that edge e ∈ Γ has index i if it is the i-th edge in the list, i.e. e = ξ i .Note that every graph in the support of G δ≥2 ν,µ will yield µ! ordered sets of edges, so from now on we will think of G δ≥2 ν,µ as a family of ordered sets of edges.Furthermore, if c is large then µ/ν will be close to c, whp.
if G has vertices of degree 1 then

5:
Of all edges incident to vertices of degree 1, let e have the lowest index 6: Let e = (v, u) where v has degree 1.  return M 14: end procedure The output of Karp-Sipser algorithm will consist of an ordered matching M = {e 1 , e 2 , . . ., e ℓ } plus some extra witness edges W .Here e 1 , e 2 , . . ., e ℓ is the order in which the matching edges are produced by the Karp-Sipser algorithm.More precisely, the output will be a sequence σ 1 , σ 2 , . . ., σ µ where σ i is either (i) an edge of G, marked as being a matching edge or a witness, or (ii) * .The choice σ i = * signifies that the ith edge is random with a distribution described in Section 3.2.In addition, the output includes the ordering of M by the time of choice by the Karp-Sipser algorithm.
There will therefore be a set Ψ of possible edge replacements for the positions with * .Given the output, each member of Ψ will be equally likely.This follows from the fact that G is sampled uniformly from a set of instances.We need however to know more about the structure of Ψ and this is the content of Lemma 7 below.

Witness edges
In addition to the edges of the matching we define edges based on the execution of the algorithm.We split the vertices of the graph into three classes, regular, pendant and unmatched.A vertex is regular if when it was removed from the graph, it had degree 2 or more.A vertex is said to be a pendant vertex if when it was removed it had degree exactly one and it is the endpoint of a matching edge in M .Unmatched vertices are those vertices that are not incident to matching edges.We say that an edge e is regular if both of its endpoints are regular, i.e. it was removed from the graph in line 8.For each of these vertices we define witness edges.
• Regular Vertices: v is removed from the graph when the edge e is picked as a matching edge.Since it has degree at least 2, there are other edges incident to it at the time it is removed.Pick the one with the lowest index and define it to be the regular witness edge for v.
• Pendant or Unmatched vertices: Consider the last point of time when v has degree at least 2, an edge e = (x, y) is removed from the graph and v is incident to at least one of them (perhaps both), say x.We then define (v, x) to be the pendant witness edge for v.
• Vertices: v has a pendant witness edge, and since it is never picked for a matching its last edge is incident to some matching edge e = (x, y), say x, we then define (v, x) to be the removal witness edge for v.
• In case of any ambiguities, define pendant witness edges first and then removal witness edges.
Use the lowest index of edges to break all ties.This can happen if a vertex goes from having degree three to pendant or from having degree 2 to degree zero if it is incident to both endpoints of a matched edge.
• Note that an edge e can be a regular witness edge for one vertex and a pendant or removal witness edge for another vertex.
Let W be the set of witness edges.Regular and pendant vertices are incident to matching edges and their witness edges.Unmatched vertices are incident to two witness edges.Hence the graph defined by M and W has minimum degree 2 and size at most 2ν.
We think of the graph G as an ordered set of µ boxes filled with edges.Suppose we know the output of KS * , M , W and also the order in which the matching and witness edges were added to M and W , but the underlying graph is unknown to us.This corresponds to µ ordered boxes, of which the ones corresponding to M and W have been opened.The following lemma provides necessary and sufficient conditions for a graph G to yield M and W as the output of KS * .Lemma 7 In the following e = (u, v) is not an edge of graph G ′ and G = G ′ ∪ {e}.Think of G ′ as an ordered set of µ boxes where one box is unopened.To obtain G we open this box and find e.Suppose that when algorithm KS * is run on G it produces the ordered matching M and the witness set W . Suppose further that when algorithm KS * is run on G ′ it produces the ordered matching M ′ and the witness set W ′ .Preservation Conditions.
1.If both u and v are both regular vertices and u was removed from the graph before v then e has a higher index than the regular witness edge for u.
2. If u is a regular vertex and v is either a pendant or unmatched vertex then (i) e has a higher index than the regular witness edge for u, (ii) u is removed before v by the Karp-Sipser algorithm and (iii) when u is removed from the graph v has degree at least 2. Additionally, if the pendant witness for v is incident to the matching edge of u then e has a higher index than the pendant witness for v.
3. At least one of u, v is a regular vertex.

Proof of Lemma 7:
To keep track of the algorithm KS * we let G t denote the graph after the tth iteration, so G 0 = G and G T = ∅ where T is the number of iterations.At timestep t let Π t be the set of edges incident with pendant vertices of G t .Let M t denote the set of matching edges and W t denote the set of witness edges at time t.For G ′ we define G ′ t etc. in the same manner.(a) Assume that u is removed first from G at time step t u + 1, i.e. u ∈ G tu and u / ∈ G tu+1 .If u, v are both G-regular then this can be assumed.(We use the term G-regular to stress the fact that we are considering the execution of the Karp-Sipser algorithm on G).If u is a G-regular vertex and v is either a G-pendant vertex or a G-unmatched vertex then u must be removed first.Suppose not and v is removed first.Suppose that v is G-pendant.Then when v is removed, edge e is still in G t and it has a higher index than the G-witness edge for v.But this implies that v is a G-regular vertex, contradiction.A similar argument handles the case when v is unmatched.
We show next that M t = M ′ t for t = 1, . . ., t u .If this holds for all t up to t u then after that we will have G t = G ′ t for t > t u since u has been removed and e is gone from the graph.This is proved by induction.The base case is easy since Case 1: Both u and v are G-regular.Since e is not incident to a G-pendant vertex and e / ∈ W we have Π t = Π ′ t .If Π t = ∅ then we select the edge from Π t with minimum index, and add it to M t , since Π t = Π ′ t we have M t+1 = M ′ t+1 .If Π t = ∅ we select the edge in G t of minimum index.This cannot be e since it comes after the regular G-witness edge f for u.If e came before f then f would not be the G-witness edge for u.Hence the preservation conditions hold and the same edge is chosen in both G t and G ′ t and M t+1 = M ′ t+1 .
Case 2: u is a G-regular vertex and v is either a G-pendant vertex or a G-unmatched vertex.Now u is removed from G before v and deg Gt u (v) ≥ 2. This is clear since e is not a pendant witness for v and the pendant witness edge is the second last edge incident to v to be removed from the graph.
Neither u nor v are G-pendant vertices for t ≤ t u and so e is not incident with a G-pendant vertex, hence Π t = Π ′ t .Thus if Π t = ∅ we have M t+1 = M ′ t+1 as before.If Π t = ∅ we use the argument for Case 1 with the additional comment.If the G-pendant witness f for v is incident to the matching edge of u then e must have a higher index than f , otherwise e would be the G-pendant witness for v.
Case 3: If u, v are both G-pendant vertices and u is matched before v then u is incident to e and its matching edge at the time it is matched, contradicting the fact that deg Gt u (u) = 1.If u is G-pendant and v is unmatched then we draw the same conclusion.If both u, v are G-unmatched and u becomes isolated first then e would have to be the G-removal witness for u, a contradiction since v is still in the graph.
We have now shown that e satisfies the preservation conditions with respect to G and the same ordered matching set M will be generated for G and G ′ .In particular we have shown This together with the fact that e has a higher index than the G-regular witness edge f for u immediately implies that W ′ = W .This is because the Karp-Sipser algorithm will basically have the same choices for witnesses at each point.This completes the proof of Part (a).(b) Assume now that the preservation conditions hold with respect to the execution of the Karp-Sipser algorithm on G ′ .It is enough to show that (10) holds under these assumptions.This will imply that W ′ = W . Consider the execution of the Karp-Sipser algorithm on G. Assume as in (a) that M t = M ′ t so G t = G ′ t ∪ {e}.We show that M t+1 = M ′ t+1 .Case 1: Both u and v are G ′ -regular.Since e is not incident to a G ′ -pendant vertex we have Π t = Π ′ t .If Π t = ∅ then we select the edge from Π t with minimum index and add it to M t .Since Π t = Π ′ t we have M t+1 = M ′ t+1 .If Π t = ∅ we select the edge in G t of minimum index.This cannot be e since it comes after the regular G ′ -witness edge f for u.Note that f ∈ G ′ t implies that f ∈ G t at this point.Hence the same edge is chosen in both G t and G ′ t and M t+1 = M ′ t+1 .Case 2: u is a G ′ -regular vertex and v is either a G ′ -pendant vertex or an G ′ -unmatched vertex.By the preservation conditions u is removed from G ′ before v and deg t+1 as before.If Π t = ∅ we use the argument for Case 1. Hence the graphs G and G ′ will generate the same ordered matchings and (10) holds.This completes the proof of Part (b).Part (c) follows from (10).It immediately shows that if We only have to check now that if x is G ′ -pendant then it is G-pendant.Here we must have x ∈ {u, v} and then x = v since we have assumed that u is deleted first and is G ′ -regular.But the fact that u is deleted before v means that adding the edge e cannot change the type of vertex v when going from G ′ to G.
Remark 8 Note that it is possible to add an edge to the graph that will produce the same set of matching edges, but a different witness set.Since we want to condition on both sets, and the exact order in which they were produced we are not concerned with such cases.
Remark 9 Part (a) of Lemma 7 shows that we can remove all edges except matching and witness edges and KS * will produce the same output on this "skeleton" graph.Thus we can partition the set of all graphs where each part corresponds to the same skeleton graph.Part (b) of Lemma 7 allows us to construct each graph corresponding to the same skeleton graph.
Remark 10 We stress the following point about condition 2(iii).If (u, x) is a pendant edge at the time of u's removal and v was pendant at this time then adding (u, v) would mean that G tu and G ′ tu have different sets of pendant vertices and this contradicts our argument for Part (a).

Probability Space
We describe the probability space after we sample a random graph from G δ≥2 ν,µ and run KS * on the graph.We condition on the output matching edges M and also on the witness edges W .For the purpose of proof we assume that KS * outputs W as well as M .Given the output M and W and Lemma 7 we can find all graphs that would give M and W as the output of KS * and generate one uniformly at random.
First note that for each box i that is not in M or W we can create a set of edges E i that could go into that box, from Lemma 7 we see that this set depends only on M and W . Also note that all the rules state that an edge can go into any box that comes after some specified box, thus we have E i ⊆ E j when i < j.This leads us to the following algorithm for generating a random graph from the distribution G δ≥2 ν,µ |M, W , i.e. conditioned on the output of KS * .1: procedure Generate-Random(M ,W ) for unfilled boxes i do for unfilled boxes i in increasing order do 7: Select e uniformly at random from E i 8: Remove e from E j for all j > i 10: end for 11: return Γ 12: end procedure Each Γ that outputs M and W can be generated with Generate-Random in exactly one way (since the graph is an ordered set of edges) and that any graph Γ produced by Generate-Random will produce M and W when we run KS * on Γ.This shows that Generate-Random will gives a uniformly random graph from G δ≥2 ν,µ |M, W .

Final Proof
In Section 3.2 we gave a description of the probability space.However this is not enough for us to finish the proof of Theorem 1.We need to dig a little deeper into the analysis of the Karp-Sipser algorithm.We begin by listing some definitions and lemmas from the paper that we will need.The constants ε and γ are given in (1).Let τ 0 be the last time when the number of vertices removed from the graph is at most n .8+10ε .We refer to vertices removed before τ 0 as early vertices and those removed afterwards as late.We say that a matching edge e is a good matching edge if both of its endpoints are early and regular and if the regular witness edges for both of its endpoints have indices in [µ/2].
We will show that whp when we reach Stage 3, T u and T v will both have many late vertices at the front (Lemma 13).There will also be many good matching edges outside the two trees (Lemma 14).
The good matching edges are useful since they allow us to identify a large set of potential edges in the graph.Suppose that e = (x, a) where a is an endpoint of a good matching edge and x is a late vertex.Then a is a regular vertex removed before x and when a is removed from the graph all vertices have degree at least two.So unless (i) x is a pendant or unmatched vertex and (ii) the pendant witness edge for x is incident to the other endpoint of the matching edge for b, then all of the preservation conditions are satisfied and (x, a) can be part of the graph.We say that such an edge is available.But for each a, Lemma 3 implies that there is at most one vertex x such that (i) and (ii) hold.Because a is an endpoint of a good matching edge the edge e = (x, a) can go into any open box after the regular witness edge for a, which is guaranteed to have index less than µ/2.In summary, if e is available then Using this, we will be able to show that if T u , T v contain many unexposed late vertices at their front and there are many good matching edges, then we will be able to find x, a, b, y such that x, y are unexposed late vertices at the fronts of T u , T v repsectively and (a, b) is a good matching edge and x, a, b, y is a path, producing the required augmenting path from u to v.

The Batch Graph
Let Γ(t) denote the current graph after t steps of Phase 2. It is shown in [1] that Γ(t) is distributed uniformly at random from the set of all graphs with v 0 (t) vertices of degree 0, v 1 (t) vertices of degree 1, v(t) vertices of degree at least 2 and m(t) edges, we denote this sequence by v(t) = (v 0 (t), v 1 (t), v(t), m(t)).Furthermore, the sequence v(t) is a Markov chain.Thus the analysis of the algorithm is done by tracking the sequence v(t).Additionally we define z(t) by Thus z(0) = λ ∼ z of Section 2.3.1.Notice that v 1 (0) = 0. Conditional on v(t), the degrees of the vertices of degree at least 2 are distributed as independent copies of a truncated Poisson random variable Z, where As our input is taken from G δ≥2 ν,µ we start in the state v(0) = (0, 0, ν, µ), i.e. with v 1 (0) = 0.For t 1 < t 2 such that v 1 (t 1 ) = v 1 (t 2 ) = 0 and v 1 (t) > 0 for t 1 < t < t 2 ≤ τ 0 we look at the set of edges and vertices removed from t 1 to t 2 , i.e. the graph Γ(t 1 ) \ Γ(t 2 ) and call it a batch.Note that each batch contains the regular matching edge removed at time t 1 and is a connected graph.
This shows that for t ≤ τ 0 each batch corresponds to an interval of time of length at most O(log 3 n).Since the maximum degree in Γ is o(log n) whp, we see that the total number of pendant vertices (at the time of removal) is O(log 4 n), which implies that the number of vertices in a batch is also O(log 4 n).
This also shows that during the first τ 0 time steps there will be at least Ω n .8+10εlog 4 n times when v 1 (t) = 0 and thus at least that many regular edges are added to the matching.
Lemma 12 The probability ρ that there exists a vertex v ∈ G that is within distance γ log c n of 100 batches is at most n −5

Proof of Lemma 12:
Letting dist denote distance in Γ we bound this probability by Explanation: Here n .8+10εk is the number of choices for the start times of the batches B 1 , B 2 , . . ., B k .
Proof of (12): Suppose that B i is constructed at time t i .It is a subgraph of Γ(t i ) and depends only on this graph.Using Lemma 11, we argue that where N k (w) is the set of vertices within distance k of w in G n,p .

Explanation:
The O(n −4 ) term is the probability the batch B i is large.The term in (13) arises as follows.We can assume that |N γ log c n (w)| ≤ n γ+ε , see (14) below.Suppose as in [1] we expose the graph Γ at the same time that we run the Karp-Sipser algorithm.For us it is convenient to work within the configuration model of Bollobás [3].(Note that in this case the multi-graph produced by the configuration model has an Ω(1) probability of being simple, see Section 2.3.1).Assume that we have exposed N γ log c n (w).At the start of the construction of a batch we choose a random edge of the current graph.The probability this edge lies in In the middle of the construction of a batch, one endpoint of a pendant edge is known and then the other endpoint is chosen randomly from the set of configuration points associated with Γ(t).The probability this new endpoint lies in N γ log c n (w) is also ≤ n γ+ε /(n−o(n)) and there are only O(log 4 n) steps in the creation of a batch.This explains the term O(log 4 n) n−o(n) n γ+ε .It only remains to verify To prove this we can resort to proving the same inequality for G n,c/n .This is an easy application of Chernoff bounds.We do a BFS from w until the first time that the breadth first tree has ≥ log n leaves.Given a maximum degree of o(log n) we see that there will at this time be o(log 2 n) leaves.
If there are never ≥ log n leaves then, |N γ log c n (w)| = o(log 3 n) and so assume it does.If there are ℓ s leaves at depth s then the number of leaves at depth s + 1 in the BFS tree is bounded by Bin(ℓ s n, p) where p = c n and so and so with probability 1 − O(n −10 ) say, the tree will grow multliplicatively at a rate of less than 2c per round.In which case Early pendant vertices can be a nuisance at the front of a tree.It follows from Lemma 7, Rule 2(ii), that if v is early then there could be relatively few choices for u so that we can find edge (u, v) in an unopened box.The next lemma puts a bound on the number of early vertices at the front.
Lemma 13 Let T be an augmenting tree of size |T | = Ω(n .5 ).Then, whp, there are at most |F |/n 2ε early vertices at the front F of the tree.
Proof of Lemma 13: Suppose first that T is in Stage 1 and that there are s = |T | n 3ε early vertices at the front of the augmenting tree T .Let F ′ be the set of nodes in the tree at distance γ log c n from the front.Consider the subtree T ′ formed by F ′ and the paths from nodes in F ′ to the root.By Lemma 2, if T ′ underwent γ 2 log c n spurts, its front would increase by a factor of at least 9c 10 for c large enough.This shows that |F ′ | ≤ |T |/n γ/3 .Since there are at least |T |/n 3ε early vertices on the front there exists a node v in F ′ that is an ancestor to at least n γ/3−3ε early vertices.Each batch contains at most log 4 n vertices so there are at least n γ/3−3ε−o (1) early vertices from distinct batches.By the definition of F ′ the batches are at a distance of at most γ log c n from v and so by Lemma 12 the probability of this event can be bounded above by n −5 .Now assume that T is in Stage 2 and that r levels have been added since the start of Stage 2. This is the interesting case, but the argument for Stage 1 will come in handy.The problem is that (15) need not be true, as we do not grow from unexposed vertices.Let T * ⊇ T denote the tree we would grow if we had also grown from unexposed vertices in Stage 2. Let f 0 , f 1 , . . ., f r be the front sizes in T from the start of Stage 2 and let f * 0 , f * 1 , . . ., f * r denote the sizes of the corresponding fronts F * i in T * .Here f 0 = f * 0 is the size of the front at the end of Stage 1.The argument already given shows that there are at most f * i /n 3ε early vertices at the front of the ith such level of T * .Now under the worst-case assumption that the early vertices of F * i are all unexposed, we see that there at most n −3ε r i=1 f * i early vertices in the front F of T .The lemma follows from this and (16).

Good Matching edges
Lemma 14 There are whp Ω n .8+10εlog 4 n good matching edges in Γ.
Proof of Lemma 14: As already observed, Lemma 11 shows that there are Ω n .8+10εlog 4 n times t ≤ τ 0 when v 1 (t) = 0. Consider exposing the ordering of the edges of the graph as we remove edges from the graph.Thus at time t all edges in Γ \ Γ(t) have been revealed.When v 1 (t) = 0 an edge is picked as a matching edge and must be in the available box of lowest index.Then for both endpoints we reveal the indices of edges incident to the endpoints.The edges of lowest index for each vertex become the witness edges.Note that at this point in time the contents of at most O(n .8+10ε ) boxes have been revealed.Since each endpoint has at least one edge incident to it the index of the witness edge is in [µ/2] with probability at least 1 2 − o(1).This shows that the regular edge created at this time is a good matching edge with probability at least 1 4 − o(1), independently of the previous history.Thus the actual number of good matching edges dominates a binomial with expectation Ω n .8+10εlog 4 n .

Putting it all together
We show that whp in each execution of Augmentpath the algorithm will find an augmenting path and will find one by exposing at most O(n .6−ε ) new vertices.

Running Time
The size of the front of an augmenting tree in the ith execution of Augmentpath will be at most O(in .6−ε ) since we claim that whp we only explore O(n .6−ε ) unexposed vertices in an execution of Augmentpath.Thus all fronts can easily be fitted into an array of size n.This also implies that the amount of work done in the ith execution of Augmentpath is O(((i − 1)n .6−ε+ n .6−εlog n) log n), since we could in the worst case visit all previously exposed vertices and the maximum degree is o(log n) whp.The final log n accounts for the overhead from using more sophisticated data structures as discussed in section 2.1.So the total work would be Õ(l 2 n .6−ε ) = Õ(n 1−ε ) = o(n), where l is the total number of executions of AugmentPath and l = Õ(n .2 ) since the size of I 2 is Õ(n .2 ) whp.

Proof of Theorem 1
Consider the ith execution of Augmentpath and assume we have only exposed O(i • n .6−ε ) = o(n .8 ) vertices.By Lemma 5 we know that whp the algorithm will be able to grow the trees by a factor ≥ 3c/4 in each execution of Augmentpath.If we have found an augmenting path before both trees are in Stage 2 then at most O(n .6−ε ) new vertices have been exposed.We will now show that during Stage 3, the algorithm will find an augmenting path and expose at most O(n .6−ε ) new vertices in the process.
We know from Lemma 5 that T u , T v will enter Stage 3. Let F u and F v be the unexposed vertices on the fronts of T u and T v respectively, and let S u and S v be two sets of size n .6−2εselected randomly from the fronts.Now F u and F v represent at least an n −ε fraction of the size of the front, and whp at most an n −2ε fraction of the front can be early vertices.So in the worst case at most an n −ε fraction of F u and F v are early vertices.Since S u and S v are selected at random from F u and F v at least half of S u and S v are late vertices with probability at least 1 − o(n −1 ).
We now show that AugmentPath can find an augmenting path by a simple search.Let A be the set of good matching edges whose endpoints are unexposed.Lemma 14 implies that initially we have at least n .8+9εgood matching edges and at most o(n .8 ) vertices have been exposed, we know that |A| = Ω(n .8+9ε).
If there exists an x ∈ S u , y ∈ S v and (a, b) ∈ A such that (x, a), (b, y) ∈ E then (x, a), (a, b), (b, y) forms an augmenting path with the paths from x and y to their roots.AugmentPath will do a simple search for such a quadruple x, a, b, y.
Let A u be the set of edges in A whose vertices are adjacent to S u .It follows from Lemma 7 that an available edge (x, a), x ∈ S u and a incident to A can go into any one of the unopened boxes after µ/2.At most 2n + o(n) edges have been revealed.Revealing the contents of the boxes one by one we have a lower bound of 1  ( n 2 ) for the probability of seeing any one fixed edge e, see (11).This is under the assumption that we have not already seen e in an earlier box, regardless of previously opened boxes.Suppose now that we open the boxes µ/2 to 3µ/4 and stop when we have either exhausted all such boxes or have found n .4+7εmembers of A u .When we open a box, there is a probability of at least (n .8+9ε− n .4+7ε)n .6−2ε/ n 2 ≥ n .9ε−.6 of finding a new member of A u .Thus |A u | dominates min n .4+7ε, Bin(µ/4 − 2n − o(n), n .9ε−. 6 and so whp n .4+7ε/2 ≤ |A u | ≤ n .4+7ε .For each edge a ∈ A u let ξ a be a vertex of a that is adjacent to S u and let η a be the other endpoint.
We now consider the edges in unopened boxes 3µ/4 to µ.For any unopened box the probability it contains an edge from {η a : a ∈ A u } to S v is Ω n .6−2εn .4+7ε/2 ( n 2 ) . Thus the number of such edges dominates Bin(µ/4 − 2n − o(n), n −1+5ε ) and this is non-zero with probability 1 − o(n −1 ).Because the maximum degree of a vertex is o(log n) we see that the search for x, a, b, y can be done in O(n .6−2εlog 2 n) = O(n .6−ε ) time.This completes the proof of Theorem 1

Conclusion
We have shown that a maximum matching can be found in O(n) expected time if the average degree is a sufficiently large constant.It is easy to extend this to the case where the average degree grows with n.It is much more challenging to try to extend the result to any constant c.Karp and Sipser showed that if c < e then whp Phase 1 leaves o(n) vertices for Phase 2. In the paper [1], it was shown that for c < e, only a few vertex disjoint cycles are left, whp.So the problematical range is e ≤ c < c 0 .

Figure 1 :
Figure 1: The trees T u and T v are shown with bold edges.The edges e i correspond to cases i in the algorithm for i = 1, . . ., 6.

Figure 2 :Figure 3 :
Figure 2: The trees T u and T v after using the hit edge e 4 .

Lemma 2 Lemma 3 1 n 1 −α 2 ,Lemma 4
The following will hold with probability 1 − O( 1 n 2 ).For all matchings M of Γ and all augmenting trees T with c −1 α 1 log n ≤ |T | ≤ n .99 .T can be grown to a new tree T ′ for which Φ(T ′ ) ∈ 9c 10 |T |, 11c 10 |T | .With probability 1 − Õ Γ does not contain two cycles of length a and b, at distance d apart for any a, b, d such that a + b + d ≤ α 2 log c n With probability 1−O( 1 n 2 ) Γ, does not contain a set S ⊆ V (Γ) with log log n ≤ |S| ≤ n .99 that has more than (1 + ǫ)|S| edges.

( a )
Both T u and T v will reach Stage 3. (b) The number of growth spurts undergone by either tree in Stage 2 is at most log 3c/4 n.
Case 2b: If one tree was in Stage 1 and the other in Stage 2 or 3, then since we grew T ′ v , a subtree of T u must have been in Stage 2 or 3 and T ′ v in Stage 1.This implies that |T ′ v | ≤ n .6−ε≤ |T u |.Case 2c: If one tree is in Stage 3 and the other in Stage 2 then T ′ v must be in Stage 2 and T u must be in Stage 3. T v must also be in Stage 3, since we are trying to grow T u .This contingency, T u and T v both in Stage 3 is not dealt with in the lemma.It will be dealt with in Section 4.3.Now consider the set S = T u ∪ A ∪ T A , it has at least |S| + |A| − 2 edges.We have 2µ.The resulting Z 1 , Z 2 , . . ., Z ν have the same distribution as the degrees of Γ 1 .This follows from Lemma 4 of[1].If we choose λ so that E(P o(λ; ≥ 2) = 2µ ν or λ(e λ −1) e λ −1−λ = 2µ ν then the conditional probability, P(Z1 + Z 2 + • • • Z ν = 2µ) = Ω(1/ √ ν)and so we will have to pay a factor of O( √ ν) for removing the conditioning i.e. to use the simple inequality P(A | B) ≤ P(A)/P(B).

k ǫ n ǫk
and since k = O(n .99 ) the summand can be upper bounded by 2 −k for k ≥ √ n and by n −ǫk/200 for k ≤ √ n.The union bound then gives an upper bound of = o(n −3 )

( a )
If e / ∈ M ∪ W then M ′ = M and W ′ = W and e satisfies the preservation conditions 1,2, and 3 below where the type of a vertex is determined by the execution of the Karp-Sipser algorithm on G. (b) If e satisfies the preservation conditions 1,2, and 3 below, where the type of a vertex is determined by the execution of the Karp-Sipser algorithm on G ′ then M = M ′ and W = W ′ .(c) In both cases (a), (b) above, the type of u or v is the same with respect to the execution of the Karp-Sipser algorithm on G or G ′ .

3 :
E i ← {all edges e that can go into box i} f r ≤ n ε f r .