Optimizing Higher-Order Pattern Uniﬁcation

. We present an abstract view of existential variables in a de-pendently typed lambda-calculus based on modal type theory. This allows us to justify optimizations to pattern uniﬁcation such as linearization, which eliminates many unnecessary occurs-checks. The presented modal framework explains a number of features of the current implementation of higher-order uniﬁcation in Twelf and provides insight into several optimizations. Experimental results demonstrate signiﬁcant performance improvement in many example applications of Twelf, including those in the area of proof-carrying code.


Introduction
Unification lies at the heart of automated reasoning systems, logic programming and rewrite systems.Thus its performance affects in a crucial way the global efficiency of each of these applications.This need for efficient unification algorithms has led to many investigations in the first-order setting.However, the efficient implementation of higher-order unification, especially for dependently typed λ-calculus, is still a central open problem limiting the potential impact of higher-order reasoning systems such as Twelf [15], Isabelle [12], or λProlog [9].
The most comprehensive study on efficient and robust implementation techniques for higher-order unification so far has been carried out by Nadathur and colleagues for the simply-typed λ-calculus in the programming language λProlog [7,8].The Teyjus compiler [10] embodies many of the insights found, in particular an adequate representation of lambda terms and mechanisms to delay and compose substitutions.Higher-order unification is implemented via Huet's algorithm [5] and special mechanisms are molded into the WAM instruction set to support branching and carrying unification problems.To only perform an occurs-check when necessary, the compiler distinguishes between the first occurrence and subsequent occurrences of a variable and compiles them into different WAM instructions.While for the first occurrence of a variable the occurs-check may be omitted, full unification is used for all subsequent variables.This approach seems to work well in the simply-typed setting, however it is not clear how to generalize it to dependent types.
In this paper, we discuss the efficient implementation of higher-order pattern unification for the dependently typed lambda-calculus.Unlike Huet's general higher-order unification algorithm which involves branching and backtracking, higher-order pattern unification [6,13] is deterministic and decidable.An important step toward the efficient implementation of higher-order pattern unification was the development based on explicit substitutions and de Bruijn indices [3] for the simply-typed lambda-calculus.This allows a clear distinction between bound and existential variables and reduces the problem to essentially first-order unification.Although the use of de Bruijn indices leads to a simple formal system, the readability may be obstructed and critical principles are obfuscated by the technical notation.In addition, some techniques like pre-cooking of terms and optimizations such as lowering and grafting remain ad hoc.This makes it more difficult to transfer these optimizations to other calculi.
We present an abstract view of existential variables in the dependently typed lambda-calculus based on modal type theory.Our calculus does not require de Bruijn indices, nor does it require closures M [σ] as first-class terms.This leads to a simple clean framework which allows us to explain a number of features of the current implementation of higher-order unification in Twelf [15] and provides insight into several optimizations.In the paper, we will particularly focus on one optimization called linearization, which eliminates many unnecessary occurs-checks.We have implemented this optimization of higher-order unification as part of the Twelf system.Experimental results demonstrate significant performance improvement, including those in the area of proof-carrying code.
The paper is organized as follows: First we give some background on modal logic and modal type theory, and discuss its relation to the dependently typed lambda calculus (see Section 2).In Section 3 we discuss higher-order pattern unification.In particular, we focus on the optimization, called linearization, which eliminates unnecessary occurs-checks.Finally in Section 4 we discuss experimental results.Related work is discussed in Section 5.

Motivation
We start by presenting a foundation for dependently typed existential variables based on modal logic.Following the methodology of Pfenning and Davies [14], we can assign constructive explanations to modal operators.A key characteristic of this view, is to distinguish between propositions that are true and propositions that are valid.A proposition is valid if its truth does not depend on the truth of any other propositions.This leads to the basic hypothetical judgment Under the multiple-world interpretation of modal logic, C valid corresponds to C true in all reachable worlds.This means C true without any assumptions, except those that are assumed to be true in all worlds.We can generalize this idea to also capture truth relative to a set of specified assumptions by writing C valid Ψ , where Ψ is the abbreviation for C 1 true, . . ., C n true.In terms of the multiple world semantics, this means that C is true in any world where C 1 through C n are all true and we say C is valid relative to the assumptions in Ψ .Hypotheses about relative validity are more complex now, so our general judgment form is While it is interesting to investigate this modal logic above in its own right, it does not come alive until we introduce proof terms.In this paper, we investigate the use of a modal proof term calculus as a foundation for existential variables.We will view existential variables u as modal variables of type A in a context Ψ while bound variables are treated as ordinary variables.This allows us to distinguish between existential variables u::(Ψ A) for relative validity assumptions A valid Ψ declared in a modal context, and x:A for ordinary truth assumptions A true declared in an (ordinary) context.If we have an assumption A valid Ψ we can only conclude A true if we can verify all assumptions in Ψ .
In other words, if we know A true in Ψ , and all elements in Ψ can be verified from the assumptions in Γ , then we can conclude A true in Γ .As we will see in the next section, this transition from one context Ψ to another context Γ , can be achieved via a substitutions from Ψ to Γ .

Dependently typed lambda calculus based on modal logic
In this section, we introduce a dependently typed lambda calculus.Existential variables u are treated as modal variables and x denotes ordinary variables.c and a are constants, which are declared in a signature.This is a conservative extension of the LF [4] so we suppress some routine details such as signatures.
Note that the substitution σ is part of the syntax of existential variables.This eliminates the need of pre-cooking [3] which raises existential variables to the correct context.
The principal judgments are listed below.As usual, we omit similar judgments on types and kinds and all judgments concerning definitional equality.

∆; Γ M
Note substitutions σ are defined only on ordinary variables x and not modal variables u.We write id Γ for the identity substitution (x 1 /x 1 , . . ., x n /x n ) for a context Γ = (•, x 1 :A 1 , . . ., x n :A n ).We will use π for a substitution which may permute the variables, i.e π = (x Φ(1) /x 1 , . . ., x Φ(n) /x n ) where Φ is a total permutation defined on the elements from a context Γ = (•, x 1 :A 1 , . . ., x n :A n ).We only consider well-typed substitutions, so π must respect possible dependencies in its domain.We also streamline the calculus slightly by always substituting simultaneously for all ordinary variables.This is not essential, but saves some tedium in relating simultaneous and iterated substitution.Moreover, it is also closer to the actual implementation where we use de Bruijn indices and postpone explicit substitutions.The typing rules are given in Figure 1.Note that the rule for modal variables is the rule (*) presented in the previous section, annotated with proof terms and slightly generalized, because of the dependent type theory we are working in.This rule also justifies our implementation choice of using existential variables only in the form u[σ].
Our convention is that substitutions as defined operations on expressions are written in prefix notation [σ]P for an object, family, kind, or substitution P .These operations are capture-avoiding as usual.Moreover, we always assume that all free variables in P are declared in σ.Substitutions that are part of the syntax are written in postfix notation, u[σ].Note that such explicit substitutions occur only for variables u labeling relative validity assumptions.
Substitutions are defined in a standard manner.We omit the details at the level of types and kinds for the sake of brevity.
provided y not declared or free in σ The side conditions can always be verified by (tacitly) renaming bound variables.We do not need an operation of applying a substitution σ to a context.The last principle makes it clear that [σ]τ corresponds to composition of substitutions, which is sometimes written as τ • σ.
The following substitution principles for substitutions σ hold.They are suggested by the modal interpretation and proved by simple structural inductions.We elide corresponding principles for families and kinds.

If
A new and interesting operation arises from the substitution principles for relative validity.The new operation of substitution is compositional, but two interesting situations arise: when a variable u is encountered, and when we substitute into a λ-abstraction.For sake of brevity, we only give the substitution on objects. [ We remark that the rule for substitution into a λ-abstraction does not require a side condition.This is because the object M is defined in a different context, which is accounted for by the explicit substitution stored at occurrences of u.This ultimately justifies implementing substitution for existential variables by mutation.
Finally, consider the case of substituting into a closure, which is the critical case of this definition.
This is clearly well-founded, because σ is a subexpression (so [[M/u]]σ will terminate) and application of an ordinary substitution has been defined previously without reference to the new form of substitution.
Using the given definitions, we can then show that the new substitution operation for relative validity satisfies the substitution principles.Again, this is motivated by the logical interpretation and follows by simple inductions after straightforward generalization to encompass all syntactic categories.Theorem 2.

Normal Forms
There are two notions of normal forms that are useful in the implementation.The first corresponds to a β-normal form.We simultaneously define normal (U ) and neutral (R) objects and normal substitutions η.

Normal Objects
We obtain the canonical objects (long βη-normal forms) by requiring normal objects of the form R to have base type (that is, not to have function type).In the implementation we use a stronger normal form where existential variables (represented here by modal variables) must also be of atomic type.This is accomplished by a technique called lowering.Lowering replaces a variable u::(Ψ Πx:A 1 .A 2 ) by a new variable u :(Ψ, x:A 1 A 2 ).This process is repeated until all existential variables have a type of the form Ψ b N 1 . . .N k .This operation has been proved correct for the simply-typed case by Dowek et al. [3], but remains somewhat mysterious.Here, it is justified by the modal substitution principle.

(Lowering
For part (2) we use instead that ∆, u:: Since we can lower all modal variables, we can change the syntax of normal forms so that terms u[η] are also normal objects of base type, rather than neutral objects.This is, in fact, what we chose in the implementation.

Existential Variables
As mentioned several times above, in the implementation the modal variables in ∆ are used to represent existential variables (also known as meta-variables), while the variables in Γ are universal variables (also known as parameters).
Existential variables are created in an ambient context Ψ and then lowered.We do not explicitly maintain a context ∆ of these existential variables, but it is important that a proper order for them exists.Existential variables are created with a mutable reference, which is updated with an assignment when we need to carry out a substitution [[M/u]].
In certain operations, and particularly after type reconstruction, we need to abstract over the existential variables in a term.Since the LF type theory provides no means to quantify over u::(Ψ A) we raise such variables until they have the form u ::(• A ).It turns out that in the context of type reconstruction we can now quantify over them as ordinary variables x :A .However, this is not satisfactory as this requires first raising the type of existential variables for abstraction, and later again lowering the type of existential variables during unification to undo the effect of raising.To efficiently treat existential variables, we would like to directly quantify over modal variables u.
The judgmental reconstruction in terms of modal logic suggests two ways to incorporate modal variables.One way is via a new quantifier Π 2 u::(Ψ A 1 ).A 2 , the other is via a general modal operator 2 Ψ .Proof-theoretically, the former is slightly simpler, so we will pursue this here.The new operator then has the form Π 2 u::(Ψ A 1 ).A 2 and is defined by the following rules.The main complication of this extension is that variables u can now be bound and substitution must be capture avoiding.In the present implementation, this is handled by de Bruijn indices.
3 Toward efficient higher-order pattern unification

Preliminaries
In the following, we will consider the pattern fragment of the modal lambdacalculus.Higher-order patterns are terms where existential variables must be applied to distinct bound variables.This fragment was first identified by Miller [6] for the simply-typed lambda-calculus, and later extended by Pfenning [13] to the dependently typed and polymorphic case.We enforce that all terms are in normal form, and the type of existential variables has been lowered and is atomic.We call a normal term U an atomic pattern, if all the subterms of the form u[σ] are such that σ = y 1 /x 1 , . . .y k /x k where y 1 , . . ., y k are distinct bound variables.This is already implicitly assumed for x 1 , . . ., x k because all variables defined by a substitution must be distinct.
Higher-order pattern unification can be done in two phases (see [13,3] for a more detailed account).During the first phase, we decompose the terms until one of the two terms we unify is an existential variable u[σ].This decomposition phase is straightforward and resembles first-order unification closely.During the second phase, we need to find an actual instantiation for the existential variable u.There are two main cases to distinguish: (1) when we unify two existential variables, u[σ] .
= v[σ ], and when we unify an existential variable with another kind of term, u[σ] .= M .The latter case is transformed into u .

= [σ]
−1 M assuming u does not occur in M and all variables v[τ ] are pruned so that the free variables in τ all occur in the image of σ (see [6,3]

for details). Note that we view [σ]
−1 M as a new meta-level operation such as substitution, because it may be defined even if σ is not invertible in full generality.
The main efficiency problem in pattern unification lies in treating this last case.In particular, we must traverse the term M .First of all, we must perform the occurs-check to prevent cyclic terms.Second, we may need to prune the substitutions associated with existential variables occurring in M .Third, we need to ensure that all bound variables occurring in M do occur in the range of σ, otherwise [σ] −1 M does not exist.To illustrate the problem, we give the definition for inverting substitutions: In the next section, we will show how linearization can be used to enforce the two criteria which eliminates the need to traverse M .First, we will enforce that all existential variables occur only once, thereby eliminating the occurscheck.Second, we will require that the substitution σ associated with existential variables is always a permutation π.This ensures that the substitutions are always invertible and eliminates the need for pruning.

Linearization
One critical optimization in unification is to perform the occurs-check only when necessary.While unification with the occurs-check is at best linear in the sum of the sizes of the terms being unified, unification without the occurs-check is linear in the smallest term being unified.In fact the occurs-check can be omitted if the terms are linear, i.e., every existential variable occurs only once.
Let us consider the following clause from a program which evaluates expressions from a small functional language Mini-ML.It says that functions evaluate to themselves.Using the introduced modal term language, this can be expressed as follows: exp : type.lam : (exp → exp) → exp.eval : exp → exp → type.ev lam : Π 2 e::(y:exp exp).eval(lam (λx:exp.e[x/y]))(lam (λx:exp.e[x/y])).
The existential variable e in the clause ev lam is quantified by Π 2 e::(y:exp exp).
To enforce that every existential variable occurs only once, the clause head of ev lam can be translated into the linear type where e is a new existential variable.Then a constant time assignment algorithm can be used for assigning a linear clause head to a goal, and the variable definitions are solved by conventional unification.As a result, the occurs-check is only performed if necessary.
In the dependently typed lambda-calculus, there are several difficulties in performing this optimization.First of all, all existential variables carry their context Ψ and type A. If we introduce a new existential variable, then the question arises what type should be assigned to it.As type inference is undecidable in the dependently typed case, this may be expensive.In general, we may even obtain a term which is not necessarily well-typed.
Let us modify the previous example, and annotate the expressions with their type thus enforcing that any evaluation of a Mini-ML expression will be welltyped.and the following variable definitions Due to the linearization, the linear clause head is clearly not well-typed.However, it is well-typed modulo variable definitions.Therefore, it will be welltyped after all existential variables have been instantiated during assignment and the variable definitions have been solved.It would be interesting to accord first-class type-theoretic status to the variable definitions, but we leave this to future work, since the implementation treats them only in a very special manner explained in Section 3.3.Note that some of these variable definitions are in fact redundant, which is another orthogonal optimization (see [11] for an analysis on a fragment of LF).
It is worth pointing out that this situation does not arise in the simply typed case.These considerations lead to the following description of variable definitions: The idea of factoring out duplicate existential variables can be generalized to replacing arbitrary subterms by new existential variables and creating variable definitions.In particular, the process of linearization also replaces any existential variables v[σ] where σ is not a permutation by a new variable u[id Ψ ] and a variable definition The linearization itself is quite straightforward and we will omit the details here.In the actual implementation, we do not generate types A and contexts Ψ for the new, linearly occurring existential variables, but ensure that all such variables are in instantiated and disappear by the time the variable definitions have been solved.

Assignment for higher-order patterns
In this section, we give a refinement of the general higher-order pattern unification which exploits the presented ideas.The algorithm proceeds in three phases.First we will unify a linear atomic higher-order pattern L with an object U .The following judgments capture the assignment between a linear atomic higher-order pattern L and a normal object U .We write θ for simultaneous substitutions [[U 1 /u 1 , . . .U n /u n ]] for existential variables which have straightforward definition and properties.
∆; Γ L .= U/(θ, E) assignment for normal objects ∆; Γ L .= R /(θ, E) assignment for neutral objects where R is a linear neutral object.E denotes residual equations which may be generated during assignment.The assignment algorithm itself is given below.
Note that we do not need to worry about capture in the rule lam, since existential variables and bound variables are defined in different contexts.In the rule app, we are allowed to union the two substitutions θ 1 and θ 2 , as the linearity requirement ensures that the domains of both substitutions are disjoint.Note that the case for unifying an existential variable u[π] with another term U is now simpler and more efficient than in the general higher-order pattern case.In particular, it does not require a traversal of U (see rule existsL).Since the inverse of the substitution π can be computed directly and will be total, we know [π] −1 U exists and can simply generate a substitution [π] −1 U/u.Finally, we may need to postpone solving some unification problems and generate some residual equations if the non-linear term is an existential variable (see existsR).
The result of the assignment algorithm is a substitution θ 1 for the existential variables in L and potentially some residual equations E. In the second phase, we apply θ 1 to the variable definitions D which were generated during linearization of U and solve [[θ 1 ]]D using conventional pattern unification.We only need to pay attention to the case where we unify an existential variable u[id Ψ ] with another variable u [σ].In this case, we simply generate a substitution [[u [σ]/u]], as the inverse of id Ψ will be the identity substitution again.This ensures that all existential variables introduced during linearization, will be instantiated after assignment succeeds.
As a final result of solving the variable definitions D, we obtain an additional substitution θ 2 .In the third phase, we solve the remaining residual equations E, which were generated during phase 1 under θ 1 • θ 2 .

Experiments
In this section, we discuss some experimental results with different programs written in Twelf.All experiments are done on a machine with the following specifications: 1.60GHz Intel Pentium Processor, 256 KB cache.We are using SML of New Jersey 110.0.3 under Linux Red Hat 7.1.Times are measured in seconds.In the tables below, the column "opt" refers to the optimized version with linearization and assignment, while the column "stand" refers to the standard implementation using general higher-order pattern unification.

Higher-order logic programming
In this section, we present two experiments with higher-order logic programming.The first one uses an implementation of a meta-interpreter for ordered linear logic by Polakow and Pfenning [16].In the second experiment we evaluate our unification algorithm using an implementation of foundational proof-carrying code developed at Princeton University [ As the results for the meta-interpreter demonstrate, the performance improvement ranges between 40% and 152%.Roughly, 45% of the time there were no variable definitions at all.From the non-trivial equations roughly 45% were not unifiable.This means overall, in approx.20% -30% of the cases the assignment algorithm succeeded and the failure of unification was delayed.It is worth noting that 77% to 80% of the calls to assignment presented in Section 3. 3  As the results demonstrate, the performance of the theorem prover is not greatly influenced by the optimized unification algorithm.The main reason is that we have many dynamic assumptions, which need to be unified with the current goal.However, we use the standard higher-order pattern unification algorithm for this operation and use the optimized algorithm only for selecting a clause.For dynamic assumptions we cannot maintain the linearity requirement and linearizing the dynamic assumptions at run-time seems too expensive.The second example is theorem proving in the natural deduction calculus.In contrast to the previous experiments with the sequent calculus, there is a substantial performance improvement by approximately 70%.Although linear head compilation substantially improves performance, more optimizations, such as tabelling and indexing, are needed to solve more complex theorems.

Related Work
The language most closely related to our work, is λProlog.Two main different implementations of λProlog exist, Prolog/Mali and Teyjus.In Prolog/MALI implementation, the occurs-check is left out entirely [2].While Teyjus [8,7] also eliminates some unnecessary occurs-checks statically during compilation.However, in addition to the presence of dependencies in Twelf, there are several other differences between our implementation and Teyjus. 1) Teyjus compiles first and subsequent occurrences of existential variables into different instruction.Therefore, assignment and unification are freely mixed during the execution.This may lead to expensive failure in some cases, since unification is still called.In our approach, we perform a simple fast assignment check and delay unification entirely.As the experimental results demonstrate, only a small percentage of the cases fails after it already passed the assignment test and most cases benefit from a fast simple assignment check.2) We always assume that the types of existential variables are lowered.This can be done at compile time and incurs no run-time overhead.In Huet's unification algorithm, projection and imitation rules are applied at run-time to construct the correct prefixes of λ-abstractions.
3) Our approach can easily incorporate definitions and constraint domains.This is important since unifying definitions and constraint expressions may potentially be expensive.In fact, we generalize and extend the idea of linearization in the implementation and factor out not only duplicate existential variables but also any difficult sub-expressions such as definitions and constraint expressions.Therefore, our approach seems more general than the one adopted in Teyjus.

Conclusion
We have presented modal foundation for existential variables which underlies our higher-order pattern unification implementation in Twelf.This leads to a simple framework in which many optimizations such as lowering, grafting and linearization can be justified.As experiments show, performance is improved substantially.This is especially important in large-scale applications such as proof-carrying code and allows us explore the full potential of logical frameworks in real-world applications.
In the future, we plan to investigate and implement further optimizations to reduce the performance gap between higher-order and first-order system.One optimization which is particularly important to sustain performance in largescale examples is term indexing.However, indexing of higher-order terms is still an open problem.We believe the presented strategy is a suitable basis to adopt indexing techniques from the first-order setting for two reasons: 1) the presented strategy reduces the higher-order unification problem to a simple class of problems which is essentially first-order.2) the experimental results show that many unification problems arising in practice fall into this class and can be solved by this strategy.This indicates that a term indexing algorithm based on this assignment strategy may also be effective in practice.
Besides a logic programming engine, the Twelf system also provides a theorem prover which is based on iterative deepening search.In this section, we consider two examples, theorem proving in an intuitionist sequent calculus and theorem proving in the classical natural deduction calculus.