Higher-order representation of substructural logics

We present a technique for higher-order representation of substructural logics such as linear or modal logic. We show that such logics can be encoded in the (ordinary) Logical Framework, without any linear or modal extensions. Using this encoding, metatheoretic proofs about such logics can easily be developed in the Twelf proof assistant.


Introduction
The Logical Framework (or LF) [6] provides a powerful and flexible framework for encoding deductive systems such as programming languages and logics.LF employs an elegant account of binding structure by identifying object-language variables with LF variables, object-language contexts with (fragments of) the LF context, and object-language binding occurrences with LF lambda abstraction.This account of binding, often called higher-order abstract syntax [12], automatically handles most operations that pertain to binding, including alpha-equivalence, substitution, and variablefreshness conventions [3].
Since the object-language context is maintained implicitly, as part of the built-in LF context, the structural properties of LF contexts (such as weakening and contraction) automatically apply to the object language as well.Ordinarily this is desirable, but it poses a problem for encoding substructural logics that do not possess those properties. 1or example, linear logics (by design) satisfy neither weakening nor contraction, so it would seem that they cannot be encoded in LF.
One solution to this problem is to extend LF with linear features.Linear LF [4] extends LF with linear assumptions and connectives.This provides the ability to encode linear logics.However, linearity has yet to be implemented in Twelf [13], the proof assistant that implements LF, in part due to unresolved complications that linearity creates in its metalogical apparatus.Consequently, Linear LF is not currently an option for those engaged in formalizing metatheory.Moreover, Linear LF does not give us any assistance with other substructural logics, such as affine, strict, or modal logic.
Another option is to break with standard LF practice and model object-language contexts explicitly [5].Explicit contexts can be reconciled with higher-order abstract syntax, thereby retaining many of the benefits of LF.Once contexts are explicit, it is easy to state inference rules that handle the context in an appropriate way for a substructural logic.However, the explicit context method is clumsy to work with and sacrifices some of the advantages of LF.For example, although substitution is still free (since the syntax of terms is unchanged), the substitution lemma is not.The explicit context method is typically used internally within a proof, rather than in the "official" formalization of a logic.
In this paper we advocate a more general and workable approach in which we look at substructural logic from a slightly different perspective.Rather than viewing a substructural logic from the perspective of its contexts (that is, collections of assumptions), we suggest it is profitable to look at it from the perspective of its individual assumptions.
The essence of linear logic is not that type-checking splits the context when it checks a (multiplicative) term with multiple subterms.The essence of linear logic is that an assumption is used exactly once.The latter property can be stated on an assumption-by-assumption basis, without reference to contexts.Thus, wherever an assumption is introduced, as part of the typing rule that introduced it, we can check that that assumption is used linearly.
At first glance, it might appear that linearity must be a meta-judgement, tracing the use of assumptions throughout a typing derivation.That would make it very awkward to use in practice.Fortunately, however, to check the linear use of assumptions, we need look only at proof terms; there is no need to examine typing derivations.
The idea of linearity as a judgement over proof terms dates to the early days of LF.Avron et al. [2,1] suggested that linearity can be expressed by imposing a lattice structure on proof terms and defining linear proof terms as those that are strict and distributive, when viewed as a function of their linear variables.
In this paper, we suggest a simpler formulation of linearity, based on tracking variables through the proof terms of linear logic.This allows for a clean, practical definition of linearity.
We express linear logic using two judgements, the usual typing judgement: and a linearity judgement: linear : (term -> term) -> type.
The judgement linear(λx.Mx) should be read as "the variable x is used linearly (i.e., is used exactly once) in Mx." In this paper, we illustrate the use of a substructural judgement (such as linear) in three settings: linear logic, dependently typed linear logic, and judgemental modal logic [11].Many other substructural logics including affine logic and strict logic can be handled analogously.Some others, such as ordered logic [15,14], cannot, because the rules of the logic make it impossible to handle assumptions independently.We briefly discuss the latter in Section 5.
The full Twelf development can be found on-line at: www.cs.cmu.edu/~crary/papers/2009/substruct.tar In our discussion, we assume familiarity with the Logical Framework, and with linear and modal logic.Some familiarity with Twelf may also be helpful.The sections on adequacy are technical, but the remainder of the paper should be accessible to the casual practitioner.

Linear Logic
We begin by representing the syntax of linear logic in the usual fashion.The LF encoding, with the standard on-paper notation written alongside it for reference, is shown in Figure 1.The type atom ranges over a fixed set of atomic propositions.
On paper, we represent linear logic with the typing judgment Γ; ∆ M : A. In this, the first context, Γ, represents the unrestricted context (i.e., truth), and the second context, ∆, represents the linear context (i.e., resources).To simplify the notation, we adopt the convention that the linear context is unordered.Thus (∆, ∆ ) refers to a context that can be split into two pieces ∆ and ∆ that may possibly be interleaved.We also adopt the convention that all the variables appearing in either context must be distinct.
The encoding of the static semantics, as discussed previously, is given by two judgements: of : term -> tp -> type.linear : (term -> term) -> type.
We read "of M A" as "M is of type A," and we read "linear ([x:term] M x)" as "x is used linearly in (M x)."Note that [x:term] is Twelf's concrete syntax for LF lambda abstraction2 (λx:term.).Twelf can usually infer the domain type, leaving just [x].
We proceed rule-by-rule to show the encoding of the static semantics.
Variables The rule for linear variables states that a linear variable may be used provided there are no other linear variables in scope:

Γ; x:A x : A
There is no typing rule for variables in the encoding; that is handled automatically by higher-order representations.However, there is a linearity rule that states that x is linear in x: tp : type.
| The rule for unrestricted variables states that an unrestricted variable may be used provided there any no linear variables in scope: As with linear variables, there is no typing rule for unrestricted variables in the encoding.There is also no linearity rule for unrestricted variables.Note that {x:term} is Twelf's concrete syntax for the dependent function space (Πx:term.).Again, Twelf can usually infer the domain type, leaving just {x}.

Linear implication
The typing rule has the usual typing premise, plus a second premise that requires that the argument be used linearly in the body.The linearity rule says that a variable y is linear in a function (llam ([x] M y x)) if it is linear in its body (M y x) for any choice of x.
The elimination rule splits the linear context between the function and argument: This is encoded using three rules: The typing rule is standard.There are two linearity rules, one for each way a linear variable might be used.The first linearity rule says that x is linear in (lapp (M x) N) if it is linear in (M x) and does not appear in N. (Since implicitly bound meta-variables such as M and N are quantified on the outside, stating N without a dependency on x means that N is closed with respect to x.)The second linearity rule provides the symmetric case.

Multiplicative conjunction
The introduction rule for tensor is: This is encoded using three rules, in a similar fashion to function application: The elimination rule is: In the encoding, the typing rule requires that x and y are linear in N. As in previous cases where the linear context is split, there are two linearity rules depending on whether a linear variable is used in the let-bound term or the body: Additive conjunction The introduction rule for "with" does not split the context: In the encoding, there is one linearity rule, requiring that linear variables be linear in both constituents of the pair: The elimination rules are straightforward: Disjunction The introduction rules for plus are straightforward: The elimination rule splits the context into two pieces, one for the discriminant and one used by both arms: In the encoding, the typing rule requires that each arm's bound variable be used linearly.The linearity rules provide the two cases, one when the variable is used linearly in the discriminant, and one in which is it used linearly in both arms: Exponentiation The introduction rule for exponentiation requires that the linear context be empty: In the encoding, this means there is no linearity rule, since variables cannot be linear in exponents: The elimination rule splits the context and adds the newly bound variable to the unrestricted context: In the encoding, the unrestricted nature of x is handled by not checking that x is linear in (N x).The linearity rules work in the usual fashion: Units The unit for tensor is 1: The encoding is straightforward, with no linearity rule for introduction since variables cannot be linear in * : The unit for "with", , is more interesting.It stands for an unknown collection of resources, and consequently has an introduction form but no elimination form: Γ; ∆ : The encoding provides that any variable is linear in unit: of/unit : of unit top.
The unit for plus, 0, represents falsehood.Accordingly, it has an elimination form but no introduction form.The elimination form behaves a little bit like ; any resources not used to prove 0 may be discarded: In the encoding there are two linearity rules.A variable is linear in (any M ) if is is linear in M or if it does not appear in M at all: of/any : of (any M) T <-of M zero.Note that it is tempting but incorrect to simplify this to the single rule: linear/any-wrong : linear ([x] any (M x)).
That rule would allow x to be used multiple times in (M x), which is not permitted.It would be tantamount to moving the entire linear context into the unrestricted context, rather than merely discarding any unused resources.

Adequacy
It seems intuitively clear that the preceding is a faithful representation of linear logic.We wish to go further and make the correspondence rigorous, following the adequacy argument of Harper et al. [6].Adequacy establishes a isomorphism between the object language (linear logic in this case) and its encoding in LF.As usual, an isomorphism is a bijection that respects the relevant operations.
For syntax, the only primitively meaningful operation is substitution.(Other operations are given by defined semantics.)Thus, an isomorphism for syntax is a bijective translation that respects substitution.Our translation for syntax (written − ) is standard, so we will omit the obvious details of its definition and simply state its adequacy theorem for reference: 1.Let Type be the set of linear logic types.Then there exists a bijection − between Type and LF canonical forms P such that LF P : tp.(Variables cannot appear within types, so there is no substitution to respect.) 2. Let S be a set of variables and let TermS be the set of linear logic terms whose free variables are contained in S. Then there exists a bijection − between TermS and LF canonical forms P such that S LF P : term.Moreover, − respects substitution: For semantic adequacy, we wish to establish a bijective translation between typing derivations and LF canonical forms. 3The usual statement of adequacy for typing is something to the effect of: Consequently, we establish a correspondence between each linear-logic typing derivation on the one hand, and an LF proof of typing paired with a collection of LF proofs of linearity on the other.Alas, this is notationally awkward when compared with the usual adequacy theorem.Proving adequacy is typically straightforward but tedious once it is stated correctly.The same is true here, but the tedium is a bit more pronounced because of the need to manipulate encoding structures, rather than just canonical forms.We give a few cases by way of example:

Proof Sketch
First, by induction on derivations, we construct the translation and show it is type correct.
• Suppose ∇ is the derivation: • Suppose ∇ is the derivation: The other case is symmetric.
So let ∇ = (of/lapp P2 P1, H), where for each y in Domain(∆1, ∆2), It remains to show that − is a bijection.To do so, we exhibit an inverse − .The interesting cases are those that split the context.We give the application case as an example.Suppose (of/lapp P We must sort ∆ into two pieces.Define: We can show, by induction over LF canonical forms, that − is fully defined over encoding structures.It is easy to verify that − and − are inverses.Therefore − is bijective.
When the linear context is empty, the H portion of an encoding structure is empty, and we recover the usual notion of adequacy: Corollary 2.7 There exists a bijection between derivations of the judgement Γ; M : A and LF canonical forms P such that Γ P : of M A .

Metatheory
To demonstrate the practicality of our encoding, we proved the subject reduction theorem in Twelf.We give the definition of reduction in Figure 2. Reduction is encoded with the judgement: reduce : term -> term -> type.
We will not discuss the encoding of reduction and its adequacy, as they are standard.We prove subject reduction by a series of four metatheorems.To make the development more accessible to readers not familiar with Twelf's logic programming notation for proofs, we give those metatheorems in English.The next lemma is usually glossed over in proofs on paper: Lemma 2.9 (Reduction of closed terms) Suppose the ambient context is made up of bindings of the form x:term (and other bindings not subordinate to reduce).If ({x:term} reduce M1 (M2 x)) is derivable, then there exists M2':term such that M2 = ([_] M2').
Lemma 2.10 (Subject reduction for linear) Suppose the ambient context is made up of bindings of the form x:term,dx:of x A (and other bindings not subordinate to reduce or of).If ({x} reduce (M x) (M' x)) and ({x} of x A -> of (M x) B) and linear ([x] M x) are derivable, then linear ([x] M' x) is derivable.

Proof Sketch
By induction on the first derivation.Cases involving substitution (most of the beta-reduction cases) use Lemma 2.8.Multiple-subterm compatibility cases use Lemma 2.9 to show that reduction of subterms not mentioning a linear variable will not create such a reference.
Theorem 2.11 (Subject reduction for of) Suppose the ambient context is made up of bindings of the form x:term,dx:of x A (and other bindings not subordinate to reduce or of).If reduce M M' and of M T are derivable, then of M' T is derivable.

Proof Sketch
By induction on the first derivation.Cases with linearity premises (reduce/llam, reduce/lett, and reduce/case) use Lemma 2.10 to show that the linearity premises are preserved by reduction.
5 "Subordinate" is a term of art in Twelf.Informally, s is subordinate to t if s can contribute to t.More precisely, a type family s is subordinate to an type family t if there exist types S and T belonging to s and t such that objects of type S can appear within objects of type T [17].If s is not subordinate to t, then assumptions whose types belong to s can be ignored while considering t.

Proof
Immediate from Subject Reduction and Adequacy.

Dependently Typed Linear Logic
Adding dependent types to linear logic is straightforward syntactically.The revised syntax is shown in Figure 3.We delete atomic propositions, and replace them with constants that take a single term parameter.(That parameter may be a unit or tuple, which provides implicit support for zero or multiple parameters.) In the static semantics, a new wrinkle arises.Now that terms can appear within types, the typing rules must ensure that linear variables are not used within types.However, a variable can appear within a term's type without appearing in the term itself.This is obvious because our lambda abstractions are unlabelled, but it would still be the case even if all bindings were labeled with types.This is because of the equivalence rule: Using the equivalence rule, a term's type can mention any variable in scope.
One solution to this problem is to make linearity a judgement over typing derivations, rather than over proof terms.However, that would make linearity a dependently typed meta-judgement, which would be too cumbersome to work with in practice.It is better to maintain linear as a judgement over proof terms.
Instead, we change our view of unrestricted variables.In non-dependently typed linear logic, we viewed unrestrictedness as merely the absence of a linearity restriction.Now we will view unrestrictedness as conferring an affirmative capability; specifically, the capability to appear within types.
We add a new judgement unrest that applies to unrestricted variables.We extend that judgement to terms by saying that a term is unrestricted if all its free variables are unrestricted: unrest/lapp : unrest (lapp M N) <-unrest M <-unrest N. ... Note that, within the unrest judgement, all bound variables are taken to be unrestricted, even linear ones.
Only unrestricted terms are permitted to serve as the parameter to a constant.On paper, this is written where we assume some pre-specified collection of axioms of the form c : A → type.In our encoding, the well-formedness judgement for types is wf : tp -> type.The constant rule is written: We assume there exists a unique cparam rule for each axiom c : A → type.The remaining wf rules are uninteresting (but note that the rule for pi introduces an unrestricted variable).
Our existing typing rules must be altered in two ways.First, now that types can be ill-formed, several rules must add a wf premise.This is straightforward.Second, the rules for the exponential must be rewritten to use the unrest judgement: We also have the new rules for unrestricted functions and application: And finally equivalence: The addition of dependent types complicates the proof of subject reduction in a number of ways, but nearly all are orthogonal to linearity.One issue that does relate to linearity is we require one additional lemma to show that unrestrictedness is preserved by reduction: Lemma 3.1 (Subject reduction for unrest) Suppose the ambient context is made up of bindings of the form x:term,ex:unrest x and bindings of the form x:term (and other bindings not subordinate to reduce or unrest).If reduce M M' and unrest M are derivable, then unrest M' is derivable.

Adequacy
Adequacy for dependently typed linear logic proceeds in much the same fashion as before.We must make four changes.First, we revise syntactic adequacy of types, now that types are not closed: 1. Let S be a set of variables and let Type S be the set of linear logic types whose free variables are contained in S. Then there exists a bijection − between Type S and LF canonical forms P such that S LF P : tp.Moreover, − respects substitution: 2. Let S be a set of variables and let TermS be the set of linear logic terms whose free variables are contained in S. Then there exists a bijection − between TermS and LF canonical forms P such that S LF P : term.Moreover, − respects substitution: There are (at least) two ways to specify modal logic.One is using an explicit notion of Kripke worlds and accessibility [16].Such a formulation does not behave as a substructural logic (in that all assumptions are available throughout their scope) and can be encoded in LF without difficulty [8].
A second, which we consider here, is judgemental modal logic [11].
Judgemental modal logic distinguishes between two sorts of assumption, truth and validity.Although judgemental modal logic has no explicit notion of Kripke worlds, one can think of truth as applying to only the current world, and validity as applying to all worlds.Consequently, the introduction rule for A, which internalizes validity, must require that no truth assumptions are used.This is accomplished with the rule: Here, Γ is the validity context and ∆ is the truth context.Whatever truth assumptions exist are discarded while type checking M .Since assumptions in ∆, are unavailable in M despite being in scope, judgemental modal logic behaves as a substructural logic.
We express this restriction using a judgement reminiscent of linear, indicating that an assumption is used locally to the current world: local : (term -> term) -> type.
The judgement local([x] Mx) should be read as "the variable x is used locally (i.e., not within boxes) in Mx." The syntax of modal logic is given in Figure 4.In the interest of brevity, we omit discussion of the possibility modality here.A treatment of possibility appears in the full Twelf development.
Variables The rules for variables allow the use of any variable in the context: As usual, there is no typing rule for variables in the encoding, but there are two locality rules.First, x is local in x: tp : type.Second, we wish to say that x is local in every variable (truth or validity) other than x.The easiest way to express this is to generalize to all terms M that do not contain x: The important thing here is the absence of any locality rule for bx.The only way to show that a variable is local in (bx M) is using the local/closed rule, which requires that the variable not appear in M, as desired.The elimination rule for necessity is: Since the variable introduced by letbx is a validity assumption, we do not check that it is local in the body.
Metatheory Subject reduction for modal logic follows the same development as for linear logic in Section 2.2, with local standing in for linear.One lemma must be generalized: since local variables can appear multiple times in modal logic, composition of locality must allow the local variable to appear (locally) in the scope of substitution (M1 below), as well as in the substitutend (M2 below): 1.Let Type be the set of modal logic types.Then there exists a bijection − between Type and LF canonical forms P such that LF P : tp.(Variables cannot appear within types, so there is no substitution to respect.) 2. Let S be a set of variables and let TermS be the set of modal logic terms whose free variables are contained in S. Then there exists a bijection − between TermS and LF canonical forms P such that S LF P : term.Moreover, − respects substitution: Semantic adequacy again encounters a challenge; this time the opposite problem from the one we saw with linear logic.In the encoding of linear logic there were too few typing derivations; here there are too many.
The problem lies in the local judgement.Unlike linear, which expressed a property that could be satisfied in many ways, local expresses a fact that essentially can be satisfied in only one way, by the variable not appearing in any boxes.In this regard, local is more like unrest than linear.However, unlike unrest, derivations of local are not unique.
The problem stems from the fact that the local/closed rule can apply to terms that also have another rule.For example, suppose M and N are closed terms.Then local ([x] app M N) has at least two derivations: local/closed and (local/app local/closed local/closed).
One solution to the problem would be to restrict local/closed to variables (and add another rule for closed boxes).This would ensure that local derivations are unique (like unrest derivations).We could impose the restriction by creating a judgement (say, var) to identify variables, and then rewrite the local/closed rule as: However, this solution has a significant shortcoming; the substitution lemma would no longer be a free consequence of higher-order representation.Under such a regime, variable assumptions would take the form ({x:term} of x A -> var x -> ...whatever...).Consequently, we would only obtain substitution for free when the substitutend possesses a var derivation; that is, when the substitutend is another variable.The general substitution lemma would have to be proved and used explicitly.
A better solution is to rephrase adequacy to quotient out the excess derivations:

Proof Sketch
We give one case in each direction, by way of example.Suppose ∇ is the derivation: The second criterion of encoding structures is vacuously satisfied for an empty truth context, so P belongs to an encoding structure for Γ; M : A .Let ∇ = P .
Then let of/bx P be the derivation: ∇ . . . .

Γ;
M : A Γ; ∆ box M : A It is easy to verify that, for the appropriate ∇ and P, ∇ = ∇ and P ∼ = P. Therefore − and − are inverses.

Conclusion
The Logical Framework is not only (nor even primarily) a type theory.More importantly, it is a methodology for representing deductive systems using higher-order representation of syntax and semantics, and a rigorous account of adequacy.Where applicable, the LF methodology provides a powerful and elegant tool for formalizing programming languages and logics.
There are two reasons it might not apply.First, limitations of existing tools for LF, such as Twelf, might prevent one from carrying out the desired proofs once a system were encoded in LF.Second, there might be an inherent problem representing the desired deductive system adequately using a higher-order representation.When a language cannot be cleanly represented in a higher-order fashion, it often indicates that something about the language is suspect, such as an incorrect (or at least nonstandard) notion of binding and/or scope.
In some cases, however, languages with unconventional notions of binding or scope are nevertheless sensible.Substructural logics are probably the most important example.In this paper, we show that many substructural logics can be given a clean higher-order representation by isolating its "substructuralness" (e.g., linearity or locality) and expressing that as a judgement over proof terms.
Our strategy applies to other substructural logics as well.For example, affine logic and strict logic can each be encoded along very similar lines as linear logic.We conjecture that contextual modal logic [9] is encodable along similar lines as judgemental modal logic.This is a good avenue for future work.The logic of bunched implications [10] is another.
On the other hand, since our method relies on enforcing "substructuralness" on an assumption-by-assumption basis, there are some substructural logics it does not support, such as ordered logic [15,14].In ordered logic, the context is taken to be ordered and assumptions must be processed in order.We cannot enforce this restriction on assumptions independently, as the very nature of the restriction is that assumptions are not independent.The usability of one assumption can depend on the disposition of every other assumption in scope.

Definition 2 . 5
An encoding structure for Γ; ∆ M : A is a pair (P, H) of an LF canonical form P and a finite mapping H from variables to LF canonical forms, such that:• Γ, ∆ LF P : of M A, and • Domain(H) = Domain(∆), and • For each variable y in Domain(∆), Sy LF H(y) : linear ([y:term] M ), where Sy = Domain(Γ, ∆) \ {y}.Theorem 2.6 (Semantic adequacy) There exists a bijection − between derivations of the judgement Γ; ∆ M : A and encoding structures for Γ; ∆ M : A.

AFigure 4 :
Figure 4: Modal logic syntax local/closed : local ([x] M).Implication The introduction rule for implication is:Γ; (∆, x:A) M : B Γ; ∆ λx.M : A → BThis is encoded using two rules, reminiscent of the ones for linear implication: of/lam : of (lam ([x] M x)) (arrow A B) <-({x} of x A -> of (M x) B) <-local ([x] M x).local/lam : local ([y] lam ([x] M y x)) <-({x} local ([y] M y x)).The function's argument is a truth assumption, so it must be used locally in the body.The elimination rule for implication is straightforward:Γ; ∆ M : A → B Γ; ∆ N : A Γ; ∆ M N : B of/app : of (app M N) B <-of M (arrow A B) <-of N A. local/app : local ([x] app (M x) (N x)) <-local ([x] M x) <-local ([x] N x).Necessity Recall the introduction rule for necessity: Γ; M : A Γ; ∆ box M : A This is encoded with the single rule: of/bx : of (bx M) (box A) <-of M A.

3
Translation of contexts is defined: x1:A1, . . ., xn:An = x1 :term, dx1 :of x1 A1 , . . ., Unfortunately, this simple statement of adequacy does not work in the presence of linearity.Consider the judgement ; x:a ⊗ : ⊗ .It has two derivations, depending on which conjunct is chosen to consume the assumption: xn:term, dxn:of xn An Non-Theorem 2.4 There exists a bijection between derivations of the judgement Γ M : A and LF canonical forms P such that Γ LF P : of M A .
2 P 1 , H ) is an encoding structure for Γ; ∆ O : B .Then O has the form M N , and Γ; ∆ LF P 1 : of M A B , and Γ; ∆ LF P 2 : of N A .
linear/lapp2 R} Note that ∆ = ∆1, ∆2.Also note that no variable in ∆1 appears free in N or vice versa.Therefore it is easy to show that no assumption in ∆1 appears free in P 2 and vice versa.Hence 4 Γ; ∆1 LF P 1 : of M Γ; ∆2 N : A Γ; (∆1, ∆2) M N : B