Semantic characterizations of number theories

We show that a number-theoretic formula is a theorem of First-Order Arithmetic if and only if it is true, as a statement about numbers, in all Henkin-structures that are closed under abstract jump (i.e. strict-llj definitions), and that a number-theoretic formula is a theorem of Arithmetic with existential induction if and only if it is true in all Henkin-structures that contain their abstract RE (i.e. strict-IIJ definable) sets.

The induction axioms of FA are the instances of induction for all (first-order) VFA-formulas.
The primitive recursive definitions of FA are the universal closures of defining equations for all primitive recursive functions. 3 We write PR for the set of these formulas. For a primitive recursive function /, we let degree (/) denote the length of the (shortest in PR) primitive recursive definition of /.
It is easy to see that FA is a conservative extension of Peano's Arithmetic. Note that Peano's third axiom follows from ->(0 = 1) by induction, and the fourth axiom follows from the defining equations for the predecessor function.
Si-Arithmetic, EiA, is like FA except that induction is restricted to existential VFAformulas. Computationally, SiA is the same as Primitive Recursive Arithmetic, PRA (which has induction only for Vj^-equations), since they prove the same II!] formulas, that is, they have the same provably recursive functions [Par77].
It will be useful to identify conditions for doing away with the axiom ->(0 = 1). For a number theory A, let A" denote A without -«(0 = 1).
LEMMA 1 Let IP be a VfA-formula in which equality has no negative occurrences, and let A be one of the number theories above. If Ahy, then A" h (P.
PROOF: Troelstra observed [Tro73] that if a formula <P is provable in A, then the result of replacing (hereditarily) in <P every negated subformula -i^ by -» 0 = 1 is a theorem of A"". Let IP' be a prenex-disjunctive normal form for so the equivalence <P<^CP' is provable in first-order logic. Then A h IP implies A h <£>' , from which A" h <£>' , since <P F is without negation, whence A" h (P. • Let neq be the primitive recursive characteristic function of inequality: neq(x,y) = 0 iff x ^ y. For a Vp^-formula let IP be the result of replacing in (P each negatively occurring equation, t = s, by -*i{neq(t, s) = 0). Then <P has no negative occurrences of equality, and we have, immediately from the definitions, PRA V <P (P. Combined with Lemma 1 this implies 3 Our development remains valid if we expand the set of functions to include all partial recursive functions, where the defining equations are Herbrand-Godel functional programs, as in [Kle52], provably coherent in Primitive Recursive Arithmetic. A Herbrand-Godel program V is COHERENT if its operational semantics generates a single-valued relation. Coherence is an undecidable property, but there is a collection C of functional programs such that membership in C is decidable in real time, and such that every partial recursive function has a program in C. The programs obtained from any one of the standard proofs of this fact (as e.g. in [Kle52]) are all provably coherent in Primitive Recursive Arithmetic.
2 LEMMA 2 Let tp be a VpA-formula, and let A be one of the number theories above. If A h tp then A\~ (p<r+(p and A~ h (p.

Henkin-structures
The language of second-order logic is an extension of the language of first-order logic (with equality and function identifiers) with relational variables of all finite arities, and with quantification over them. Our basic proof-system for second-order logic, SOLo, is obtained from the first-order predicate calculus with equality by treating relational variables in par with object variables (without comprehension); see for example [Pra65, §V.l] for details. Given a class $ of formulas, $-Comprehension is the schema 3RVx(R(x) (£>), where R is a relational variable that does not occur free in tp, arity(R) = arity(x), and (p G $. If $ is a collection of second-order formulas, we write SOL($) for SOL 0 augmented with $comprehension. SOL will denote SOL 0 with comprehension for all (second-order) formulas.
Since the collection of second-order formulas that are valid (under the standard interpretation of relational quantification) is not in the arithmetical hierarchy, let alone effectively enumerable, even SOL is incomplete for standard validity. However, second-order logic is complete for the broader class of Henkin-structures [Hen50]. A Henkin-structure, Ji, consists of a first-order structure over some universe A, augmented with, for each r > 1, a collection Ti r of r-ary relations over A. Semantic satisfaction, 7i [= <£>, is defined using 7i r as the range of quantifiers over r-ary relations. If $ is a class of second-order formulas, then 7i is closed under $ if, for each (p = <p [x, R] in 4>, with free object variables x = (xi.. .#*;), and free relational variables R = (i?i... Ri) (with r, = arity(i2 t )), and all Qi G W ri ,..., Q\ G H rn the set {(^ ..

. a k ) G A k \ H \= <p[a/x, Q/R] } is in Hk-
The proof in [Hen50] establishes the following.

THEOREM 3 [Henkin] Let $ be a class of second-order formulas. A second-order formula is valid in all Henkin-structures
closed under $ iff it is provable in SOL($).

Computational formulas
A second-order formula is computational if it is of the form VR3x where R are relational variables and i\) is quantifier-free, i.e. if it is strict-U\ in the sense of [Bar69,Bar75]. A computational formula without free relational variables is relationally closed (r-closed for short). Comp (respectively, Comp 0 ) will denote the set of formulas equivalent in SOL 0 to a computational formula (respectively, to an r-closed computational formula).
The term "computational" is induced by the fact that each computational formula describes a computation process, which becomes apparent when the formula is converted into an equivalent "computational normal formula," as follows. Let V be a vocabulary, and fix a tuple R of relational identifiers. Let x be a syntactic parameter for quantifier-free V-formulas. A computational normal formula is a formula of the form where each ¿ n is the disjunction of formulas of the form x -+ each K n is the disjunction of formulas of the form \ A Rj(y) an d eac^ #n is the conjunction of formulas of one the forms x or R(x). The formula y> states (about its free variables) that every process that initializes the values of R as prescribed by the £ n 's, and inductively closes these relations as prescribed by the /c n 's, will reach values that satisfy some "target condition" 0 n .
Each computational normal formula defines, uniformly for all V-structures, the operational semantics of a certain finite state machine (see [Lei89] for details). The connection with computational formulas is given by the straightforward observation that every computational formula is equivalent, in SOL 0 , to a computational normal formula.
The significance of computational formulas is further manifest in the following.

THEOREM 4 [Kreisel]
Every computational VpA-formula is equivalent in the standard Vpastructure to an existential formula.
Hence, every r-closed computational VpA-formula defines in the standard VpA-structure an RE set.

PROOF:
Using familiar sequence-coding, every computational formula is equivalent in the standard Vfei-structure to a computational formula of the form VR3x where R is unary, and V> = if> [R] is quantifier-free with no free variables other than x and u. Let We have The forward direction of the last equivalence holds by Konig's Lemma, and the backward direction is straightforward. The latter formula is equivalent in the standard structure to an existential formula, since the universal quantifier is bounded. • The proof above of the equivalence of computational formulas to existential formulas clearly applies to any countable admissible structure [Bar69,Bar75 (Theorem VIII.3.1)]. However, this equivalence fails to hold in general for structures which do not contain a code for every completed computation over elements of the structure. For example, if V S = {0, s}, then it is easy to see that every RE set of natural numbers is defined in the standard V Sstructure 4 by a computational formula (compare Lemma 14 below), whereas the sets defined in the standard V^-structure by existential formulas, or even by first-order formulas, are all recursive. (In fact, even for the vocabulary V+ = {0,s, +}, every set of natural numbers defined in the standard V^-structure by a first-order formula is recursive, by [Pre30].) Thus, computational formulas might be regarded as the appropriate generalization to all structures of recursive enumerability; they reduce to existential formulas over structures in which computations are representable internally, but are in general stronger.
Of interest is also the strength of computational formulas as queries. A k-ary query (or global relation) over a class C of structures assigns to each structure S 6 C a k-avy relation over the universe |<S| of S. If C consists of V-structures, and <p is a V-formula whose free variables are among u\... u*, then \u\ ...v,k.<p determines a query over C, that assigns to Note that if a computational formula <p has free relational variables Q, then it determines a computational process that uses Q as oracles, and is equivalent over countable admissible structures to an existential formula with Q free. Thus, <p defines an abstract notion of relative RE, that is -an abstract form of Kleene's jump.

Direct second-order interpretation of number theories
In this section we define a second-order interpretation of VFA which is direct, in the sense that the target formalism has defining equations for primitive recursive functions, in contrast to the "full" interpretation we define in the sequel. While the full interpretation is more logically prestine, the direct interpretation is easier to formulate and verify. 4 where 0 is interpreted as zero and s as the successor function 5 [Fag75] proves that graph connectivity is not a first-order definable query; an elegant simple proof of this can be found in [GV85].
[Imm87] observes that all first-order queries over finite ordered structures are computable in deterministic log-space.

Definition of the direct interpretation
Let , then the extension of N in 5 is a copy of the natural numbers.
.., a* are formulas or terms, we write var{a\^... , a^) for the set of variables that occur free in c*i,... ,a*.
If (/? is a V/^-formula, then v? N denotes y> with quantifiers relativized to N. Assuming N[x] we get induction with respect to x for a formula by instantiation of the universal relational quantifier to the relation Xx.ip:

This is legitimate by comprehension for
However, if <p is a first-order Vp^-formula, then ^ is in general not first-order (because N is not), so first-order comprehension does not yield induction for the interpretation of first-order VFA-formulas! We define the direct interpretation of VFA as having N as the formula defining the target universe, and with the VPA-identifiers interpreted by themselves.
Given a formalism C for second-order (or higher-order) logic (with constant and function identifiers), the directly-interpreted number theory of C is It is not hard to delineate the direct number theory of (impredicative) Second-Order Logic, SOL. Recall that Impredicative Analysis, i.e. Second-Order Arithmetic, is the extension of FA with quantification over relations, with induction formulated as a single axiom, \/x. N[x], and with comprehension for all formulas in the language. The following is essentially due to Prawitz [Pra65].
THEOREM 5 A first-order VpA-formula <p is a theorem of Impredicative Analysis iff <p N is provable in SOL + PR + -i(0 = 1), i.e. iff <p N is provable in SOL + PR.

Correctness of the direct interpretation
The following proposition states the correctness of the direct interpretation of VFA in SOL(Comp 0 ).
PROOF: By induction on degree (/). If / is one of the initial functions, then the proposition is trivial. The case where / is defined by composition is straightforward. (v,u,f(v,u)).
Arguing within SOL(Comp 0 ) we have, by induction assumption, and Assume By (2) and (3) , which by the second defining equation This proves the first conjunct in (4). The second conjunct is immediate from (1) and the first defining equation for /. Thus we get the conclusion of (4), 7V[/(v, {?)], based on assumption (3), which is precisely the statement of the proposition. •

Soundness of the direct interpretation
In this section we prove that the direct interpretation of FA is sound for SOL ( which is clearly an r-closed computational formula, ip implies tp f trivially. For the converse, , which by Lemma 8 implies tp. • LEMMA 10 For every VpA-formula <p, comprehension for ip N is provable in SOL(Comp).
PROOF: By induction on <p. Without loss of generality, we assume that A, -«, and 3 are the only logical constants. If <p is quantifier-free then it is computational, and the lemma is trivial.
If <p = %l> A then <p N = ip N A x N -By induction assumption SOL(Comp) proves 3PW (P(u) <-> ip N ) and 3QW (Q(v) «-> x N )> where u and v list var(^) and var{x), respectively. By comprehension for quantifier-free formulas The ca.se <£> = -i^ is treated similarly.

Generalization: <p = x -> is derived from X ~* ^-By induction assumption N[var(<p),x] ->(x N -**P N )
is provable. Therefore, N[var(<p)] -> ( X N -+ Vx N^N ) is provable, by Generalization, since x must not be free in x-This concludes the induction step and the proof.

Full second-order interpretations of number theories
We define an interpretation of the vocabulary of SOL that has each V/r^-identifier interpreted by an r-closed computational formula that defines its graph. The target formalism of the interpretation cannot make do with no constants at all, since to interpret 0 and s in the absence of constants we would need second-order constant-free formulas, <p with only x free, and rj> with only x and y free, such that SOL h 3\x<p and SOL h Vj/(M[y] -• 3\xil>) where M is a formula interpreting TV. Clearly, no such formulas exist. We therefore assume that 0 and s are present in the target vocabulary.

Graphs of primitive recursive functions
For each Vp^-identifier /, we define a formula G/, by induction on degree (f), as follows.
We need the following generalization of Lemma 9.

LEMMA 14 Let <p be a conjunction of formulas of the form Gj[i\, formulas of the form N[t], and quantifier-free [r-closed] formulas. Then 3x tp is equivalent in SOL(Comp 0 ) to an [r-closed] computational
formula.

In particular, every formula of the form Gf[i] is equivalent in SOL(Comp 0 ) to an r-closed computational
formula.

<p = G h [t u s 1 ] A ... A G fk [t kj s k ] A N[ qi ] A ... A N[qi] A a,
where a is quantifier-free. We prove the lemma by main induction on the number / of conjuncts N[qi], secondary induction on m = max t <*[decree(/,)], and ternary induction on the number n of conjuncts Gf { with degree(fi) = m.
If / > 0, then As in the proof of Lemma 9, (p is equivalent to which is equivalent, by elementary quantifier rules, to where u is fresh and ¡3 is the quantifier-free formula (R{u) R(s(u))) -> R(qi). By induction assumption, <p f is equivalent in SOL(Comp 0 ) to an r-closed computational formula.
If k > 0 (and / = 0), then <p is of the form 3x (tf> A Gf [t, s]), where degree (f) = m. We proceed by cases for the definition of /. If / is initial, then Gf[t, s] is quantifier-free, and we are done by induction assumption. If / is defined by composition, then Gf [t,s] for which the lemma holds by induction assumption.

->Q(t%).
The formula x is equivalent to

VQ(t,s).
Since degree degree (h) < m, each one of the disjuncts of x! is equivalent in SOL(Comp 0 ) to an r-closed computational formula, by induction assumption, and therefore, by Lemma 7, Gj [t,s] is also equivalent to an r-closed computational formula, proving Claim 1. CLAIM 2. <p is equivalent, in SOL(Comp 0 ), to <p' =df VQ3z(V>Ax').
For the converse, assume tp'. Then, by instantiating Q to \v,u,w. Gf[v,u,w] To prove the lemma it remains to show that <p' = WQ 3x (ijj A x!) 1S equivalent in SOL(Comp 0 ) to an r-closed computational formula. We have V3x(V>AQ(f,«)).
By induction assumption, each one of the disjuncts is equivalent in SOL(Comp 0 ) to an r-closed computational formula, so tp f is also equivalent to such a formula, by Lemma 7. •

Full second-order interpretation of arithmetic
The full second-order interpretation of VFA has N as the formula that defines the target universe, with 0 interpreted by 0, s interpreted by s, and every other Vp^-identifier / interpreted by the formula G/, in the usual sense. The following is a more detailed description of the latter point.
Let us say that an equation is simple if it is of the form f(u) = v, where tT, v are variables, and that a formula is simple if all equations therein are simple or are equations between atomic terms (i.e., variables or constants). For an equation E, let E s be a simple formula, equivalent to obtained by hereditarily replacing equations by equivalent existential simple formulas. For example, f(g(u)) = v is replaced by 3w (g(u) = w A f(w) = v).
For a Vp^-formula tp the interpretation (p 1 of <p arises from (p by replacing each equation E (except simple equations and equations between atomic terms) by E s , then replacing each simple equation f{u) = v by G/ [u, v], then relativizing quantifiers to N. Note that in the standard VFA-structure Vx,z ( z ] <-> /(x) = z), and so tp tp 1 .
Given a second-order (or higher-order) formalism C (with constants 0 and s), the fullyinterpreted number theory of C, FNT[C], is

Correctness of the full interpretation
We show that the full interpretation of VFA in SOL(Comp), <p i-> </? 7 , is correct, that is, that the interpretation of each V^-identifier is the graph of a function over the interpreted universe TV, provably in SOL(Comp). We shall prove half of this already in SOL(Comp 0 ), so our interpretation is "semi-correct" for SOL(Comp 0 ).

LEMMA 15 If f is a V F A-identifier, then
(where arity(x) = arity(f)).
PROOF: By induction on degree (f). The induction basis is trivial, and the case where / is defined by composition is straightforward.

Lemma 19 If E e PR, then SOL(Comp 0 ) h N[var(E)] -+ E
To prove the converse, assume /(a, x) = z, and assume Vu,y (G g [u,y}^Q(0,u,y)) and Viï,v,w,y(Q(v,iï,w) A G h [v,u,w,y] -• Q(s(v),t?,y) ).  (5) and (6) imply ip[a\. Therefore, N[a,x\ and f(a,x) = z imply that (5) and (6)  We summarize in the following theorems the semantic readings, based on Theorem 3, of Theorems 31 and 32. First, we have the following semantic characterizations of FA: THEOREM 33 Let tp be a closed VpA-formula. The following conditions are equivalent: