Circumscription and implicit definability

We explore some connections between the technique of circumscription in artificial intelligence and the notion of implicit definition in mathematical logic. Implicit definition can be taken as the informal intent, but not necessarily the formal result, of circumscription. This raises some questions for logical theory and suggests some implications for artificial intelligence practice. The principal implication is that when circumscription ‘works’ its conclusions can be explicitly described.


Introduction
The second approach toward principled ignorance taken by artificial intelligence is that of ruling out possibilities by adopting a narrow view of what is possible. For example, frequently we are forced to take action in spite of incomplete knowledge about our circumstances and the consequences of possible actions. We may wish to follow rules that indicate which actions are appropriate in which circumstances, but due to the incompleteness of our knowledge, none of these may be clearly applicable. We may have other rules in the finite realm which specify particular standard assumptions to make, but even after following these, we may still not have enough information. Unlimited exploration of the possibilities is out of the question, for there are too many. In these straits, it is common to take what we. do know or have assumed as our "best guess" about the situation. We assume that what we do know is all there is, that what we do not know to be true must be false. This is the idea of both circumscription and the "closed world assumption"-each of which involves jumping to conclusions based on those delimited by our explicit knowledge.
For example, we may know of several things that lie on a table, but may need to know about all things on the table to tell if we have succeeded in removing everything from the table. In this case, our best guess is to assume that the several objects we know to be on the table are the only things there, to assume that nothing else is on the table.
One may wish to succeed in some action that has certain conditions of success, for example, rowing in a boat to the other side of a river. As McCarthy's well-known anecdotes show, we will never act unless we can rule out the myriad ways the action might fail. To do this, we assume that if something was wrong, we would know about it, and conclude that nothing is wrong from our ignorance of any problems.
Or finally, we may wish to take several actions in a row as a means to accomplishing a difficult goal, and to tell what action to do next, we need to know what circumstances result from the preceding actions. We may know some consequences of possible actions explicitly, through "laws of motion" of the form "Conditions C will obtain after taking action A in situation S." Such rules may tell us about the overt effects of actions, but it is in feasible to have such rules explictly mention the vast majority -2 -of conditions left unchanged by actions. Hence to tell what conditions result, we must guess that each action changes nothing except those things we can tell it changes from the laws of motion and our other knowledge about relations between things.
These assumptions adopted in taking the narrow view serve to convert many former possibilities into temporary impossibilities, which can then be avoided by the methods mentioned previously. But to make use of "best guesses" in artificial intelligence, we also need ways of telling exactly what our best, guess should be, when it should be changed, and how to compute it from our knowledge. In particular, we must address the following questions.
(1) Precisely which conclusions make sense as "best guesses" from given information?
(2) Is there an explicit, or at least concise, characterization of these conclusions and their dependence on the given information?
(3) How can these conclusions be computed quickly? Or approximated if exact answers are unconfutable or infeasiblc?
(4) Is there an easy way to tell when changes to the explicit information invalidate the guesses, and when the changes leave the guesses the same?
(5) When the guesses must change, can they be quickly updated to fit the new information?

McCarthy introduced (in [1977], modified in [1980]) the formal notion of circumscription as a solution to problem (1).
The formal definition of circumscription is as follows. Fix a first order logical language L, and let A be a set of axioms stated in L, that is, a set of closed formulas. For convenience, when A is finite, we will sometimes confuse A with a sentence A obtained as a conjunction of the members of A.
Let P be a predicate symbol of L of type or arity n. We will write P(x) to abbreviate P{x\, ..., x n ). We also write A(Q) to mean the result of substituting the symbol Q for P everywhere in A y so that A = A(P).
When A is finite, we define B(A, P), the circumscription schema for P in A, to be the sentence schema

We write B(Ay P) to mean the set of sentences abbreviated by B(A, P), that is, the set resulting from substituting every expression of L of type n for Q in B(A, P). We further define K{A, P) t the circumscription of P in A, to be the union of A and B(A, P) f that is, K(A y P) = A U B(A, P). When A is finite, we can write K{A, P) to mean the schema A A B(A, P).
Aside from slight differences in notation, the definition above of B(A t P) is exactly McCarthy's. McCarthy does not define the circumscription for infinite sets of axioms, and neither do we for the moment. We follow Reiter's [1982] lead in introducing notation for K{A,P), though ours is different than his. McCarthy's discussion apparently abandons the original axioms A for their conditional appearance in B(A, P) y focussing on the consequences of B(A, P) alone. Here our terminology diverges from that of McCarthy, for we say that a sentence is a consequence of circumscribing P in A if it is among

Th(K(A, P)) t while McCarthy would mean that the sentence appears among Th(B(A, P)).
Two simple examples will be sufficient for most of our purposes. Both are adapted from Here circumscription yields not one minimal conception of what blocks are, but two. In addition to these questions, circumscription has never been studied with respect to problems (2)-(5). Artificial intelligence is lucky this time, however, for circumscription is initmately related to the logical notion of implicit definability, about which logicians know many things-not enough, perhaps, to satisfy all the practical needs of artificial intelligence, but good insights needed better pursue those needs.

Implicit definability
Mathematical logic has long possessed the notion of implicit definition of predicates. Retaining our previous notation, a set of axioms A (finite or infinite) implicitly defines P just in case A forces a "unique" meaning for P. There arc several equivalent precise statements of this intuitive idea.

Second, A implicitly defines P iff whenever X and M are two models of A(P) (M (= A, M |= A)
that are isomorphic for all relations besides P, then they are also isomorphic for P. In particular, if M is an automorphism of M, then P is the same in both, that is, M(P) = M(P).

If we note that >t(QpVx(i>(5) = Q(x)) implies the formula [A(Q) A V x(Q(l) 3 F(5))] => V 5(P(x) 3
we see that implicit definability implies that the circumscription schema B(A,P) is valid. That is, the implicit definability schema says that P is unique; the circumscription schema says that P is minimal; and of course uniqueness is a special case of minimality. Informally we might think of circumscription as producing a unique result-it does, after all, produce a single theory-that is, we might say that the intent of K(A, P) is to be a theory which implicitly defines P, a theory stating both that A(P) holds and that P is the unique minimal predicate forced by A. Unfortunately, the "smallest" value for P may actually be several different minimal values for P, a possibly different one in each model of A(P). In this case, K(A, P) will fail to implicitly define P, so circumscription is more properly thought of as a generalization of implicit definition. In fact, few theories define predicates implicitly, even if they are constructed with the intent of doing so. While in the conjunctive block example above K(A, Block) does implicitly define Block, in the disjunctive example K(A, Block) does not implicitly define Block. There are instead two nonisomorphic models of K(A, Block), one in which 61 is the only block, and one in which 62 is the only block. Q{x)), which would also require the circumscribed predicate to be uniquely defined. Unfortunately,'thesc strengthenings do not result in more useful results from circimscription, for they simply yield inconsistent theories when applied to axioms like those of the disjunctive example above.

The general rarity of implicit definitions leads to many questions about circumscription. When docs K(A,P) define P implicitly? If it docs not, should we care? Does not K(A,P) produce useful conclusions anyway? Put another way, what does K[A, P) accomplish when it fails to implicitly define P?
Are there any benefits to implicit definition succeeding? And could some other additions to A succeed at implicitly defining P when circumscription fails? We treat some of these questions below.

Explicit definability
Since axiom schema are less convenient than finite axiom sets, and since the circumscription schema defines P rather indirectly, we can pursue question (2) and ask if K(A, P) entails some concise characterization of P. Specifically, can we find among the consequences of K(A, P) an explicit definition for P of the form Vi(P(x) = <t>(x)), where <f> is some formula not involving P? For example, Reiter [1982], drawing on work of Clark [1978], has shown that for the special case in which A(P) is Horn in P, circumscribing P in A(P) entails the predicate completion of P. That is, in this case A(P) can be rewritten in the form V>AVx(#z)3P(5)), and K(A, P) entails the formula Vx(P(x)o#5)).

Vx(P(2) m #*)).
Our question here is whether $ and may be chosen so that they do not mention P.
The answer to this question about the existence of explicit definitions is given by a famous result of mathematical logic. Beth's Definability Theorem states that implicit and explicit definability are equivalent: or formally, that A (finite or infinite) implicitly defines P iff there is a formula <j> involving only the symbols of A exclusive of P such that A (-V x(P(x) = <£(2)). This means that an explicit definition of P in K(A,P) does not exist when K(A,P) fails to implicitly define P. This is part of the reason why having implict definitions succeed is a good thing. Recognition of Beth's theorem changes one's outlook on circumscription, from the ill-posed problem of choosing "good" or "useful" instances of the schema to the concrete problem of computing the explicit definition of P.
But even if P admits an explicit definition, it may not be possible to eliminate the schema in favor of the explicit definition. That is, even if K(A, P) is consistent and if K(A, P) (-Vx(P(x) = <£(x)) as desired, it may be that Th(K(A>P)) 7^ Th(A (J {Vx(P(x) = <^(x))}), for the circumscription schema forces minimality of P, while an explicit definition need not. In general, statement of the minimality fo P requires infinitely many axioms, e.g. the circumscription schema. However, in the fortunate cases in which the explicit definition of P consists of a list I of possible values (as in the conjunctive block example Jibove), the circumscription schema may be completely replaced by a finite axiomatization of P's minimality. Since / is finite, each of its proper subsets may be explicitly listed, and an axiom set fi(A,P,l) constructed which declares that each proper subset fails to satisfy A(P). In this case, Th(K[A,P)) = ЩЛ U Vx(/'(2) = ф{х))и»{Л,Р,1)).
Actually, while Beth's theorem is famous, it is never used in logic except as an excuse for ignoring implicit definitions, since it means that implicit definitions offer no greater expressiveness in defining functions than do explicit definitions. But Beth's theorem is an ea»y consequence of Crai g 's Interpolation Lemma, which is also famous, and which appears in many logical studies with many applications. This lemma states that if а Э 0, then there is a formula 7 involving only symbols common to both a and 0 such that aP 7 and 7 Э 0. That is, 7 "interpolates" between a and 0.

Even if it is little used in logic, Beth's theorem is of practical interest for artificial intelligence, since it yields a procedure for finding explicit definitions if they exist. The conceptually simplest procedure in our case is to enumerate the consequences of K(A,P) and watch for one of the desired form. But there is a better way. Instead of this pure enumeration, we can use the first formulation of implicit definability and try to prove Vx(P(x) == Q(x)) from K{A(P),P)
and K{A(Q),Q) for some Q not in {.. A proof will only exist if К (A, P) implicitly defines P, and an explicit definition can be extracted from the proof. Some proof systems (see [Smullyan 1968]) even carry the explicit definition along as an interpolant. Needless to say, this method is much more directed than the pure enumeration. Unfortunately, neither method can guarantee results. By the completeness of the predicate calculus, if an explicit definition exists, these methods will find one. But if an explicit definition does not exist, then we see that the methods will run on forever, searching in vain for the desired definition or for a non-existent proof. Worse, there is no way to modify the procedures to check first that an explicit definition exists. Whether a theory explicitly (or implicitly) defines a predicate is undecidabie, so the only way to be sure an explicit definition does not exist is to search all the consequences.

(Proof: We can translate questions about satisfiability into questions about implicit definitions by considering a formula like ф : ф v[-i фА Vx.P(x) = x = x]. U ф implicitly defines P, we know that ф is not satisfiablc, for if it were; P could be anything. And if ф is not satisfiable, then ф implictly (even explicitly) defines P.
Hence ф is unsatisfiable iff ф implicitly defines P. The undecidability of satisfiability means that implicit definition must also be undecidabie.)

If implicit definition is undecidabie, can we find decidable special cases? In particular, can we find conditions on A(P) which guarantee that К (A, P) implicitly defines P? One such result is that the circumscription of a theory A(P) implicitly defines P if A(P) is monotone in P.
M onotone in P means that the theory Is equivalent to one in which P occurs only positively. Theories monotone in P are important because they can be used to define monotone (non-decreasing) operators corresponding to P,~ Monotone operators in turn are the basis of inductive definitions, in which one identifies the least fixed point of the operator as the set inductively defined by the operator (see [Aczel 1977]). If the monotone operator corresponding to P has a least fixed point, then that least fixed point is the predicate implicitly defined by K{A, P). This allows us to define the circumscription of an infinite theory A as the theory corresponding to the least fixed point of the P operator derived from A. Unfortunately, whether a theory is monotone in P is also undecidabie, so we must again look for tractable special cases. In fact, Horn in P is a special case of monotone in P, so circumscription implicitly defines P in each consistent A(P) that is Horn in P. This means that the Reiter-Clark result on predicate completion in Prolog can be strengthened to yield an explicit definition Vx(P(x) = <t>{x)) such that <f> does not mention P.

Disjunctive Definability
If circumscription fails to implicitly define its intended predicate, then no concise characterization of the predicate exists as an explicit definition. But the disjunctive block example above suggests that we might be willing to settle for something weaker as a concise characterization: namely, a disjunctive definition of the form Vx(P(x) s ^(x)) V...W T(P(X) = <£ fc (x)) which we might abbreviate as

One can view a disjunctive definition as a classification of all models with respect to P-just as in the block example, models were classified into two categories: those in which 61 is the only block, and those in which 62 is the only block. In fact, if P is explicitly definable in each model of K(A 9 P), then a disjunctive definition must exist. That is, even though K(A, P) has infinitely many models, they fall into a finite set of equivalence classes with respect to P. To see this, consider adding to K(A 9 P). sentences of the form -> Vx(P(x) s <t>{x)) for each expression <t> not involving P. If K(A, P) is consistent, then this extended theory must be inconsistent since each model of K(A,P)
will falsify one of the added sentences, so by compactness there is a finite inconsistent subset. Then K(A, P) entails the disjunction of the negations of the added statements in this finite set.

If a disjunctive definition of P in K[A, P) exists, by the completeness of the predicate calculus it will be provable from K(A y P), so we can find it by enumerating the consequences of K(A,P) and looking for formulas of the form yVx(P(x)
s <fr(x)). But not only is the existence of a disjunctive definition undecidable, whether a disjunctive definition of a given size k exists is also undecidable, and consequently there is no algorithm for computing the smallest disjunctive definition. In particular, even if we believe that a disjunctive definition exists, there is no way to tell from K(A,P) how large any disjunctive definition must be except by finding some disjunctive definition and taking its size as an upper bound. Thus short of finding a small expression, we cannot be sure that the desired disjunctive definition does not have a million disjuncts. Moreover, for each k there are theories whose smallest disjunctive definition of P is of size k. Consider, for example, the axiom

Vx(P(x) = [x = 1]) V ... V Vx(P(x) m [x = k]).
(I suspect there are smaller theories with this property.) However, there are direct procedures for finding disjunctive definitions of specific sizes, procedures extending the interpolation-based procedure for finding explicit definitions. If we seek a definition of P in A(P) with k disjuncts, we assume when A is Horn in P. In such cases, the closed world assumption always preserves the consistency of Л, and fits well with some procedures for automated deduction. (Sec [Reitcr 1978.) For the case in which the closed world assumption produces an inconsistent extended theory, Minker [1982] has defined the generalized closed world assumption to extend A with all the ground cases (positive and negative) common to all the consistent completions of A with respect to P. Lipski [1977Lipski [ , 1983 has studied the notions of external and internal interpretations of databases.
The internal interpretation appears related to the closed world assumption, and admits characterizations in terms of topological boolean algebras and modal logics. It would be nice to know more about the relation between this internal interpretation and circumscription.

Conservation questions
If circumscription is to be routinely useful in artificial intelligence practice, then it is worth knowing how new axioms affect circumscriptive^ obtained results. Let A' D A be an extended set of axioms. It is easy to see that A' may implicitly define P even if A does not. For instance, let A be the axioms of the disjunctive block example, and A' these axioms plus the conjunctive example axioms. Since the conjunctive axioms entail the disjunctive ones, the resulting theory has the same conclusions as the conjunctive example axioms alone, and these implicitly define the predicate Block, Conversely, A may implicitly define P but A! may not. For instance, let A be the conjunctive block axioms, and

A' be these plus Block(bZ) V Block(b4). And of course, circumscription is non-monotonic in that K(A, P) need not entail K(A', P), and K{A' 1 P) need not entail K(A,P).
Indeed, in many cases of interest, these sets will be inconsistent. For instance, augmenting the conjunctive block axioms with a third block 63 yields a theory inconsistent with the two-block circumscription. Unfortunately, conservation questions are largely unstudied. We take the present opportunity to ask several obvious ones.

First, if A entails A', then clearly K(A,P)
implicitly defines P iff K{A f , P) does too, and moreover Th(K{A,P)) = Th(iC(>l / , P)). In some cases, the same is true even though A does not entail

A*for example, if A' is A plus the explicit definition of P in K(A, P). But when, precisely, does K(A, P) entail K{A',P)? If and only if A' C
Th(K{A,P))1

Language extensions
Even if implicit definability fails, there is sometimes still an explicit definition-but in an extended language, in which the extended explicit definition can rule out "non-standard" interpretations. For example, Godel showed how truth in arithmetic may be "implicitly defined" in arithmetic, yet lack an explicit definition within arithmetic, while Tarski showed how a simple extension to the language of arithmetic permits an explicit definition of truth in arithmetic (but not truth in the extended language). There are many questions that may be interesting here. For instance, if K(A, P) fails to implicitly define P, when can some simple extension to L permit an explicit definition? And how can it be found? Is it practical to use this idea in the meta-theoretical systems popular in artificial intelligence? Is it possible to adapt the ideas of Kripke's [1975] theory of truth to implicit definitions of other predicates?

Conclusion
We have surveyed the motivations for circumscription and its connections with logical notions like implicit definability. 1 hope that some of the questions raised above will prove as interesting to the logician as some of the logical techniques may prove to the artificial intelligence researcher, for much work remains to be done. We recall the main sorts of open problems: 1. What are important cases in which K(A, P) produces explicit definitions or small disjunctive definitions? How large are these definitions relative to the size of A? Can these special cases be recognized, and are they of common importance (e.g. Horn databases)? What is the cost of finding these definitions given their existence? 2. What docs circumscription do when it fails to implicitly define P? Can its consequences be characterized in some interesting way?
3. How should revision of circumscriptive conclusions be mechanized?
One important topic slighted above is the analog of circumscription for systems other than logical languages. For example, in the abstract, circumscription can be applied in all sorts of representational systems by formulating data-structures via inductive definitions and looking for least fixed points or least solutions. See [Aczel 1977] for a treatment of inductive definitions in general, and [Scott 1982] for a "prepositional" treatment of data-structures.
Many have wondered about the relation between circumscription and non-monotonic logic. It seems fairly clear now that there is little relation. The preceding shows that circumscription is fundamentally a logical topic, the study of minimal solutions to axiomatized predicates. [Doyle 1982] shows that the leasoned assumptions appearing in reason maintenance systems and in non-monotonic logic are preferences of the agent concerning its own "state of mind," and that they comprise a fundamentally psychological or decision theoretic topic, the study of value and choice in the mental operations of the agent.
We close by noting that little is known about when and how to use circumscription in reasoning and decision-making. But that is not a problem about the logical nature or mechanization of circumscription, and we leave it to future artificial intelligence research.