STRICT LOWER AND UPPER BOUNDS ON ITERATIVE COMPUTATIONAL COMPLEXITY *

Publisher Summary In this chapter, a non-asymptotic theory of iterative computational complexity is constructed with strict lower and upper bounds. Complexity is a measure of cost. The relevant costs depend on the model under analysis. The costs can be taken as units of time, number of comparisons, size of storage, or number of arithmetics. A number of different costs can be relevant to a model. The complexity of an algorithm, of a class of algorithms, or of a problem can be analyzed. The subject dealing with the analysis of a class of algorithms or of a problem is called computational complexity. Computational complexity comes in many flavors depending on the class of algorithms, the problem, and the costs. There are three types of computational complexity. In each of these, the costs are taken as the arithmetic operations. Algebraic computational complexity deals with a problem and a class of algorithms that solve the problems at finite cost. The branch of complexity theory that deals with nonfinite cost problems analytic is called computational complexity.

Computational complexity comes in many flavors depending on the class of algorithms, the problem, and the costs. We limit ourselves here to mentioning three types of computatioi*al complexity. In each of these the costs are taken as the arithmetic operations. Algebraic computational complexity deals with a problem and a class of algorithms which solve the problems at finite cost. Typically the problem belongs to a class of problems which is indexed by an integer n. Let corap(P^) be the complexity of solving the nth problem in the class. We are interested in lower bounds L(P^) and upper bounds U(P^) on compCP^), (1.1) L(P n ) £ comp(P n ) *U(P n ).
The upper bounds are obtained by exhibiting an algorithm for solving P n with complexity U(P )• Lower bounds are obtained by theoretical considerations and "non-trivial" lower bounds are difficult to obtain. For example if P^ is the problem of multiplying two n by n matrices and if the cost of each arithmetic operation is taken as unity then 0(n 2 ) <; comp(P n ) <> 0(n p ), p = lg 7.
(We use lg to represent log£.) Borodin and Munro [75] survey the state of knowledge in algebraic complexity. Exact solutions of "most" problems in science, engineering, and applied mathematics cannot be obtained with finite cost even if infinite-precision arithmetic is assumed. Indeed linear .problems and evaluation of rational functions which can be solved at finite cost are the exception. Even when the problem can be solved rationally, we may choose to solve it by iteration. An example is the solution of large sparse linear systems. Typically, non-linear problems cannot be solved at finite cost.
We call the branch of complexity theory that deals with non-finite cost problems analytic computational complexity.
Often the algorithms are iterative and we then refer to iterative computational complexity.
In this paper we propose a new methodology for iterative computational complexity. Our aim is to create at least a partial synthesis between iterative complexity and other types of complexity.
A basic quantity in iterative complexity has been the efficiency index of an algorithm or class of algorithms. In this paper we introduce a new quantity, the complexity index, which is the reciprocal of the efficiency index. The complexity index is directly proportional to the complexity of an algorithm or class of algorithms. We show under what conditions the complexity index is a good measure of complexity.
Our methodology is non-asymptotic in the number of iterations.
Earlier analyses of complexity applied only as the number of iterations went to infinity and this is not of course realistic in practice.
We summarize the contents of this paper. In Section 2 we analyze a simplified model of the errors of an iterative process and show that complexity is the product of two factors, the complexity index and the error coefficient function. Bounds on the error coefficient function are derived in the following Section and used to derive rigorous conditions for comparing the complexity of two different algorithms. In Section 4 we show how the results of the simple model can be applied to a realistic model of one-point iteration. Lower and upper bounds on the complexity index for several important classes of iterations appear in Section 5.
In a short concluding Section we state the extensions and generalizations to be reported in future papers.

BASIC CONCEPTS
We analyze algorithms for the following problem. Let f be a non-linear real or complex scalar function with a simple zero a.
Let x Q be given and let an algorithm 0 generate a sequence of approximations x^,...,x^ to a. We terminate the algorithm when x^ is a sufficiently good approximation to a* This will be made precise below.
The appropriate setting for this investigation is to consider f as a non-linear operator on a Banach space of finite or infinite dimension. Since many of the basic ideas can be illustrated when f is a non-linear scalar function we shall assume throughout this paper that this holds. We must remark however that some of the most interesting and important results deal with the dependence of complexity on problem dimension and we do not deal with that dependence here.
Let e. > 0 represent some measure of the error of x.. Assume that the e^ satisfy the error equation (2.1) e ± = p £ 1, i « l,2,...,k.
We call p the non-asymptotic order and A^ the error coefficient. We require 0<L£A£U<°° for all values of e^ including the possibility that e Q be arbitrarily small. Then p is unique. Many iterations satisfy the model given by (2.1). In Section 6 we mention extensions to this model. where T)^ is in the interval spanned by a and x^.

B
We simplify the model of (2.1) and show what kind of results may then be obtained. In Section 4 we return to the analysis of (2.1). Let We call this the constant error coefficient model while (2.1) is the variable error coefficient model,.
We consider first the case p > 1. It is easy to verify Choose e 1 , 0 < € ! < 1, and let k be the smallest index for which e. £ e f e rt . Define e ^ c 1 so that k 0 (2.5) e k = ee Q .
C 1 is a basic parameter which measures the increase in precision to be obtained in the iteration. We choose c to avoid ceiling and floor functions later in this paper. It is con--2 venient to assume e £ 2 (we use this in Theorem 3.1) but this is non-restrictive in practice.
and it follows that where This is independent of the logarithm base but it is convenient to take all logarithms to base 2. Then, if e^^ is the relative error, t measures the number of bits to be gained in the iteration.
We denote the complexity of iteration i by c^. In this paper we assume c, = c is independent of i. We defer a discussion of the estimation of c until Section 5. The important case of variable cost will be considered in a future paper. We define the complexity of the algorithm by (2.9) cotnp = ck.
We call g the error coefficient function. Equation (2.10) will be fundamental in our further analysis.
We have decomposed complexity into the product of two factors. The complexity index, which is independent of both the error coefficient and the starting error, is relatively easy to compute for any given algorithm. (However, lower bounds on the complexity index for classes of algorithms require upper bounds on order which is a difficult problem only solved for special cases (Kung and Traub [73], Meersman [75] and Wo^niakowski [75b]).) We shall show, in a sense to be made precise in the next section, that the error coefficient is insensitive for a large portion of its domain and that complexity is determined primarily by the complexity index.
We shall also show there are cases where complexity is determined primarily by the error coefficient function. for a class of iterations § we shall say that 5 is normal.
An example of a normal class of iterations may be found in Wozniakowski [75b]. To simplify notation we shall henceforth write w as v whether or not we are dealing with a normal P class. Now, g(w) is a monotonically decreasing function and lim g(w) « % lim g(w) = 0.
To study the size of g(w) we somewhat arbitrarily divide the range of w, given by (3.1), into three sub-ranges.
, 2 V ^ t, Then lg p ^ g(w) < 1 + lg t-lglg t To get some feel for the length of these sub-ranges, observe that if e^ represents relative error then in single-precision computation on a "typical 11 digital computer we might t take e =* 2" 32 . Then t « 32 and if p = 2, then 2 P "* ] = 2 32 .
From the bounds on the error coefficient function and (2.10) we immediately obtain the following bounds on complexity.
As w approaches unity, then for e fixed, comp ~ -zlglg w. In this case the effect of the error coefficient A and the initial error e n cannot be neglected.
Note that for any p > 1 the complexity of an iteration can be greater than if p = 1 (see (2.12)) provided w is sufficiently close to unity.
For any w ^ 2, complexity is bounded from above by zlg(l+t) and is therefore independent of the error coefficient A and the initial error e^. For w £ 2, complexity is insensitive to w and we need only crude bounds on w.

e
We have taken w -2 as one of our endpoints for convenience but this is of course arbitrary. Any value of w sufficiently far from unity will do. If w = 2 V then g(w) -lg(l+vt).
Then the effect of the nearness of w to unity and of e to zero are equal if v = t, that is if w = 2*". For this choice of w, 2 1 comp = zlg(l+t ) ~ 2zlg t = 2zlglg We have chosen the sub-ranges of w so that the endpoints are simple. We could also choose values of w that make the complexity formula simple. If

y • •
We discuss some of the implications of this theorem. As t «>, comp^/comp2 -» Z <|/ Z 2 ^o r an y fixed values of w^, \j^9 The ratio z -|/ 2 2 ^a s b een the way that iterations have been compared (see Traub [64,Appendix C] where efficiency indices are used). Theorem 3.2 shows that z -|/ z 2 can be a very poor measure of comp^/comp^; see for example (3.7).

THE VARIABLE ERROR COEFFICIENT MODEL
We turn to the variable error coefficient model, A complete analysis of this model is beyond the scope of this paper. Here we confine ourselves to the very simple assumption (4.2) A L <: k ± <; Ay, i = 1,k. The value of c is discussed in Section 5. Note that a sufficient condition for convergence is u but with only this condition, complexity could be extremely large. • 1/2 EXAMPLE 4.2. We seek to calculate a ' , that is solve 2 in 2 12 f(x) = x -a. Let a «= 2 X , m even, z> £ X < 2. Then 1/2 m/2 l/2 l/2 a 9 = 2 1 X, 1/2 1 <> X < 2 9 . We use Newton-Raphson iteration, Then A ± -l/(2x jL _ 1 ). If x Q > X, then a l-2^i A i < k-v i = 1 "-" k -Hence 1/2 Let x Q « 2 • Then w y £ 2 and comp £ c lg(lH-t). To derive a lower bound on complexity one must make an assumption about 1/2 the closest machine-representable number to 2 . We do not pursue that here. •

BOUNDS ON THE COMPLEXITY INDEX
We have shown that provided w is not too close to unity, then for fixed c, complexity depends only on the complexity index z. In this section we turn our attention to the complexity index.
Recall that z = c/lg p. We begin our analysis of z by considering the cost per step, c. We distinguish between two kinds of problems.
We say a problem is explicit if the formula for f is l/2 given explicitly. For example, the calculation of a by 2 solving f = x -a is an explicit problem. The complexity of explicit problems has been studied by Paterson [72] and Kung We assume the same set of functionals is used at each step of the iteration. The set of functionals used by an iteration algorithm 0 is called the information set 51. Wozniakowski [75a] gives many examples of 51. Let the information complexity u = u(f,51) be the cost of evaluating functionals on the information set 51 and let the combinatory complexity d « d(0) be the cost of combining functionals (see Kung and Traub [74b]). We assume that each arithmetic operation costs unity and denote the number of operations for one evaluation of f^ by c(f^). The following simple example may serve to illustrate the definition. (c(f))' / y/ It would only be reasonable to use this high an order * n -1 iteration for very small e. Assume t » p = 2 Observe that z(\|f ) is a very fl flat n function of n. Thus 3 11 n z(tyo) 855 o^Cf) + a nd comparing this with z(ty ^) shows we We can obtain a lower bound on the complexity of the jclass of multipoint iterations by using an upper bound on the •maximal order of any multipoint iteration and a lower bound on the combinatorial complexity. Kung and Traub [74a] conjecture that any iteration without memory which uses n pieces of information per step has order p <• 2 n ^.

SUMMARY AND EXTENSIONS TO THE MODEL
Wer have constructed a non-asymptotic theory of iterative computational complexity with strict lower and upper bounds.
In order to make the complexity ideas as accessible as possible we have limited ourselves to scalar non-linear problems.
The natural setting for this work is in a Banach space of.finite or infinite dimension and we shall do our analysis in this setting in a future paper. We have focussed on the simplified model e. = Ae? ,. More realistic models include some of the following features; 1. e. = A.e? -under various assumptions on the strucl I i-l ture of A^. P l P m 2. e, = A.e. n ...e.
. This is the appropriate model i l i-l i-m for. iterations with memory.
4. Include round-off error. Then e^ will not converge to zero.
We plan to analyze these more realistic models in the future.
We also intend to investigate additional basic properties of complexity. Our various results will be used to analyze the complexity of important problems in science and engineering.