MICROPRODUCTION FUNCTIONS WITH UNIQUE COEFFICIENTS AND ERRORS: A RECONSIDERATION AND RESPECIFICATION

Estimated microproduction functions confront two major problems—those of (1) unknown functional forms and (2) the measurement of capital independent of the distribution of output among the factors of production. The latter problem has emerged unresolved from the earlier Cambridge capital controversy. In the presence of these two problems, all specifications of microproduction functions have involved nonunique coefficients and error terms. We provide a method of deriving time-varying coefficients that produces unique coefficients and error terms. Specifically, we respecify the microproduction function in such a way that its coefficients are the sums of (i) the appropriate partial derivatives and (ii) exact representations of excluded-variable biases. By decomposing the total coefficients, we obtain the unique coefficients and a unique error term. Our treatment of heterogeneous capital is not subject to the criticisms of that concept that emerged during the Cambridge controversy.


INTRODUCTION
The use of aggregate production functions in macroeconomics, including in growth models and dynamic stochastic general equilibrium (DSGE) models [see Woodford (2003)] and its related literature, such as real business cycle (RBC) [see Kydland and Prescott (1982)] and new Keynesian models [see Clarida et al. (1999)] remains pervasive despite the extremely demanding conditions that need to be satisfied to ensure the existence of those functions. 1 As pointed out by Fisher (2005, p. 490), "even under constant returns, the conditions for aggregation are so very stringent as to make the existence of aggregate production functions a non-event." However, as Fisher went on to argue, these conditions do not apply to the estimation of microproduction functions. Nevertheless, the estimation of microproduction functions also presents considerable difficulties. Estimates of the marginal products of the factor inputs of microproduction functions are dependent on the correct specification of the underlying microproduction functions themselves. Yet previous empirical work has not addressed the critical problem of unknown functional form. Simply put, the precise functional form of any microproduction function is impossible to verify, but if the problem of unknown functional form is not addressed in a satisfactory way, the coefficients and the error terms of microproduction functions will not be unique; by uniqueness of the coefficients and the error terms we mean that these are invariant to changes in the relationship between the independent variables and the error term of any equation, with no changes in the equation itself and its dependent and independent variables. 2 We argue that previous empirical work on microproduction functions either has not addressed the unknown functional form problem in a satisfactory way, or has overlooked it altogether, leading to coefficients and error terms that are not unique.
A second problem also confronts the specification of microproduction functions, which also leads to nonunique coefficients and error terms. A key argument contained in every production function is capital input. Indeed, the quantity of capital is closely associated with the two broad theories of distribution and relative prices that have been developed in the literature-the classical and neoclassical theories. Classical economists argued that the determinants of labor remuneration could be studied separately from those of nonlabor remuneration. The latter remuneration was considered to be what was left over-that is, the surpluses of outputs-after wages were accounted for; hence, this theory of distribution is called the "surplus" method. The neoclassical theory uses supply and demand functions based on the "substitutability" of factors of production. 3 This substitutability would in turn result from the existence of alternative methods of production of each commodity and the choice made by consumers among different goods [see Garegnani (1990, p. 1)]. The issue of the validity of these respective theories formed the core of the Cambridge-Cambridge capital controversy (hereafter, Cambridge controversy), which raged from the 1940s through the 1970s. An important, but unresolved, issue in the Cambridge controversy debate concerns the difficulty of measuring capital independent of the distribution of output among factors of productionfor a detailed description of how this difficulty of capital measurement arises in the two theories, see Garegnani (1990). As Garegnani (1990) showed, the issue of how to measure capital needs to be resolved before the marginal products of the factors of production are derived under the neoclassical set-up. However, the measurement of capital itself has been a highly contested issue in the literature, even with respect to microproduction functions.
As indicated in the preceding, two key problems surround the estimation of microproduction functions. First, the problem of unknown functional form needs to be dealt with. Second, the problem of measuring capital independent of the distribution of output among the factors of production also needs to be addressed. In what follows, we account for both of those problems. Therefore, we are able to specify microproduction functions with unique coefficients and error terms. The problem of unknown functional form is dealt with by rewriting a given microproduction function with unknown functional form as an exact linear-in-variables but nonlinear-in-coefficients equation in which the slope coefficients are equated to the respective partial derivatives (with unknown functional forms) of the microproduction function. The problem of measuring capital independent of the distribution of output among the factors of production is dealt with by treating capital as an immeasurable factor. Effectively, we treat capital as an omitted variable. By doing so, we are able to derive omitted-variable biases and extract these biases from the total coefficient of labor to derive a bias-free estimate of the marginal product of labor. Specifically, we address the problems stemming from heterogeneous capital goods by treating different kinds of capital goods as separate immeasurable factors of production. 4 We argue that a respecification of the neoclassical microproduction function so that specification biases are separated from the correct partial derivatives of that function can take us a considerable distance in resolving some of the issues raised in the detailed debate leading to the Cambridge controversy.
Models used in this paper do not assume a given technology. Different technology bases of time series of cross-sectional data are maintained in this paper without explaining away any "contrary" empirical tests by advocates on either side of the debate. According to Sraffa (1961, pp. 305-306), "Theoretical measures require absolute precision. Any imperfections . . . were not merely upsetting, but knocked down the whole theoretical basis. . . . The work of J. B. Clark, Böhm-Bawerk and others was intended to produce pure definitions of capital, as required by their theories. If we found contradictions . . . these pointed to defects in the theory." To follow Sraffa, we first use certain measures with absolute precision to build a theoretical model and then we account for the appropriate measurement errors. As we show, solving the unknown functional-form problem is a straightforward exercise. Because our knowledge of real-world economic relations is incomplete and our measurements of economic variables are imperfect, we cannot avoid excluded-regressor and measurement-error biases. We first account for them with absolute precision and then propose a method of estimating the biases; the method also removes these biases from the estimates of quantities affected by those biases [see Tavlas (2005, 2007)].
The remainder of the paper is divided into three sections. Section 2 develops the methods of resolving the problems of spurious correlations, unknown functional forms, excluded regressors, and measurement errors associated with the neoclassical microproduction functions. These methods help us remove unavoidable specification biases from the neoclassical microproduction functions. Section 3 develops methods for obtaining consistent estimates of the several features of bias-corrected neoclassical microproduction functions. Section 4 concludes.

Microproduction Function
This function can be written as where c indexes different commodities produced by U.S. firms using different technologies, i indexes firms, t indexes time, y * cit is the ith firm's output, some of x * 1cit , . . . , x * L it −1,cit are the inputs used by the ith firm to produce y * cit , and the remaining x * 's represent "all relevant preexisting conditions." (We explain these conditions later.) Let t = 1, . . . , T and let i = 1, 2, . . . , n t , indicating the variable number of firms owing to entry and exit. A firm's attempt to maximize output by assigning the use of its various factors to different techniques of production results in the dependence of maximized output only on the total amounts of such factors. This dependence can be written as a functional relationship of the form given in (1). This line of argument has been used by Felipe and Fisher (2003, p. 210) to show that the existence of a production function at the firm level is ensured. However, differentiability will not be ensured.
In what follows, the subscript c is suppressed for simplicity of presentation and the symbol f it (.) is shorthand for the function on the right-hand side of (1); the function is defined only for nonnegative values of the input and output levels. The functional form of (1) is unknown. For t = 1, . . . , T and i = 1, . . . , n t , data on (y * it , x * 1it , . . . , x * K−1,it ) are available, but they may contain measurement errors. To represent these errors, we use the notation y it = y * it + ν * 0it , x jit = x * jit + ν * jit , j = 1, . . . , K-1, where the variables without an asterisk are the observables, the variables with an asterisk are the unobservable true values, and the ν * jit 's are measurement and other errors. Data on (x * Kit , . . . , x * L it −1,it ) are assumed not to be available. Consequently, these variables play the role of excluded regressors. We call (x 1it , . . . , x K−1,it ) the included regressors.
We can treat (1) as the real-world relationship between inputs and outputs if it is expressed in terms of unique coefficients and error term without misspecifying its true functional form. (See Section 2.10; that section also defines the term "uniqueness.")

Economic Regularity Conditions
The question that arises here is whether a measure derived from model (1) is economically meaningful if the model does not satisfy certain regularity conditions (such as increasing and concave in inputs) stated in Diewert (1971, pp. 484-485, Condition I). In other words, do regularity conditions (e.g., real-valued function of inputs, nondecreasing in inputs, right continuous, quasiconcave in inputs) need to be satisfied for any measure derived from (1) to be economically meaningful?
To answer this question, let us add a nonunique error term to Diewert's production function. Doing so gives where ψ it (x 1it , . . . , is a continuous, monotonically increasing function that tends to plus infinity and has h 1 (0) = 0, and ε it is the error term arbitrarily added to h 1 (.) to account for the net effect of excluded regressors on y it . For now, let us ignore measurement errors. To estimate model (1a) by nonlinear least squares, we need to make six assumptions, as provided in Greene (2008, pp. 286-287). These assumptions, however, are incorrect because ε it is equal to , which is not distributed with mean zero independent of h 1 (.). Even though h 1 (.) satisfies the regularity conditions in Diewert, it does not do so when combined with ε it . This situation arises because ε it is arbitrary. Indeed, any assumption we might make about ε it is arbitrary, yielding arbitrary results. In other words, when y it equals h 1 (.), it is a quadratic function of the square roots of the included regressors and it satisfies Diewert's regularity conditions for K -1 regressors.
What about omitted regressors? These are dealt with by adding ε it . There is, however, no reason that ε it should satisfy the regularity conditions. Also, if ε it has a large variance, then ψ it is a crude approximation toy it .
To deal with these problems, let us change the argument a bit. Delete ε it from (1a) and include (x * Kit , . . . , x * L it −1,it ) in h 1 (.). Doing so gives Diewert (1971) showed that this function satisfies the regularity conditions. Note that (1b) does not contain any excluded regressors-it contains a complete set of arguments, while satisfying Diewert's regularity conditions. Nevertheless, we do not know the functional form of (1b). To deal with the problem of unknown functional form, we rewrite (1b) as where β * it = ∂h 2 (.)/∂x * it if x * it is continuous and = h 2 (.)/ x * it with the right sign if x * it is discrete and the values of all the regressors of (1b) other than x * it are held constant, is the first difference operator, and β * We develop in the following a method of estimating model (1c). Thus, estimation of (1b) is not difficult. The problem is with the chosen functional form of it , the quadratic form in the square roots of inputs. This functional form is restrictive. Also, assuming the constancy of the coefficients a in (1b) can amount to misspecifying the true functional form of (1). If we replace a in (1b) with the time-varying coefficients (a it ), then (1b) may not satisfy the regularity conditions. In any case, the functional form of (1b) with or without constant coefficients is more restrictive than that of (1), and hence (1b) can have an incorrect functional form. A measure derived from a production function with the regularity conditions incorporated is not economically meaningful if it has an incorrect functional form. What (1b) implies is that there is a possibility that any production function forced to satisfy the regularity conditions has an incorrect functional form. When the functional form of the real-world production function is unknown, we cannot be sure that any production function satisfying the regularity conditions has the correct functional form and is unique. We call this the first problem with models (1a) and (1b). If the real-world production functions always satisfy the regularity conditions, then there is no need to impose them on (1). In this case, we have no problem.
In the econometrics literature, quantile approaches, nonparametric approaches, polynomials, etc., are used to estimate (1) whenever its functional form is unknown. A problem with these approaches is that they approximate (1) with models that have nonunique coefficients and error terms.

Solution for the Unknown Functional Form Problem
To briefly summarize, we specified an equation that does not have any omitted variables-that was (1). Yet we do not observe all of the variables in (1). In the process of eliminating from (1) those variables for which we do not have observations, we do not want to misspecify its true functional form, nor do we want to ignore omitted variable biases. We now proceed to solve the unknown functional form problem while ensuring that we avoid any misspecifications. As mentioned, our aim is to derive unique coefficients and error term.
Our starting point is based on a fundamental theorem that states that one way to represent any real-world relation, whatever its true functional form, is by a linear-in-variables model with time-varying coefficients. Thus we write with x * 0it = 1 ∀i, t. We can go from (1) to (2) in many different ways. Not all of them assign unique coefficients to (2). The coefficients of (2) can be unique if we go from (1) to (2) by defining that for = 1, . . . , L it -1, and all the regressors of (1) other than x * it are kept constant, is the first difference operator, and is the intercept. The coefficients on the continuous regressors of (2) are unique, as we show in the following. Equation (2) is true for all possible production functions, and so with this equation we do not have to make any specific assumption about the functional form of the real-world microproduction function in (1). We adopt the definitions (3a, 3b) because they do not misspecify the true functional form of (1). The advantage of (2) and definitions (3a) and (3b) is that they are correct even when the true functional form of f it (.) is unknown. 5 Note that the partial derivative of y * it with respect to x * jit implied by model (1a) or (1b) is not the same as α * jit in (3a), even when (1a)'s error term has mean zero and is independent of its regressors. We call this the second problem with models (1a) and (1b). Previously, researchers attempted to estimate the partial derivatives of h 1 (.) using (1a), but not the coefficients (3a) on the x * jit 's in (2). It can easily be verified that with definitions (3a) and (3b), (2) is exact. This equation is nonlinear if its coefficients are nonlinear, even though it is linear in variables. By construction, the coefficients of (2) differ among individual firms at a point in time and through time. 6 Therefore, (2) hypothesizes different capital intensities across firms and differs from Samuelson's (1962) model based on the assumption of equal factor proportions in all industries.

Elimination of Spurious Correlations
To avoid spurious correlations, it is necessary to control for all relevant preexisting conditions. Consider (2) again. We are interested in its partial derivatives. But these partial derivatives can be nonzero when the correlation between y * it and x * jit is spurious. To avoid this result, we need to control for all relevant preexisting conditions. How do we do this? These preconditions are all contained in (1). To control for those preexisting conditions, we proceed as follows. It is known that the regressors, x * 1it , . . . , x * K−1,it , are the genuine causes of y * it if the statistical correlation between y * it and each of x * 1it , . . . , x * K−1,it does not disappear when we control for all relevant preexisting conditions [see Skyrms (1988, p. 59)]. A formal statement of this condition is as follows: Pearl's (2000, p. 55, Definition 2.7.1)], if x * jit with j ≥ L it and y * it are dependent given the context of (1), and if x * jit with j ≥ L it and y * it are independent given the context of (1) and Pearl (2000, p. 55)]. 7 Thus, the dependence between x * jit with j ≥ L it and y * it is eliminated when we control for all relevant preexisting conditions by including (x * 1it , . . . , x * L it −1,it ) in (1). Typically, production decisions are made by simultaneously determining inputs and outputs, in which case the assertion that the inputs have a genuine causal influence on the outputs violates the principle that causes should precede their effects in time. 8 In such cases, the conditions prevailing before the production decisions are made constitute the preexisting conditions. But the problem here is that in the context of (1), we do not have the complete list of all the relevant preexisting conditions and do not know how to represent them. The included variables, x * 1it , . . . , x * K−1,it , may not be adequate to control for all the relevant preexisting conditions. Therefore, we assume that some of (x * Kit , . . . , x * L it −1,it ) in (1) represent all the relevant preexisting conditions. To allow this assumption to be true, we keep the variables (x * Kit , . . . , x * L it −1,it ) and their number (L it − K) unspecified. We also allow this number to depend on i and t. This practice is different from the usual econometricians' practice of including the nonunique error terms in their models to represent the net effect of excluded regressors on the respective dependent variables, as explained earlier in our discussion of (1a).

Measurement Problems
There are major measurement problems with capital. Usually, the term capital refers to a collection of heterogeneous goods. All these goods (i) are the produced outputs of some firms, (ii) are used as inputs for further production, and (iii) depreciate over time. If the quantity of a factor cannot be defined before the determination of the factor shares in an output (or distribution) and relative prices, then it is not possible to use a theory that allows the determination of distribution and relative prices only after the methods of production and the factor endowments of the economy are known [see Garegnani (1990, pp. 9-10)]. Consider, for example, a case where a measure of the amount of capital varies with the rate of profits. There is a problem here if, as in neoclassical theory, this rate of profits is itself assumed to be determined by the amount of capital being used. Thus, there is circularity in the neoclassical economists' argument. This problem underlies the importance of measuring capital independently of distribution. This requirement gives rise to major measurement problems, as shown by Garegnani (1990). In econometric practice, capital is usually measured in value terms by treating capital as a single factor of production. Unfortunately, the value of capital goods is not invariant to changes in distribution [see Garegnani (1990, p. 10)]. This problem is inevitable if capital is treated as a single factor of production.
Let us now see whether we have any problem if each kind of capital good is used as a separate input in production function (1). Garegnani (1990, p. 11) showed that this treatment takes the physical composition of the initial capital endowment as given. This datum contradicts the condition of an equilibrium physical composition of the capital stock, expressed under free competition in terms of equality in the effective rates of return over the supply prices of the capital goods. Such contradictions do not arise when excluded-variable and measurement-error biases are correctly treated using the models with the correct functional forms and unique coefficients and error terms as representations of the real-world relations.

Sraffa's Method
A measure of the amount of capital is produced by reducing all machines to dated labor. For example, a machine manufactured in period t can be treated as the labor and commodity inputs used in its production multiplied by the rate of profits; 9 and these commodity inputs can be further reduced to the labor inputs that produced them in t -1 plus the commodity inputs multiplied by the rate of profits; and so on, until the nonlabor component is reduced to a negligible amount. Obviously, this measure of a machine still includes the rate of profits. The effect of this measure is to reverse the direction of causation implied by neoclassical economics, according to which an increase in the amount of machines employed causes a fall in the rate of profits because of diminishing returns. The nature of Sraffa's method is that a change in the rate of profits would change the measured amount of machines in highly nonlinear ways. Therefore, Sraffa's measure of capital would change with income distribution even if capital did not change physically, so the production function involving Sraffa's measure of capital is not technical.

Definitions
Our way out of these difficulties is to treat the services provided by different kinds of capital as different factors of production that cannot be measured independently of distribution. We assume that some of (x * Kit , . . . , x * L it −1,it ) in (2) represent these immeasurable factors. The dependent variable of model (1) is the output produced by the ith firm in period t. The variable x * 1it is labor reduced to homogeneous units by stating it in terms of hours of the same skill and intensity. Usually, data on these homogeneous units of labor are not available. In this case, we use x 1it , a measure of labor in man-hours, as a proxy for x * 1it = x 1it − ν * 1it . The remaining arguments of the function in (1) represent other factors of production. With these definitions, we avoid some of Sraffa's criticisms (discussed later) by not letting the microproduction function in (1) relate the flow variable, output, to the stock variable, capital.
PROPOSITION 1. If f it (.) in (1) is nonlinear, then each coefficient of (2) can be a function of all of the regressors x * 1it , . . . , x * L it −1,it . The proof is immediate.
The consequences of this proposition are that (i) the coefficients of (2) are functions of its regressors and (ii) the technically different microproduction functions in (2) may not be additively separable in factors.
PROPOSITION 2. Suppose that f it (.) exists. Then the coefficient on a regressor of (2) defined in (3a) is unique and exact unless the regressor is discrete, in which case the coefficient is a discrete approximation.

Reswitching and Its Implications
Let a technique be a particular physical capital/labor ratio, as in Cohen and Harcourt (2003, p. 202). Then a technique of production is cost-minimizing at low and high rates of profit, but another technique is cost-minimizing at intermediate rates. 10 Robinson (1953)(1954) and Sraffa (1960) introduced the phenomena of reswitching to deny a simple (monotonic) nonincreasing relationship between capital intensity and the rate of profits, a relationship the neoclassical production and distribution (NPD) theories rely on. Samuelson (1966) provided examples of the phenomena of reswitching.

PROPOSITION 3. Equation (2) can produce the phenomena of reswitching.
Proof. Equation (1) is about the direct production of each of the goods produced in the economy. In Section 2.7, we have clarified how different kinds of capital goods are represented in (1). The premises of (1) are consistent with the presence of many heterogeneous capital goods and various capital intensities across firms. Even though we have not shown how to derive the wage-profit rate frontiers from model (1), its nonlinearities imply that such frontiers are nonlinear and may cross over each other more than once, which means that for a low rate of profits one may choose a capital-intensive technique. As we consider higher and higher values of the rate of profits, the technique with lower capital intensity may be chosen, and for still higher rates of profits the original technique of higher capital intensity may be chosen again [see Petri (2004, p. 220)]. Under these circumstances, the demand curve for capital is not always downward sloping. The conditions under which reswitching occurs are given in Petri (2004, Chap. 6). These conditions involve the variables that are not introduced in this paper. Unless we do so, we cannot provide a complete proof of Proposition 3. However, it is obvious that if any of Petri's conditions imply misspecification of the functional forms of the real-world production functions, then the phenomena of reswitching do not conform to such functions. For example, in one of Petri's (2004, p. 207, [6.1]) equations used to show the occurrence of the phenomena of reswitching, a matrix of technical coefficients of nonlabor inputs and a vector of technical coefficients of labor inputs in the production of goods appear. If the phenomena of reswitching are such that they can only occur when these technical coefficients imply the correct functional forms of the underlying production functions, then the production function in (1) is compatible with reswitching. This is because the production function in (1) has the correct functional form.
The result of choosing a capital-intensive technique for both low and high rates of profit runs contrary to the neoclassical theory of value and income distribution. This is Sraffa's and Robinson's criticism. We have introduced the definition of reswitching to show that this criticism does not apply to (1). Felipe and Fisher (2003, p. 220) pointed out that the phenomena of reswitching only appear paradoxical to anyone who believes that aggregate factors are related to aggregate output satisfying the properties that one expects of microproduction function (1).

PROPOSITION 4. The usual practice of treating the net effect
of excluded regressors on y * it as the error term is followed, then model (2) will have nonunique coefficients on its included regressors and nonunique error term. Excluded regressors (x * Kit , . . . , x * L it −1,it ) are also not unique. Proof. For simplicity, set K = 2 and L it = 3 so that there is only one included regressor and one excluded regressor in (2). Treat the effect x * 2it α * 2it of excluded regressor x * 2it on y * it as the error term, denoted by u it . The operations of adding and subtracting the term x * 1it α * 2it on the right-hand side of equation (2) do not change the equation and its dependent variable and included regressor but change the coefficients on the included regressor, excluded regressor, and error term. Thus, under the conditions of Proposition 4, the coefficients on the included regressors, excluded regressors, and the error term in (2) are not unique; for further discussion, see Swamy and Hall (2012). Proposition 4 shows that the coefficients and error term of model (1a) and the coefficients of (1b) are not unique. The same is true of the coefficients and error terms of DSGE and RBC models. 11 For consistent estimation of these models, the assumption that their included regressors are independent of their respective error terms is needed. The consequence of the nonuniqueness of their error terms is that this assumption can be shown to be false by making a change in one or more of their coefficients and making the offsetting change in their respective error terms in the constant coefficients case (see note 2). A model with this property cannot be a real-world relationship or a correctly specified model. We call this the third problem with models (1a) and (1b). Any assumption about nonunique error terms is arbitrary and gives arbitrary results.

A Model with Unique Coefficients and Error Term
Uniqueness: The coefficients and error term of a model are unique if it is impossible to change them without changing the model equation, its dependent variable, and included regressors and nonunique otherwise. This is an alternative form of the next to the last sentence of the opening paragraph of the Introduction.
There can be functional relationships among the regressors of (2). These relationships between each of the excluded regressors (x * Kit , . . . , x * L it −1,it ) and the included regressors (x * 1it , . . . , x * K−1,it ) can be written as where λ * jgit = ∂x * git /∂x * jit , keeping the included regressors other than x * jit constant, if x * jit is continuous and λ * jgit = x * git / x * jit with the right sign if x * jit is discrete and λ * 0git = x * git − K−1 j =1 x * jit λ * jgit . These definitions do not misspecify the true functional form of equation (4), which is exact. Here again we are relying on the theorem that states that any nonlinear equation with unknown functional form can be represented by a time-varying coefficient model. We do not necessarily observe excluded variables, or even know what they are, but we do know that if the relationships between these variables and the included ones exist, then they can be written in the form of (4). Without using these auxiliary regressions in (4) it is not possible to derive a model with unique coefficients and error term, as shown by Swamy and Hall (2012).

PROPOSITION 5. Each coefficient of equation (4) can be a function of all of its regressors.
The proof is analogous to that of Proposition 1.
Substituting the expression on the right-hand side of the equality sign in (4) for x * git in (2) gives Note that the intercept λ * 0git of equation (4) is the portion of the excluded variable x * git remaining after the effect K−1 j =1 x * jit λ * jgit of the included regressors on it has been removed from it. It can be seen from equation (5) that in conjunction with the included regressors, x * 1it , . . . , x * K−1,it , these portions, λ * 0git , g = K, . . . , L it − 1, of excluded regressors are at least sufficient to determine y * it exactly. This proves that the variables λ * 0git , g = K, . . . , L it − 1, form "sufficient sets" of excluded regressors [see Pratt and Schlaifer (1988)]. It should also be noted that the coefficients α * 0it , α * 1it , . . . , α * L it −1,it , as well as the coefficients of equation (4), have the correct functional forms. Therefore, the second term L it −1 g=K λ * 0git α * git on the right-hand side of the equality sign in (5) is a function of sufficient sets of excluded regressors with the correct functional form. Pratt and Schlaifer (1988, p. 34) show that it is correct to take this function as the error term of model (5). This error term is different from the log of A t shown in Petri (2004, p. 330). The error term of model (1a) is not a function of "sufficient sets" of excluded regressors with the correct functional form and hence the model is misspecified, giving rise to a further problem.
The second term, L it −1 g=K λ * jgit α * git , of the coefficient of x * jit in (5) arises as a result of excluding x * git , g = K, . . ., L it − 1, of (2) from (5). It is for this reason that the term L it −1 g=K λ * jgit α * git is called "excluded-variables bias." When we are given that x * git , g = K, . . ., L it -1, are the regressors excluded from (5), this bias is unique. 12 Model (1a) does not account for the excluded-variables bias and has nonunique coefficients and error terms. We call this the fifth problem with this model.
This proposition can be false if the α * it 's in (5) are replaced by the β * it 's in (1c) because the functional form of (1b) is restrictive.

Accounting for Measurement Errors
The set of observable counterparts (x 1it , . . . , x K−1,it ) of the unobservable regressors (x * 1it , . . . , x * K−1,it ) is divided into two subsets, denoted by S 1 and S 2 . All the observable regressors of (5) that take the value zero with probability zero are included in S 1 and the remaining observable regressors of (5) that take the value zero with positive probability are included in S 2 . 13 To set up model (5) for estimation, we write it as where the dependent variable and the regressors are observable and the coefficients are not the same as the first K coefficients of (2), because we have excluded some regressors from (2) and introduced measurement errors into the included variables to obtain (6). We now display the exact relationships among the coefficients of (2), (4), (5), and (6). These relationships are With the exception of the components (i) ν 0it , containing measurement errors, we have already interpreted all the components of the coefficients of model (6). The interpretations of these components are as follows: (i) is the measurement error in the dependent variable y * it ; (ii) is the sum of measurement-error bias components of the coefficients of x jit ∈ S 2 ; and (iii) is the measurement-error bias component of the coefficient of x jit ∈ S 1 .
To recapitulate, first, following Sraffa (1961), we have used certain unobservable measures with absolute precision to build the theoretical model in (5) with the correct functional form and unique coefficients and error term. Next, we have inserted the appropriate measurement errors in the right places in (5) to obtain (6). Because our knowledge of the real economic relations is not complete and our observations on their variables are not perfect, we cannot avoid excluded-variable and measurement-error biases. Yet we have accounted for them with precision. These results imply that model (6) is correctly specified. In the next section, we use the appropriate method to estimate these specification biases and remove them from the estimates of the coefficients of (6).

ESTIMATION OF MODEL (6)
So far, we have not made any parametric assumptions. We have displayed in (2) a representation of the real-world microproduction function, which is shown to be valid, and displayed in (6) its form that can be estimated. We then showed exactly what the relationship between (2) and (6) is. To a large extent, we have covered most of the difficulties raised in the capital controversy, which essentially resulted from a theoretical debate, and shown how a microproduction function may be specified to overcome these difficulties. We now go a little beyond the controversy to discuss how such a function can be consistently estimated.
Simultaneous estimation of γ jit and its components is the only superior method that can provide good estimates of the components of γ jit in (7) and (8). To perform this estimation, we need to make a parametric assumption regarding the relationship between the coefficients of (6) and their observable drivers. Here, it should be noted that any method of decomposing γ jit will not give good estimates of its components unless the correlations between the regressors and γ jit 's of (6) are taken into account. To take such correlations into account, we assume Assumption I. For j = 0, 1, . . . , K -1, where z 0it = 1 ∀ i and t, and the z hit are called "the coefficient drivers." It is assumed that conditional on these coefficient drivers, the ε jit 's are distributed with mean zero and are serially and contemporaneously correlated, as in Swamy et al. (2010).
Assumption II. The regressors of model (6) are conditionally independent of their coefficients given the coefficient drivers.
Assumption III. For each j, the p coefficient drivers in (9) are grouped into three sets, denoted by A 1jit , A 2jit , and A 3jit , such that for j = 0, h∈A 1jit z hit π jh , h∈A 2jit z hit π jh , and h∈A 3jit z hit π jh + ε jit + j ∈S 2 ( h∈A 3jit z hit π jh + ε jit ) have the same sign, the same magnitude, and the same cross-sectional and temporal movements as α * 0it , L it −1 g=K λ * 0git α * git , and ν * 0it − j ∈S 2 ν * jit (α * jit + L it −1 g=K λ * jgit α * git ), respectively; for j ∈ S 1 , h∈A 1jit z hit π jh , h∈A 2jit z hit π jh , and h∈A 3jit z hit π jh + ε jit have the same sign, the same magnitude, and the same cross-sectional and temporal movements as α * jit , x jit ), respectively; for j ∈ S 2 , h∈A 1jit z hit π jh and h∈A 2jit z hit π jh have the same sign, the same magnitude, and the same cross-sectional and temporal movements as α * jit and L it −1 g=K λ * jgit α * git , respectively, during estimation and forecasting periods.
This assumption requires that the number p be much larger than 3. Assumption I says that each coefficient of (6) essentially consisting of three components is a linear function of a set of observable drivers. For each (j, i, t), Assumption III specifies three groups of drivers and establishes a connection between the components of the coefficient and the three groups of drivers. These groupings can be different for different coefficients and can vary across i and over time. The three groups chosen for a coefficient on a nonconstant regressor divide its cross-sectional and time variations into three types, one type coming from the true nonlinearity in the corresponding coefficient of (2) and the other two types coming from excluded regressors and measurement errors, respectively. This assumption having been made, it is then possible to remove the bias movements from the total variations in the coefficients of (6) and attempt to get back to the first K coefficients in (2). Computational details of separating the estimates of the α * jit 's from those of the rest of the components in (8) are provided in Swamy et al. (in press).
Substituting the expression on the right-hand side of the equality sign in (9) for γ jit in (6) gives Swamy et al. (2010) show that the estimators of the coefficients and the predictors of the errors of model (10) given by an iteratively rescaled generalized least squares (IRSGLS) method have desirable sampling properties. 14 Under Assumption III, the coefficients and error terms of model (10) have the correct interpretations. This model gives good fits and does not give perfect or over fits to data on its variables. The "Shaikh critique" is reproduced in Petri (2004, pp. 330-332). Even though Shaikh and Petri (2004, pp. 324-340) criticized aggregate production functions, some of their criticisms can also be applied to microproduction functions. Such criticisms do not apply to the model in (5)- (10). Comparing the production functions Shaikh criticized with the model in (5)- (10) shows that his critique is aimed at the real production-function man who uses production functions that have incorrect functional forms and nonunique coefficients and error terms with no realistic meanings. The model in (5)-(10) is not that type of production function.

Elimination of Excluded-Variable and Measurement-Error Biases
Under Assumptions I-III, the components of the coefficients of model (6) that are devoid of incorrect functional-form, excluded-variable, and measurement-error (specification) biases are where the symbol ≈ means "approximately equal to." This approximation arises when it is not possible to satisfy Assumption III exactly. An estimate of α * jit , denoted byα * jit , is given by ( h∈A 1jit z hitπjh ), whereπ jh is the IRSGLS estimate of π jh .
The marginal product of a factor: Let x * j it be any one of the inputs x * jit , j = 1, . . . , K − 1. Then the marginal product (MP j it ) of x * j it is the rate of change of its total product with respect to variations of its quantity holding the values of all the regressors x * it , = 1, . . . , L it −1, other than x * j it constant. It follows from Assumptions I-III that this marginal product without specification biases is equal to α * j it , which is approximately equal to ( h∈A 1j it z hit π j h ): where substitutability between the inputs is assumed. This assumption may not hold, as we show in the following. Let MP j it denote an estimate of MP j it obtained by replacing the π 's in (12) with their IRSGLS estimates. In (12), we can get the exact partial derivative if the correct functional form of f it (.) in (1) is known. The situations where this condition holds exactly are rarely, if ever, obtained. The marginal product in (12) of an input is estimated under the assumption of substitutability between the inputs.
The output elasticity of a factor is, an approximate estimate of which is x j it y it MP j it . The rate of technical substitution (RTS): Let x * j it and x * j it be any two of the inputs x * jit , j = 1, . . . , L it − 1. Let the partial derivatives α * j it and α * j it of f it (.) in (1) with respect to x * j it and x * j it be denoted by MP j it and MP j it , respectively. Then the slope of the tangent at a point on the isoquant is the rate at which x * j it must be substituted for x * j it in order to maintain the corresponding output level. This slope is An estimate of RT S j j it is MP j it / MP j it . We have already pointed out that the assumption of substitutability between the inputs involved in (14) may not hold. We discuss this point further in the following.
Elasticity of substitution: The proportionate rate of change of the input ratio x * j it / x * j it divided by the proportionate rate of change of the ratio MP j it /MP j it (or the rate at which substitution of x * j it for x * j it takes place along the isoquant) is where f j it = α * j it , f j it = α * j it ; f j j it and f j j it are the second direct partial derivatives of f it (.) in (1); and f j j it and f j j it are the second cross partial derivatives of f it (.).
We cannot find the derivatives of a function without knowing its functional form. For this reason, we cannot apply the second formula in (15) tof it (.). However, we can obtain an approximate estimate log(x j it /x j it )/ log( MP j it / MP j it ), where is written for the change in log(.), of the first formula in (15) if the derivatives in this formula exist. These derivatives may not exist.
The phenomena of reswitching have led some economists to express extreme doubt about the validity of the factor substitution mechanism and "thus [in] the [very] foundation . . . of the supply-and-demand approach to distribution" [see Petri (2004, pp. 10-16, 221-222)]. In Samuelson's (1966) example of reswitching, two different techniques, labeled "a" and "b," are considered. At different values of an interest rate (or a rate of profits) along any discrete downward-sloping segment of the demand for capital (per unit of labor), the value of "capital" is different for a physically unchanging technique. Changes in the value of "capital" for a physically unchanging technique arise from inventory revaluations of the same physical stock due to new capital goods prices. Also, in the same example at lower values of the interest rate, the cost-minimizing technique "switches" from a to b and then "reswitches" back to a because of differences in the physical stock of capital goods [see Cohen and Harcourt (2003, pp. 202-203)]. This result implies that, given the technical production coefficients, the relationship between the rate of profits (or rate of interest) and the price of a commodity relative to that of another commodity need not be monotonic, as pointed out by Sraffa [see Petri (2004, pp. 210-211)]. Furthermore, the demand for capital may be either negative or positive with respect to an interest rate, with extreme equilibrium distributive values such as zero wages or zero interest rate [Garegnani (1970); Lazzarini (2008, p. 12)]. All this is to say that there are situations where the derivatives in the first formula of (15) do not exist. Our finding is that these derivatives can exist if the conditions under which reswitching occurs are unreal, assigning wrong functional forms to (1) and do not exist otherwise. PROPOSITION 7. The price of the services of a factor of production, say the j th, will be equal to its marginal value product at least approximately if its employer has the ability to obtain an approximate estimate of its marginal value product in (12) and has the willingness to pay the factor its approximate marginal value product.
The proof is immediate.
Harcourt (personal correspondence, 2011) pointed out that "Basically the theoretical arguments [involved in the Cambridge controversy] are concerned with the conceptual base of two competing visions of how capitalism works, and especially about the processes of accumulation and distribution. The mainstream builds on an (Irving) Fisherian base where the consumer is king and drives the system along through lifetime consuming and saving decisions. The alternative is classical, Marxian, Keynesian, Kaleckian whereby the capitalist class . . . dominate[s] and profit-making and accumulating are ends . . . [and] ways of life. The origin of profits is to be found by the creation of the potential surplus in the sphere of production and its realization in the sphere of distribution and exchange by the accumulating and saving behavior principally of the capitalist class. That is where the meaning of capital comes in in the two strands, with measurement a corollary. As it is a doctrinal debate, if it is possible to show that fundamental conceptual conjectures do not go through even in the most abstract and ideal circumstances (that is what the capital theory results showed) then empirical testing is beside the point because the conjectures and inferences from the abstract models are not in a form for empirical testing." Proposition 7 questions either vision of how capitalism works. The consumer may overpay for the goods and services he buys, because he has no idea of the prices corresponding to normal profits to the capitalist class. This class obtains normal profits if it charges the consumer the prices for the goods and services corresponding to normal profits and pays for its employees their marginal value products. Proposition 7 shows that this may not happen. Proposition 7 is based on the correct conceptual base of how capitalism works. Model (1) is the most abstract model. It is incorrect to say that empirical testing is beside the point because the conjectures and inferences based on model (1) are empirically testable via model (10). Several economists pointed out that theory without measurement and measurement without theory are the two extremes and should be avoided.
Sraffa argued that empirical work could be approximate but theory had to be precise. We agree with the modified statement that theory can get as precise as model (5) and only empirical work based on Assumptions I-III is approximate. Model (5) does not suffer from incorrect functional-form biases but suffers from excluded-variable biases. Model (6) does not suffer from incorrect functionalform biases but suffers from excluded-variable and measurement-error biases. Proposition 7 is a value-free objective social science. Temporal aggregation of model (2) produces biases from collapsing the long period into the short period. This paper shows the relevance of the Cambridge controversy to econometric practice; for an alternative, or perhaps a complementary view, see Harcourt (2007).
Homogeneous production functions and their properties: where κ is constant and τ is any positive real number. If f it (.) is homogeneous of degree κ, returns to scale are increasing, constant, or decreasing according as κ > < 1. Unfortunately, equation (5) (.) in (1) is homogeneous of degree one and if the firm were to pay the suppliers of each input its marginal physical product, then total output would be just exhausted. (ii) In the case where f it (.) is not linear homogeneous, what exhaust total output are the terms on the right-hand side of (5).
Proof. Although a proof of Proposition 8(i) is given in microeconomic textbooks, the equality of the left-and right-hand side quantities of (5) proves Proposition 8(ii).
The capitalist class can acquire the output shares of the first, second, and fourth terms on the right-hand side of (5).
Total factor productivity (TFP): The ratio of y * it to a function of inputs given by (5) is where the denominator is the same as the right-hand side of (5) without its intercept. An important advantage of this denominator is that it is devoid of incorrect functional-form and measurement-error biases. The function of inputs used by others in the denominator of their TFP ratio suffers from both these biases [see Blazek and Sickles (2010)]. Another important advantage of the measure in (17) is that it considers all relevant inputs including excluded inputs. 15 Under Assumptions I-III, an approximate estimate of (17) where theπ 's are IRSGLS estimates. 16

CONCLUSIONS
Without misspecifying their functional forms, we derive new representations of microproduction functions that are not subject to the criticisms of neoclassical production functions made by Sraffa, Robinson, Garegnani, and Fisher, among others. These new functions allow capital and labor to be heterogeneous and allow different firms producing physically homogeneous outputs to have different capital intensities. Our treatment of heterogeneous capital is not subject to the criticisms of Sraffa, Robinson, and Garegnani. Within the framework of the new functions, reswitching of techniques can occur, provided the conditions under which they occur do not misspecify the functional forms of the real-world production functions. The new functions are not subject to spurious correlations and allow for excluded regressors. The new representations of the microproduction functions have unique coefficients and error term. Whenever substitutability between factors of production is possible, the formulas for the marginal product of a factor, the rate of technical substitution, and the elasticity of substitution derived in this paper are not subject to incorrect functional forms, excluded-variable, and measurementerror biases. We have also derived a new formula for total factor productivity that does not contain measurement-error biases. NOTES 1. Felipe and Fisher (2003) survey the relevant literature. 2. See Basmann (1988, p. 73). For example, the relationship between the independent variable x t and the error term ε t is changed when the equation y t = β 0 + β 1 x t + ε t is changed to y t = β 0 + (β 1 + a)x t + (ε t − ax t ). Here (β 0 ,β 1 ) and ε t are not invariant. To make them invariant we have to have an additional equation relating ε t to x t with the correct functional form. See Swamy and Hall (2012, Appendix). This simple example shows that consistent estimation of model parameters is not possible if the coefficients and error term of the model are not unique.