LM Test of Neglected Correlated Random Effects and Its Application

This article aims at achieving two distinct goals. The first is to extend the existing LM test of overdispersion to the situation where the alternative hypothesis is characterized by the correlated random effects model. We obtain a result that the test against the random effects model has a certain max-min type optimality property. We will call such a test the LM test of overdispersion. The second goal of the article is to draw a connection between panel data analysis and the analysis of multiplicity of equilibrium in games. Because such multiplicity can be viewed as a particular form of neglected heterogeneity, we propose an intuitive specification test for a class of two-step game estimators.


INTRODUCTION
Econometric model specification is usually tested by exploiting one of the three principles, including (i) the Wald test, which is based on the asymptotic distribution of parameter estimates; (ii) the likelihood ratio (LR) test; and (iii) the Lagrange multiplier (LM) procedure, which is based on the derivative of the log-likelihood (score) imposing the hypothesis. (The relationship among the three tests was reviewed by, e.g., Engle (1984).) The LM procedure seems to dominate the other two in terms of convenience when the objective of the test is to detect neglected heterogeneity in a given econometric model. The LM test statistic is solely based on the parameter estimates under the null hypothesis that there is no neglected heterogeneity and, as such, it eliminates the burden of specifying and estimating the model under the alternative hypothesis that there is some neglected heterogeneity.
The possibility of using the LM test as a way of detecting individual heterogeneity was discussed by Breusch and Pagan (1980), Chesher (1984), Lee and Chesher (1986), and Hahn, Newey, and Smith (2014), among others. These tests can be viewed as a version of White's (1982) information matrix test, as discussed by Chesher (1984). Alternatively, they can be understood to be a test of overdispersion as in Cox (1983), which is our interpretation in this article. Implicit in these tests is the assumption that the neglected heterogeneity (under the alternative hypothesis) is more or less independent of the randomness that generates the main model, although the assumption is made more explicit in Hahn, Newey, and Smith (2014, Lemmas 3.1 and 3.2). In the panel data context, this assumption amounts to the random effects specification, as developed in Balestra and Nerlove (1966) or Maddala (1971). In other words, in panel data applications we may roughly say that these tests are designed to detect random effects, but not fixed effects (or equivalently, correlated random effects). It is therefore of interest to revisit the LM test and analyze the test of neglected heterogeneity under more general conditions. This article aims at achieving two distinct goals. The first is to extend the existing LM test to the situation where the alternative hypothesis is characterized by the fixed effects model, where we adopt the convention of equating the fixed effects model with the correlated random effects model (see, e.g., Chamberlain 1984). We begin by analyzing a model where the (local) alternative hypothesis is characterized by an arbitrary correlated random effects model, where the mean independence assumption is violated. We obtain a result that the test to detect such an arbitrary correlated random effects would take the form of the test of conditional moment restriction. We then go on to analyze a more restricted version of correlated random effects, where the effects are assumed to be mean independent of the covariates of interest. For the latter case, we obtain an interesting result that the test against the random effects model, where the conditional variance of the effects does not depend on the covariates, has a certain max-min type optimality property. We will call such a test the LM test of overdispersion. In addition, while we show the LM test of overdispersion has no power in linear models where the mean independence assumption is violated, we show the test has some robustness properties against such violations in nonlinear models. As an example, we show the test has power against the arbitrary correlated random effects alternative in the case of the logit model commonly used in applications.
These results suggest that the most natural applications for the LM test of overdispersion are nonlinear models where the mean independence assumption is expected to hold. We find such an application in the context of two-step semiparametric estimation of Bayesian game models commonly applied to the analysis of environments with social or strategic interactions among agents. This leads to the second goal of the article, which is to draw a connection between panel data analysis and the analysis of multiplicity of equilibrium in games. Such multiplicity can be viewed as a particular form of neglected heterogeneity, and we propose an intuitive specification test for a class of two-step game estimators. Moreover, the canonical specification of these games is a system of logit discrete choice models, to which we show the test is robust against more general forms of neglected heterogeneity that may arise from elsewhere in the model. Two-step estimators of Bayesian game models exploit the two requirements of Bayes-Nash equilibrium. (See, e.g., Aguirregabiria and Mira 2002;Bajari et al. 2011.) First, equilibrium requires that players' strategies are optimal given their beliefs about the distribution of opponent play and, second, that these beliefs are consistent with the true equilibrium distribution, conditional on any common knowledge determinants of player payoffs. Under some assumptions about the nature of unobserved heterogeneity, the latter assumption allows the econometrician to, in the first step, nonparametrically estimate player beliefs directly from the data by estimating conditional equilibrium choice probabilities. Treating these estimates as data, the problem of estimating the model of interdependent actions reduces to one of estimating a single agent choice model in the second step. This two-step procedure allows the model to be estimated without ever having to solve the model, an often computationally expensive task. (This insight has been particularly important in the literature on estimating dynamic games (Aguirregabiria and Mira 2007;Bajari, Benkard, and Levin 2007;Pakes, Ostrovsky, and Berry 2007;Pesendorfer and Schmidt-Dengler 2008), where calculating equilibrium is so computationally expensive that "full solution" approaches to estimation are generally infeasible.) The first-stage estimates conditional choice probabilities generally require pooling across a large number of observed game outcomes, which necessitates restrictive assumptions about two types of unobserved heterogeneity. First, we require there be no "persistent unobserved state variables." That is, we rule out, for example, random and fixed effects in player payoff functions. Second, while the underlying game is permitted to have multiple equilibria, we have to assume a single equilibrium is played in the data. If the first-stage estimates pool across games where multiple equilibria are played, the resulting choice probabilities will be a weighted average of these equilibria, where the weights are given by the equilibrium selection probabilities. Nonlinearity in the second step estimator then leads to inconsistent parameter estimates. Some progress has been made on relaxing these assumptions (see Aguirregabiria and Nevo 2013 for a survey), however, these fixes often only apply to relatively restrictive classes of games and outside these classes custom solutions must be developed. Moreover, implementing these solutions often require increasing the computational complexity and burden of estimation. These considerations suggest that a simple specification test of these assumptions may be an attractive tool for practitioners. The LM test of overdispersion naturally applies here. Our proposed LM test can be viewed as a simple triage tool that allows the researcher to estimate the relatively inexpensive, restrictive model and then use the LM test to decide whether the more complicated model is needed. We demonstrate the performance of our test in detecting failure of the single equilibrium assumption in a Monte Carlo exercise based on the application of Sweeting (2009), who analyzes the timing of radio commercials among rival stations.

TEST OF NEGLECTED CORRELATED RANDOM EFFECTS-LM TEST REVISITED
Assume that we observe a random sample (Y i , X i ), i = 1, . . . , n. The Y and X can be vectors. For example, in the panel data analysis where each individual is observed over T time periods, we will have Y i = (Y i1 , . . . , Y iT ) and X i = (X i1 , . . . , X iT ). We will assume that the conditional density of Y given X is given by the density f ( y| x, θ, γ ), where (θ, γ ) is the parameter that characterizes the density. Under the null hypothesis, γ is fixed at ξ , but under the alternative hypothesis, the γ may be a random variable potentially correlated with X. Our purpose is to develop an LM test to detect such neglected heterogeneity in γ . For simplicity, we will assume that the γ is a scalar. Discussion for the case of multivariate γ can be found in the Appendix B. Following the convention in the literature, for example, Chesher (1984), we will also treat both θ and ξ as known/fixed, which is the basis of our analysis throughout the rest of this section. Note that we can write f ( y| x, θ, γ ) = f ( y| x, γ ) without loss of generality.
We will assume that under the alternative hypothesis, the γ is drawn from a distribution whose density is where η > 0 and Note that the conditional density of γ given X is allowed to depend on the realization of X.
In the existing literature, including Breusch andPagan (1980), Chesher (1984), Lee and Chesher (1986), and Hahn, Newey, and Smith (2014), it is (implicitly or explicitly) assumed that (i) γ and X are independent of each other and (ii) the mean of γ is ξ , so that b (x) = 0.
Our specification and analysis differ from the existing literature in that both of these restrictions are relaxed. In what follows we consider two cases, (i) when b (x) is unknown and not necessarily zero and (ii) when b (x) = 0.

Case 1: Arbitrary b (x)
When the conditional density function of γ is 1 η g( γ −ξ η x) as in (1), the density function of the observation Y is By the change-of-variable γ = ξ + ηw, we have The null hypothesis of interest is that there is no heterogeneity in the parameter γ . In the model above, this is equivalent to the test of the null H 0 : η = 0. Our LM test is based on evaluated at η = 0. We can easily see that at η = 0, we have where The analysis leading to (4) implies that the LM test reduces to a test of the moment restriction If we are to have power against all possible distributions of γ , we need the above conditional moment restriction to hold for all (square integrable) b (x)'s. Obviously, it is equivalent to the test of conditional moment restriction We have shown that test of H 0 : E[s γ (y|x, ξ )|x] = 0 can be viewed as an LM test of neglected heterogeneity with arbitrary b (x), which does not seem to be known to the profession, at least not explicitly. That being said, we avoid further discussion of how this test should be implemented for two reasons. First, there already exist many articles that show how generic tests of conditional moment restrictions can be implemented. These articles consider a restriction of the form E [ g (z)| x] = 0 for any given function g (·). A partial list includes Newey (1985), Bierens (1990), and Donald, Imbens, and Newey (2003). Users interested in testing E[s γ (y|x, ξ )|x] = 0 can simply use one of the existing tests; all they need to do is to replace g by s γ . Given that there is not much we can add to the technical discussion on the generic test of conditional moment restrictions, it is deemed sufficient to point out the existing literature. Second, the test of the conditional moment restriction (6) may be avoided from a practical point of view. We find in Section 2.2.2 that for some nonlinear models, the LM test based on the restriction b (x) = 0 does seem to have power even when b (x) = 0. This seems to indicate that unless the LM test is specifically applied to linear models, a user may settle with the seemingly restrictive test discussed in Section 2.2. Obviously this is reasonable when the user's sole objective is to detect neglected heterogeneity, and it does not mean that the restriction E[s γ (y|x, ξ )|x] = 0 is unimportant in itself.

Example 1. Consider the model
where ε it is iid N 0, ω 2 with ω 2 = 1 known. We have and the test of interest is of the conditional moment restriction Remark 1. In Example 1, a simple practical version of the test is a J-test for the moment (here we treat θ and ξ to be known, although in practice, we would plug in a consistent estimators of θ and ξ . Therefore, the critical value in practice would have to reflect the sampling variation of the plugged-in estimators as well as the test statistic itself): We now consider the special case, where we impose the restriction that b (x) = 0. The restriction leads to the problem that the LM test statistic in (4) becomes identically equal to 0. To tackle this problem, we follow Chesher (1984) and Lee and Chesher (1986), for example, and work instead with √ η, that is, It is straightforward to show that the LM test should be based on the √ n-standardized sample average of 1 2 where our test statistic would take the form Under some regularity conditions, it can be shown that (this is a straightforward consequence of Lemma A.1 in Appendix A): as n → ∞, under the null hypothesis that η = 0. (10) is inconvenient because it requires that σ 2 (x) be specified. In this section, we provide a justification of using σ 2 (x) = 1 from a (local) max-min perspective. For expositional simplicity, we assume that ξ and τ 2 (·) are known. Now suppose that nature chooses an alternative den-

Max-Min Consideration. The test statistic in
Then, under some regularity conditions, it can be shown (see Lemma A.1 in Appendix A) that under the sequence of distributions characterized by the density we have This implies that under the local alternative (11) chosen by nature, the test statistic T n σ 2 in (10) converges to a noncentral χ 2 distribution with the noncentrality parameter equal to Suppose that c is the asymptotic critical value of the test statistic T n (σ 2 ). Suppose that P n,g * denotes the probability measure with density equal to (11) with w 2 g * (w|x)dw = σ 2 * (x) chosen by nature. Then, the asymptotic local power of the test statistic T n (σ 2 ), lim n→∞ P n,g * [T n (σ 2 ) ≥ c], is determined by the noncentrality parameter φ(σ 2 (·), σ 2 * (·)) that depends only on σ 2 (·) and σ 2 * (·). Our max-min justification is given in the following result: where σ 2 (x) and σ 2 * (x) are normalized such that The solution is such that σ 2 (·) is a constant function.
Proof. In Appendix A.
Our discussion above suggests that it would be reasonable to take σ 2 (X i ) = 1 and can be justified by the max-min principle. Assuming that τ 2 (X i ) > 0 almost everywhere, we can see that the φ(1, σ 2 * (·)) > 0 for any σ 2 * (X i ) as long as σ 2 * (X i ) > 0 with positive probability. Given that the probability σ 2 * (X i ) > 0 is positive for any reasonable σ 2 * (X i ), we can conclude that the test statistic T n (σ 2 ) with σ 2 (X i ) = 1 has a local power against any reasonable alternatives. Proposition 1 has an intuitive interpretation that the random effects specification has the max-min property from a testing perspective. The local power only depends on the conditional second moment and therefore, the random effects specification where γ is independent of x can be argued to be the optimal specification.
2.2.2 Additional Consideration. We ask whether the test of overdispersion has any power against the local alternative where the mean independence assumption may be violated. Test of the conditional moment restriction (6) is not expected to have any power if the alternative satisfies the mean independence assumption b(x) = 0. Although the LM test of overdispersion is developed under the assumption that b(x) = 0, if it has a power even when b(x) = 0, we may argue that it is a more robust than the test of conditional restriction. We provide two examples, and discuss the power of the LM test of overdispersion.
Given the max-min consideration, we will only consider the case where σ 2 (·) = 1, and examine the power of the test statistic (10) under the local alternative where the mean independence may be violated: Proposition 2. Under the sequence of distributions characterized by the density (3) with η = 1 √ n , the test statistic (10) is asymptotically noncentral χ 2 with the noncentrality parameter Proof. Using the same argument as in Appendix A with κ = n −1/2 , we can show that 1 and variance equal to E τ 2 (X i ) , from which the conclusion follows.
Below we present two examples, one of which is such that the local power is zero because υ(X i ) = 0, and the other one is such that it is not necessarily the case.
Example 2 (Example 1, continued). The textbook linear model in Example 1 is such that we have and using the symmetry of the normal distribution, we obtain υ(X i ) = 0.

Example 3 (Panel Logit). Consider the panel logit model where
We can see that

MONTE CARLO
A potentially important application of our test arises in the context of estimating static binary choice models with social and strategic interactions. These models have been employed to analyze a wide variety of applications including firm entry (Seim 2006), the timing of radio commercials (Sweeting 2009), labor force participation (Bjorn and Vuong 1984), teen sex (Card and Giuliano 2013), and many more. These models can be conveniently estimated by a two-step procedure. In the first step, the economist estimates the so-called conditional choice probabilities(CCP). These CCPs are then used as an ingredient in the second step to estimate the structural parameters. (The basic two-step structure of this model was pioneered by Hotz and Miller (1993) in the context of dynamic discrete choice models and has subsequently been extended to static discrete games (Aguirregabiria and Mira 2003;Bajari et al. 2010) and dynamic games (Aguirregabiria and Mira 2007;Bajari, Benkard, and Levin 2007;Pakes, Ostrovsky, and Berry 2007;Pesendorfer and Schmidt-Dengler 2008).) The first-stage CCP estimates require pooling of observations across games and agents, which leads to two sources of potential neglected heterogeneity that may invalidate the approach. First, unobserved individual or game/group level payoff heterogeneity and second, multiple equilibria being played across games. Both of these make the CCP inconsistent estimates of equilibrium choice probabilities and, in turn, lead to inconsistency of the second-stage estimates. We will illustrate how the LM test statistic can be used to detect such problems, exploiting the intuition by, for example, Bajari et al. (2011) that multiple equilibria in games can be viewed as a particular from of unobserved heterogeneity.
We start with a brief description of the model. Assume that the econometrician has access to data {D ig , X ig } g=1,...,n i=1,...,K g on binary actions, D, and covariates, X, for K g agents interacting in each of n groups/games. The objective of estimation is to estimate payoffs u(d ig , d −ig , x ig ; θ ) that rationalize observed choice behavior, where D −ig is the vector of choices of all agents except i. We assume utilities depend on the sample average of choices of other players and covariates in the following linear, separable way: Following Bajari et al. (2010), we assume the data are generated by a Bayes-Nash equilibrium of a game of incomplete information where players choose actions, d ig ∈ {0, 1} to maximize where ε 0ig (d ig ) is an iid, across agents and games, private information payoff shock drawn from a commonly known distribution and we use the convention X g = (X 1g , . . . , X K g g ), and the probability f (d ig = 1|x g ; θ ) that player i chooses d ig = 1 is The structure above leads to a natural two-step estimator for θ as discussed in the beginning of this section, using (ideally nonparametric) estimates of the CCP (in the common usage the CCP for this example would be the individual conditional choice probability E[D j |X g ]. Here, the assumption that payoffs only depend on average rival actions and these enter only linearity in the payoff function means the scalar, conditional average choice function, (13) is sufficient for the vector of individual choice probabilities): When there are multiple equilibria, E[ 1 K g −1 j =i D jg |X g = x g ] can be viewed as a random variable γ (x g ); the moment Equation (13) defines a pseudo-parameter ξ (x), which is an average of the γ (x) weighted by the probability of equilibrium selection. It follows that the LM test statistic of interest in this application is simply Obviously, one needs to adjust for the error of estimating θ as well as ξ (X i ). The ξ (X i ) is often estimated nonparametrically, and as such, analysis along the lines of Newey (1994) is necessary to characterize the asymptotic variance. From a practical perspective, it can be accommodated either by the bootstrap result as in Chen, Linton, and van Keilegom (2003), or by the numerical equivalence result as in Ackerberg, Chen, and Hahn (2012). (In a supplementary appendix available upon request, we present a (straightforward) characterization of how the procedure in Ackerberg, Chen, and Hahn (2012) can be used in our context.) In principle, one may want to consider Wald or LR tests as well, using the recent progress on identification and two-step estimation of models with multiple equilibria and unobserved payoff heterogeneity, in particular, by Kasahara and Shimotsu (2009). However, in contrast to our LM test, both the Wald and LR tests require the econometrician to be able to specify the number of equilibria/types. Both full solution approaches, where the economist actually solves for equilibrium for every parameter guess in the estimation search algorithm, and two-step approaches admitting multiple equilibria/payoff heterogeneity entail a substantial increase in the computational burden of estimation. In this sense, our LM test can be looked at as a triage tool that allows the researcher to judge whether incurring such costs is necessary. Of course, any such pretesting strategy should account for the inherent potential distortions. Formal treatment of this problem is outside the scope of this article and any actual implementation will depend on the context of the application so we do not address these issues further here.
Our Monte Carlo exercise is inspired by the application of Sweeting (2009) (Sweeting (2009) discussed how "overdispersion" in the distribution of the CCP can be exploited for identification in the context of the two-step estimator, although he does not employ this approach and argues informally that the data support the single equilibrium assumption. Sweeting did develop a formal modified log-likelihood test for the nested fixed point estimator, but such a test cannot be developed for the two-step estimator because the CCP are not identified under the alternative of multiple equilibria), who estimated a strategic model of the timing of radio commercials by radio stations. There may be incentives for radio stations to coordinate the timing of their commercials so listeners do not have incentive to switch stations when commercials come on because, if they do, they will just tune to another commercial. We consider the simplified version of the model discussed earlier, and assume there are K stations competing in each of n games. For ease of exposition, we assume the players are local in the sense that each player only competes in 1 game. Stations compete by choosing a binary action d i ∈ {0, 1}, which can be interpreted as a choice to play commercials at :50 or :55 min past the hour. The payoffs associated with each choice are special cases of the payoffs presented above and are given by We assume X ig ∈ {−1, 0, 1} and are drawn iid uniform across players and games. Finally, we assume the ε's are drawn iid from a Type 1 extreme value distribution. The probability of observing d i = 1 is given by In the second stage, we formulate a (pseudo) maximum likelihood estimator based on the logit conditional choice probabilities: where we have used exchangeability of the X's to rearrange X g as X i g = (X ig , X −ig ) and drop the i index on the CCP andP is the first-stage CCP estimate: Lettingθ = (β,Ĵ ) denote the maximum likelihood estimates, our test statistic iŝ For our Monte Carlo experiments, we set K = 5. We first solve for all the stable equilibria of our game conditional on X g , we then generate data assuming the X g 's are drawn from a discrete uniform distribution in the population. To generate choice data under the single equilibrium assumption, we randomly selected an equilibrium from the calculated set of equilibria for each X ∈ supp(X) and then generated the profile of choices by drawing from the distribution induced by this equilibrium for every game in which the realization of X g was X. That is, conditional on X, a single equilibrium is played, though multiple equilibria may be played across the different X's. To generate the data under the alternative where the single equilibrium assumption fails, we randomly select an equilibrium from the calculated set of equilibria for each realization of X g in the data and then generate choices by drawing from the distribution induced by this equilibrium. In both the case where the single equilibrium assumption holds and where it does not, we use the same equilibrium selection mechanism. To define the mechanism, we start by using iterative updating of strategies to find an equilibrium associated with an initial starting strategy profile that has all players choosing an action with probability one. There are 32(= 2 5 ) such profiles and all converge to one of the, at most, two stable equilibria of the game. For each X g , in the case of the single equilibrium assumption holding, and for each g, in the case of the single equilibrium assumption failing, the selection mechanism chooses uniformly from the 32 equilibrium realizations of one of the, at most, two equilibria found using our iterative updating algorithm. Intuitively, this causes the mechanism to, for example, tend to select the "high" equilibrium more often when covariate values tend to be higher. Defining the selection mechanism in this way is a natural way to generate dependence of the equilibrium selection on the covariates.
Tables 1 and 2 show the results of our Monte Carlo experiments. Table 1 displays the mean and standard deviation of NOTES: Fraction of 2000 simulated datasets where the null (a single equilibrium is played for each covariate configuration X) is rejected at the indicated significance level. Test statistic distribution estimated using 250 bootstrap repetitions. n is the number of observed games, where each game has five players. As described in the text, each game has five players, payoff shifters, X ig ∈ {0, 1}, are drawn uniform iid across players and games. Equilibrium selection is determined by iteratively updating best responses starting from a randomly selected deterministic strategy profile conditional on: (1) X under the null and (2) X g under the alternative. parameter estimates from simulations assuming different sample sizes and true underlying interaction effects.  Table 1 illustrate the impact of the failure of the single equilibrium assumption. Both the impact of fundamentals (β 1 ) and strategic effects (J) are attenuated. The reason for this is simple. Due to multiplicity, actual choice probabilities are either all high or all low depending on which equilibrium is selected. The CCP estimate lies in between as a selection-probability weighted average of these equilibria. In other words, our CCP estimates suggest that equilibrium choices are more independent than they really are and thus more consistent with a game with relatively low strategic interaction effects. Table 2 shows our multiplicity test performs quite well in this example. When strategic effects are weaker, variation in outcomes becomes decreasingly dominated by multiplicity and our test has less power in detecting a violation of the single equilibrium assumption. In our example, this happens both because the direct effect of X i on a player's payoffs is larger relative to the strategic effects and also indirectly because the equilibrium selection mechanism is relatively more dependent on X g .

A.1 Regularity Conditions and Preliminary Result
We first provide regularity conditions. Assumption A.1. Assume that (Y i , X i ) are iid.
We denote p (x) as the density function of X i and X as the support of X i . Assumption A.2. We assume that the conditional density of w on x, g * (w|x) , satisfies the following conditions. (i) Assumption A.3. We assume that (i) the conditional density f (y|x, γ ) is twice differentiable with respect to the argument γ, (ii) there exists a function F (y, x) such that for all x ∈ X and satisfying the following: is the bound of Assumption A.2(iii).

A.2 Limit T n σ 2 in (10) Under Local Alternatives
We prove that the test statistic based on (8), that is, is asymptotically normal with mean equal to and variance equal to Lemma A.1. Assume Assumptions A.1-A.2. Then, under the sequence of distributions characterized by the density √ n , the test statistic T n (σ 2 ) in (10) converges to a noncentral χ 2 distribution with the noncentrality parameter equal to We denote P κ , E κ , and V κ as the probability measure, the expectation operator, and the variance operator, respectively, defined by the density We first derive the limits Part (i): We apply the second-order Taylor approximation to k ( y| x, ξ ; κ) − f ( y| x, ξ ) as a function of κ around κ = 0. Then, whereκ lies between 0 and κ. Under Assumptions A.3 and A.2, we have Then, we have where the first equality holds by Assumption A.1 and the second equality holds since f γ γ ( y| x, ξ ) dy = 0 under Assumption A.3, and the third equality holds by the second-order Taylor approximation under Assumption A.3. Notice that by (A.3) and (A.4), we have for some finite constant M (ξ ). Then, by the dominated convergence theorem, we have Part (ii): As for the variance in (ii), we can also use the dominated convergence theorem under Assumption A.3. Note that It follows that Proof of Lemma A.1. We derive the limiting distribution under the sequence of local alternative. For this, we let and consider a triangular sequence {U i : 1 ≤ i ≤ n} and its underlying probability measure sequence P n −1/4 . Since {U i } i=1,...,n ∼ iid (0, 1) under P n −1/4 , by the Lindberg-Feller central limit theorem (CLT) for the triangular array, we have By Parts (i) and (ii) and (A.6), under P n −1/4 we have as required for Lemma A.1.
We consider two cases separately, (i) when X i is discrete and (ii) when X i is continuous.

B.1 Score
Now suppose that ξ, w ∈ R K , while η is a scalar. We interpret ξ + √ ηw is a random deviation from ξ in the random direction w with magnitude