Selection of Loss Function in Covariance Structure Analysis: Case of the Spherical Model

ABSTRACT In this paper, we derive the asymptotic properties of estimators obtained from various kinds of loss functions in covariance structure analysis. We first show that the estimators except for OLS-based loss functions have the same asymptotic distribution when the dimension of the covariance matrix, , is fixed and the sample size tends to infinity. Then, focusing on the spherical model, we show that this equivalence does not hold when both and become larger. Specifically, we show that some estimators lose consistency, and even consistent estimators have different asymptotic variances. Among the estimators considered, the maximum likelihood estimator shows the best performance, while the less famous invGLS(ub) estimator performs better than the commonly used GLS estimator. We also demonstrate the validity of the likelihood ratio test for the spherical and diagonal models in a high-dimensional framework.


Introduction
Many disciplines use covariance structure analysis or structural equation modeling (SEM), including psychometrics, economics, sociology, marketing, and so on. For covariance structure analysis using real-world data, the maximum likelihood (ML) estimator is the most widely used technique. This may be partly because most software programs used for SEM choose the ML estimator as a default, although several alternative estimators derived from different loss functions have been proposed in the existing literature. However, a systematic comparison between estimators has not yet been conducted. Therefore, in this study, we derive the asymptotic properties of estimators based on several forms of loss functions under the assumption of normality. 1 Specifically, we consider two asymptotic frameworks. First, since the dimension of the covariance matrix, denoted p, is small and the sample size, denoted by n, is large in typical covariance structure analysis, we consider the case where n tends to infinity while p is fixed. However, researchers have recently focused on cases in which p and n are large. Therefore, as the second asymptotic framework, we consider the case in which both n and p tend to infinity with p=n ! c, and c is a finite constant with 0 < c < 1. Typical examples with large p and n are factor models with many manifest variables and dynamic panel data models with a large time-series dimension. For factor models, Bai and Li (2012) derive the asymptotic properties of the ML estimator for explanatory factor models when both the number of manifest variables (p) and the sample size (n) are large. Deng et al. (2018) survey the recent literature on SEM with many variables. See also, for example, Tian and Yuan (2019) and Yuan et al. (2019) for recent studies on SEM with many variables. For dynamic panel data models, Bai (2013aBai ( , 2013b derive the asymptotic properties of the ML estimator when the cross-section and time-series dimensions become larger. In this paper, we first demonstrate that nine out of twelve estimators have the same asymptotic distribution when p is fixed and n is large. However, when both n and p are large, by considering the spherical model, we show that only five estimators are consistent, while the remaining estimators are inconsistent. Moreover, the five consistent estimators have different asymptotic distributions, and the ML estimator is the most efficient. We also find that the invGLS(ub) estimator, which received little attention in the literature and will be defined in Section 2, has better bias and efficiency property than the more famous GLS estimator. This finding suggests that the invGLS(ub) estimator deserves more attention in the literature.
We also study the specification test based on the likelihood ratio (LR) test. The SEM literature has repeatedly reported that the LR test suffers from a serious over-rejection problem unless p is much smaller than (see, Bentler & Yuan, 1999;Boomsma, 1982;Fouladi, 2000;Herzog et al., 2007;Hu et al., 1992;Moshagen, 2012;Nevitt & Hancock, 2004). Recently, Hayakawa (2019) has employed a Monte Carlo simulation to demonstrate that using a test proposed by Browne (1974), denoted T RLS , can address this over-rejection problem. However, he has not provided a theoretical justification under the large p and large n asymptotic frameworks. Therefore, in this study, we consider the spherical and diagonal models and show that T RLS is identical to some tests proposed in the literature. Specifically, for the spherical model, T RLS is identical to the sphericity test proposed by John (1971John ( , 1972, and Ledoit and Wolf (2002), and for the diagonal model, T RLS is identical to the independence test proposed by Schott (2005). Hence, for these two particular models, we establish the theoretical validity of T RLS under a high-dimensional framework. Further, we conduct a Monte Carlo simulation to confirm the theoretical implications.
The remainder of this paper is organized as follows. Section 2 introduces various types of loss functions. In Sections 3 and 4 derive the asymptotic properties of the estimators introduced in Section 2 when p is fixed and n is large, and when both p and n are large. In Section 5, we conduct a Monte Carlo simulation. Section 6 summarizes the main conclusions.

Loss functions
Suppose that the p � 1 vector y i ¼ ðy 1i ; . . . ; y pi Þ 0 follows y i ,i:i:d:N μ; Σ θ 0 ð Þ ð Þ; ði ¼ 1; . . . ; NÞ, where ΣðθÞ is a p � p covariance matrix characterized by the q � 1 unknown parameter θ. θ 0 denotes the true value of θ. Let S n be the sample covariance matrix y i , defined as S n ¼ n À 1 P N i¼1 y i À y ð Þ y i À y ð Þ 0 , where n ¼ N À 1 and y ¼ N À 1 P N i¼1 y i . Throughout this study, we assume p < n to ensure the existence of S À 1 n . Additionally, we define s n ¼ vec S n ð Þ and σ θ ð Þ ¼ vec Σ θ ð Þ ½ �. Note that we use the vec operator rather than the vech operator in vectorization to avoid using many duplication matrices.
We now define the loss functions. The most popular loss function is given by The ML estimator is obtained as a minimizer of L ML θ ð Þ, which is often called the Wishart likelihood function (Browne, 1974). 2 The ML estimator has been used extensively in many fields. For instance, when estimating factor models in psychometrics, the ML estimator is the default estimator in many software packages (e.g., EQS, LISREL, AMOS, and lavaan in R). From a theoretical viewpoint, several studies employ L ML θ ð Þ as a loss function when deriving the asymptotic properties of estimators for the cases in which both n and p are large. For instance, Bai (2013aBai ( , 2013b and Bai and Li (2012) employ L ML θ ð Þ as a loss function. L ML θ ð Þ is also called the (normalized) Stein's loss function (Ledoit & Wolf, 2018;Stein, 1975Stein, , 1986). Stein's loss function has a feature that it is proportional to the Kullback-Leibler divergence between N ð0; S n Þ and N ð0; Σð θÞÞ.
Although L ML θ ð Þ is the most widely used, there are several alternative loss functions. Perhaps the second most popular loss function is the generalized least squares (GLS)-based function, given by Browne (1974), for example, studies this loss function. The following loss function uses an equal weight for all components of s n and is defined as Wolf (2012, 2015) study the loss function L OLS θ ð Þ, which is often referred to as the Frobenius loss function.
Further, we consider another loss function in which we replace S n and ΣðθÞ with their inverses:a where we use S À 1 n À ΣðθÞ À 1 ¼ À ΣðθÞ À 1 S n À ΣðθÞ ð ÞS À 1 n . Note that Tsukuma (2005, eq.(1.3)) considers a function that is similar to L invML θ ð Þ. Additionally, L invML θ ð Þ can be interpreted as Stein's loss applied to the precision matrix ΣðθÞ À 1 .
Although L invGLS θ ð Þ is rarely used in the SEM literature, it is closely related to the econometrics literature. Specifically, the loss function L invGLS θ ð Þ can be seen as an objective function of the continuously updating generalized method of moments (GMM) estimator (Hansen et al., 1996), which is one of the most famous estimators in econometrics. To see this, let g i θ ð Þ ¼ vec½ y i À μ À � y i À μÞ 0 À � À σ θ ð Þ. Then, we have the moment condition E½g i ðθ 0 Þ� ¼ 0. Since the variance of g i ðθÞ is given by ðI p 2 þ K p;p Þ ΣðθÞ � ΣðθÞ ð Þ under normality where K p;p is a commutation matrix such that K p;p vec A ð Þ ¼ vec A 0 ð Þ, the continuously updating GMM estimator is the minimizer of L invGLS θ ð Þ, as in (5). In addition, L ML θ ð Þ and L invGLS θ ð Þ have 2 Note that the original function is not divided by p. However, we use this definition because we consider the case in which both n and p are large.
the following relationship (Browne, 1974),which plays an important role in investigating the behavior of the LR test (Hayakawa, 2019) Wolf (2012, 2018) investigate the loss function L invOLS θ ð Þ, which is referred to as the inverse Frobenius loss function. It measures the distance between S n and ΣðθÞ in terms of the precision matrix.
Another loss function is given by Moakher and Batchelor (2006) and Ledoit and Wolf (2018) study this loss function. Note that L SYM θ ð Þ is equal to Jeffreys (1946) divergence between the multivariate normal distribution with zero mean and covariance matrix Σ and the multivariate normal distribution with zero mean and covariance matrix S n after rescaling by p.
In the literature on high-dimensional frameworks, it is well known that the sample covariance matrix S À 1 n is a biased estimator for Σ À 1 0 , where Σ 0 ¼ Σðθ 0 Þ. Specifically, under normality, it follows that EðS À 1 n Þ ¼ n nÀ pÀ 1 Σ À 1 0 . Hence, it is easily conjectured that loss functions that involve S À 1 n result in undesirable properties. However, since we know the form of bias under normality, we use them to derive unbiased loss functions.
Let b θ j be the minimizer of each loss function, where j represents the type of loss function, that is, j ¼ ML; GLS; and so on. In the following, we denote b θ ML as ML, b θ invGLSub as invGLSub, and so on.
As an example, let us consider the spherical model given by ΣðθÞ ¼ θI p , where 0 < θ < 1. Then, the estimators that minimize the loss functions are explicitly given by: For the derivation, see the appendix.a b θ SYMub ¼

:
Asymptotic properties under the fixed p and large n framework In this section, we derive the asymptotic distribution of the estimators introduced in the previous section based on the following assumptions (Browne, 1974):a Assumption 1. (i) All elements of ΣðθÞ and all partial derivatives of the first three orders with respect to the elements of θ are continuous and bounded in a neighborhood of θ ¼ θ 0 .
To derive the asymptotic properties of the estimators with fixed p and large n, we consider the following loss function:a where A n θ ð Þ and B n θ ð Þ are symmetric and positive definite. Then, we find that the above loss functions are special cases of L A n θ ð Þ; B n θ ð Þ; c n θ ð Þ ð Þ, as follows:a Additionally, since L E θ ð Þ ¼ L Eub θ ð Þ þ oð1Þ, where E ¼ GLS, invML, invGLS, invOLS with large n and fixed p; GLSub, invMLub, invGLSub, and invOLSub have the same asymptotic distribution as GLS, invML, invGLS, and invOLS, respectively.
Remark 1. This result indicates that all estimators except for the OLS-related estimators have the same asymptotic distribution. This result seems to be natural because both the sample covariance matrix S n and the estimated covariance matrix Σð b θ j Þ are consistent for Σðθ 0 Þ, and S À 1 n and Σð b θ j Þ À 1 are consistent for Σðθ 0 Þ À 1 . Hence, when p is small and n is large, the choice of the loss function does not matter, at least asymptotically, as long as the OLS-related estimators are not used.
Remark 2. This result is not new. Swain (1975) proposes a class of estimators that include the above estimators except for OLS-related estimators in terms of the eigenvalues of S À 1=2 n ΣðθÞS À 1=2 n , and demonstrates that it has the same asymptotic distribution as the ML estimator. However, we use the above unified loss function approach because all the loss functions including OLS-related functions represent special cases of (14).
The following theorem is an immediate result of Theorem 1, which will be useful in the next section.
ðj ¼ ML;GLS;OLS;invML;invGLS;invOLS;SYM; GLSub;invMLub;invGLSub;invOLSub;SYMubÞ: Remark 3. In this simple example, all estimators, including OLS, have the same asymptotic distribution. However, as we show below, this equivalence never holds in the highdimensional case where both n and p become larger, and each estimator has distinct asymptotic properties.

Asymptotic properties under the high-dimensional framework
In this section, we investigate the asymptotic properties of the estimators under the large p and large n framework. Some studies have derived the high-dimensional asymptotic properties of the ML estimator for certain models. Specifically, Bai and Li (2012) and Bai (2013aBai ( , 2013b derive the asymptotic properties of the ML estimator for explanatory factors and dynamic panel data models, respectively. However, they consider only the loss function of the ML estimator given by (1). In addition, for these specific models, theoretical analysis is quite complicated, making it intractable for the general form of ΣðθÞ. Since the main purpose of this study is to investigate the consequences of the selection of the loss function, it is important to derive the asymptotic properties of all estimators introduced in Section 2. Therefore, we focus on the spherical model given by Σðθ Þ ¼ θI p . However, even in this simple example, the derivation of the asymptotic properties of all estimators is not trivial, as shown in the appendix, and notable differences in asymptotic properties can also be obtained. Many statistical studies have adopted the large n and large p asymptotic framework. For the cases of the spherical or diagonal models, for instance, Srivastava (2005) Srivastava et al. (2014) use the large n and large p framework. However, our interest is different from that of these studies. Whereas the above papers are mainly interested in the testing for sphericity or diagonality of a covariance matrix, our focus is on the estimation of parameters.
Theorems 3 and 4 show the asymptotic properties of the estimators when both n and p are large. Note that since we denote these three estimator as "ML(OLS)," "invML(invOLS)," and "invMLub(invOLSub)," respectively.
Next, we derive the asymptotic distribution of consistent estimators.
Remark 4. From Theorem 3, we find that some estimators are inconsistent under the high-dimensional framework. An implication of this result is that the GLS, invML, invGLS, and SYM will be biased unless p is sufficiently smaller than n. Theorem 4 demonstrates that even consistent estimators have different asymptotic variances, with b θ ML being the most efficient estimator. These results should be contrasted with the results under the fixed p and large n framework, where all the estimators are consistent and have the same asymptotic variances.
Remark 5. Note that b θ ML can be consistent, even with a fixed n, as long as p is large. Hence, in this sense, more assumptions are made about b θ ML than are necessary. In addition, note that the assumption 0 < c < 1 can be relaxed to 0 < c < 1 for the ML and invGLSub estimators.

Remark 6.
To investigate the differences between estimators, we compute their theoretical values and asymptotic variances with θ 0 ¼ 2 and c ¼ 0:025; . . . ; 0:975, as shown in Figure 1. 5 Figure 1 shows that the asymptotic biases of the inconsistent estimators become larger as c increases. Among the inconsistent estimators, we find that SYM has the smallest bias, GLS has the largest bias, and invGLS and invML have values between them. Hence, in terms of bias, SYM shows the best performance, followed by invGLS and invML among the inconsistent estimators. With regard to efficiency, we find that the asymptotic variance of ML and invGLSub decreases monotonically as c increases, with the difference in magnitude between them being marginal. However, we find that the asymptotic variance of GLSub, invMLub, and SYMub does not decrease monotonically with c. Specifically, although the asymptotic variance of these estimators decreases as c increases from 0, it conversely increases as c exceeds a certain value. With this nonmonotonic behavior, we may say that GLSub, invMLub, and SYMub are not preferable. Hence, considering bias and efficiency, we conclude that ML has the best performance followed by invGLS(ub), whereas the wellknown GLS estimator is not recommended because it has the worst performance in terms of both bias and efficiency.
Remark 7. Using the Monte Carlo simulation method, many studies have demonstrated that although the ML estimator performs well, the GLS estimator performs poorly when p is not sufficiently small compared to n, despite both having the same asymptotic distribution(see next section). Although the conventional asymptotic results with large n and fixed p cannot explain this theoretically, it is possible with large n and large p asymptotic results. Specifically, although the ML estimator is consistent even when p increases, the GLS estimator becomes more biased and inconsistent as p=n increases.
Remark 8. In the SEM literature, the GLS estimator has received much more attention than the invGLS estimator, although the difference between them is simply in the form of a weighting matrix, that is, whether S À 1 n or ΣðθÞ À 1 is used. However, Theorems 3 and 4 demonstrate that the less famous invGLS estimator performs better than the more famous GLS estimator as the bias of the invGLS estimator is smaller than that of the GLS estimator(although inconsistent) and the invGLSub estimator is more efficient than the GLSub estimator. This suggests that the less known invGLS estimator should receive more attention in the SEM literature.

Remark 9.
For the derivation of these results, we assume normality. However, since Bai (2013aBai ( , 2013b and Bai and Li (2012) show consistency and asymptotic normality without normality assumption, we could relax the normality assumption at the cost of great complexity of the proof, which is not pursued in this study.
Remark 10. The notable result associated with Theorems 3 and 4 is that although the choice of the loss function is inconsequential when p is fixed, this is not the case when p is large. Among the estimators, the ML estimator is found to be the best because it is consistent for both fixed and large p cases. We also find that the invGLS(ub), which is less known than the GLS estimator in the literature, comes next. This has an important implication for practice, that is, the invGLS estimator deserves more attention than the GLS estimator and could be a useful alternative to the ML estimator.

Likelihood ratio test
In covariance structure analysis, it is common practice to conduct a goodness-of-fit test or the LR test to confirm that the covariance matrix under consideration is identical to the population matrix. The most popular LR test is given by However, studies have repeatedly shown that T ML suffers from the over-rejection problem unless p is sufficiently small compared to n. 6 Recently, Hayakawa (2019) has demonstrated that one of the tests considered by Browne (1974) can address the over-rejection problem. This alternative test is given by Since the ML, invGLS, and invGLSub do not use the inverse of the sample covariance matrix S n , the condition 0 < c < 1 is not necessary for these three estimators, and can be considered for the extended region c � 1. However, as there are no large differences if c is extended to c � 1, we consider only 0 < c < 1. Note that the ML estimator is used to compute the test statistic instead of b θ invGLS . Browne (1974) shows that both T ML and T RLS asymptotically follow the chi-square distribution with df ¼ pðp þ 1Þ=2 À q degrees of freedom under normality and the correct specification when p is fixed and n is large:a Moreover, Browne (1974) demonstrates that T ML and T RLS have the following relationship:a (2019) uses a Monte Carlo simulation to show that T RLS can address the over-rejection problem of T ML , even when p is comparable to n in the context of confirmatory factors, dynamic panel data, spherical, and diagonal models. However, he does not provide a theoretical justification for this result. In this study, we demonstrate that T RLS is valid, even under large p and large n, in the context of (i) the spherical model and (ii) the diagonal(independent) model.
(i) Spherical model For the spherical model given by It is interesting to note that U np is exactly the same test statistic proposed by John (1971John ( , 1972 and Ledoit and Wolf (2002). Thus, we can consider U np as a special case of T RLS . John (1972) shows that under the null hypothesis, as n ! 1 with fixed p, Þ=2À 1 À p: Ledoit and Wolf (2002) show that under the null hypothesis, as both n and p tend to infinity with p=n ! c; ð0 < c < 1Þ, and that This result indicates that testing for sphericity via U np is valid for both fixed and large values of p and justifies the use of T RLS , even when p is large.
(ii) Diagonal model As the second example, we consider the case where Σ θ ð Þ ¼ diagðθ 1 ; . . . ; θ p Þ with 0 < θ j0 < 1; ðj ¼ 1; . . . ; pÞ. Since we assume normality, the diagonality of Σ θ ð Þ indicates that the elements of y i are mutually independent. Hence, this model is often called the independent model in the psychometrics literature and is used as the baseline model to compute goodness-of-fit indices such as the comparative fit index (CFI).
The ML estimator for the diagonal model is b θ ML ¼ ðs n;11 ; . . . ; s n;pp Þ 0 , where s n;jk denotes the (j; k) element of S n .
Hence, it follows that Σð b θ ML Þ ¼ b Σ ¼ diagðs n;11 ; . . . ; s n;pp Þ; thus, we have is the correlation coefficient matrix associated with S n , and where r n;jk is the (j; k) element of R n . Using (21), we obtain:a The over-rejection problem of T ML Next, we investigate why T ML suffers from the overrejection problem. Using we have Hence, it follows that Thus, if p is fixed and n is large, B tends to zero. For the case with large p and large n, we consider the following alternative expression to From (22) and (23), we have t RLS ! d N ð0; 1Þ as both p and n increase with p=n ! c; ð0 < c < 1Þ. However, we can easily see that the second term 1 p B diverges under p=n ! c; ð0 < c < 1Þ. This result is consistent with the findings from simulation studies showing that T ML suffers from the over-rejection problem unless p is much smaller than n. Contrary to T ML , as shown above, the T RLS for spherical and diagonal models is still valid, regardless of whether p is fixed or diverges, as long as p=n ! c; ð0 < c < 1Þ. Hence, although the model is somewhat limited, this result provides a theoretical justification for using T RLS rather than T ML , especially when p is large.

Monte Carlo simulation
In this section, we investigate the finite sample behavior of the estimators defined in Section 2. Specifically, we consider three covariance structure models. The first is the spherical model given by Σ θ ð Þ ¼ θI p . This is the case investigated in Sections 3, 4, and 5. The second is the diagonal model given by Σ θ ð Þ ¼ diagðθ 1 ; θ 2 . . . ; θ p Þ. This model is described in Section 5. The third is the first-order autoregressive (AR(1)) model given by Σ θ ð Þ ¼ fσ 2 ρ jiÀ jj g. Although the large ðn; pÞ asymptotic properties of the last two models have not been investigated, the estimators' behaviors in such models deserve more attention. We also investigate the effects of nonnormality.

Spherical model
We consider the model Σðθ 0 Þ ¼ θ 0 I p with θ 0 ¼ 2. For the data distribution, we considered the normal and χ 2 distributions. Specifically, the data are generated as where ζ i is a p � 1 random vector that determines the distributional property of y i . According to Yuan and Bentler (1997) and Yanagihara (2007), we generate ζ i as follows:a where κ 4 denotes the multivariate kurtosis due to Mardia (1970).
For the sample size, we consider n ¼ f100; 500g, and for the dimension p, we fix the ratio c ¼ p=n with c ¼ f0:02; 0:1; 0:3; 0:5; 0:7; 0:9g. We compute the means, standard deviations, and empirical rejection frequencies (in %) for the test H 0 : θ ¼ θ 0 at the 5% significance level for ML, GLS, invML, invGLS, SYM, GLSub, invMLub, invGLSub, and SYMub. We use 5,000 replications. Tables 1 and 2 present the results. For the empirical rejection frequency, we consider two cases: one based on Theorem 2 and the other based on Theorem 4. Based on Theorem 3, we showed that ML, GLSub, invMLub, invGLSub, and SYMub are consistent, even when both n and p tend to infinity and the remaining estimators are inconsistent. The simulation results in Table 1 confirm this finding. ML is almost unbiased for any value of c. Additionally, other consistent estimators also have little bias. The inconsistent estimators, as Theorem 3 predicts, tend to be biased as c approaches one (see, also Figure 1). Regarding the dispersion, we find that the standard deviations of ML, GLS, invML, invOLS, invGLS, and invGLSub decrease monotonically as c increases. However, as shown in Figure 1, the standard deviations of SYM, GLSub, invMLub, and invSYMub do not decrease monotonically as c increases. This result is consistent with Theorem 4. In terms of inference, we find that only ML has the correct empirical rejection frequency for any value of c. This is because the asymptotic variance of the ML estimator under fixed p and large p are essentially the same. For the consistent estimators GLSub, invMLub, invGLSub, and SYMub, we find that although the empirical rejection frequencies are close to the nominal level (e.g., when c � 0:5), they tend to exhibit over-rejection as c approaches one, except for invGLSub. Summarizing the simulation results, we find that ML shows the best performance in terms of bias, efficiency, and accuracy of inference, followed by invGLSub. The other remaining estimators have some problems, especially when c is close to one. Next, we consider the effect of non-normality. From Table 2, ML shows the best performance for all combinations of p and n. However, there are significant differences between GLSub, invMLub, invGLSub, and SYMub. Although these estimators have little bias under normality (Table 1), they tend to be more biased as c increases. This is because unbiased loss functions are derived under the assumption of normality.

Diagonal model
As the second model, we consider the diagonal model given by Σðθ 0 Þ ¼ diagðθ 10 ; θ 20 . . . ; θ p0 Þ. We set the parameter θ 0 ¼ ðθ 10 ; θ 20 ; . . . ; θ p0 Þ 0 with θ j0 ¼ j. The data are generated as shown in (24). For the sample size, we consider n ¼ 100, and for the dimension, we consider p ¼ f2; 10; 30; 50; 70; 90g. The other setup is identical to that of the spherical model. The simulation results are presented in Table 3, with similar findings to those of the spherical case. Among the estimators, ML has the smallest bias in almost all cases for both distributions. Although GLSub, invMLub, invGLSub, and SYMub, which are derived from unbiased loss functions, have relatively small biases under normality, this is not the case under nonnormality. Thus, in terms of bias and robustness to distributions, ML is the most preferable estimator.

AR(1) model
As the third model, we consider the AR(1) model given by The remaining setup is identical to that of the diagonal model. The simulation results are listed in Table 4. In contrast to the previous two models, all the estimators for ρ have little bias for all combinations of n and p. However, with regard to the estimation of σ 2 , whereas ML and other consistent estimators have little bias for all combinations of n and p under normality, others are biased. This result implies that not all parameters are affected by a large p. Under non-normality, as in the previous cases, while ML has little bias, GLSub, invMLub, invGLSub, and SYMub tend to be more biased as n and p increase.

LR tests
Finally, we consider the LR tests. The empirical rejection frequencies of T ML and T RLS (in %) for the three models under normality are listed in Table 5. From the table, we find that T ML   has an empirical rejection frequency close to the nominal level only when ðn; pÞ ¼ ð100; 2Þ; ð100; 10Þ. For other cases, T ML suffers from a serious over-rejection problem. In contrast to T ML , T RLS has correct the empirical rejection frequencies in almost all cases.

Conclusions
In this study, we considered the consequences of loss function choices in covariance structure analysis. We demonstrated that when the dimension of the covariance matrix p is fixed and the sample size n tends to infinity, all estimators considered, except for the OLS-related estimators, have the same asymptotic distribution. However, when both n and p tend to infinity, some estimators become inconsistent. Additionally, even for consistent estimators, the asymptotic variances differ. Table 6 provides a summary of the results derived in this study. From the theoretical results, we find that the ML estimator is consistent and more efficient than other estimators when both n and p tend to infinity. We also find that the less known invGLS estimator has the second best performance. This finding has important implications for practice. Although the GLS estimator is much more famous than the invGLS estimator in the SEM literature, these results suggest that the invGLS estimator is preferable to the GLS estimator and should be more frequently used. We also showed that the alternative LR test, T RLS , proposed by Browne (1974) and reevaluated by Hayakawa (2019) is valid, even when both n and p are large, while the conventional LR test, T ML , is valid only when p is much smaller than n.

Disclosure statement
No potential conflict of interest was reported by the author(s).