Multiple data sources fusion for enterprise quality improvement by a multilevel latent response model

Quality improvement of an enterprise needs a model to link multiple data sources, including the independent and interdependent activities of individuals in the enterprise, enterprise infrastructure, climate, and administration strategies, as well as the quality outcomes of the enterprise. This is a challenging problem because the data are at two levels—i.e., the individual and enterprise levels—and each individual's contribution to the enterprise quality outcome is usually not explicitly known. These challenges make general regression analysis and conventional multilevel models non-applicable to the problem. This article a new multilevel model that treats each individual's contribution to the enterprise quality outcome as a latent variable. Under this new formulation, an algorithm is developed to estimate the model parameters, which integrates the Fisher scoring algorithm and generalized least squares estimation. Extensive simulation studies are performed that demonstrate the superiority of the proposed model over the competing approach in terms of the statistical properties in parameter estimation. The proposed model is applied to a real-world application of nursing quality improvement and helps identify key nursing activities and unit (a hospital unit is an enterprise in this context) quality-improving measures that help reduce patient falls.


Introduction
An enterprise consists of individuals. Examples of an enterprise include a company or a company division, such as the technology division and marketing division, consisting of employees; a military team consisting of soldiers on a specific mission; and a hospital or a hospital unit, such as the emergency unit and surgical unit, consisting of medical professionals. Quality improvement for an enterprise is different from quality improvement for an individual in terms of the following aspects.
1. The quality of an enterprise is affected by the independent and interdependent activities of individual employees. 2. The quality of an enterprise is also affected by factors at the enterprise level such as the infrastructure, culture, and administration strategies.
3. Each individual contributes a certain amount to the quality of the enterprise; however, this individual contribution is usually not explicitly known.
For example, consider a hospital unit as an enterprise. Although the quality of a hospital unit has many dimensions, here we solely focus on one dimension-the quality of nursing in patient care; this can be measured in terms of the number of falls and medication errors made by the unit during a certain time period and also patient satisfaction with the care provided by the unit's staff. Because our focus is on the quality of nursing, we consider the individuals associated with the unit to be nurses, although a hospital unit also includes other medical professionals. It is not difficult to see that the aforementioned three aspects in the quality improvement of an enterprise fit perfectly in the context of nursing.
1. The quality of nursing in a unit is affected by the independent nursing activities of individual nurses as well as their interdependent activities in coordinating a patient's care (Lamb et al., 2008).
0740-817X C 2014 "IIE" Data fusion for enterprise quality improvement 513 2. The quality of nursing in a unit is also affected by unit infrastructure, climate, and administration strategies. 3. Each nurse contributes a certain amount to the nursing quality of the unit. However, this contribution is usually not known because delivery of patient care requires team work and it is difficult to explicitly apportion the quality outcome of the unit to each individual nurse.
For quality improvement of an enterprise, three data sources may be utilized and they exist at two levels. Specifically, at the enterprise level, there are two data sources on quality outcomes and quality-affecting factors, such as enterprise infrastructure, climate, and administration strategies. At the individual level, there is one data source on individual independent and interdependent activities. To accomplish the goal of quality improvement of an enterprise, a statistical model is needed to link these multiple data sources. The model should be able to address the questions of how individual activities affect the quality of the enterprise, how factors at the enterprise level affect the quality, and how an individual's activities/performance interact with the enterprise-level factors to jointly affect the enterprise quality. Answers to these questions are keys to formulating action plans for quality improvement of the enterprise.
To address the above questions, regression models provide a potential tool. Compared with many "black-box" methods that focus primarily on best prediction of the outcome variable, regression models are a "white-box" method offering better interpretability by explicitly revealing the variables that lead to the best prediction for the outcome. This is especially important when the final objective is to formulate an action plan. Furthermore, parametric regression models allow for statistical inference in variable or model selection, which offers a rigorous way for generalizing the model estimated from a specific data set to the general population. Third, multi-level regression, as a specific type of regression model, was developed to handle multilevel data sources.
In the "language" of regression, the quality outcomes are called responses; the enterprise-level quality-affecting factors and the individual-level activity variables can both be called predictors. In general regression analysis, all of the responses and predictors should be at the same level. Therefore, to be able to apply general regression analysis to our data, an intuitive approach is to transform the data sources to a single level. Specifically, since out of the three data sources only the predictors of individual activities are at the individual level, we might consider aggregating these predictors to the enterprise level by using some pre-defined summary statistics, such as the sample mean and variance of the data of all of the individuals in each enterprise. Then, general regression analysis can be applied at the enterprise level. One drawback of this approach is loss of information, since the pre-defined summary statistics may not be able to capture the full spectrum of behavior of the individuals in each enterprise. Another more severe drawback is aggregation bias (Robinson, 1950;Goldstein, 1995;Draper and Smith, 1998); i.e., a predictor at different levels may have different effects on the response. For example, in the nursing context, there is an important predictor called exchanging, which is originally defined for each individual nurse (i.e., an individual-level predictor) to characterize activities of exchanging information with others regarding patient care. If exchanging were aggregated to the unit level, then the resulting variable-i.e., the average level of information exchange of a unit-is a proxy measure of the unit's normative environment/facility in promoting information exchange and thus may have an effect on the nursing quality of the unit different from the effect of an individual nurse's activities in exchanging information.
When predictors are at different levels, multilevel regression provides a more appropriate tool than general regression analysis. Multilevel regression has been discussed in a diverse range of literature under a variety of titles. For example, they are referred to as multilevel regression or multilevel linear models in sociological research (Mason et al., 1983;Goldstein, 1995), as mixed-effects models in biometric research (Elston and Grizzle, 1962;Laird and Ware, 1982;Singer, 1998), as random efficient models in econometrics (Rosenberg, 1973;Longford, 1993), and as covariance components models in statistics (Dempster et al., 1981). A multilevel regression model allows the inclusion of predictors at two (e.g., individual and enterprise) or more levels. The basic idea is to build a separate regression model for each enterprise by linking the individual-level predictors with the response and then modeling the variation among enterprises by considering the regression coefficients as multivariate responses explained by enterprise-level predictors. However, existing multilevel models are not directly applicable to our problem because they require that the response variable be at the individual level.
To the best of our knowledge, there is no effective model to link enterprise-level quality responses with enterpriseand individual-level predictors for quality improvement of an enterprise. This research aims to bridge this gap by proposing a new multilevel regression model with enterprise-level responses, which treats an individual's contributions to the responses as latent variables. Furthermore, the proposed model is applied to a real-world application of nursing quality improvement. Most existing research in nursing quality is either qualitative or quantitative but only utilizes the unit-level information (Aiken et al., 1994;Aiken et al., 2002;Institute of Medicine, 2004;Kazanjian et al., 2005;Lake et al., 2006;Laschinger and Leiter, 2006;Aiken et al., 2008;Friese et al., 2008;Mark et al., 2008), because the activities of individual nurses have long been considered to be non-quantifiable (Lamb et al., 2008). This application analyzes the data collected by two co-authors (Lamb and Schmitt) in a Robert Wood Johnson Foundation (RWJF)sponsored project, in which the research team designed the first instrument to measure nurses' independent activities and their interdependent activities in coordinating patient care using a comprehensive collection of variables. The data also include measurements of unit-level qualityaffecting factors and quality outcomes. By linking these multiple data sources together, the findings may have a profound impact on nursing quality improvement.
The remainder of the article is organized as follows: Section 2 presents the new multilevel model development; Section 3 performs simulation studies to assess the model performance; Section 4 presents the findings in applying the proposed model to nursing quality improvement; Section 5 concludes the article.

Proposed model-multilevel regression with
enterprise-level response

Model formulation
Let i be the index for enterprises and m be the total number of enterprises; i.e., i = 1, . . . , m. Let j be the index for individuals and n i be the total number of individuals in enterprise i ; i.e., j = 1, . . . , n i . Let y i denote the response (quality outcome) of enterprise i . Letỹ i j denote the contribution of individual j to y i andỹ i j is latent. In this article; we focus on the situation when the contributions of individuals to the enterprise-level response are additive i.e.,y i = n i j =1ỹ i j . Please note that although model formulation (this section) and estimation and inference (next section) are discussed for this simple additive relationship betweenỹ i j , j = 1, . . . , n i , and y i , they can be readily extended to address two other situations: one situation is that y i is a weighted sum of theỹ i j , j = 1, . . . , n i , with known weights; the other situation is that y i is a general function of y i j , j = 1, . . . , n i -i.e., y i = f (ỹ i j , j = 1, 2, . . . , n i )-with the function form, f , known and can be reasonably approximated by a linear function through the Taylor expansion. In this article, we focus on y i = n i j =1ỹ i j for better presentation and clarity and also considering that this relationship is appropriate for the real-world application in Section 4. Also, we focus on a single response; a multi-response model is just a straightforward extension to the singleresponse model we propose. Furthermore, assume that there are a total of Q individual-level predictors and P enterprise-level predictors. Let x qi j denote the measurement on the qth individual-level predictor for individual j in enterprise i . Let s pi be the measurement on the pth enterprise-level predictor for enterprise i .
The model formulation consists of two stages: At Stage 1, each individual's latent response is linked to a set of individual-level predictors. At Stage 2, the regression coefficients in the Stage 1 model for each enterprise are response variables that are hypothesized to be explained by enterprise-level predictors. Specifically, The Stage 1 model is to regress the latent response on the individual-level predictors; that is, where ε i j ∼ N(0, σ 2 ). This is to assume that the residual errors, ε i j , i = 1, . . . , m, j = 1, . . . , n i , are normal and independent and identically distributed (i.i.d.) across different individuals and enterprises. Note that model (1) keeps i (index for enterprises) in the subscript for the model coefficients, β 0i , . . . , β Qi . This is to acknowledge the uniqueness of each enterprise; i.e., we consider that the effect of individual-level predictors on the quality response may vary across enterprises. The Stage 2 model is to regress the regression coefficients in Equation (1) on the enterprise-level predictors where u qi ∼ N(0, τ qq ) and cov(u qi , u q i ) = τ qq for q = q . Note that the distribution parameters, τ qq and τ qq , do not have enterprise index i in their subscripts. This is to assume that the random effects of different enterprises; i.e., u qi , i = 1, . . . , m, are i.i.d. across the enterprises. This model aims to characterize how enterprise-level predictors may modify the effect that the individual-level predictors have on the response. By inserting Equation (2) into Equation (1), a combined model can be obtained: where γ 00 is the grand mean; γ q0 is the fixed main effect from an individual-level predictor; γ 0 p is the fixed main effect from an enterprise-level predictor; γ qp is the fixed interaction effect between an individual-level predictor and an enterprise-level predictor; u 0i and u q j are random effects; ε i j is the residual error. The two-stage model formulation can be more clearly depicted by Fig. 1.

Model estimation
To make the discussion in this section easier, we adopt an equivalent but more succinct representation for the combined model in Equation (3); that is, where x i j is a vector of measurements on the Q individuallevel predictors for individual j in enterprise i -i.e., x i j = [1, x 1i j , . . . , x Qi j ] T ; u i is a vector of the random effects-i.e., fixed effects; that is, Furthermore, let denote the covariance matrix of u i . According to model (4), the parameters to be estimated are = {ϒ, , σ 2 }. The Maximum Likelihood Estimation (MLE) method can be employed. Specifically, it can be derived from Equation (4) . Furthermore, based on the relationship y i = n i j =1ỹ i j , the distribution of y i can be derived; that is, Because the y i , i = 1, . . . , m, are observable, the loglikelihood function of can be built upon the y i variables; that is, It is difficult to find closed-form expressions for the maximizer of the log-likelihood in Equation (6). Therefore, some iterative algorithm needs to be adopted. We adopt the Fisher Scoring (FS) algorithm (Longford, 1987). The FS algorithm begins with user-specified initial values for the parameters-i.e., (0) = {ϒ (0) , (0) , σ 2 (0) } in our case-and then continuously updates the parameters by (k+1) = (k) + α (k+1) p k until some convergence condition is met; e.g., (k+1) − (k) 2 < ε, where ε = 10 −4 is a common choice. Clearly, the key in adapting the FS algorithm to our problem setting is to obtain p k and α (k+1) , called the step direction and step size, respectively. In what follows, we will first discuss how to obtain p k and then discuss how to select α (k+1) .
The step direction, p k , can be expressed as (k) . The derivation of ∂l( )/∂ and E(∂l( ) 2 /∂ ∂ ) can be found in the Appendix. One potential limitation in use of p k = −H −1 F for computing p k is that we need to calculate the inverse of matrix H, which may be computationally inefficient if the dimension of is large. To overcome this limitation, we propose the following strategy. Note that E(∂ϒ∂/∂ϒ∂σ 2 ) = 0 and E(∂l( ) 2 /∂ϒ∂ ) = 0; i.e., the update of ϒ is independent of and σ 2 in each iteration of the FS algorithm. This suggests another way to update ϒ. Specifically, we can rewrite Equation (5) as where , and vec(·) is an operator that converts a matrix into a row vector by concatenating the columns of the matrix. Furthermore, Then, we can use the generalized least squares method to estimate the vec(ϒ) T in Equation (7); that is, The incorporation of this new strategy into the proposed FS algorithm results in the following new algorithm. At the kth iteration, the FS algorithm is used to obtain estimates for and σ 2 ; i.e., (k) and σ (k) , which are used to compute V (k) ; then, the generalized least squares method is used to obtain an estimate for vec(ϒ) T ; i.e., vec( A complete description of this new algorithm is given in Fig. 2. Next, we discuss how to select the step size, α (k+1) . The selection of α (k+1) is a classic but challenging problem in optimization. In our case, the approach introduced in Rosenberg (1973) is recommended. Specifically, the FS algorithm can start with α (k+1) = 1. If l( (k+1) ) > l( (k) ), then accept α (k+1) = 1; otherwise, make α (k+1) = 1 2 , 1 4 , 1 8 , . . . , until l( (k+1) ) > l( (k) ). This approach works well 516 Huang et al.  empirically. However, a potential drawback is that sometimes it may be impossible to find a positive α (k+1) that makes l( (k+1) ) > l( (k) ), so the algorithm will break down. A major reason is that the iterations in the FS algorithm may be far away from the optimal solution. To avoid this, the initial values, (0) = {ϒ (0) , (0) , σ 2 (0) }, should be well chosen.
To select good initial values, we adapt the method developed in Sun et al. (2007) for conventional multilevel regression with individual-level responses to our problem that considers the individual-level responses to be latent. Specifically, our method includes two steps. The first step is to select ϒ (0) and σ 2 (0) . For this purpose, we ignore the random effects, so Equation (5) becomes . Estimates for ϒ and σ 2 can be easily obtained by MLE; these estimates are used as ϒ (0) and σ 2 (0) , respectively.
The second step is to select (0) . Specifically, let Then, according to Equation (4), we can write where ε i = n i j =1 ε i j . Based on Equation (9), the least squares estimate for u i can be obtained; that is, Inserting Equation (9) into Equation (10), Because ε i is independent of any element in u i , the last two terms are negligible. Therefore, is used as (0) .

Model inference
After model parameters have been estimated, the next step is to perform hypothesis testing to check the statistical significance of the parameters. Three types of hypotheses usually need to be tested in our case: tests for fixed effects, tests for random effects, and tests for model comparison. Specifically, to test a fixed effect, γ qp -i.e., H 0 : γ qp = 0 versus H 1 : γ qp = 0-the test statistic t =γ qp / var(γ qp ) can be used, q = 0, 1, . . . , Q, p = 0, 1, . . . , P.γ qp is an MLE estimate for γ qp , and var(γ qp ) can be asymptotically approximated by the corresponding element in matrix H −1 . Recall that bothγ qp and H have been obtained from the model estimation method proposed in Section 2.2. This test statistic is asymptotically standard normal; this property can be used to calculate the P-value of the test. Furthermore, to test the random effects is equivalent to testing the covariance matrix of the random effects, ; i.e., H 0 : = 0 versus H 1 : = 1 . For example, to test whether a random effect u qi exists, q = 0, 1, . . . , Q, we can set 0 to be a reduced form of 1 by making the qth row and column of

Simulation studies
An intuitive, competing approach to the proposed model is to aggregate the individual-level predictors to the enterprise level by using some summary statistics and then perform general regression analysis at the enterprise level.
Limitations of this aggregate model have been conceptually discussed in Section 1. In this section, we will use simulation data to compare the performance of the aggregate model and the proposed model in terms of the statistical properties in fixed and random effect estimation.
To generate the simulation data, the following true model is used. Consider that an individual-level response, y i j , is linked to one individual-level predictor, x i j , by a regression y i j = β i x i j + ε i j and β i is linked to one enterprise-level predictor, s i , by a regression β i = γ s i + u i . These two regressions can be combined into one; i.e., y i j = γ s i x i j + u i x i j + ε i j . The parameters of the true model include the fixed effect, γ , variance of the random effect, τ = var(u i ), and variance of the residual error, σ 2 = var(ε i j ). γ , τ , and σ 2 are all assumed to be one. Furthermore, data are generated from the true model for each unit. Specifically, for unit i , the following steps are performed: (i) draw two sample from N(0, 1) and make them the values for u i and s i , respectively, and compute β i ; (ii) draw n i samples from N(0, 1) for ε i 1 , . . . , ε in i , and another n i samples from N(0, 1) for x i 1 , . . . , x in i , and compute y i 1 , . . . , y in i ; (iii) compute y i = n i j =1 y i j . Based on the data, y i , x i j , and s i , j = 1, . . . , n i , i = 1, . . . , m, the proposed model can be applied to estimate the true model parameters γ , τ , and σ 2 . Alternatively, the aggregate model can also be applied, which builds an ordinary regression of y i on predictors s i , x i = n i j =1 x i j , and s i x i . In this aggregate model, γ is estimated by the coefficient of predictor s i x i . The aggregate model is not able to separate between-and within-enterprise variations by providing separate estimates for τ and σ 2 . Instead, it estimates the overall variation by the residual variance. In addition, as data on the individual-level response, y i j , are available in simulation, conventional multilevel regression is also applied to estimate γ , τ , and σ 2 . These estimates can be used as the gold standard to assess the impact of treating the individual-level response as latent by the proposed model. The results of comparison between the three models are presented as follows.

Fixed effect estimation
For each model, the average estimate for γ over 100 repetitions of the simulation was obtained. Furthermore, the deviation of the average estimate from the true γ = 1 was computed to assess the bias in the estimation. The deviations for the gold standard, proposed, and aggregate models are 0.028, 0.021, 0.023, respectively, when m = 10 (number of enterprises) and n i = n = 50 (enterprise sample size). The deviations are small, which is also true for other m and n values. This implies that the proposed model gives unbiased estimators for fixed effects. In fact, this property of the proposed model can be theoretically proved. Specifically, according to Equation (10), the estimates for the fixed effect are vec(Υ) T = (Z T VZ) −1 Z T VY. Then, where the second " = " follows from Equation (8). The gold standard model also gives unbiased estimators for fixed effects, which is a well-known property for conventional multilevel regression. In the aggregate model, the data on the response variable-i.e., the y i -are independent but nonidentically distributed, because var (y i ) = x 2 i τ + nσ 2 . Even though the data are not i.i.d., ordinary least squares estimation can still give unbiased estimators for the regression coefficients (Demidenko, 2004), so the estimates for fixed effects by the aggregate model are also unbiased.
Furthermore, we compare the three models in terms of the standard error of the fixed effect estimate. Figure 3(a) shows the standard error averaged over 100 repetitions of the simulation by each model (y-axis) with respect to the number of enterprises, m (x-axis). The enterprise sample size, n, was fixed to be 50. It can be seen that the standard error by the proposed model is very close to that by the gold standard model, whereas that by the aggregate model is much larger. A large standard error leads to the risk of mis-detecting significant fixed effects. Also, Fig. 3(a) shows that increasing the number of enterprises, m, can significantly reduce the standard errors for all three models. Furthermore, we varied the enterprise sample size by generating the sample size of each enterprise, n i , from a uniform distribution on the interval [1, 10] (Rosenberg, 1973;Lamb et al., 2008). The simulation was repeated and the results are shown in Fig. 4. Comparing Fig. 4 with Fig. 3, it can be seen that smaller and unbalanced enterprise sample sizes increase the standard errors but only slightly. In other words, the enterprise sample size influences the standard errors much less than the number of enterprises. This observation is consistent with existing knowledge about multilevel regression (Draper and Smith, 1998).

Random effect estimation
The aggregate model does not include any random effects. Therefore, the comparison is between the proposed and gold standard models. For each model, the deviation of the average estimate for τ from the true τ = 1 was computed to assess the bias in the estimation. The deviations for the gold standard and proposed models are −0.015 and −0.024, respectively, when m = 100 and n = 50. The magnitudes of these deviations become smaller when m increases. This empirically implies that both models might give unbiased estimators for τ . Furthermore, we compared the two  models in terms of the standard error of the estimate for τ . The result is given in Fig. 3(b), from which it is clear that the standard error by the proposed model is very close to that by the gold standard model. Increasing the number of enterprises, m, can significantly reduce the standard errors for both models, whereas increasing n does not have this effect (results not shown here).

Residual error estimation
For each model, the deviation of the average estimate for σ from the true σ = 1 was computed to assess the bias in the estimation. The deviations for the gold standard, proposed, and aggregate models are −0.003, −0.177, 13.394, respectively, when m = 100 and n = 50. The magnitudes of these deviations only display small changes when m and n in-crease. This implies that the proposed model may be biased in estimating the residual error σ ; fortunately, the magnitude of the bias is small. In contrast, the aggregate model over estimates σ with a large bias. The large bias is due to the fact that the aggregate model cannot separate withinand between-enterprise variation, so that the estimate for the residual error is a combination of the two variation sources. Furthermore, we compared the three models in terms of the standard error of the estimate for σ . The same phenomena/trends were observed as for the standard error of the fixed effect estimate. Furthermore, we performed simulation studies with more individual-and enterprise-level predictors. Specifically, in the first set of simulations, we included Q individual-level predictors (Q > 1), while keeping the number of enterprise-level predictors to be one. Therefore,  the true model used to generate the simulation data was y i j = Q q=1 γ q s i x qi j + Q q=1 u qi x qi j + ε i j , where γ q = 1, τ qq = var(u qi ) = 1, and σ 2 = var(ε i j ) = 1. s i and x qi j were sampled from the N(0, 1) distribution. Based on the simulation data, the proposed, aggregate, and gold standard models were applied to estimate the Q fixed effects, variances of the Q random effects (the aggregate model cannot estimate these), and the residual variance. Because all three models have been theoretically proven to give an unbiased estimator for each fixed effect, the simulation result for assessing the bias of each model in fixed effect estimation is not shown here. The models were compared in terms of the standard errors of the fixed effects estimates, the biases, and standard errors of the random effect variance estimates and the bias and standard error of the residual variance estimate. Due to page limits, instead of showing the standard error of each fixed effect estimate, we show the average over the standard errors of the estimates for the Q fixed effects. A similar consideration applies to the random effects; i.e., we show the average over the biases/standard errors of the estimates for the Q random effects' variances. The results for Q = 8 are shown in the "Q = 8, P = 1" column of Tables 1-5.
In the second set of simulations, we included P enterprise-level predictors (P > 1), while keeping the number of individual-level predictors at one. Therefore, the true model used to generate the simulation data was y i j = P p=1 γ p s pi x i j + u i x i j + ε i j , where γ p = 1, τ = var(u i ) = 1, and σ 2 = var(ε i j ) = 1. s pi and x i j were sampled from the N(0, 1) distribution. The three models were applied to the simulation data. The results for P = 8 are in the "Q = 1, P = 8" column of Tables 1 to 5. Note that because there are eight fixed effects, only the average over the standard errors of the estimates for these fixed effects is  shown in Table 1 due to page limits. In addition, the results for one individual-level predictor and one enterprise-level predictor, discussed previously, are copied here for the purpose of comparison.
The following observations can be drawn from the performed simulations: 1. The proposed model is consistently better than the aggregate model in terms of all the statistical properties chosen for comparison and regardless of the number of predictors. 2. The proposed model performs close to the goldstandard model, especially when the number of individual-level predictors, Q, is small. This is because Q represents the number of random effects; a large Q increases the variation sources, introducing more uncertainty in the model estimation. 3. With a fixed number of enterprises-i.e., m = 100 in Table 6-the fewer the number of parameters to be estimated, the better the estimation. This is true for all three models. Furthermore, the enterprise sample size, n, influences the model performance to a much smaller extent than the number of enterprises.

Data collection and selection of predictors and response variable
The case study in this section uses the data collected by two co-authors in an RWJF-sponsored project. Data were collected mainly in the format of surveys handed out to 614 nurses in 32 hospital units (human subject approval  was obtained). Using the "language" of this article, nurses are the "individuals" and units are the "enterprises."

Selection of individual-level predictors
The survey included 11 questions designed to measure independent nursing activities and interdependent activities in coordinating patient care. Examples of the 11 questions include "I organize the supplies that I need to be able to keep the care of my patients on track," "I initiate actions to get my nursing team members to do what is needed to keep my patients on their plan of care," and "I communicate information to my interdisciplinary team members they need to know to carry out their patient care activities or to make changes in their plan of care." Each question included four aspects for capturing the nurses' perceptions of (i) the amount of time spent on the activity in a usual shift; (ii) the priority placed on the activity for a usual shift; (iii) the amount of time spent on the activity in the last shift worked; and (iv) the amount of time spent on this activity compared with the perception of the amount of time needed in the last shift worked. In the survey, the nurses were asked to respond to the four aspects for each question. The response to each aspect was on a 1 to 5 numerical scale, where 1 represents the least amount of time spent (for aspects (i), (iii), and (iv))/the lowest priority (for aspects (ii)) and 5 represents the most amount of time spent/the highest priority. In the data analysis described in this section, we took the average over the responses to the four aspects and considered the average as the response to each question. In this way, the response to each question was a combined measure for the amount of time spent and priority of the activity to which this question corresponds. The 11 questions were designed to be indicators for six underlying constructs, including organizing one's own activities and resources, checking patient progress and response, doing the work of others to keep care on track, assisting each other's work, mobilizing people and resources, and exchanging information with team members. The six constructs are called organizing, checking, backfilling, assisting, mobilizing, and exchanging, or denoted by o, c, b, a, m, and e in short in this section. Note that constructs o, c, and b correspond to the independent nursing activities, whereas a, m, and e correspond to the interdependent nurs-ing activities; i.e., activities that need coordination between nurses and with other healthcare professionals.
To verify the designed/hypothetical correspondence between the 11 questions and the six constructs, we performed factor analysis with a rotation method called Procrustes (Harman, 1976). This method identifies the factors that underly the questions, such that the correspondence between the factors and the questions maximally overlaps with the designed/hypothetical correspondence between the six constructs and the questions. Six factors were identified and their correspondence with the 11 questions is almost the same as the designed/hypothetical correspondence, except that a is identified to correspond to questions Q5 and Q6, whereas it was designed to correspond to only Q5. Question Q6 was "I communicate information to my nursing team members that they need to know to carry out their patient care activities or to make changes in the plan of care." It is reasonable to believe that this question is an indicator for both e and a. Therefore, we used the six factors identified as individual-level predictors in our model.

Selection of unit-level predictors
The survey also included 30 questions capturing the nurses' perceptions about the infrastructure, climate, and administration strategies in their respective units. Examples of the 30 questions include "Our information technology helps me to find the information I need quickly," "Physicians respond quickly when we call them for a change in an order or change in patient status," and "The physical layout of the unit allows us to get the supplies we need easily." The response to each question was on a 1 to 5 numerical scale, where 1 to 5 represents "strongly disagree" to "strongly agree." Note that although these questions asked for an individual nurse's response, it is more appropriate to include them as unit-level predictors rather than individuallevel predictors. Therefore, we took the average over the responses from all of the nurses in each unit for a question. Furthermore, we performed principal component analysis (Jolliffe, 2001) on the 30 unit-level predictors and kept the first Principle Component (PC) as the final unit-level predictor included in our model. This had two purposes: (i) to reduce the number of unit-level predictors; (ii) the first PC is a linear combination of the 30 unit-level predictors, thus hypothetically serving as an overall measure for the extent to which each unit has characteristics that facilitate nursing quality improvement.

Selection of response variable/quality outcome
Unit-level data for measuring the quality of nursing in each unit were collected separately from the survey. We included one quality measure, the total number of falls per 1000 patient days, or "falls" in short, as the response variable in our model. A summary of the predictors and response variable that were selected is given in Table 6.

Modeling and results
We generated a scatterplot for each predictor in Table 6 with respect to the response. For an individual-level predictor, we plotted the unit average because the response was at the unit level. The scatterplots (see online supplement) show linear trends. This confirms the validity of using a regression model for this particular application. Furthermore, we applied the proposed multi-level model to the data. The Stage 1 model included all six individual-level predictors; i.e.,ỹ i j = β 0i + β 1i (o) i j + β 2i (c) i j + β 3i (b) i j + β 4i (a) i j + β 5i (m) i j + β 6i (e) i j + ε i j . The Stage 2 model regresses each Stage 1 model coefficient on the unit-level predictor, i.e., β qi = γ q0 + γ q1 s i + u qi , q = 0, 1, . . . , 6. The combined model is where (s o) i j = s i × (o) i j and the coefficient γ 11 reflects the interaction effect between s and o.ỹ i j is the latent contribution of nurse j in unit i to the total number of falls in this unit. y i = n i j =1ỹ i j is the total number of falls in unit i and is observable. By applying the model estimation method proposed in Section 2, we obtained estimates for the fixed effects denoted by the γ coefficients in Equation (11) and an estimate for the covariance matrix of the random effects denoted by the u coefficients. Here, considering the sample size limitation, we assumed that was diagonal. Therefore, the proposed method actually gave an estimate for the variance/standard deviation of each random effect. In addition, the proposed method gave an estimate for σ 2 that is the variance of the residual ε i j . These estimates are shown in Table 7. It can be seen that the model in Table 7 (called the full model hereafter) has many fixed effects with large P-values and random effects with small standard deviations. This implies that the model may be simplified. We performed model selection using the LRT suggested in Section 2.3. The selected model was further subjected to model adequacy checks and there were no apparent violations of the model assumptions (please see online supplement for details). However, unit 26 was found to be an outlier. After removing unit 26, the final model was obtained, as shown in Table 8. The R 2 value for this model is 0.94, showing a good fit.
To facilitate the interpretation of the final model, we now write out the Stage 1 and Stage 2 models corresponding to the final model; i.e.,ỹ i j = β 0i + β 4i (a) i j + β 6i (e) i j + ε i j (Stage-1); β 0i = γ 00 + γ 01 s i + u 0i , β 4i = γ 40 + γ 41 s i , β 6i = u 6i (Stage 2). Some interesting conclusions can be drawn.  patient care. This indicates that the coordination between nurses and with other healthcare professionals may be more important for reducing falls, compared with independent nursing activities. 2. β 0i is the mean number of falls in unit i when the levels of nursing activities a and e are equal to their respective global means. β 0i is affected by s according to the Stage 2 model; also, the coefficient for s, γ 01 , is negative. This implies that a unit with a high level of s will have a fewer mean number of falls than a unit with a low level of s, even though the nurses in the two units have the same levels of "assisting" and "exchanging" activities. Recall that s is a composite measure for the extent to which the unit has characteristics that facilitate nursing quality improvement, hypothetically. Our finding confirms this hypothesis. In addition, β 0i consists of a random effect, u 0i , whose variance is significant. This implies that hospital units vary in their mean number of falls even , then on average these units will have a mean number of falls equal to γ 00 + 0.0317γ 01 = 3.53. 3. β 4i reflects the strength of association between a nurse's assisting activity in unit i and the number of falls. β 4i is affected by s according to the Stage 2 model; also, the coefficient for s, γ 41 , is negative. This implies that in a unit with a high level of s, increasing the assisting activity of nurses will reduce the number of falls more than in a unit with a low level of s. In addition, β 4i does not include a random effect. This implies that after controlling for s, hospital units behave similarly in terms of the strength of association between nurses' assisting activities and the number of falls; i.e., little variability in the strength of association remains to be explained. Furthermore, γ 40 + γ 41 s can be used to estimate the average strength of association between nurses' assisting activities and the number of falls across the population of units with the same s. Note that because γ 40 < 0, γ 41 < 0, and s > 0, the strength of association is always negative, implying that nurses' assisting activities will reduce the number of falls regardless of the unit to which the nurses belong. 4. β 6i reflects the strength of association between a nurse's exchanging activity in unit i and the number of falls. β 6i includes a random effect, implying that hospital units vary in terms of the strength of association between nurses' exchanging activities and the number of falls. However, this variability cannot be accounted for by s. Furthermore, as β 6i does not include any fixed effect, it implies that on average there is little association between nurses' exchanging activities and the number of falls.

Conclusions
This article proposed a multilevel model to link individualand enterprise-level predictors with an enterprise-level quality outcome for enterprise quality improvement. Unlike conventional multilevel regression, which requires the outcome be at the individual level, the proposed model treats each individual's contribution to the enterprise quality outcome as a latent variable. An algorithm was proposed to estimate the model parameters, which integrates the FS algorithm and generalized least squares estimation. Simulation studies were conducted to assess the performance of the proposed model, in comparison with the aggregate model that which aggregates the individual-level predictors to the enterprise level and the gold standard model, which assumes that each individual's contribution to the enterprise quality outcome is explicitly known. These studies showed that the proposed model performs close to the gold standard model in terms of the biases and standard errors of the estimates for the fixed effects, variances of the random effects and residual variance. In contrast, the aggregate model leads to much larger standard errors of the estimates for the fixed effects and much larger bias and standard error of the residual variance estimate; also, the aggregate model cannot separate the within-and betweenenterprise variations by providing separate estimates for the random effects variances and residual variance. The proposed model was applied to a real-world application of nursing quality improvement. Our finding showed that the interdependent nursing activities in coordinating patient care, especially the assisting and exchanging activities, have a significant impact on reducing patient falls. Also, our finding confirmed that the unit infrastructure, climate, and administration strategies that are hypothesized to improve nursing quality do help significantly reduce falls. In addition, the assisting activity of each nurse and the quality-improving infrastructure, climate, and administration strategies of the nurse's unit promote each other in reducing falls.
Finally, we point out several future research directions. Multiple quality outcomes, such as falls, medication errors, and patient satisfaction, may be considered together, leading to a multilevel model with multiple latent responses to be developed. Also, a generalized multilevel model may be more appropriate considering that the response variables may not all be strictly normal. Furthermore, robust model estimation with a large number of predictors and limited sample sizes may be studied. In addition to the methodological development, it would be of interest to investigate the difference between hospital units in terms of how the individual-and unit-level predictors affect nursing quality outcomes and formulate specific quality improvement plans for each unit. Also, instead of using a composite measure as the unit-level predictor, it may reveal more insights to include each specific quality-assuring measure as a predictor in order to assess the effectiveness of each measure.

Funding
This work is partly supported by the Interdisciplinary Nursing Quality Research Initiative program at the Robert Wood Johnson Foundation and the National Science Foundation under grants CMMI-0825827 and CMMI-1069246.

Supplemental Material
Supplemental data for this article can be accessed at the publisher's website.