Mixed-Effects Models with Crossed Random Effects for Multivariate Longitudinal Data

Abstract Multivariate models for longitudinal data attempt to examine change in multiple variables as well as their interrelations over time. In this study, we present a Mixed-Effects Model with Crossed Random effects (MEM-CR) for individuals and variables, and compare it with two existing structural equation models for multivariate longitudinal data, namely the Curve-of-Factor-Scores (CUFFS) and the Factor-of-Curve-Scores (FOCUS). We compare these models in two types of longitudinal studies based on balanced and unbalanced data: panel studies and cohort-sequential designs, respectively. We illustrate the performance of these statistical techniques using empirical data from two studies (MHS, a panel study, and NLSY79, a cohort-sequential design) with discrete and continuous time metric modeling, respectively. We conclude that MEMs-CR provides relevant information about the developmental trajectories of individuals and variables in multivariate longitudinal data under either type of data condition. We discuss the theoretical and methodological implications of our findings.

Longitudinal studies allow researchers to model psychological processes as they unfold over time and identify possible causes and interrelations of such changes (e.g., Schaie, 1965). These studies are characterized by repeated measurements on the same individuals across different measurement occasions. Thus, the corresponding statistical models aim at capturing the developmental trajectories, considering both intra-individual change and inter-individual differences in such change. In the case of multiple processes, McArdle (2009) described different modeling approaches for multivariate longitudinal data, being two of the most common models are the Curve-of-Factor Scores (CUFFS) and the Factor-of-Curve Scores (FOCUS). These models include second-order growth factors that can explain the general developmental processes of multiple variables over time (McArdle, 1988). As an alternative model for multivariate longitudinal data, we propose Mixed-Effects Models with Crossed Random effects (MEMs-CR). Typically, MEMs-CR is used to model multivariate nested data, such as students from schools within neighborhoods (e.g., Garner & Raudenbush, 1991;Raudenbush, 1993;Leckie, 2013), or individuals and items within experimental designs (e.g., Baayen et al., 2008;Hoffman & Rovine, 2007;Mart ınez-Huertas et al., 2022;Quen e & van den Bergh, 2004). In this paper, we extend the use of MEMs-CR models to multivariate longitudinal data and compare it with the CUFFS and FOCUS models.

Curve-of-Factor Scores (CUFFS)
The CUFFS model is an extension of univariate latent growth models in which multiple items or variables are measured at each measurement occasion, so the growth represents an underlying construct (McArdle, 1988). Following McArdle (1988; see also Ferrer et al., 2008;Isiordia & Ferrer, 2018;Hancock et al., 2001), the CUFFS model is a combination of a measurement model and a growth model. Given three variables X, Y, and Z, the equation for each variable can be written as where X, Y, and Z, represent manifest variables for person i at a time t, k represents the factor loading, and e represents the unique factor score. The latent factor f ti is a function of the growth process of individual i at the measurement occasion t for the three measured variables X, Y, and Z. These latent factors are then specified as a function of a latent intercept (f 0i ), and latent slope (f si ), a specific curve parameter (b t ), and an error term or time specific variability at measurement occasion t for individual i (r 2 fti ). The latent intercept and slope factors are expressed as: where both factors have average intercepts (l 0i ) and slopes over time (l si ) general to the three measured variables X, Y, and Z, as well as residuals with variance among individuals (e (2) f0 , and e (2) fs , respectively).

Factor-of-Curve Scores (FOCUS)
The FOCUS model is also a multivariate extension of the univariate latent growth model. It allows examining changes in multiple variables over time as well as the interrelations of such changes. Instead of specifying the interrelations of intercepts and slopes as covariances, this model specifies higher-order common factors that capture the relations among lower-order developmental processes in terms of intercepts and slopes (McArdle, 1988). While the CUFFS model has been applied more frequently than the FOCUS (Isiordia et al., 2017), the latter model has important features for the study of multivariate longitudinal data. In a standard specification, the FOCUS model has different firstorder univariate latent growth models for each variable and then different second-order latent factors representing common variance in the intercepts and slopes of the lower-order factors (McArdle, 1988). Equation 3 represents the latent growth curves for three variables X, Y, and Z: where X ti , Y ti , and Z ti represent the observed scores for each variable measured at time t for individual i. Each observed variable is a function of a latent intercept (x 0i , y 0i , and z 0i ) and a latent slope (x si , y si , and z si ). These intercepts and slopes are then specified in terms of second-order factors f 0i and f si : These latent factors represent the common variance among the intercepts (f 0i ) and slopes (f si ), respectively. Here, the coefficient k denotes the factor loading linking the intercepts and slopes to the corresponding second-order factors, which, themselves, have variances and covariances.
The underlying idea of both CUFFS and FOCUS models is to describe a general developmental process underlying multivariate longitudinal data. An important difference between both models is the specification of their first-order latent factors. In the CUFFS model, these first-order latent factors represent a latent state of the developmental process for all the measured variables, whereas in the FOCUS model they represent different growth curves per variable. Consequently, the second-order latent factors describe the initial state and the general developmental change per measurement occasion representing a different underlying developmental process. A detailed explanation of these models can be found in the literature (McArdle, 1988(McArdle, , 2009).

Mixed-Effects Model with Crossed Random Effects (MEM-CR)
Mixed-Effects Models (MEMs) have been used to analyze longitudinal data (e.g., Bryk & Raudenbush, 1987;Cudeck, 1996;Singer & Willett, 2003). Whilst most of the implementations have been done in univariate processes, only occasionally these models have been applied to bivariate (e.g., Garner & Raudenbush, 1991) or multivariate processes (e.g., Raudenbush et al., 1995;Hoffman, 2015;MacCallum et al., 1997;Snijders & Bosker 2012). MEMs-CR is a mixed model that considers different sources of variability simultaneously. From this perspective, we propose MEMs-CR as a novel and valuable alternative tool for the analysis of multivariate longitudinal data.
MEMs-CR has been applied to multivariate experimental data that include, for example, neighborhoods and schools (Garner & Raudenbush, 1991;Leckie, 2013;Raudenbush, 1993) or individuals and items (Baayen et al., 2008;Hoffman & Rovine, 2007;Quen e & van den Bergh, 2004). Usually, the inclusion of random effects in MEMs-CR is due to the adverse statistical effects in the parameter estimates when such random effects are ignored (e.g., Hoffman, 2015;Hox et al., 2018;Meyers & Beretvas, 2006). Moreover, including random effects has been proposed from a confirmatory perspective to study the empirical variability of within-subject effects in individuals or items (Barr, 2013). In the present study, we consider measurement occasions in longitudinal multivariate data similarly as those withinsubject effects. Specifically, we extend the use of crossed random effects for studying variation in both individuals and variables. This implementation allows examining a general trajectory common to all the participants and variables, but also the specific trajectories of both participants and variables. Equation (5) 1 describes a MEMs-CR where each individual i is measured on several variables v at different occasions t: Here, the fixed effects represent the mean intercept or grand mean of the sample for all variables (l 0iv ), and the general slope (l 1iv ) effects of measurement occasion (or age). The subscripts of the fixed effects indicate that they vary for individuals and variables, which are the random effects. Specifically, y 0iv is a function of the random variance of the intercept of individual i and variable v, while y 1iv is a function of the random variance of the slope of individual i and variable v. Thus, the error term (e ivt ) depends on the individual, the variable, and the age. The random effects of this model are: r (2) 0i is the variance of the intercepts of individuals, r (2) 0v is the variance of the intercepts of variables, r (2) 1i is the variance of the slopes of individuals, and r (2) 1v is the variance of the slopes of variables. When specifying both individuals and variables as random effects in MEMs-CR it is important to consider the socalled exchangeability principle in hierarchical linear models (Lindley & Smith, 1972;Raudenbush, 1993). According to Raudenbush (1993), the use of two different crossed random effects (such as individuals and variables) implies the assumptions of exchangeability for the levels of such random effects. The exchangeability of participants is related to the independent and identically distributed probability distribution of the variability of individuals around the effect of age (when age is used as the time metric). This means that there is a general fixed effect that explains the developmental change of all individuals and that individuals present some differences around that effect. This is a common interpretation for random effects in MEMs and SEMs. However, the exchangeability of variables, which is also related to the independent and identically distributed probability distribution of the variability of variables around the effect of age, is not available in other approaches for multivariate longitudinal data analysis. This means that there is a general fixed effect that explains the developmental change in all the variables and that variables present some differences around that effect. Thus, variables present idiosyncratic variability around that mean effect of time with their own independent and identically distributed probability distribution.
In multivariate longitudinal data, individuals and variables are fully crossed: all the variables are measured in all the individuals, except for cases of missing data (e.g., an individual could have been measured on some but not all the variables at some given occasion). Thus, individuals and variables can be parameterized as crossed random effects. However, time can be modeled differentially depending on the type of longitudinal design (e.g., panel study vs. cohortsequential design, see next section where time is coded as discrete or continuous). As we describe in a subsequent section, different longitudinal designs can lead to different samplings of time in which individuals are assessed on different measures at different time points, which is especially important for partitioning the variance in SEMs and MEMs-CR.

Balanced vs. Unbalanced Data and Discrete vs. Continuous Time in Longitudinal Designs
To study how individuals change over time and capture key characteristics of psychological processes, researchers use different designs for data collection (e.g., Dormann & Griffin, 2015;Finkel, 1995;Johal et al., 2021). Two of the most common longitudinal designs are panel studies and cohort sequential designs.
Panel designs involve repeated assessments of a given sample at approximately equal time intervals. The UK's Household Longitudinal Study (UKHLS) is an example of a large panel study (Buck & McFall, 2011;Lynn, 2009). This is a longitudinal study from a representative sample of UK households with annual assessments. In this panel study, participants are assessed at approximately the same measurement occasion, which is used as the unit to evaluate intra-individual change (e.g., Lacey et al., 2019;Tippett et al., 2013;Whitley et al., 2016).
In cohort sequential designs, on the other hand, each individual is measured on a few occasions and these measurements cover only a fraction of the time range of the study (Bell, 1953(Bell, , 1954Duncan et al., 1996;McArdle, 1994;Schaie, 1965). An important example of a cohort-sequential design is the National Institute of Mental Health MRI Study of Normal Brain Development (NIH MRI; Evans & The Brain Development Cooperative Group, 2006). This is a longitudinal study from a representative sample of US children involving psychological and imaging data. In this study, children of different ages (cohorts) are assessed at different points of their development (i.e., the measurement of some cohorts starts earlier than others) to cover an extensive period of the developmental process. This type of study is capable of sampling different developmental points using different overlapped cohorts. Data from cohort-sequential designs are typically analyzed using age as the underlying unit of intra-individual change (e.g., Aubert-Broche et al., 2014;Pangelinan et al., 2011;Waber et al., 2007).
Panel studies and cohort sequential designs have their own advantages and disadvantages, but both are useful designs for collecting longitudinal data. Table 1 briefly summarizes some key features of both designs. Intra-individual change is typically defined by measurement occasions in panel studies, whereas it is commonly defined by age in cohort-sequential designs. This is because the measurement occasion and the interval between measurements can vary across individuals in cohort-sequential designs, while they are constant in panel studies. Thus, cohort-sequential designs tend to cover larger ranges of time than common panel studies. A direct consequence of data collection is the differential presence of missing data. In panel studies, missing data typically increases along measurement occasions due to attrition. In cohort sequential designs, missing data is considerably higher per measurement occasion or age, as only a fraction of individuals is measured at each specific occasion, with possible differences in the intervals between occasions.

The Present Study
Two common models for analyzing multivariate longitudinal data are CUFFS and FOCUS models (McArdle, 1988). Here, we propose MEMs-CR as an alternative model taking advantage of the fact individuals and variables are fully crossed in longitudinal designs (except for cases of missing data). Using individuals and variables as random effects in MEMs-CR allows for studying their variability thus capturing their idiosyncrasies. CUFFS, FOCUS, and MEMs-CR models are hypothesized to describe the general trajectories similarly. This is because they all assume that there is a general trajectory common to all the variables in the study. We maintain, however, that the MEMs-CR model captures the variability of both individuals and variables uniquely. Moreover, this approach provides greater flexibility for specifying the underlying time metric as either discrete or continuous, relative to SEM models (please see Driver et al., 2017;Oud & Jansen, 2000;Voelkle et al., 2012; for some proposals on SEM using continuous time). In the next section, we illustrate the use of MEMs-CR with multivariate longitudinal data and compare it with the CUFFS and FOCUS models.

Empirical Examples
The first empirical example consists of data from a panel study (Motivation in High School Project;Ferrer & McArdle, 2003) involving 253 participants (107 women) ranging from 14 to 17 years (M ¼ 14.48, SD ¼ .84). The second empirical example involves data from a cohort-sequential design (NLSY79 study; Center for Human Resource Research, 2009;Chase-Lansdale et al., 1991) with 9,261 participants (4,602 women) ranging from 3 to 22 years (M ¼ 9.79, SD ¼ 2.83). Table 2 presents the descriptive statistics of the variables used in the present analyses (see also other descriptive analyses of the NLSY79 study data in Supplementary Table S1). The panel study uses measurement occasion as the time metric that defines intra-individual change, whereas the cohort-sequential design uses age. For each data set, we implement the MEMs-CR, CUFFS, and FOCUS models using two different perspectives: (1) we specified the CUFFS and FOCUS models to approximate MEMs-CR, and then compared the parameter estimates of the trajectories and the model fit across models; and (2) we specified each model attempting to maximize its unique features. For the former, we examined convergence between the three models, trying to maintain their equivalence. For the latter, we tried to maximize the strength of each model and examined their divergences (please see the Supplementary Materials). In our analyses, we interpret the estimates of the models for pedagogical purposes but recommend an integrative perspective between model misfit and theory.
We implemented the CUFFS and FOCUS models using the R's lavaan package (Rosseel, 2012) for measurement occasion and Mplus 7 using TSCORES for age (Mehta & Neale, 2005;Mehta & West, 2000;. For the MEMs-CR models, we used the lme4 package (Bates et al., 2015) in R software (R Development Core Team, 2019) in both data sets. Additionally, we computed the standard errors of the random effects from MEMs-CR using the arm package (Gelman et al., 2013). To facilitate the comparison of parameter estimates, we standardized all the variables based on their first measurement. All analyses were carried out using maximum likelihood (ML) estimation and REML. While previous research has found slight differences between ML and REML estimations (Jiang, 1996;Mart ınez-Huertas et al., 2022;Morrell, 1998;Thompson, 1962;West et al., 2014), we did not find substantive differences in the analyses for this report. Thus, we only report ML results for MEMs-CR to ease the comparison with CUFFS and FOCUS models. Code for all analyses is available in the Supplementary Materials. Data from the MHS is available upon request. Data from the NLYS is available from https://www.bls.gov/nls/nlsy79.htm.

Panel Study: Motivation in High School Project
To illustrate the use of these methods in a panel study, we selected data from the Motivation in High School Project (Ferrer & McArdle, 2003). This study was conducted to examine changes in self-perceptions among high school students. A total of 261 adolescent students were assessed on By design (and attrition) Note: Differences between designs are present in relative terms to ease their comparison. four measurement occasions during a school semester, with intervals of about six weeks. In this example, we use four variables from the Self-Profile of Adolescent Scale (Harter, 1985), namely, perceived competence, perceived appearance, general self-worth, and physical self-worth. According to Harter's theory (1985), a general longitudinal trajectory can be expected for all these variables, together with differences across individuals and variables. More information about these variables and details of the study are available elsewhere (Ferrer & Gonzales, 2014;Ferrer & McArdle, 2003;Ferrer et al., 2008;Isiordia & Ferrer, 2018). Figure 1 presents longitudinal data for all variables for four individuals. This figure illustrates considerable differences in the trajectories across individuals and variables. In addition, missing data in this study is scarce. Table 3 presents the results from the CUFFS model using a linear parameterization to match SEMs with MEMs-CR. 2 The parameter estimates include a general intercept not different from zero (l f0 ¼ À.069, p ¼ .56) and a statistically significant linear slope (l fls ¼ .033, p < .05). Both the intercept and the slope of the general growth curve represent self-perceptions present significant variances (r (2) f0 ¼ .422 and r (2) fs ¼ .017, respectively) indicating individual differences in both initial status and growth over time. There is also a covariation between these latent factors (r f0-fs ¼ À.028, p < .05), which translates into a medium correlation (q f0fs ¼ À.33): individuals with lower initial scores in the selfperception profile tend to have higher rates of change across time. Regarding the factorial structure, invariance is specified across time points, and the factor loadings indicate that the four observed variables contribute equally and significantly to the factor on each occasion. Table 4 presents results from the FOCUS model using a linear parameterization to match SEMs with MEMs-CR. Results showed an overall factor intercept not different from zero (l f0 ¼ À.004, p ¼ .94), due to the variables being standardized based on their first-time point. Similarly, the overall factor slope was not different from zero (l fs ¼ .020, p ¼ .10). Both the intercept and slope factors showed significant variances (r (2) f0 ¼ .548, p < .001, and r (2) fs ¼ .023, p < .001, respectively) indicating individual differences in the initial levels and the longitudinal growth. A statistically significant covariance between the factors (r f0-fs ¼ À.043, p < .001; q f0fs ¼ À.382) denotes that individuals with lower initial scores in the self-perception factor tend to have higher rates of change across time. Loadings for the intercept and slope factors were all statistically significant indicating that all the self-perception variables contributed to the definition of the second-order common factors. Given the standardized firstorder slope loadings that were estimated, a latent (not-linear) basis will be explored later because it could be a better choice for these data.
Results from the MEMs-CR are presented in Table 5. Because all the variables were standardized based on the first time point, the intercept was not statistically significant   The loadings of the second-order intercept factor were fixed to 1. c The loadings of the second-order slope factor were fixed to (0, 1, 2, 3). The First-order parameters are estimated but not shown.
2 As we will see later, it would be recommended to use a latent basis in some situations.
(l 0iv ¼ À.032, p ¼ .56). Similarly, the linear slope of measurement occasion was not different from zero (l 1iv ¼ .036, p ¼ .17). The correlation between the general intercept and the slope was À.094, indicating that individuals with lower initial levels of the various self-perception variables tend to present higher longitudinal growth. As we will see later, the lack of statistical significance is mainly related to the probable overparameterization of the random structure imposed in this analysis. The random effects show differences in the intercepts across individuals. Moreover, they also show differences in their slopes, as the estimated cluster-specific variability (r (2) 1i ¼ .035) is substantial relative to the corresponding fixed effect. In contrast, the variances in the intercept and slopes across the variables were negligible. The second to last set of rows in Table 5 reports the estimated intercepts and slopes for each variable. These estimates show relative variation across variables. For example, a significant effect of time was observed in the slopes of perceived competence (.080, 95%CI [.055-.104]), perceived appearance (.031, 95%CI [.007-.056]), and psychical selfworth (.048, 95%CI [.024-.073]). On the contrary, global self-worth showed an effect of time not different from zero (À.015, 95%CI [À.040 to .001]) and the most negative intercept (À.071, 95%CI [À.089 to À.053]). Figure 2a displays the mean predicted trajectories from CUFFS, FOCUS, and MEM-CR. The estimated linear trend of the three models is similar, except for the FOCUS model, which showed a less pronounced increase over time. Also, the estimated mean effect at the first measurement occasion was slightly smaller for the CUFFS than for the MEM-CR. Figure 2b presents the specific longitudinal trajectories predicted for each variable, highlighting the differences across them. Whereas perceived competence, perceived appearance, and psychical self-worth showed a positive linear trend across measurement occasions, global self-worth showed a negative not-different from zero trend). Figures 2c and d present the longitudinal trajectories of two different individuals. Individual 1 (panel c) shows a higher level in the general variable throughout the study but does not show longitudinal changes. On the contrary, Individual 2 (panel d) shows a mean level at the beginning of the study but presents a large change along the study. There are small differences in the longitudinal trajectories predicted by the models, but the predicted changes are very similar for both individuals.

Cohort-Sequential Design Study: National Longitudinal Survey of Youth 1979
To illustrate the use of the three proposed models with data from a cohort-sequential design, we used the child sample of the National Longitudinal Survey of Youth 1979 (NLSY79) study (Center for Human Resource Research, 2009;Chase-Lansdale et al., 1991). A total of 9,261 children were assessed on up to five measurement occasions, with different intervals for each individual. In the present study, we use six variables of the study: PIAT reading comprehension, WISC-R memory for digit span (backward), WISC-R memory for digit span (forward), PIAT mathematics, the Peabody picture vocabulary test-revised, and PIAT reading recognition. These variables represent processes thought to show substantial growth over the developmental period of this study (3 to 22 years). Moreover, although all variables represent cognitive abilities, they are expected to show important differences in their trajectories and across individuals (McArdle et al., 2002). We refer to the guide for data users of the NLSY79 study (Center for Human Resource Research, 2009) for a detailed description of these variables. For all analyses, the participants' age was centered on the youngest age in the sample and was used as a continuous-time metric. Figure 3 presents the observed longitudinal data for all the variables for four individuals. As was the case in the previous example, the data show considerable differences across both individuals and variables. Table 6 presents the results from fitting the CUFFS model to the NLSY79 data. Given the data, we specified a quadratic function using age as the underlying time metric, .169 .720 --Second-order intercept factor loadings k ÃÃ p < .01. a Standardized factor loadings were reported. The loadings of the first-order intercept factors were fixed to 1, and the loadings of the first-order slope factors were fixed to (0, 1, 2, 3). and this function was used for the remaining models as well. The first set of estimates represents the factor loadings relating each of the variables to the first-order factor. These loadings are all high and uniform, indicating a similar contribution to the factor from all variables, except for the two variables related to memory. The general intercept was negative (l f0 ¼ À1.819, p < .001), representing the initial status of the factor at the youngest age. The linear and the quadratic effects of age were both statistically significant (l fls ¼ .593, p < .001 and l qs ¼ À.016, p < .001, respectively). Jointly, these parameters indicate an overall increase in the rate of change common to the six variables of the study, but also a general negative quadratic effect reflecting a deceleration in the trajectory across the entire study. The results also showed statistically significant variances for the general intercept (r f0 ¼ 2.923, p < .001) and the general linear and quadratic age effects (r f0 ¼ .330, p < .001 and r f0 ¼ .001, p < .001, respectively). All three covariance parameters of the model were also statistically significant. These associations indicate that individuals with lower initial scores tend to have higher rates of linear change and smaller rates of quadratic change across time. Individuals with larger rates of linear change across time tend to present smaller quadratic rates of change. Table 7 presents the results from the FOCUS model. These results indicate a negative mean for the intercept (l f0 ¼ À2.501, p < .001), which represents the prediction for the initial status (at the youngest age) and has a statistically significant variance (r (2) f0 ¼ .207, p < .001). The linear age slope was positive (l fls ¼ .815, p < .001) with a statistically significant variance (r (2) fl ¼ .006, p < .001), but the quadratic effect was negative (l qs ¼ À.030, p < .001) with a variance that was fixed to zero due to converge problems. 3 The negative covariance between the intercept and the linear slope indicates that individuals with lower initial scores have slightly higher rates of change across age (r f0-fls ¼ À.008, p < .001). These parameters indicate a negative mean initial status, an overall linear increase, and a negative quadratic decrease over the age that is common to all six variables. Loadings for the slope factors were all fixed to one so that the six variables contributed to the definition of the age effects of the second-order common factors. Residual variances of the first-order factors could be interpreted as the deviation from the estimated parameters of the trajectories common to the six variables (as discussed later, these residual variances share some parallelisms with random effects of MEMs-CR). Table 8 presents the results from the MEMs-CR model. The mean of the intercept was statistically significant (l 0iv ¼ À1.99, p < .001), representing the prediction of the model for the initial status. The linear effect of age was also statistically significant (l 1iv ¼ .62, p < .05), but the quadratic effect was not (l 2iv ¼ À.02, p ¼ .97), as was the case for the FOCUS model. The correlations between the intercept and the linear and quadratic age slopes were positive and medium (q ¼ .340 and q ¼ .278, respectively), whereas the correlation between both slopes was negative (q ¼ À.269). These correlations suggest that individuals with lower initial levels of the various cognitive abilities tend to present higher linear and quadratic longitudinal growths, but individuals with larger linear changes across time also tend to present smaller quadratic longitudinal growth.
The random effects show that individuals present some variance in their intercepts and linear slopes, but not in the quadratic trajectories. In contrast, the random effects for the variables show large differences in the initial status, the linear and the quadratic slopes across variables. Some variables showed a more pronounced quadratic trajectory (PIAT reading comprehension, PIAT mathematics, Peabody picture vocabulary test-revised, and PIAT reading recognition), whereas others (WISC-R memory for digit span backward, and WISC-R memory for digit span forward) showed The covariances involving quadratic change were also fixed to zero.
primarily a linear change. Similarly, all the variables displayed large variability among individuals, both in terms of the intercept and the slopes. Figure 4a displays the mean predicted trajectories from the CUFFS, FOCUS, and MEM-CR models. The three estimated trajectories are similar, except for the CUFFS model which shows a less pronounced quadratic effect over time, and for the asymptote of MEMs-CR, which is lower than those from the two SEM models. Figure 4b presents the predicted trajectories for each variable. All the variables presented a clear quadratic effect of age, except for WISC-R memory for digit span-backward and forward-that presented a linear trend over time. Similarly, all variables presented different intercepts and changes over time, with PIAT reading comprehension showing the largest changes throughout the study. Figures 4c and d  longitudinal trajectories of two different individuals. In this case, both individuals show a similar longitudinal trajectory with small differences in their intercepts and asymptotes. The three models predict similar changes throughout the study, but they have a slightly different trajectory for older ages; that is, the influence of the quadratic effect was slightly different between the models. It is worth noting that fixing the variance of the second-order factor quadratic effect of the FOCUS model to zero generated the same quadratic effect for all the individuals.

presents the
The similarity of trajectories for the variables across the three models points to some parallelisms between the random effects of MEMs-CR and the residual variances of the first-order factors of the FOCUS model. For example, those variables with lower negative intercepts in MEMs-CR showed larger residual variances in their FOCUS intercept factors (i.e., they had the largest deviation from the estimated mean for the general intercept). Similarly, those variables with smaller linear increases over time in MEMs-CR showed larger residual variances in their FOCUS linear effect factors (i.e., largest deviation from the estimated mean for the general linear effect of age). It seems reasonable to say that the CUFFS model estimates the general trajectory common to all the variables of the study, whereas the FOCUS and the MEMs-CR can also extract information about the variability of different variables around a common trajectory. In this line, MEMs-CR allows to estimate coefficients related to both the general and the specific trajectories in the same model, while the FOCUS model generates almost the same trajectory for all the variables and, through the residual variances, allows one to identify whether the such trajectory is adequate for each variable. As described later, if estimating the trajectory common to all the variables is not the goal, there are ways to estimate the specific trajectories of each variable in the FOCUS model.

Comparing CUFFS, FOCUS, and MEMs-CR Model Fit
We do not think that it is possible to directly compare the fit of models from different families, like SEMs and MEMs, using the log-likelihood calculation (or its derivates like AIC or BIC) because of the differences in how the models and their likelihoods are computed 4 (e.g., using FIML estimation in SEMs and ML in MEMs, or using different software and packages for each model). We believe, however, that one reasonable solution is to compare the fit of CUFFS, FOCUS, and MEMs-CR models using the residuals of the predictions of the trajectories because these are common to the three models. In the previous section, we presented the trajectories of the different models for each data set. Here, we examine model fit based on the predictions of individual trajectories through the residuals, which were computed as the root mean squared error (RMSE) between the estimated scores in each model and the observed scores for each indi-  The RMSE was computed separately for each model in each data set 5 (see the distribution of the residuals in Figure 5). In the panel study, the RMSE was computed based on 3,388 observations (i.e., from 263 individuals measured on four variables across up to four-time points). The CUFFS model showed a mean RMSE of .39 (SD ¼ .19, Mdn ¼ .35), ranging from .11 to 1.25. For the FOCUS model, the mean RMSE was .66 (SD ¼ .20, Mdn ¼ .64), ranging from .15 to 1.99. For the MEM-CR, the mean RMSE was .40 (SD ¼ .34, Mdn ¼ .31), ranging from 0 to 2.24. These results indicate that the CUFFS and MEM-CR models yielded smaller mean residuals than the FOCUS, but the MEM-CR produced the largest range of residuals. Thus, while the mean residual of MEMs-CR is similar to CUFFS, there is significantly more variability in its error predictions, relative to both SEMs.
In the cohort-sequential study, the RMSE was computed based on $159,248 observations (i.e., 9,621 individuals measured on six variables across up to five-time points). The CUFFS model showed a mean RMSE of .68 (SD ¼ .21, Mdn ¼ .65), ranging from .14 to 2.31. For the FOCUS, the mean RMSE was .86 (SD ¼ .37, Mdn ¼ .78), ranging from .19 to 5.14. The MEM-CR showed a mean RMSE of .33 (SD ¼ .28, Mdn ¼ .27), ranging from 0 to 3.66. In these data, both the MEM-CR and the FOCUS presented the largest range of residuals, compared to the CUFFS model. But the MEM-CR generated the lowest mean RMSE, which indicates that its predictions were more accurate than those from the SEMs models.

Maximizing Unique Features of MEMs-CR, CUFFS, and FOCUS Models
In this section, we expand the previous specification for each model and focus on their unique features. Whereas in the previous section we attempted to find convergence among the MEMs-CR, CUFFS, and FOCUS models, our goal here is to highlight their differences.
For the panel data, following the results from the CUFFS and FOCUS models, we specified a linear slope for the MEMs-CR. This was not an optimal specification because this function may be biasing the longitudinal effects of measurement occasion in this data set. This is an important limitation of MEMs-CR models, as they cannot accommodate a latent basis like SEMs. However, there are strategies ÃÃ p < .01. First-order factor intercepts were fixed to 0, and their variances were fixed to 1. Residual variances of observed variables were constrained to be equal. 5 The available information for computing RMSE was not exactly the same for all models due to differences in the number of available data across individuals. Thus, model predictions were more probable when individuals had more observations. Thus, RMSE was computed using more information for MEMs-CR than for the SEMs. Although this is a limitation when comparing the fit across models, we decided to use all the available information in each model as a natural analytical strategy. that can be implemented to maximize the strength of each model. We report those here.
Supplementary Tables S2 and S3 present the results for the CUFFS and FOCUS models using a latent basis, respectively. As expected, a latent basis revealed a non-linear effect of measurement occasion. The estimate representing measurement occasion as a latent basis was larger than that using a linear trend (see Tables 3 and 4), but the FOCUS model did not obtain a statistically significant effect of time. As we discuss later, the lack of statistical significance in the FOCUS model and some versions of MEMs-CR could be related to the overparameterization of the time effect of variables in this data set (which seem to present a similar effect of measurement occasion).
For the MEMs-CR model, we followed two different strategies. First, we reduced the complexity of the model's random structure. Second, we used a latent basis using the estimates of the SEM modes as the time metric. Results can be found in Supplementary Tables S4-S6. First, given that we did not find variability in the random slopes of variables (see Table 5), this random effect was fixed to zero. This yielded a measurement occasion that was statistically significant (l 1iv ¼ .04, p < .05), and similar across all variables, and with no other changes from the previous results (Supplementary Table S4). Second, we used the expected values of the latent basis of the SEM models (i.e., the mean of the factor loadings of the second-and first-factors CUFFS and FOCUS, respectively) as the levels of the fixed effect of measurement occasion. This did not result in a statistically significant measurement occasion, but the actual estimate was similar to that of the CUFFS model (Supplementary Table S5). As was the case before, no variability was found between the trajectories of the variables. Thus, we fitted a simpler model without random slopes for variables. This produced a statistically significant measurement occasion effect, which was close to the estimate from the CUFFS model (l 1iv ¼ .12, p < .001) with a slope similar across all variables (Supplementary Table S6).
For the cohort-sequential data, we used the procedure TSCORES in Mplus to model the effects of continuous time in CUFFS and FOCUS models. We were interested in estimating a trajectory common to all variables to match the SEM models and the MEMs-CR. Given that the purpose of the CUFFS is to estimate the trajectory common to all variables, an ideal specification to analyze these data is the one presented in Table 6. For the FOCUS model, it is possible to estimate the trajectories of all variables together with the common factors of time effects, but this specification does not generate a general trajectory. For MEMs-CR, a model selection strategy could be followed to determine the most appropriate random structure to analyze the data. For the MEMS-CR we followed a bottom-up model selection based on likelihood ratio tests and found that using both intercepts and random slopes for all the effects was the most appropriate random structure (Supplementary Table S8).
The results of the selected model were those reported in Table 8.
For the FOCUS model, we estimated the trajectories of all the variables and fitted second-order factors with means fixed to zero and free variances (see Supplementary  Table S7). Results indicated that the first-order factor means were statistically significant and presented a similar trend: negative intercepts, positive linear effects of age, and small negative quadratic effects of age. Here, we report the estimates of PIAT reading comprehension and WISC-R memory for digit span (backward) and compare them with estimates from the MEMs-CR model. The PIAT reading comprehension showed a negative mean for the intercept (l rc0 ¼ À2.591, p < .001) with a non-significant variance (r (2) rc0 ¼ .080, p ¼ .08), a positive linear effect of age (l rc1 ¼ .837, p < .001) with a statistically significant variance (r (2) rc1 ¼ .014, p < .01), and a small negative quadratic effect of age (l rc2 ¼ À.033, p < .001) with statistically significant variance (r (2) rc2 ¼ .000, p < .001). The WISC-R memory for digit span (backward) presented a negative but lower intercept mean (l mds0 ¼ À1.112, p < .001) with a not-statistically significant variance (r (2) mds0 ¼ .078, p ¼ .69), a smaller positive linear effect of age (l mds1 ¼ .395, p < .001) with a notstatistically significant variance (r (2) mds1 ¼ .041, p ¼ .28), and a smaller negative quadratic effect of age (l mds2 ¼ À.016, p < .001) with a non-significant variance (r (2) mds2 ¼ .000, p ¼ .24). These estimates are similar to those from the MEMs-CR (see Table 8), and they correspond with the residual variances of the first-order factors of the FOCUS model reported in Table 7. On the other hand, the means of the second-order factors of the FOCUS model were fixed to zero to estimate a general trajectory, although their variances were statistically significant (r f0 ¼ .295, p < .001; r fs ¼ .055, p < .001; r fq ¼ .000, p < .001), showing individual variability in these estimates.

Summary of Findings
In this study, we discussed three models for analyzing multivariate longitudinal data. Two of these, the CUFFS and the FOCUS, are SEMs. The third model, MEMs-CR for individuals and variables, was a multilevel model. We implemented these models in two types of longitudinal data with the goal of illustrating the use of the less standard MEMs-CR for the study of multivariate processes. Our results indicate that MEMs-CRs are a useful technique for the analysis of multivariate longitudinal data using both discrete (measurement occasion) and continuous (age) time metrics. In the following paragraphs, we summarize the general findings.
First, although some differences were apparent among the three models, the findings pertaining to the general growth of the variables were equivalent among all models. The estimates representing measurement occasion and age were similar across the models, with some exceptions regarding statistical significance. This overall convergence indicates that the three models described a similar underlying model of change, representing a general longitudinal trajectory for all individuals and variables, in addition to the specific idiosyncrasies of each model. Furthermore, the predictions of individual trajectories were also similar across models. In other words, the three models captured the relevant variability and thus were able to make accurate predictions, with very small differences between models at the individual level. Here, it is worth mentioning that model fit measures based on the individual predictions showed differences between the panel study and the cohort sequential design. In the panel study, we found that MEMs-CR and CUFFS models had lower residuals than FOCUS, while MEMs-CR had lower residuals than both SEMs in the cohort sequential design. In either case, the range of residuals was larger in the MEMs-CR in both data sets. Thus, while the mean performance was better for MEMs-CR than for the SEMs in general but especially in the cohort sequential design, the MEMs-CRs present larger variability in their performance. This variability could be explained by the larger amount of available information to compute RMSE in MEMs-CR compared to SEMs, being less probable to obtain model predictions for participants with very few observations in SEMs.
Second, the MEMs-CR provided relevant estimates of both the general and the specific trajectories of the individuals and the variables in the data. This implies that it is possible to obtain a general average effect common to all individuals and variables. This fixed effect should be understood as the general effect of time (or age) representing the intra-individual change along the longitudinal study. Nevertheless, it is also possible to obtain a specific trajectory for each of the individuals and each of the variables in the study. That is, the random effects of MEMs-CR can inform about the unique trajectories of individuals and variables, given sufficient variability in the data. These random effects can be used to study the idiosyncrasies of the longitudinal trajectories of specific individuals or variables.
For the SEM models, to the best of our knowledge, it is not possible to obtain both the general and the specific trajectories of the variables. The CUFFS model allows detecting the trajectory common to all the variables. Conversely, the FOCUS model was precisely developed to identify the trajectories specific to the variables, while adding second-order factors that capture general patterns of change. In the first set of results, the parametrization of the FOCUS was adapted to match the one of MEMs-CR (that is, estimating the general trajectory of the variables, but not their specific trajectories). Using this specification, we observed that the general trajectory was similar to that of the MEMs-CR and that the residual variances of the first-order factors were comparable to the random effects of the MEMs-CR. In the second set of results, we used a popular specification of the FOCUS model, where the specific trajectories of variables were estimated, and different second-order factors (whose means were fixed to zero) were added. As expected, the specific trajectories of the variables of the FOCUS shared the same patterns as those from the MEMs-CR, but we were not able to obtain a general trajectory using this specification. Thus, we believe that a MEM-CR can provide relevant information pertaining to both the general and the specific trajectories of the different variables in a study.
Third, we found differences in convergence among the models. We encountered some convergence issues in the SEMs and solved most of them by imposing constraints on some parameters. In the case of the CUFFS model, we only had to fix some factor loadings and some means to set the metric of the latent factors. In the case of the FOCUS model, different relevant constraints were imposed to identify the model and solve convergence problems. In particular, the variance and covariances of the second-order factor of the quadratic effect, which was substantive factors, were fixed to zero, and all the factor loadings between the firstand the second-order factors were fixed to one. In contrast, all MEMs-CR converged and yielded reasonable estimates. Similarly, there were differences in computation time among the different models, with MEMs-CR being the most efficient. Computation time was very fast with the panel data, probably related to the lower number of participants, variables, and complexity of the discrete-time metric. With the cohort-sequential data, however, the differences were much larger when the SEMs were fitted using age as the underlying metric. This is likely due to the larger number of participants, variables, and complexity of the continuous time metric in the NLSY79 data set. In this data set, computation time was significantly larger for SEMs, whose estimation time was around 10 min for CUFFS and around 2 h for the FOCUS when convergence problems were solved, compared to MEMs-CR, whose estimation time was around 5 min and did not require to solve convergence problems. Along these lines, McNeish and Matta (2020) reported that, relative to MEMs, data sparsity could significantly affect SEMs when the overlap between the measurement occasions is small. The results of the present study could be reflecting these differences.

Theoretical and Methodological Considerations
Our findings using data from the panel study, the Motivation in High School Project (Ferrer & McArdle, 2003), showed a small linear effect across measurement occasions. This represented an increase in the overall construct measured by the four variables: perceived competence, perceived appearance, global self-worth, and physical self-worth. This result can be inferred from the fixed effects of MEMs-CR as well as the means of the second-order factors of CUFFS and FOCUS models. In addition, individuals showed substantial variability in their initial status, and this was reported by all three models. Furthermore, the MEMs-CR model showed that there was not much variability across the variables in their initial status or their slopes. This information was unique to the MEM-CR model.
An important difference among the models was regarding the linear effect of time. This effect was statistically significant for the CUFFS and FOCUS models, but not for the MEMs-CR model. There are two different but complementary possible explanations for this result. First, our additional analysis (Supplementary Materials) showed that a latent basis was more appropriate to describe these data because the measurement occasion effect was not linear. This is a limitation of MEMs-CR compared to SEMs. But the estimates of these models were similar when a latent basis was used in the MEMs-CR. Second, it is also possible that the MEMs-CR was underpowered due to the inclusion of random effects for variables, which in this case, could be unnecessary. It is well-known that under-parameterizing or over-parameterizing the random structure of MEMs-CR can lead to a loss of power (Hoffman, 2015;Hox et al., 2018;Meyers & Beretvas, 2006). Similarly, we also found that deleting the random slopes of variables increased the statistical significance of the fixed effect of the MEM-CR (Supplementary Materials). However, while the variability between the variables was scarce, the MEMs-CR yielded interesting differences across them: perceived competence showed larger growth than the other variables, while global self-worth presented a decline not different from zero throughout the study. Thus, although a general growth can explain part of the longitudinal trajectory of self-perceptions in this study (see Harter, 1985), the different variables also presented idiosyncratic patterns of change.
Results using data from the cohort-sequential design, the National Longitudinal Survey of Youth 1979 (NLSY79; Center for Human Resource Research, 2009;Chase-Lansdale et al., 1991), indicated a large linear effect, together with a smaller quadratic effect of age, along the study common to all the variables. As was the case with the panel study, these results can be inferred from the fixed effects of MEMs-CR and the means of the second-order factors of the CUFFS and FOCUS models. These models also showed variability across individuals in the key parameters. In addition, the MEMs-CR model was also able to detect important variation across the variables: PIAT mathematics presented the largest longitudinal linear and quadratic growth along the study, while both backward and forward memory for digit span showed mainly a linear effect (this latter result could be related to the age where the variables were sampled). Thus, an important finding from the MEMs-CR model was revealing a general pattern of growth common to these cognitive abilities together with idiosyncratic differences among them. While the CUFFS estimated a general trajectory common to all the variables, the FOCUS model was also capable of estimating such general and specific patterns of growth. This, however, required to estimate the model with two different parametrizations. We observed similarities between the random effects of MEMs-CR and the residual variances of the first-order factors of the FOCUS model when the parameters of the second-order factors were estimated to obtain a general longitudinal trajectory. When the objective was to estimate the specific trajectories of variables using the FOCUS model, it was not possible to obtain a general trajectory common to all variables. In this line, there was a large similarity between the estimations of MEMs-CR and both versions of the FOCUS model, with the advantage of the MEM-CR of estimating both types of trajectories simultaneously.
The three models in our analyses conceptualize longitudinal change as a hierarchical process where there is a general growth common to all the variables together with some differences across them. In panel studies characterized by balanced data, an SEM approach could be more informative than MEMs-CR and not as computationally demanding. On the other hand, in cohort sequential designs, which involve unbalanced data, a MEMs-CR approach could be more informative and efficient, as it is a less complex model. As described previously, these differences can be directly related to the sparsity of the data. MEMs tend to perform appropriately with data from panel studies using discrete time metrics (e.g., Fine et al., 2019), as well as data from more demanding longitudinal designs using continuous time metrics (e.g., McNeish & Matta, 2020), at least in univariate models. Considering the findings of the present study, we can extend those conclusions to MEMs-CR, with the main difference in favor of MEMs-CR, relative to the CUFFS and FOCUS models, in that it can jointly study the variability of individuals and variables at the same time in the same model.

Limitations and Future Directions
To illustrate the use of MEMs-CR for analyzing multivariate longitudinal data, we assumed that both data sets included random effects in the intercepts and slopes for both individuals and variables. In our additional analyses, we saw that reducing the complexity of the random structure could alleviate the lack of statistical significance of some fixed effects. To evaluate the presence of those effects in empirical analyses, different strategies have been developed to select the optimum random structure for MEMs-CR for individuals and items (e.g., Barr et al., 2013;Mart ınez-Huertas et al., 2022;Matuschek et al., 2017) or to compute average estimations for parameters of interest (e.g., Mart ınez-Huertas et al., 2022). All these strategies should be tested in multivariate longitudinal data, especially for longitudinal designs with a continuous time metric. Moreover, the robustness of MEMs-CR has been tested in experimental research, including psycholinguistics, where these models are usually estimated with hundreds of individuals and items. Thus, it is important to further examine if such robustness can be extended to longitudinal studies in which the number of repeated measures varies across individuals and variables, and the number of available variables is smaller. Similarly, we presented the 95%CI of the estimated random effects for variables to illustrate how to make inferences about the longitudinal trajectories of specific variables. However, we are aware that it is not a very common practice in MEMs-CR, and we encourage further research to validate the standard error estimations of the levels of the target random effects.
There could be relevant differences between experimental research and correlational studies regarding the exchangeability principle in MEMs and its assumptions (Lindley & Smith, 1972;Raudenbush, 1993). This is because items and variables in experimental research are designed to control for extraneous variables, which are expected to produce not-substantive variability among the fixed effects. In correlational designs, such as most longitudinal studies, this control is not always possible because most psychological variables present substantive variability that is precisely the target of the models themselves. A similar point could be made regarding neighborhoods and schools in crossed random effects, where the level of such clusters is large. The number of variables that is realistic in longitudinal studies naturally implies a reduced number of levels in such random effects. Moreover, it is worth to mention that some SEMs, like the CUFFS model, requires longitudinal invariance to assume that the underlying latent variables are the same across measurement occasions. While probably invariance cannot be reasonably expected in many longitudinal studies (e.g., McArdle et al., 2002), it is a requirement when fitting some SEMs.
Our comparison involving three models for multivariate data is small and, thus, we endorse future research including other multivariate models. Similarly, it would also be helpful to extend our analyses to different conditions of missing data, including planned missing data (e.g., Rhemtulla & Hancock, 2016;Rhemtulla et al., 2014). This is because the presence of different patterns of missing data could influence the estimations of SEM and the MEMs-CR models, and thus could lead to relevant differences between them, especially when the underlying time metric is continuous, such as age. These concerns are directly related to samplingtime variations when discrete time points are used to summarize an underlying continuous-time metric (Miller & Ferrer, 2017). In our analyses, we only evaluated linear and quadratic longitudinal trajectories in MEMs-CR, as they were reasonable trajectories for the data. Future research should consider other more complex non-linear functions.
Similarly, Bayesian hierarchical models are a promising alternative to study the trajectories of individuals and/or variables. For example, Bayesian approaches have been recommended within multilevel SEMs because they overcome convergence problems and improve parameter estimations (e.g., Depaoli & Clifton, 2015;Muth en & Asparouhov, 2012). Similarly, when dealing with small samples, Bayesian estimates are becoming very popular and could be a useful way of analyzing longitudinal data, but there are some handicaps related to the specification of the prior distribution which deserves further research (e.g., McNeish, 2016). In the context of multivariate longitudinal data analysis, we think that Bayesian hierarchical models could alleviate some problems related to convergence given that modeling crossed random effects can be computationally demanding. But future research should compare different prior distributions and analyze their performance in crossed random effects for both discrete and continuous time metrics.

Conclusions
Our analyses comparing longitudinal multivariate models using data from a panel study and a cohort-sequential design was a first attempt to examine the usefulness of MEMs-CR to model longitudinal multivariate data. Based on our findings, we endorse the use of MEMs-CR models with random effects in both individuals and variables to study a general trajectory as well as unique characteristics of individuals and variables. This endorsement is, in part, due to the flexibility of this modeling approach for coding time as discrete or continuous. Our analyses indicate that, in situations with unbalanced data, with large sparsity, such as in a cohort-sequential design, using MEMs-CR is recommended. However, more research is needed to determine the specific contexts in which MEMs-CR can be considered a first choice to analyze multivariate longitudinal data as well as the robustness of its estimation, relative to other models.