Dataset for: A Bayesian Confirmatory Factor Model for Multivariate Observations in the Form of Two-Way Tables of Data

Researchers collected multiple measurements on schizophrenia (SZ) patients and their relatives, as well as control subjects and their relatives, to study vulnerability factors for schizophrenics and their near relatives. Observations across individuals from the same family are correlated, and also the multiple outcome measures on the same individuals are correlated. Traditional data analyses model outcomes separately and thus do not provide information about the interrelationships among outcomes. We propose a novel Bayesian Family Factor Model (BFFM), which extends the classical confirmatory factor analysis (CFA) model to explain the correlations among observed variables using a combination of family-member factors and outcome factors.Traditional methods for fitting CFA models, such as full information maximum likelihood (FIML) estimation using quasi-Newton optimization (QNO) can have convergence problems and Heywood cases (lack-of-convergence) caused by empirical under-identification. In contrast, modern Bayesian Markov chain Monte Carlo handles these inference problems easily. Simulations compare the BFFM to FIML-QNO in settings where the true covariance matrix is identified, close to not identified and not identified. For these settings, FIML-QNO fails to fit the data in $13\%$, $57\%$ and $85\%$ of the cases, respectively, while MCMC provides stable estimates. When both methods successfully fit the data, estimates from the BFFM have smaller variances and comparable mean squared errors. We illustrate the BFFM by analyzing data on data from schizophrenics and their family members.