figshare
Browse
DOCUMENT
JCGS-21-409-Suppl.pdf (544.22 kB)
TEXT
JCGS_simu_clusterMLD.R (25.69 kB)
DOCUMENT
README.docx (13.27 kB)
DOCUMENT
README.pdf (83.9 kB)
1/0
4 files

clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data

dataset
posted on 2022-11-22, 22:00 authored by Junyi Zhou, Ying Zhang, Wanzhu Tu

Longitudinal data clustering is challenging because the grouping has to account for the similarity of individual trajectories in the presence of sparse and irregular times of observation. This paper puts forward a hierarchical agglomerative clustering method based on a dissimilarity metric that quantifies the cost of merging two distinct groups of curves, which are depicted by B-splines for the repeatedly measured data. Extensive simulations show that the proposed method has superior performance in determining the number of clusters, classifying individuals into the correct clusters, and in computational efficiency. Importantly, the method is not only suitable for clustering multivariate longitudinal data with sparse and irregular measurements but also for intensely measured functional data. Towards this end, we provide an R package for the implementation of such analyses. To illustrate the use of the proposed clustering method, two large clinical data sets from real-world clinical studies are analyzed.

Funding

The authors thank the Associate Editor and two anonymous referees for their many insightful comments that have greatly improved the quality of this work. This manuscript was prepared using the SPRINT Research Materials obtained from the NHLBI Biologic Specimen and Data Repository Information Coordinating Center, and the PREDICT-HD data obtained from the PREDICT-HD investigators and coordinators of the Huntington’s Disease Study Group. The work does not necessarily reflect the opinions or views of the SPRINT research team or the NHLBI, or the Huntington’s disease study group. This research is partly supported by National Institutes of Health grants RO1HL095086, U24AA026969, U54GM115458, RO1NS103475.

History