figshare
Browse
uasa_a_1292915_sm1721.pdf (415.98 kB)

Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach

Download (415.98 kB)
journal contribution
posted on 2018-06-06, 22:30 authored by Dungang Liu, Heping Zhang

Ordinal outcomes are common in scientific research and everyday practice, and we often rely on regression models to make inference. A long-standing problem with such regression analyses is the lack of effective diagnostic tools for validating model assumptions. The difficulty arises from the fact that an ordinal variable has discrete values that are labeled with, but not, numerical values. The values merely represent ordered categories. In this article, we propose a surrogate approach to defining residuals for an ordinal outcome Y. The idea is to define a continuous variable S as a “surrogate” of Y and then obtain residuals based on S. For the general class of cumulative link regression models, we study the residual’s theoretical and graphical properties. We show that the residual has null properties similar to those of the common residuals for continuous outcomes. Our numerical studies demonstrate that the residual has power to detect misspecification with respect to (1) mean structures; (2) link functions; (3) heteroscedasticity; (4) proportionality; and (5) mixed populations. The proposed residual also enables us to develop numeric measures for goodness of fit using classical distance notions. Our results suggest that compared to a previously defined residual, our residual can reveal deeper insights into model diagnostics. We stress that this work focuses on residual analysis, rather than hypothesis testing. The latter has limited utility as it only provides a single p-value, whereas our residual can reveal what components of the model are misspecified and advise how to make improvements. Supplementary materials for this article are available online.

Funding

This work is partially supported by the grant R01 DA016750 from the NIH. Liu’s research is also partially supported by a junior faculty fund from Lindner College of Business. The real data used in this article were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1 (accession number phs000092.v1.p1). The data collection was funded by NIH grants U01 HG004422, U01 HG004446, U10 AA008401, P01 CA089392, R01 DA013423, U01 HG004438, and HHSN268200782096C.

History