Many leveled ordinal models for frequency of alcohol and drug use

ABSTRACT Background: The numbers of days people consume alcohol and other drugs over a fixed interval, such as 28 days, are often collected in surveys of substance use. The presence of an upper bound on these variables can result in response distributions with “ceiling effects.” Also, if some peoples’ substance use behaviors are characterized by weekly patterns of use, summaries of substance days-of-use over longer periods can exhibit multiple modes. Objective: To highlight advantages of ordinal models with a separate level for each distinct survey response, for bounded, and potentially multimodal, count data. Methods: We fitted a Bayesian proportional odds ordinal model to longitudinal cannabis days-of-use reported by 443 individuals who used illicit drugs (206 female, 214 male, 23 non-binary). We specified an ordinal level for each unique response to allow the exact numeric distribution implied by the predicted ordinal response to be inferred. We then compared the fit of the proportional odds model with binomial, negative binomial, hurdle negative binomial and beta-binomial models. Results: Posterior predictive checks and the leave one out information criterion both suggested that the proportional odds model gave a better fit to the cannabis days-of-use data than the other models. Cannabis use among the target population declined during the COVID-19 pandemic in Australia, with the odds of a member of the population exceeding any specified frequency of cannabis use in Wave 4 estimated to be 73% lower than in Wave 1 (median odds ratio 0.27, 90% credible interval 0.19, 0.38). Conclusion: Ordinal models can be suitable for complex count data.


Introduction
Survey questionnaires such as the Australian Treatment Outcomes Profile (1) collect information on frequency of alcohol and drug use with questions such as "How many days of the last 28 did you use cannabis?." Valid answers to such questions are limited to the integers from 0 to 28.Count model families such as Poisson or, more commonly, negative binomial models (2,3) or their zero-inflated or hurdle variants (4,5), often fitted to these data, do not accommodate the "ceiling effects" (see (6)) imposed by the upper bound to valid responses.While the set of possible outcomes is the same as would be modeled by a binomial model, where each observation represents the sum of 28 Bernoulli trials, the binomial assumption of independence between days in each 28 day interval is unlikely to be reasonable for frequency of substance use.Beta-binomial distributed responses are a more plausible assumption a priori (7).Another feature sometimes observed in days-of-use data over a 28 day period is a preponderance of observations consistent with repeated weekly patterns of use.For example, if some people regularly consumed alcohol on weekends, but rarely on other days, modes in daysof-use at four and 8 days out of 28 might be observed.Although the binomial and beta binomial models have the same support as these data, they cannot capture this multi-modality.
An alternative to modeling substance days-of-use as a numeric variable is to transform the numeric responses to a set of ordered categories (i.e.collapsed counts) and fit an ordinal model (8)(9)(10).The use of an ordinal model allows the distribution of the response variable to be constrained to plausible values, but also allows the baseline odds of progressing beyond each threshold of use to differ freely.Typically, ordinal response variables are defined with relatively few categories (e.g. less than 10).The simple strategy of using an ordinal category for each reported value of a bounded count variable appears to have been largely overlooked.Specifying a separate level for each observed value is less arbitrary than allocating the 29 possible response values into some smaller number of ordinal categories.More importantly, allocating a unique ordinal level for each possible response value facilitates the calculation of numeric summaries such as posterior means, quantiles and variances without needing to resort to crude approximations (see (11)).This strategy also allows more comprehensive comparison of model fit between ordinal and count data models.Modern Bayesian methods are an attractive option for fitting complex, longitudinal data (12).Bürkner and Vuorre (13) provide an accessible introduction to fitting ordinal Bayesian models.Models fitted to longitudinal ordinal data using frequentist approaches have also been described, including with application to frequency of substance use (9).
We apply the proposed many leveled ordinal modeling approach to reported cannabis days-of-use by individuals over four 28 day intervals during the COVID-19 pandemic in Australia using a Bayesian approach.We compare results obtained using the ordinal model with equivalent models assuming negative binomial, hurdlenegative binomial, binomial and beta-binomial response distributions.

Data considered
Online surveys were used to collect data on substance use of Australians during the COVID-19 pandemic for the Australians' Drug use: Adapting to Pandemic Threats (ADAPT) study.People were eligible to be recruited to the study if they were at least 18 years of age and had used illicit drugs regularly (i.e., at least once a month) in 2019 (i.e., "pre-COVID").Participants had the option to complete a one-off baseline survey or to consent to subsequent follow-up surveys (2 months, 6 months and 12 months post baseline).This paper uses data from the latter group (n = 452).Participants were reimbursed $15 AUD (GiftPay voucher sent via e-mail) for each survey that they completed, and also went in the draw to win one of three $100 AUD GiftPay vouchers.Data were collected on use of a range of substances, price and ease of obtaining illicit substances, as well as how respondents had been impacted by the pandemic at each wave of the survey.Data used in this study were provided by respondents that gave consent for their information to be used for research and publication using a webpage checkbox.Ethical approval was granted by the UNSW Sydney Human Research Ethics Committee (HC200264).

Outcome variable
We modeled self-reported cannabis days-of-use over the 28 days prior to each survey date as the outcome variable.Respondents' cannabis use days were deduced from responses to two questions.Abstinence from cannabis use over the 28 days was deduced from the question "When did you last use cannabis?." Respondents that indicated they had never used cannabis, or they last used more than 4 weeks before the survey date was assigned a value of zero cannabis use days.Non-zero values were assigned, based on responses to the question "In the past four weeks (28 days) how many days have you used cannabis?." Responses were selected from a drop-down menu with options including "1 (once)", "2 (once a fortnight)", "3", "4 (once a week)", "5" and so on.That is, some options were described as the result of a specific pattern of use, while others were not.For the proportional odds model, the response variable was defined as an ordinal type with a separate category for each of the 29 possible values from 0 to 28.An integer-valued version of the same variable was used when other model families were fitted.The data are longitudinal with surveys from 452 people initially considered in Wave 1. From these original respondents, 443, 305, 265 and 251 sufficiently complete questionnaires were received during Waves 1 to 4, respectively, so that a total of 1264 observations were used to fit the model.

Explanatory variables
The cannabis days-of-use outcome variable was regressed on five explanatory variables.These were survey wave, isolation, gender, rurality and state.Survey wave was treated as a categorical variable with four levels, Wave 1 was May and June 2020, Wave 2 was July to September 2020, Wave 3 was predominantly November and December 2020 and Wave 4 was between May and July 2021.Isolation was defined as a dichotomous variable and coded as "yes" if the participant reported homeisolation or 14 day quarantine during the period of interest, otherwise the variable was coded as "no."Gender was coded as a factor with three levels, male, female and nonbinary.Rurality was a dichotomous variable, coded as either city or rural/regional.There were few respondents from the state of Tasmania, nor from the Northern Territory or the Australian Capital Territory.Therefore, we combined respondents from Tasmania with Victoria, those from the Northern Territory with South Australia and those from the Australian Capital Territory with New South Wales.Person-level random intercepts were specified to allow for variability in peoples' cannabis use frequency not explained by the covariates.

Statistical analysis
We fitted Bayesian regression models to the data described above.We assumed the probability that person i uses cannabis on D ij ¼ 0; . . .; 28 days during wave j is described by a proportional odds ordinal model with 29 levels.The brms package (14) which invokes Stan (15) was used to fit all Bayesian models.The adequacy of the fit of the proportional odds model was compared with models assuming alternative response distributions regressed on the same explanatory variables using posterior predictive checks ( 16) and the Pareto-smoothed importance sampling leave-one-out information criterion (PSIS-LOO-IC) (17).The assumption of proportional odds was tested by fitting 28 separate Bernoulli models for the odds of exceeding 0; . . .; 27 days of cannabis use and comparing model coefficients with the proportional odds ordinal model.

Dependence of linear predictors on explanatory variables
As with generalized linear mixed models (GLMMs), distributional parameters were related to a linear combination of the explanatory variables, the linear predictor, via a monotonic link function.Let x ij be a row vector of explanatory variables for COVID-19 isolation status, wave, state, rurality, person i on wave j and β be a column vector of coefficients.We specified the linear predictor, η ij , as where b i is a person-level random intercept.Note that, since participant i reports a single set of covariates at wave j, vector x ij specifies the values of all covariates for that case (see e.g (18), Chapter 8).Also, we emphasize, that while x ij is the same in all models, the parameters β and b i differ between models and distributional parameters.
In GLMMs, the dependence of the outcome variable on predictors is usually modeled explicitly through a single location distributional parameter such as the mean of the assumed response distribution.The values of other distributional parameters are assumed to be the same for all responses.Models for "location, scale and shape" (19) are a generalization of this approach.The brms package allows all distributional parameters of the specified response distribution to be regressed on explanatory variables.For the ordinal model, we chose to regress only the location parameter on the explanatory variables.However, for some of the other models, described below, we regressed multiple distributional parameters on the explanatory variables.For models with explanatory variable dependence on more than one distributional parameter, correlation between the random effects, b i , for each person, i, in the regression models for the distributional parameters was modeled explicitly (see (20)).

Proportional odds model
The proportional odds ordinal model ( 21) is a cumulative logit link model in that the linear predictor, η ij , is related to the cumulative distribution function of the response variable rather than the probability mass function.Let r be any of the integer number of cannabis days-of-use that might be exceeded (i.e.r ¼ 0; 1; . . .; 27).Then, the probability that person i uses cannabis on no more than r days during wave j can be expressed as where η ij are as defined in Equation ( 1).The threshold parameters, θ r with θ 0 � θ 1 � θ 2 � . . .� θ 27 , control the odds that people exceed (or do not exceed) each successive ordinal category of cannabis days-of-use.

Other models
Corresponding negative binomial, hurdle-negative binomial, binomial and beta-binomial models were also fitted to the cannabis-use-days data for comparison with the proportional odds model.Probability mass functions, expressed in terms of the distributional parameters described below, are provided in supplemental materials, Table S1.
Negative binomial.The negative binomial is a discrete distribution defined on the non-negative integers.With mean, μ, and shape parameter, α, the negative binomial has variance μ þ αμ 2 (22).We obtained a better fit when regressing both μ and α on the explanatory variables.
Hurdle-negative binomial.According to the hurdlenegative binomial model, the probability that person i abstains completely from using cannabis on wave j, ψ ij , is modeled using a binomial GLMM with logit link function.Then, conditional on non-zero cannabis use, the distribution of positive cannabis days-of-use is assumed to be described by a truncated negative binomial with location parameter, μ ij , the mean of the (untruncated) negative binomial and shape parameter α ij (23).According to the hurdle-negative binomial model, the number of cannabis use days, D ij are realiza- Binomial.According to the fitted binomial model, the expected probability, π ij , that person i uses cannabis on a given day in wave j was assumed to be defined by Equation (1).Then, the number of days person i consumed cannabis over 28 days during wave j, D ij , is assumed to be the realization of a Binomial 28; π ij À � distribution, equal to the sum of 28 independent Bernoulli π ij À � trials.

Beta-binomial.
The beta-binomial distribution can be parameterized in terms of an underlying (binomial) probability of using cannabis on any given day, π, and an overdispersion parameter, ϕ.Then, the beta-binomial probability of using on any day in a particular 28 day interval, π � , is a beta random variable (see Table S1).We regressed both π and ϕ on the explanatory variables with logit and log links, respectively.We denote this model, Beta-Bin 28;

Prior distributions and Monte Carlo details
We used horseshoe priors ( 24) for population-level (i.e.fixed effects) coefficient parameters to provide a level of protection against overfitting.For other parameters, we used brms default priors.These are t distributions with 3 degrees of freedom and a scale parameter of 2.5 for both the ordinal level thresholds, θ r , and the standard deviations of the random effects distributions, σ b .For models with multiple distributional parameters, the Lewandowski-Kurowicka-Joe (LKJ) prior distribution (25) was assumed for the random effect correlation matrices.The LKJ prior is the brms default in this case.
Posterior distributions of model parameters were approximated with four chains of 5000 iterations each generated by Stan's No U-turn Sampler.The first 500 iterations from each chain were discarded as burn-in or warm-up after which every fifth iteration was kept, resulting in a total of 3,600 samples from each conditioned model on which to base posterior inference.Convergence of the no U-turn sampler was assessed by visual inspection of mixing of the four chains in trace plots and the rankbased R statistic convergence diagnostic (26).Four thinned chains, each with 25 000 iterations, were required for convergence of the binomial model.Instructions and code for fitting all models are provided in an online repository (27).

Distribution of cannabis days-of-use
Abstinence from cannabis use was reported on approximately 28% of 28 day intervals across the four waves combined, while daily cannabis use was reported on almost 25% of intervals.More notably, the distribution of cannabis days-of-use was distinctly multimodal, with modes at multiples of 4 days (Figure 1).

Posterior predictive checks
The observed empirical cumulative distribution function (ECDF) for cannabis days-of-use across the four waves is depicted by the black and yellow double line in each panel of Figure 2. The multiple modes, evident in Figure 1, manifest as large steps in the observed ECDF at multiples of 4 days, relative to smaller steps in the ECDF observed at other values (Figure 2).The thin blue lines in Figure 2  Comparing the observed and posterior predictive ECDFs for the negative binomial and hurdle negative binomial models (Figure 2) reveals that both of these models consistently predict cannabis use days exceeding 28 for around 10% of 28 day intervals on each Markov chain Monte Carlo iteration.The binomial model predicts both fewer responses near zero and fewer near 28 than were reported.This suggests that the reported cannabis days-of-use are overdispersed compared with what would be expected if responses conformed to a binomial distribution.The posterior predictive distribution of the beta-binomial model reproduced the  coarse-scale distribution in reported cannabis days-ofuse.However, only the proportional odds model is able to approximate the heterogeneity in step size characteristic of the cannabis days-of-use ECDF.The difference in the posterior predictive distributions of the betabinomial and ordinal models is more clearly shown in rootograms (Supplemental materials, Figure S1).

Days of cannabis use Cumulative distribution function
It would be expected that an ordinal model with 28 thresholds and person-level random effects would reproduce the cannabis use data pooled across all survey waves quite well (Figure 2).A more rigorous test of this model is to check whether the posterior distributions of the thresholds for cannabis use days are reasonable for different values of the explanatory variables.Separately observed ECDFs for the four survey waves (yellow and black double lines in Figure 3) reveal complete abstinence from cannabis over the 28 day intervals increased from 21% in Wave 1 to 37% in Wave 4. Conversely, 28% of Wave 1 respondents reported daily cannabis use compared with 21% of respondents in Wave 4. The posterior predicted distribution of the ECDF for each wave is reasonably consistent with the observed data.
The corresponding plot for the beta-binomial model is shown in Supplemental materials (Figure S2).
Defining a separate ordinal category for each response value enables the calculation of posterior predictive distributions of numeric summaries for the proportional odds model, such as means and standard deviations.The posterior predictive standard deviations by survey wave of the beta-binomial and proportional odds models are more consistent with observed data than the others (Supplemental Materials, Figure S3).
The proportional odds model relies on the assumption that a single set of parameter estimates for the covariate effects is valid for the log odds of not exceeding each threshold, θ r .We test this assumption by comparing the parameter estimates from the proportional odds model with 28 separate Bernoulli models fitted for exceeding each threshold (see (29)).Qualitatively, the proportional odds assumption seems quite reasonable in this case (see Supplemental Materials, Figure S4).The proportional odds model can be a reasonable choice even when there is mild evidence of departure from the proportional odds assumption (18,30).

PSIS-LOO-IC
Although the same set of explanatory variables was used for each model, the five models do not have the same number of parameters.For example, the proportional odds model includes the 28 threshold parameters that are not included in the other models.On the other hand, the negative binomial models and the beta-binomial model each have multiple sub-models with full sets of parameters, including person-level random effects, estimated for each (Table 1).Since the proportional odds model includes a separate ordinal category for each numeric response, we can compare its goodness-of-fit with the other four models using information criteria.According to PSIS-LOO-IC (Table 1), the many leveled proportional odds model fits cannabis days-of-use clearly better than the other models, accounting for differences in the number of parameters between models.PSIS-LOO-IC allows an estimate of a model's effective number of parameters to be calculated, including the contribution of the random effects.This will not be the same as the nominal number of parameters, since the parameters are not estimated independently.As noted by Vehtari et al. (17), these estimates of the effective number of parameters, denoted P-LOO in Table 1, can be used as a measure of model complexity but should not be over-interpreted.
Fitting all models using default Gaussian priors instead of horseshoe priors for the coefficients gives model diagnostics and information criteria that lead to the same conclusion as described here.

Inference
As the best model, we base posterior inference on the proportional odds model.For the ADAPT study, primary interest lay in frequency of use of cannabis and other drugs at the four survey waves as well as the effect of home isolation and quarantine.The conditioned model suggests the odds of exceeding cannabis use thresholds declined progressively during the portion of the COVID-19 pandemic covered by the four survey waves.For example, among members of the population studied, the odds of exceeding a given threshold of cannabis days use during Wave 4 is estimated to have been around 73% lower than during Wave 1 (posterior median odds ratio 0.27, 90% CI 0.19, 0.38).There is little evidence of a change in frequency of cannabis use in response to isolation or quarantine after accounting for survey wave (Figure 4

Discussion
The computation performance of today's personal computers allows ordinal models with many levels to be fitted.We have demonstrated that the fits of ordinal models can be compared with models for count data if a separate level is specified for each possible integer value.For the cannabis days-of-use data considered, PSIS-LOO-IC and posterior predictive checks both suggest that the specified proportional odds model fits better than other models commonly used for substance days-of-use data with the same explanatory variables.Shortcomings of negative binomial, hurdle-negative binomial and binomial models, in particular, were evident in the investigation described here.
A key strength of ordinal models is clinically convenient interpretation of effect sizes (31,32).For the proportional odds model, effect sizes can be expressed as odds ratios as shown in Figure 4 that inform relative odds of exceeding all specified thresholds.The effects of the explanatory variables on the binomial model can also be conveniently summarized as odds ratios (with slightly different interpretation).However, effects of the same variables on the negative binomial, hurdle-negative binomial and betabinomial, as specified in this investigation, cannot be easily summarized because they have multiple dependent distributional parameters.Regressing multiple distributional parameters against explanatory variables will often fit data better than conventional regression models, but makes the effects of explanatory variables on the response much harder to interpret (18).
The assumption of proportional odds will not be reasonable in all applications.While the proportional odds model is the most commonly applied ordinal model, a range of others have been described and are sometimes used.Substance days-of-use data, for example, can be modeled using a continuation ratio ordinal model, where the attainment of a final category requires the subject first passes through each lower category.A person that uses cannabis on 2 days in a 28 day interval necessarily progresses from initially no use to a first day of cannabis use before deciding to use it on another day and then stops.Models with category-specific effects can be fitted to continuation ratio models, making them, in this sense, less restrictive than proportional odds.However, categoryspecific effects invariably complicate model interpretation.According to PSIS-LOO-IC, the continuation ratio fitted the cannabis use days data slightly better than the proportional odds ratio (not shown).We favor the proportional odds model in this application because it affords arguably clearer inferential statements.
It seems likely that the wording of available responses on the questionnaire in the ADAPT study exaggerated the true extent of multimodality in cannabis days-of-use by prompting respondents to preferentially choose responses corresponding to multiples of four."Heaping" of responses to quantitative survey questions is common in discrete quantitative survey data where respondents are sometimes unable to precisely recall the correct response (33).Wang and Heitjan (34) use zero-inflated Poisson and zero-inflated negative binomial models to demonstrate that heaping can produce biased inference.However, in any application, the extent of heaping will generally be equivocal.For example, modes in daily cigarette consumption, assumed by Wang and Heitjan to be due to heaping, are alternatively interpreted by Farrell and colleagues (35) as essentially correct with observed modes resulting from the number of cigarettes in a packet and "self rationing." Multiple modes (less pronounced than described in this study) might be expected in substance days-of-use over a 28 day interval because of repeated weekly patterns of use.A comprehensive discussion of heaping is beyond the scope of this study.Nevertheless, we point out that a simple approach to addressing the effects of heaping would be to define an ordinal model with fewer levels defined specifically to minimize the influence of heaping.Where the proportional odds assumption is valid, inference about the effects of predictors is invariant to cutoffs chosen when categories are formed by collapsing counts (36).Therefore, in applications where the proportional odds assumption holds, it may be reasonable to conclude that inference would be somewhat resistant to the effects of heaping.Ordinal regression models are conferred a degree of robustness because only the order of the response variable is modeled (30,32).Investigation of the robustness of ordinal models to plausible heaping is an interesting topic for future study.
Finally, while we have focused on models for frequency of substance use, we would expect the approach that we have described could be fruitfully applied to analyses of other bounded count variables such as days with poor mental health, migraines, absenteeism, interrupted sleep and exercise (37)(38)(39)(40)(41)(42).The advantage of ordinal models over beta binomials in these applications will be greatest when data are multimodal.Ordinal models with many levels could also be applied to composite measures calculated by summing responses to multiple survey items, such as the WHO ASSIST (43) measure, where other model families are less reasonable.
are ECDFs, resulting from 25 draws from the posterior predictive distribution of each model.Inconsistency between a model's posterior predictive distribution and observed data is suggestive of lack-offit (see Gelman et al. (28) Chapter 6).

FrequencyFigure 1 .
Figure 1.Reported cannabis days-of-use during 28 day intervals across four survey waves of the Australians' Drug use: Adapting to Pandemic Threats (ADAPT) study.Note the logarithmic y-scale.

Figure 2 .
Figure 2. Posterior predictive checks based on the empirical cumulative distribution function (ECDF) of days cannabis was used across all four waves of the Australians' Drug use: Adapting to Pandemic Threats (ADAPT) study.Observed ECDFs of days of cannabis use over four 28 day intervals are shown by the double black and yellow line.The thin blue lines are ECDFs based on 25 draws from the posterior predictive distributions of the five models conditioned on the same data.Predictions from the negative binomial and hurdlenegative binomial model have been truncated at 40 days for convenience of presentation.

Figure 3 .
Figure 3. Posterior predictive check of empirical cumulative distribution functions (ECDFs) for the proportional odds model disaggregated by survey wave.Observed ECDFs depicted by double black and yellow lines.Thin blue lines are ECDFs predicted from 25 random draws from the posterior distribution.
a). Summaries of posterior predictive means by survey wave as implied by the proportional odds model (Figure 4 b) indicate the declines in mean cannabis days-of-use (e.g.posterior means of 13.6 days in Wave 1 and 10.6 days in Wave 4) were noteworthy but not extreme.

Figure 4 .
Figure 4. (a) Posterior distributions of odds ratios of exceeding a specified frequency of cannabis use in Waves 2, 3 and 4, relative to Wave 1 and when in isolation or quarantine relative to when not.Open circles are posterior medians, thick green bands are 50% credible intervals and thin gray lines are 90% credible intervals.Scale is log transformed.(b) Posterior predictive means of conditional mean cannabis-days-of-use by survey wave (open circles) with 90% credible intervals.

Table 1 .
Pareto smoothed importance sampling leave-one-out information criterion (PSIS-LOO-IC) for fits of the five models to the days cannabis use data with estimated standard errors in parentheses.P-LOO is the estimated number of effective parameters in each model.Lower values of PSIS-LOO-IC indicate better fit.