Initial Validation of a Computerized Adaptive Test for Substance Use Disorder Identification in Adolescents

Abstract Purpose Computerized adaptive tests (CATs) are highly efficient assessment tools that couple low patient and clinician time burden with high diagnostic accuracy. A CAT for substance use disorders (CAT-SUD-E) has been validated in adult populations but has yet to be tested in adolescents. The purpose of this study was to perform initial evaluation of the K-CAT-SUD-E (i.e., Kiddy-CAT-SUD-E) in an adolescent sample compared to a gold-standard diagnostic interview. Methods Adolescents (N = 156; aged 11–17) with diverse substance use histories completed the K-CAT-SUD-E electronically and the substance related disorders portion of a clinician-conducted diagnostic interview (K-SADS) via tele-videoconferencing platform. The K-CAT-SUD-E assessed both current and lifetime overall SUD and substance-specific diagnoses for nine substance classes. Results Using the K-CAT-SUD-E continuous severity score and diagnoses to predict the presence of any K-SADS SUD diagnosis, the classification accuracy ranged from excellent for current SUD (AUC = 0.89, 95% CI = 0.81, 0.95) to outstanding (AUC = 0.93, 95% CI = 0.82, 0.97) for lifetime SUD. Regarding current substance-specific diagnoses, the classification accuracy was excellent for alcohol (AUC = 0.82), cannabis (AUC = 0.83) and nicotine/tobacco (AUC = 0.90). For lifetime substance-specific diagnoses, the classification accuracy ranged from excellent (e.g., opioids, AUC = 0.84) to outstanding (e.g., stimulants, AUC = 0.96). K-CAT-SUD-E median completion time was 4 min 22 s compared to 45 min for the K-SADS. Conclusions This study provides initial support for the K-CAT-SUD-E as a feasible accurate diagnostic tool for assessing SUDs in adolescents. Future studies should further validate the K-CAT-SUD-E in a larger sample of adolescents and examine its acceptability, feasibility, and scalability in youth-serving settings.


Introduction
Substance use disorders (SUDs) present a major health crisis in the United States (Substance Abuse & Mental Health Services Administration, 2021).Use of drugs and alcohol is typically initiated during adolescence, a developmental period marked by substantial change across multiple functional domains (Miech et al., 2023;Steinberg, 2005).Earlier and more frequent use of substances in adolescents is associated with increased likelihood of developing SUDs, which are often accompanied by a wide array of negative outcomes (e.g., mental illness, suicide, accidents/injuries, academic failure, interpersonal conflict, legal involvement), making identification and delivery of interventions for adolescents with SUDs especially important (Forman-Hoffman et al., 2017;Odgers et al., 2008).Yet only 7.6% of the estimated 1,600,000 adolescents aged 12-17 in the United States with SUDs received any substance use treatment (Substance Abuse & Mental Health Services Administration, 2021).This treatment gap is partly due to inadequate availability of standardized screening and diagnosis of SUDs in healthcare and other youth-serving settings to identify youth in need of SUD services (McLellan & Meyers, 2004).
To solve this problem, national organizations such as the American Academy of Pediatrics (AAP) and the Substance Abuse and Mental Health Services Administration (SAMHSA) recommend incorporation of the Screening, Brief Intervention, and Referral to Treatment (SBIRT) model into routine adolescent health care (Levy et al., 2020;Levy et al., 2016).Several barriers prevent this from being standard practice, however.Common barriers specific to substance use screening in pediatric practice include lack of time and lack of familiarity with available screening tools and their use (Levy et al., 2020).Several adolescent-appropriate screening and diagnostic tools exist for substance use, and they vary in terms of which substances are assessed and diagnostic detail.Noteworthy tools include single-substance screeners (e.g., Alcohol Use Disorders Identification Test [AUDIT]), general screeners (e.g., Car, Relax, Alone, Forget, Friends, Trouble [CRAFFT]; Problem Oriented Screening Instrument for Teenagers substance use scale [POSIT]), screeners that include multiple substance categories (e.g., Brief Screener for Tobacco, Alcohol, and other Drugs [BSTAD]; Screening to Brief Intervention [S2BI]), and full-scale diagnostic interviews (e.g., Kiddy Schedule for Affective Disorders and Schizophrenia for School-Age Children [K-SADS]. While structured interviews (e.g., K-SADS) can be used to accurately diagnose SUDs, their widespread use is limited by the availability of trained providers/clinicians and the duration of the interview process (Priester et al., 2016).Although screeners (e.g., BSTAD, CRAFFT, POSIT, S2BI) can be rapidly completed, and some have been shown to provide reliable indication of problematic substance use in adolescents (Kelly et al., 2014;Knight et al., 2002;Levy et al., 2021), a screener cannot provide reliable SUD diagnoses and positive screens typically require additional assessment.Therefore, with regard to substance use assessment, pediatric healthcare providers face a choice between rapid detection of the likelihood of problematic use via screening, requiring follow-up assessment/diagnosis, or lengthy determination of diagnosable substance use disorder criteria, requiring trained clinicians and significant time from patients.
Recent advances in computerized adaptive tests (CATs) for mental health have begun to provide tools capable of striking a middle ground in the desire for increased assessment speed, increased diagnostic accuracy, and decreased patient burden (Gibbons et al., 2008;2012).In contrast to fixed-length mental health assessments, these tests adaptively select only a few items from a larger item bank until converging on precision of measurement of the mental health domain(s) of interest.CATs have been validated for a range of mental health domains, and importantly, they have been validated for assessment of psychopathology in adolescents (Gibbons et al., 2020b).Gibbons et al. (2020a) recently demonstrated the diagnostic reliability of a CAT for substance use disorders (CAT-SUD) in an adult population, which was recently expanded (CAT-SUD-E) to include useful diagnostic detail to inform treatment planning (Hulvershorn et al., 2022).That sample included only adults aged 18 and older.The purpose of this paper is to report on the performance of this expanded CAT-SUD-E, in an adolescent sample (i.e., ages 11-17 years; Kiddy-CAT-SUD-E, K-CAT-SUD-E), and its comparison against the Kiddy Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime Version (K-SADS; Kaufman et al., 2013).

Design
This study was conducted in compliance with the ethical principles of the Declaration of Helsinki and the International Conference on Harmonization's Good Clinical Practices Guidelines from June 2020 to October 2021.The Institutional Review Board at Indiana University approved the study, and individuals provided verbal informed consent prior to initiation of study procedures.The study design involved administration of both the self-reported K-CAT-SUD-E and a clinicianconducted interview using the SUD section of the K-SADS.The order of administration for the K-CAT-SUD-E and K-SADS was randomized across participants.
The sample was recruited from the community using multiple strategies, including physical and online advertisement as well as word-of-mouth referrals from previous participants.Eligibility criteria included being 11-17 years of age, being fluent in English, and having access to an electronic device permitting the use of Zoom and K-CAT-SUD-E applications.Attempts were made to recruit participants with a diverse set of substance use histories, especially adolescents with substantial past or present substance use (i.e., more than just experimental use).Initial phone screens were used to obtain adolescent assent and parent/guardian consent, in addition to helping ensure a sample likely enriched in SUD diagnoses, by asking guardians (yes/no) if they were aware of any adolescent substance use.Some substance use was indicated in 59% of participants during the phone screen, prior to K-SADS/K-CAT-SUD-E assessment.
All K-SADS interviews were conducted virtually using the secure Zoom for Healthcare tele-videoconferencing platform.The K-SADS is a semi-structured diagnostic interview for youth aged 6 to 18 years old that assesses for current and past SUDs based on the criteria in DSM-5 (Kaufman et al., 2013).K-SADS assessors were two experienced pediatric behavioral health clinicians working under the supervision of a licensed clinical psychologist.The assessors and supervisors were blind to the results of the K-CAT-SUD-E when cases were staffed, and diagnoses were assigned based upon the K-SADS interviews with the adolescent and their guardian.The K-SADS is a gold standard psychiatric diagnostic assessment in youth, which has been mapped to DSM-5 disorders/symptoms, has been adapted to multiple languages, and has been utilized in thousands of scientific papers.Furthermore, the K-SADS provides advantages over available youth substance use assessments, including incorporating both parent and child interviews, determining use at multiple time points (e.g., current, lifetime), and allowing clinician freedom to use probing/follow-up questions during assessment.
Participants were compensated with a $100 gift card and had the option of receiving a copy of their K-SADS results.Of 167 participants, 11 were not included in the analytical sample (n = 156) for at least one of the following reasons: incomplete K-CAT-SUD-E, incorrect K-CAT-SUD version (not Expanded), incorrect K-CAT-SUD-E timeframe (not past 30 days), or strong suspicion of invalid answers or ineligible age.

The CAT-SUD
Calibration and validation of the CAT-SUD in adults was reported previously (Gibbons et al., 2020a).Briefly, an item bank covering relevant health domains, consisting mostly of Likert-type scales, was administered, and the item-response data were calibrated using a multidimensional item response theory (MIRT) model-specifically, a full-information item bifactor model, originally described by Gibbons and Hedeker (1992) for binary item responses and later generalized to ordinal (polytomous) item response data as used here (Gibbons et al., 2007).Using the item parameter estimates, tuning parameters for CAT administration (precision threshold, item information termination criteria, and probability of selecting the most or second most informative item to spread use of all of the items in the final item bank) were optimized, minimizing participant burden and maximizing the correlation between the adaptive CAT and administration of the full 168-item bank.The adaptive CAT-SUD was developed into a web application and validated against the Composite International Diagnostic Interview (CIDI) for SUDs.Cross validation of the area under the ROC curve (AUC) of CAT-SUD severity scores and CIDI-based diagnostic (SUD) outcomes yielded excellent discrimination (AUC = 0.80-0.89;Gibbons et al., 2020a;Hosmer & Lemeshow, 2000).

The CAT-SUD expanded (CAT-SUD-E)
Initial testing of the CAT-SUD-E in an adult sample was previously reported (Hulvershorn et al., 2022).Briefly, the CAT-SUD-E includes the original 168-item bank from the CAT-SUD and expands upon the CAT-SUD by assessing multiple substance classes (by featuring a series of branching logic questions about drug use and DSM-5 criteria for specific drugs of abuse) and by assessing two separate timeframes, current (past 30 days) and past (prior to past 30 days).Following initial diagnostic questions regarding age (to determine adult ≥ 18 or adolescent < 18), hospitalizations, drug treatment, and probation, participants were asked if they had used (yes/no) any of several substance classes during the selected timeframe.These substance classes included alcohol, cannabis/spice, opioids, cocaine/crack, methamphetamine/amphetamine, sedatives, hallucinogens, and nicotine.A response of "yes" to any substance class in either timeframe was followed by an additional set of substance specific, DSM-5 criteria-based questions (Hulvershorn et al., 2022).Sample images of the CAT-SUD-E user/participant experience can be found in Supplementary Figures, and information regarding CAT-MH and K-CAT products can be found at the Adaptive Testing Technologies website (https://adaptivetestingtechnologies.com).
As with the CAT-SUD, logistic regression was used to examine the association between CAT-SUD-E severity scores and clinician-based SUD outcomes using the Structured Clinical Interview for DSM-5, Research Version (SCID).However, in the case of the CAT-SUD-E, AUCs of ROC curves were determined separately for each substance class and for both timeframes.Classification accuracy in adults was found to range from acceptable to outstanding (see Hosmer & Lemeshow, 2000 for thresholds), depending on substance class and timeframe (Gibbons et al., 2020b).

Modifications for adolescents
In order to test the K-CAT-SUD-E in adolescents aged 11-17 years, we made the following modifications to the procedure used for adults (Hulvershorn et al., 2022): 1) Adolescents and guardians were separately interviewed using the Substance-Related Disorders Supplement of the K-SADS (instead of the SCID) and both adolescent and guardian responses during the K-SADS interview were used to determine whether the adolescent met diagnostic criteria for SUD(s), and, 2) Inhalant and over-the-counter medication substance classes were added to the K-CAT-SUD-E, as both are present in the K-SADS interview and both are used by adolescents including at disproportionately high rates in the case of inhalants (Substance Abuse & Mental Health Services Administration, 2021).The K-CAT-SUD-E computes a continuous severity score (0-100) and current and lifetime overall and individual SUD diagnoses for nine substance classes (alcohol, cannabis, opioid, stimulant, sedative, inhalants, hallucinogen, nicotine/tobacco, OTC/other, cocaine).

Statistical analysis
The analysis was conducted in line with the previous study conducted in an adult sample (Hulvershorn et al., 2022).Briefly, we used logistic regression to examine the association between K-CAT-SUD-E severity scores and K-CAT-SUD-E diagnoses (individually and overall), and clinician-based SUD diagnoses (using the K-SADS, individually and overall).This was done separately for current and lifetime for the overall and individual SUD diagnoses.For each regression model, we computed the probability of a K-SADS SUD diagnosis, and then the area under the ROC curve (AUC), using the K-CAT-SUD-E severity score and corresponding K-CAT-SUD-E diagnosis as predictors.In general, an AUC of 0.5 suggests no discrimination (e.g., inability to determine adolescents with and without SUDs based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding discrimination (Hosmer & Lemeshow, 2000).For each analysis, sensitivity was also reported at fixed specificity values of 0.8 and specificity of 0.9, and sensitivity, specificity and kappa were also computed at the point on the ROC curve of maximum classification accuracy (MCA).Classification accuracy in terms of AUC was based on out-of-sample threefold cross validated results with bootstrap bias corrected 95% confidence intervals.All computations were conducted in Stata 17.
Of note, the K-SADS refers to substance use within the past 12 months, whereas the K-CAT-SUD-E refers to the past 30 days, so our estimates of agreement represent a lower bound on what can be achieved using the same timeframes.Stimulants also were categorized somewhat differently between instruments, with the K-SADS diagnoses included questions about stimulants and cocaine, whereas the K-CAT-SUD-E included the categories methamphetamine/ amphetamine and cocaine/crack.To this end, we created a category encompassing all stimulant substance categories from both the K-SADS and K-CAT-SUD-E.This also places an upper bound on the computed level of agreement.As the K-SADS includes limited assessment of nicotine/tobacco use disorder criteria, a new set of items was created by adapting DSM-5 SUD criteria for nicotine and tobacco into a format similar to K-SADS SUD items for other substances, for a subset of K-SADS interviews (n = 53).Analyses were conducted with and without the diagnostic supplement.In addition, summary diagnoses were not initially made for K-SADS cocaine or "other" substance categories and were later added for a subset of participants (n = 48).

Descriptive statistics
Table 1 displays sample demographic characteristics based on K-SADS diagnoses.One hundred fifty-six participants (ages 11-17; 48% female) completed the K-CAT-SUD-E and a K-SADS diagnostic interview.Median K-CAT-SUD-E completion time was 4 min and 22 s (interquartile range from 1:04 to 7:40).Conversely, K-SADS completion time was a median of 45 min, ranging from approximately 15 to 75 min, depending on adolescent substance use history.No adverse events from undergoing the assessments were noted.

Overall diagnostic accuracy
For the overall current SUD K-SADS diagnosis, the K-CAT-SUD-E (severity score and any diagnosis excluding our expanded nicotine/tobacco K-SADS diagnostic criteria) had AUC = 0.89, 95% CI = 0.81, 0.95.Maximum classification accuracy was 87%, and at that point on the ROC curve sensitivity was 0.75, specificity was 0.92, and kappa was 0.66.Adjusting specificity to 0.90 yields sensitivity of 0.75.Including our K-SADS expanded nicotine/tobacco criteria, AUC was identical at 0.89, 95% CI = 0.80, 0.95.

Discussion
The current study demonstrated initial performance of the K-CAT-SUD-E for identifying and quantifying SUD symptoms in 11-to 17-year-old adolescents.The classification accuracy of the K-CAT-SUD-E ranged from excellent to outstanding for current and lifetime overall SUD and current and lifetime individual substance-specific diagnoses.This work represents a significant contribution to the menu of assessment options available to clinicians tasked with diagnosing SUDs and determining which, if any, additional clinical services may be appropriate for an individual adolescent.
With regard to general classification accuracy, the K-CAT-SUD-E appears to perform similarly to existing brief screeners of substance use for adolescents.Across the three most commonly used substances by adolescents (alcohol, nicotine, cannabis), AUCs for K-CAT-SUD-E were similar for current use (0.82-0.90) and lifetime use (0.91-0.92) for the BSTAD (i.e., only other tool for which AUCs were reported for multiple substance classes; 0.87-0.96;Kelly et al., 2014).The reduction in accuracy for current use presented here is likely a reflection of the different timeframes used to define current use for the K-CAT-SUD-E compared to the K-SADS (i.e., past month versus past year), and the shift away from combustible cigarettes toward a variety of vaping products may make comparisons with older studies problematic.Comparing K-CAT-SUD-E diagnoses of alcohol use disorder and cannabis use disorder to those provided by the S2BI, at similar specificities (0.90 vs. 0.92-0.94)K-CAT-SUD-E sensitivity was superior for both alcohol (0.74 vs. 0.53) and cannabis (0.85 vs. 0.81; Levy et al., 2021).In addition, the AUC for K-CAT-SUD-E lifetime alcohol use disorder (0.91) was similar to those reported for the CRAFFT (0.88), POSIT (0.93), and AUDIT (0.91), and superior to that reported for the CAGE (0.77; Knight et al., 2003).Furthermore, the K-CAT-SUD-E also showed excellent to outstanding classification accuracy (AUCs = 0.84-0.96)for substance classes not assessed or not reported for existing screeners (i.e., opioids, stimulants, sedatives).Lastly, compared to CAT-SUD-E performance in adults, overall diagnostic accuracy was lower for adolescents, but importantly, accuracy for lifetime use of the three most commonly used substances was higher for adolescents (0.91-0.92 vs. 0.85-0.90;Hulvershorn et al., 2022).Collectively, the K-CAT-SUD-E provides measurement accuracy comparable to existing tools across multiple substance classes, including tools targeting a specific substance (i.e., AUDIT), provides assessment for substance classes not found in other tools (e.g., inhalants), and unlike existing tools, uses adaptive technology to limit patient demand to just those items needed for diagnosis.
The K-CAT-SUD-E introduces a diagnostic tool for substance use disorders that can be completed by adolescents in any location, with speed approaching that of a simple screener and diagnostic accuracy approaching that of an interview requiring both a clinician and parent.As such, it could be utilized in many settings, including academic and juvenile justice contexts as well as primary care consistent with SBIRT recommendations.That the K-CAT-SUD-E resulted in similar diagnostic accuracy with adolescent report alone-compared to interviews with both adolescents and parents-further supports its efficiency.Adolescent substance use is often underreported in clinical settings due to, among other things, stigma, rapport with administrators/providers, confidentiality concerns, and presence of parents (Gryczynski et al., 2019), the latter of which is fairly common during substance use screening (Levy et al., 2020).In adolescents, self-administration of assessments can be more reliable than clinician-administration for sensitive topics like substance use, and youth might actually prefer reporting using an electronic device (e.g., tablet) compared to having an interview (Kelly et al., 2014).The rapid, remote nature of the K-CAT-SUD-E addresses a number of these concerns with adolescent assessment.Finally, the K-CAT-SUD-E has the unique advantage of incorporating other mental health domains (e.g., depression, suicidality, posttraumatic stress) seamlessly and efficiently through administration of other CAT modules.
Limitations of the current study include sample size, rates of substance use, and timeframe discrepancy.The sample size used (n = 156) is smaller than samples used to validate other substance use tools in adolescents (n's = 517-538; Kelly et al., 2014;Knight et al., 2002;Levy et al., 2020).Of note, by targeting a sample enriched for substance use, the current study did have higher rates of any use (59% vs. 21-50%) and of past year SUDs for alcohol (10% vs. 3-5%), nicotine (19% vs. 4%), and cannabis (19% vs. 8-11%).Nevertheless, the combination of lower sample size and low rates of use of less popular substance categories precluded analysis of current use for all but the three most frequently used substances.Future studies with larger samples could address this limitation, as well as permit analyses based on gender or age within an adolescent sample.With regard to timeframe, the difference in definitions of current use for the K-CAT-SUD-E (past 30 days) vs. K-SADS (past year) likely limited the diagnostic accuracy of recent use.However, the range of K-CAT-SUD-E timeframe settings, from use as

Table 2 .
Substance-specific diagnostic accuracy-current and lifetime.
note.Mca = Maximum classification accuracy; Se = sensitivity; Sp = specificity; Pr(Dx) = probability of SuD diagnosis; Model includes current and lifetime K-cat-SuD-e diagnoses and severity scores; current substances with too few instances to calculate include Hallucinogens, inhalants, Opioids, Other/Otc, Sedatives, and Stimulants; Lifetime substances with too few instances to calculate include Hallucinogens, inhalants, and Other/Otc.