Data Files (Compressed)
Data for " Application of Knowledge-Driven Ensemble Algorithms to Improve Preventative Measures for Underage Tobacco Consumption in the U.S." We accessed publicly available retrospective data from the National Youth Tobacco Survey (NYTS), administered by the Centers for Disease Control and Prevention (CDC), on March 16, 2024. The Office of Management and Budget, RTI International’s Institutional Review Board (IRB), and CDC’s IRB approved the original survey design, privacy, and implementation by the CDC. We did not have access to information that could identify individual participants during or after data collection. All participants had written parental consent. Raw datasets are 2021_NYTS_raw and 2022_NYTS_raw. 2022-NYTS-Codebook_508 describes the original dataset and is originally from the CDC website. Preprocessed data that is label encoded is x_resampled and y_resampled, and one-hot encoded data begins with ohe. The 2021 one-hot encoded data is labeled likewise.