figshare
Browse
tx1c00187_si_001.pdf (3.11 MB)

A Classification Model to Identify Direct-Acting Mutagenic Polycyclic Aromatic Hydrocarbon Transformation Products

Download (3.11 MB)
journal contribution
posted on 2021-10-18, 21:30 authored by Trevor W. Sleight, Caitlin N. Sexton, Giannis Mpourmpakis, Leanne M. Gilbertson, Carla A. Ng
Polycyclic aromatic hydrocarbons (PAHs) are a complex group of environmental contaminants, many having long environmental half-lives. As these compounds degrade, the changes in their structure can result in a substantial increase in mutagenicity compared to the parent compound. Over time, each individual PAH can potentially degrade into several thousand unique transformation products, creating a complex, constantly evolving set of intermediates. Microbial degradation is the primary mechanism of their transformation and ultimate removal from the environment, and this process can result in mutagenic activation similar to the metabolic activation that can occur in multicellular organisms. The diversity of the potential intermediate structures in PAH-contaminated environments renders hazard assessment difficult for both remediation professionals and regulators. A mixture of structural and energetic descriptors has proven effective in existing studies for classifying which PAH transformation products will be mutagenic. However, most existing studies of environmental PAH mutagens primarily focus on nitrogenated derivatives, which are prevalent in the atmosphere and not as relevant in soil. Additionally, PAH products commonly found in the environment can range from as large as five rings to as small as a single ring, requiring a broadly inclusive methodology to comprehensively evaluate mutagenic potential. We developed a combination of supervised and unsupervised machine learning methods to predict environmentally induced PAH mutagenicity with improved performance over currently available tools. K-means clustering with principal component analysis allows us to identify molecular clusters that we hypothesize to have similar mechanisms of action. Recursive feature elimination identifies the most influential descriptors. The cluster-specific regression outperforms available classifiers in predicting direct-acting mutagens resulting from the microbial biodegradation of PAHs and provides direction for future studies evaluating the environmental hazards resulting from PAH biodegradation.

History