figshare
Browse

Enhancing QSAR Predictive Power and Explainability: Meta-Modeling and SHAP Feature Importance Analysis for Drug Discovery (Python Scripts)

software
posted on 2024-12-12, 07:45 authored by Ardo SanjayaArdo Sanjaya

Quantitative structure-activity relationship (QSAR) modeling predicts biological endpoints, such as toxicity and potency, from molecular structures. This study compares molecular fingerprints' predictive ability and reproducibility, enhances prediction accuracy by integrating fingerprints, and highlights explainability in QSAR applications. We evaluated the performance of various fingerprints across ten proteins to predict pIC50. Concordance Correlation Coefficient (CCC) analysis assessed prediction agreement and reproducibility. Shapley Additive exPlanations (SHAP) analysis explored feature importance of molecular substructures in base models and fingerprint relevance in meta-models. A Streamlit web app showcased prediction and feature importance visualization. Insert abstract text here. No single fingerprint demonstrated superiority. Meta models combining Morgan6 and other fingerprints outperformed single models in seven protein targets. SHAP analysis showed fingerprint importance depends on the protein target. The web app highlighted critical substructures using Donepezil as a case study. While fingerprints modeled different molecular aspects, their predictive performance was comparable, with high reproducibility and agreement. Combining fingerprints improved prediction accuracy, and SHAP analysis emphasized their context-dependent contribution. Feature importance analysis enhanced QSAR model interpretability, providing actionable insights for drug discovery.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC