figshare
Browse

DyePerm_Dataset

Version 7 2025-12-05, 07:02
Version 6 2025-12-04, 18:08
Version 5 2025-12-03, 10:49
Version 4 2025-12-03, 08:48
Version 3 2025-12-03, 08:41
Version 2 2025-12-03, 07:22
Version 1 2025-12-03, 07:03
dataset
posted on 2025-12-05, 07:02 authored by Bo WANGBo WANG
<p dir="ltr">DyePermDB is a curated dataset of 202 fluorescent and chromogenic dyes with experimentally supported membrane-permeability annotations. The resource integrates structural identifiers, physicochemical attributes, qualitative solubility, toxicity notes, and literature evidence from PubChem, DrugBank and primary publications. Each dye is annotated with one of three permeability labels (“Yes”, “Yes (conditional)”, “No”), independently reviewed and cross-validated by domain experts.</p><p dir="ltr">To assess dataset quality and structural coherence, we performed descriptive statistical analyses, XGBoost-based permeability classification using FP4 fingerprints, and feature-importance evaluation via random forests, revealing strong structure–permeability signals driven by heteroatom content and SMILES-derived features. The repository includes the full dataset, DrugBank-linked subset, reproducible train/test splits, and Python scripts for all modelling tasks.</p><p dir="ltr">This dataset supports cheminformatics research, QSAR/QSPR modelling, fluorescent probe selection, and dye-oriented drug repurposing studies.</p>

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC