figshare
Browse

Developing Deep Learning-based Large-scale Organic Reaction Classification Model via Sigma-profiles

Version 2 2024-06-28, 02:29
Version 1 2023-11-25, 06:00
dataset
posted on 2024-06-28, 02:29 authored by Wenlong WangWenlong Wang

The "Train_AE.zip" contains the scripts for training an auto-encoder.

The "Train_DL_Models.zip" contains the scripts for training deep learning-based models.

The "sigma_profiles_dict.npy" contains the sigma-profiles of millions of different molecules. The SMILES of a molecule is used as key to query the corresponding sigma-profiles.

The "sorted_agent_dict.npy" contains the statistical results of USPTO_TPL dataset concerning the frequency of occurrence of agents. The agents are shown in an descending manner.

The "sorted_agent_combination_dict.npy" contains the statistical results of USPTO_TPL dataset concerning the frequency of occurrence of agent combinations. The combinations are shown in an descending manner.


The "USPTO_TPL_own_version.xlsx" contains the reactions that used for training/validation/testing.


Funding

National Natural Science Foundation of China (22078041)

National Natural Science Foundation of China (22278053)

Dalian High-level Talents Innovation Support Program (2021RQ105)

Research Funds for China Central Universities (DUT22LAB608)

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC