figshare
Browse
1/1
9 files

ORDerly benchmarks of chemical reactions

Version 4 2024-02-05, 23:17
Version 3 2023-08-29, 16:24
Version 2 2023-06-12, 16:51
Version 1 2023-06-06, 21:36
dataset
posted on 2024-02-05, 23:17 authored by Daniel WighDaniel Wigh, Joe arrowsmith, Kobi Felton, Alexander Pomberger, Alexei A. Lapkin

Benchmark datasets generated with ORDerly for chemical reaction prediction tasks

ORDerly-forward: Forward reaction prediction (predict reaction products given reactants, solvents, and agents)

ORDerly-retro: Retrosynthesis prediction (prediction reactants given a desired product)

ORDerly-condition: Reaction condition prediction (predict solvents and agents given reactants and products). Note that reactions with rare solvents and agents (frequency <100) have been removed.

ORDerly-condition-with-rare: Reaction condition prediction (predict solvents and agents given reactants and products). Reactions with rare solvents and agents have not been removed.

Config: Contains the .log and .json files showing the parameters used in cleaning and the impact on dataset size after each cleaning step.

Note that all datasets here were created using the reaction string and our chemically informed logic to assign reaction roles.

Paper: https://chemrxiv.org/engage/chemrxiv/article-details/64ca5d3e4a3f7d0c0d78ca42

Neurips workshop paper: https://openreview.net/forum?id=R8FQMsECIS

Code: https://github.com/sustainable-processes/orderly

The supplementary datasets used for this work can be found here: https://doi.org/10.6084/m9.figshare.23502372.v3

Feel free to email me, Daniel Wigh, at dsw46@cam.ac.uk or my supervisor Alexei A. Lapkin.

Funding

UCB pharma

Engineering and Physical Sciences Research Council via project EP/S024220/1 EPSRC Centre for Doctoral Training in Automated Chemical Synthesis Enabled by Digital Molecular Technologies.

European Regional Development Fund via the project "Innovation Centre in Digital Molecular Technologies"

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC