DigiMOF: A Database of Metal–Organic Framework
Synthesis Information Generated via Text Mining
Posted on 2023-05-18 - 04:15
The vastness of materials
space, particularly that which is concerned
with metal–organic frameworks (MOFs), creates the critical
problem of performing efficient identification of promising materials
for specific applications. Although high-throughput computational
approaches, including the use of machine learning, have been useful
in rapid screening and rational design of MOFs, they tend to neglect
descriptors related to their synthesis. One way to improve the efficiency
of MOF discovery is to data-mine published MOF papers to extract the
materials informatics knowledge contained within journal articles.
Here, by adapting the chemistry-aware natural language processing
tool, ChemDataExtractor (CDE), we generated an open-source database
of MOFs focused on their synthetic properties: the DigiMOF database.
Using the CDE web scraping package alongside the Cambridge Structural
Database (CSD) MOF subset, we automatically downloaded 43,281 unique
MOF journal articles, extracted 15,501 unique MOF materials, and text-mined
over 52,680 associated properties including the synthesis method,
solvent, organic linker, metal precursor, and topology. Additionally,
we developed an alternative data extraction technique to obtain and
transform the chemical names assigned to each CSD entry in order to
determine linker types for each structure in the CSD MOF subset. This
data enabled us to match MOFs to a list of known linkers provided
by Tokyo Chemical Industry UK Ltd. (TCI) and analyze the cost of these
important chemicals. This centralized, structured database reveals
the MOF synthetic data embedded within thousands of MOF publications
and contains further topology, metal type, accessible surface area,
largest cavity diameter, pore limiting diameter, open metal sites,
and density calculations for all 3D MOFs in the CSD MOF subset. The
DigiMOF database and associated software are publicly available for
other researchers to rapidly search for MOFs with specific properties,
conduct further analysis of alternative MOF production pathways, and
create additional parsers to search for additional desirable properties.
CITE THIS COLLECTION
DataCite
3 Biotech
3D Printing in Medicine
3D Research
3D-Printed Materials and Systems
4OR
AAPG Bulletin
AAPS Open
AAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)
Academic Medicine
Academic Pediatrics
Academic Psychiatry
Academic Questions
Academy of Management Discoveries
Academy of Management Journal
Academy of Management Learning and Education
Academy of Management Perspectives
Academy of Management Proceedings
Academy of Management Review
Glasby, Lawson
T.; Gubsch, Kristian; Bence, Rosalee; Oktavian, Rama; Isoko, Kesler; Moosavi, Seyed Mohamad; et al. (2023). DigiMOF: A Database of Metal–Organic Framework
Synthesis Information Generated via Text Mining. ACS Publications. Collection. https://doi.org/10.1021/acs.chemmater.3c00788
or
Select your citation style and then place your mouse over the citation text to select it.
SHARE
Usage metrics
Read the peer-reviewed publication
AUTHORS (9)
LG
Lawson
T. Glasby
KG
Kristian Gubsch
RB
Rosalee Bence
RO
Rama Oktavian
KI
Kesler Isoko
SM
Seyed Mohamad Moosavi
JC
Joan L. Cordiner
JC
Jason C. Cole
PM
Peyman Z. Moghadam
KEYWORDS
throughput computational approachespore limiting diameterperforming efficient identificationneglect descriptors relatedlargest cavity diameterknown linkers providedcreate additional parserschemical names assignedautomatically downloaded 43accessible surface areadetermine linker typesadditional desirable propertiesstructured database revealscambridge structural databasedata enabled usmofs ), createsopen metal sitescsd mof subsetmof subsetorganic linkercde ),synthetic propertiesspecific propertiesmetal typemetal precursormof publicationsmof discoverysource databasespecific applicationsrational designrapid screeningpublicly availablepromising materialsone waymofs focusedmaterials spacematch mofsmachine learningimportant chemicalsextracted 15density calculationscsd entrycritical problemassociated softwarealthough high3d mofs