Version 3 2025-02-03, 16:49Version 3 2025-02-03, 16:49
Version 2 2024-11-08, 06:06Version 2 2024-11-08, 06:06
Version 1 2024-11-08, 05:50Version 1 2024-11-08, 05:50
dataset
posted on 2025-02-03, 16:49authored byJordan Winetrout, Zilu Li, Qi Zhao, Landon Gaber, Vinu U. Unnikrishnan, Vikas Varshney, Yanxun Xu, Yusu Wang, Hendrik HeinzHendrik Heinz
<p dir="ltr">This repository contains the full 3D structure database associated with the publication "<b>Prediction of Carbon Nanostructure Mechanical Properties and the Role of Defects Using Machine Learning</b>"</p><p dir="ltr">The dataset (SI_complete_dataset) contains 1179 3D atomic structures of CNT bundles, 958 structures of CNT junctions, and 50 structures of carbon fiber cross-sections with associated mechanical properties from complete stress-strain curves up to failure for each structure (e.g., strain at break, Young's modulus, and tensile strength). The models have a size of up to 80,000 atoms and the ground truth data were derived using the reactive INTERFACE force field, IFF-R.</p><p dir="ltr">The database is extensible and can include larger carbon nanostructures with labels, including data using multiple computational and experimental techniques as they become available. The goal is real-time prediction of stress-strain properties of carbon nanostructures of arbitrary 3D configurations.</p><p dir="ltr">The fileshare also contains a second folder (HS-GNN_and_small_test_model.zip) that contains the hierarchical spatial graph neural network (HS-GNN) and a runscript for XGBoost to train and apply the machine learning models for property predictions as described in the publication. The files contain the complete machine learning pipeline, and results for a small test (toy) set.</p><p dir="ltr">Third, we share the Supporting Files from the publication, which contain sample run scripts and the force field files to reproduce molecular dynamics simulations of stress-strain curves of the carbon nanostructures up to failure using IFF-R.</p><p dir="ltr">Fourth, we share the pre-processed graphs trained on 90% of the dataset of CNT bundles, which can be used to reproduce the predictions of mechanical properties using HS-GNN.</p>