figshare
Browse

Improving deep representation learning for crystal structures by learning and hybridizing human-designed descriptors

Download (1.92 GB)
Version 5 2023-08-31, 20:08
Version 4 2023-07-01, 23:14
Version 3 2022-12-11, 23:53
Version 2 2022-08-09, 04:13
Version 1 2022-04-26, 04:23
journal contribution
posted on 2023-08-31, 20:08 authored by Sheng GongSheng Gong

Update: on 07/11/2023, we upload an extended version including matformer and MEGNet as two more models studied in this work.


--------------------------------------------------------------------------------------------------------------------------------------------------------


Update: on 12/11/2022, we upload a corrected version of the datasets and models. In the previous one, the models and datasets for internal energy, Cv, and poly_electronic from dealignn were wrong.


______________________________________________



This folder contains all the datasets and trained models for the paper entitled: "Examining graph neural networks for crystal structures: limitations and opportunities for capturing periodicity". https://arxiv.org/abs/2208.05039.


Please note that, since the paper is more about insights of exisiting GNN methods than a completely new code, we provide the revised versions of CGCNN and ALIGNN in this repository. Anyone interested in the revised codes should download them from here, and run each in the same environemt as CGCNN or ALIGNN.


CGCNN: https://github.com/txie-93/cgcnn


ALIGNN: https://github.com/usnistgov/alignn


decgcnn: contains the revised CGCNN codes for the de-CGCNN. 

         Please specify the feature list at the begnning of main.py and predict.py, and input the path of the descriptors file into the two scripts. The usage of "main.py" and "predict.py" here are the same as that of orignal CGCNN.


dealignn: contains the revised ALIGNN codes for the de-ALIGNN.

          Please specify the feature list and path of the descriptors file at the begnning of fine-tuning.py

          Also, please put the script "degraph.py" at the path "/anaconda3/envs/ALIGNN/lib/python3.8/site-packages/jarvis/core" or similar path in your environment.


datasets: contains all the datasets and test results.

          descriptors_mp: all descriptors of all structures in the MP database

          learning_descriptors: dataset for learning descriptors reported in Figure 2.

          mp_selected_prop.csv: dataset of all properties of all structures in the MP database

          all_kappa: all kappa from TEDesignLab database

          test_(model_name)_(prop_name/descriptor_name).csv: test set and prediction result of (prop_name/descriptor_name) from (model_name)

          1d: contains the datasets of 1d chains in Figure 3

                sample_short/long: structures of the short/long chains

                test_results_(default/nconv/neigh)_(short/long).csv: test results for default/more conv. layers/more neighbors CGCNN for the short/long chains.


trained_models: contains all the trained models for property predictions in Figure 4.

                (model_name)_(prop_name).pt: trained model of (prop_name) from (model_name)



Please note that all properties from MP in test_XX.csv and from XX.pt are normalized. The normalizer can be found at datasets/normalizer.npy. Normalization is conducted by (property - median of property) / (90 percentile - 10 percentile). 

Funding

Toyota Research Institute

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC