figshare
Browse

Materials Project Time Split Data

Version 4 2022-06-04, 20:22
Version 3 2022-06-04, 03:32
Version 2 2022-06-04, 02:47
Version 1 2022-06-04, 02:43
dataset
posted on 2022-06-04, 20:22 authored by Sterling G. BairdSterling G. Baird, Taylor SparksTaylor Sparks

Full and dummy snapshots (2022-06-04) of data for mp-time-split encoded via matminer convenience functions grabbed via the new Materials Project API. The dataset is restricted to experimentally verified compounds with no more than 52 sites. No other filtering criteria were applied. The snapshots were developed for sparks-baird/mp-time-split as a benchmark dataset for materials generative modeling. Compressed version of the files (.gz) are also available.

dtypes

```python

from pprint import pprint

from matminer.utils.io import load_dataframe_from_json

filepath = "insert/path/to/file/here.json"

expt_df = load_dataframe_from_json(filepath)

pprint(expt_df.iloc[0].apply(type).to_dict())

```

{'discovery': ,  'energy_above_hull': ,  'formation_energy_per_atom': ,  'material_id': ,  'references': ,  'structure': ,  'theoretical': ,  'year': } 

index/mpids

 (just the number for the index). Note that `material_id`-s that begin with "mvc-" have the "mvc" dropped and the hyphen (minus sign) is left to distinguish between "mp-" and "mvc-" types while still allowing for sorting. E.g. `mvc-001` -> -1.


 {146: MPID(mp-146), 925: MPID(mp-925), 1282: MPID(mp-1282), 1335: MPID(mp-1335), 12778: MPID(mp-12778), 2540: MPID(mp-2540), 316: MPID(mp-316), 1395: MPID(mp-1395), 2678: MPID(mp-2678), 1281: MPID(mp-1281), 1251: MPID(mp-1251)} 

Funding

CAREER: SusChEM: Data Mining to Reduce the Risk in Discovering New Sustainable Thermoelectric Materials

Directorate for Mathematical & Physical Sciences

Find out more...

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC