1/1
6 files

PubMed classification v3 202202

dataset
posted on 2022-09-16, 09:27 authored by Peter SjögårdePeter Sjögårde

For each level there is a table with labels (e.g. labels_lev1_[date].csv), related by an id (e.g lev1_cluster_id).

The February 2022 update is a complete new version of the classification based on new clustering and labeling.


A visualization of the clusters is available in GitHub.


PMID_cluster_relation_[date].csv contains the relation between PMIDs and clusters. Four levels are included:

Level 1 - Topics - Most granular

Level 2 - Specialties

Level 3 - Disciplines

Level 4 - Discipline group - Most coarse


Added_pubs_version_[nih occ version].csv contains the same information as above, but for publications that have been assigned to clusters after the initial clustering proceedure. 


See the figshare collection for further description.

History