Artificial Intelligence and the Fight Against COVID-19
datasetposted on 23.06.2020 by Juan Mateos-Garcia, Joel Klinger, Konstantinos Stathoulopoulos
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Datasets analysed in a paper mapping AI research activity against COVID-19. Includes:
-rxiv metadata: A dataset with metadata about 1.8m papers from arXiv, biorXiv and medrXiv as of end May 2020 enriched with dummies about whether the papers are related to AI and/or COVID-19 research (updated 22/06/2020 to fix some ids)
-rxi_geo: A dataset with geographical metadata for papers based on the institutional affiliations of their authors after matching with the GRID database.
-covid_semantic: A dataset with topic information about COVID-19 papers based on a semantic analysis of their abstracts, including the clusters where papers have been classified and their topic mixes (updated 22/06/2020 to fix some ids).
-citation_metadata: Two JSON objects. One contains a lookup between COVID-19 related papers in the rXiv corpus and the papers they cite. Another contains metadata about the cited papers including their fields of study.
-mag_fos: A dataset with the Microsoft Academic Graph field of study hierarchy we use in our analysis (added 22 June 2020)
Each zipped folder includes a data dictionary.
For information about data processing and analysis in: https://github.com/nestauk/ai_covid_19