figshare
Browse

File(s) stored somewhere else

Please note: Linked content is NOT stored on Figshare and we can't guarantee its availability, quality, security or accept any liability.

DECM Machine Ready Corpus

Version 3 2022-12-08, 15:46
Version 2 2020-05-25, 15:15
Version 1 2020-05-25, 12:33
dataset
posted on 2020-05-25, 15:15 authored by Patricia Murrieta-FloresPatricia Murrieta-Flores, Diego Jiménez-Badillo, Bruno Emanuel da Graça Martins, Mariana Favila-Vázquez, Raquel Liceras-GarridoRaquel Liceras-Garrido

The DECM Corpus is a digital corpus of the texts of Relaciones Geográficas de Nueva España (the Geographic Reports of New Spain) with different versions, including a machine ready version, a gold standard annotated dataset, and an automatically annotated version ready for text mining and machine learning experiments.

This is the DECM Machine Ready Corpus. This version includes text only files (.txt) containing each of the 10 volumes originally edited by Rene Acuña, the 2 volumes edited by Mercedes de la Garza, the Suma de Visita edited by Del Paso y Troncoso, a file with the original text of the Crown mandate (Instrucción), and metadata for this collection. This version contains only the original text of each of the RGs as transcribed by the scholars, excluding any editorial note, commentary, or historical work. This can be therefore used directly for corpus linguistics analyses, visualisations, etc.

Funding

Digging into Early Colonial Mexico Project. T-AP Digging into Data Call. ESRC: ES/R003890/1

History