figshare
Browse
TableReferee2.pdf (37.88 kB)

A corpus of 42 books from European languages embracing four families.

Download (37.88 kB)
Version 2 2017-09-22, 02:12
Version 1 2017-09-22, 01:56
dataset
posted on 2017-09-22, 02:12 authored by Candelario Hernández GómezCandelario Hernández Gómez, Rogelio Basurto-FloresRogelio Basurto-Flores, Lev GuzmanvLev Guzmanv
A corpus of 42 books, three for each of 14 different European languages taken from the page www.gutenberg.org. The titles of the oeuvres and authors are written in the romanized way given in the page. The texts were chosen by no other reason that to be representative of each language and avoiding, as much as possible, the repetitive texts like poetry.

History