Diorisis.zip (185.44 MB)
The Diorisis Ancient Greek Corpus
datasetposted on 2018-05-02, 16:40 authored by Alessandro VatriAlessandro Vatri, Barbara McGillivrayBarbara McGillivray
An annotated corpus of literary Ancient Greek sourced from the Perseus Canonical Greek Lit repository (https://github.com/PerseusDL/canonical-greekLit), “The Little Sailing” digital library (http://www.mikrosapoplous.gr/en/texts1en.html), and the Bibliotheca Augustana digital library (http://www.hs-augsburg.de/~harsch/augustana.html#gr).
The corpus consists of 820 texts spanning between the beginnings of the AG literary tradition (Homer) and the fifth century AD, and it counts 10,206,421 words.