1/1
3 files

NLPL Faroese/Danish Parallel Corpus

dataset
posted on 28.05.2020, 16:54 by Leon Derczynski
Danish:Faroese parallel text; 5000 sentence pairs.

Data taken from:
1. Europarl, such that it's parallel with all Europarl languages
2. Tatoeba, for informal-register and conversational text
3. Dimma, to capture Faroese cultural items and names (excerpts under ophavsretsloven)

Acknowledge:
Leon Derczynski, "NLPL Faroese/Danish Parallel Corpus" (2020). doi:10.6084/m9.figshare.12384047

Funding

NeIC NordForsk NLPL

History

Usage metrics

Licence

Exports