figshare
Browse
1/1
3 files

NLPL Faroese/Danish Parallel Corpus

Version 2 2020-05-28, 16:54
Version 1 2020-05-28, 16:50
dataset
posted on 2020-05-28, 16:54 authored by Leon DerczynskiLeon Derczynski
Danish:Faroese parallel text; 5000 sentence pairs.

Data taken from:
1. Europarl, such that it's parallel with all Europarl languages
2. Tatoeba, for informal-register and conversational text
3. Dimma, to capture Faroese cultural items and names (excerpts under ophavsretsloven)

Acknowledge:
Leon Derczynski, "NLPL Faroese/Danish Parallel Corpus" (2020). doi:10.6084/m9.figshare.12384047

Funding

NeIC NordForsk NLPL

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC