figshare
Browse
paper.pdf (50.24 kB)

Development of Word Embeddings for Uzbek Language

Download (50.24 kB)
preprint
posted on 2021-01-13, 02:13 authored by B. MansurovB. Mansurov, A. Mansurov
In this paper, we share the process of developing word embeddings for the Cyrillic variant of the Uzbek language. The result of our work is the first publicly available set of word vectors trained on the word2vec, GloVe, and fastText algorithms using a high-quality web crawl corpus developed in-house. The developed word embeddings can be used in many natural language processing downstream tasks.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC