figshare
Browse
UzBERT.pdf (332.45 kB)

UzBERT: pretraining a BERT model for Uzbek

Download (332.45 kB)
preprint
posted on 2021-08-22, 18:40 authored by B. MansurovB. Mansurov, A. Mansurov
Pretrained language models based on the Transformer architecture have achieved state-of-the-art results in various natural language processing tasks such as part-of-speech tagging, named entity recognition, and question answering. However, no such monolingual model for the Uzbek language is publicly available. In this paper, we introduce UzBERT, a pretrained Uzbek language model based on the BERT architecture. Our model greatly outperforms multilingual BERT on masked language model accuracy. We make the model publicly available under the MIT open-source license.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC