figshare
Browse
DATASET
Baza za repozitorij.xlsx (540.17 kB)
DATASET
1 9 Baza za FigShare - Copy.xlsx (556.88 kB)
1/0
2 files

The database of English words and their Croatian equivalents

Version 2 2022-11-04, 08:40
Version 1 2022-06-07, 11:27
dataset
posted on 2022-11-04, 08:40 authored by Irena BogunovićIrena Bogunović, Jasmina Jelčić ČolakovacJasmina Jelčić Čolakovac, Mirjana Borucinsky

 The current database contains English words which appear in Croatian in their original, unadapted form (e.g. show, boxer, zombie, skin, etc.). The list of words is based on The Database of English words in Croatian (Bogunović & Kučić 2022; https://repository.pfri.uniri.hr/islandora/object/pfri:2495), and was further complemented with words obtained from the corpus hrWaC (Ljubešić & Erjavec 2011; Ljubešić & Klubička 2014) using the platform SketchEngine (Kilgarriff et al. 2004). The same platform was used to check the list of English words against the corpora ENGRI (Bogunović et al. 2021; Bogunović & Kučić 2021) i hrWaC by consulting concordances and using CQL. The tagger Xf was used to filter out all English sentences embedded in Croatian texts. Corpus results were then manually checked using the random sample and filter tools to remove e.g., proper nouns, false cognate, false pairs, etc. The database also lists Croatian equivalents (and corresponding frequencies in the corpora) for each English word if they exist in Croatian. The choice of the Croatian equivalent depended greatly on the available corpus data on word frequency as well as Croatian online dictionaries. Furthermore, single-word and multi-word English expressions are represented separately in the database for reasons of visual transparency and simplification of word search. We would like to stress that the database by no means represents a final product and is not a definite representation of data on English words in Croatian, but is, however, representative of their current status in the Croatian language. Further efforts will be made to update the database and incorporate new data. 

Funding

English words in Croatian: Identification, affective-semantic norming and investigation into cognitive processing via behavioural and neuroscientific methods

Croatian Science Foundation

Find out more...

History