motion-charts-futurate.zip (136.28 kB)
R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart
Version 2 2019-03-10, 06:18
Version 1 2019-03-10, 03:22
dataset
posted on 2019-03-10, 06:18 authored by Gede Primahadi Wijaya RajegGede Primahadi Wijaya RajegPublication
Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387
Description of R codes and data files in the repository
This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).
The raw input data consists of two files (i.e.
will_INF.txt
and go_INF.txt
). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).These two input files are used in the R code file
1-script-create-input-data-raw.r
. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade
, (ii) coll
(for "collocate"), (iii) BE going to
(for frequency of the collocates with be going to) and (iv) will
(for frequency of the collocates with will); it is available in the input_data_raw.txt
. Then, the script
2-script-create-motion-chart-input-data.R
processes the input_data_raw.txt
for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt
). The output from the second script is input_data_futurate.txt
.Next,
input_data_futurate.txt
contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R
), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R
).The repository adopts the project-oriented workflow in RStudio; double-click on the
Future Constructions.Rproj
file to open an RStudio session whose working directory is associated with the contents of this repository.Funding
English Department, Faculty of Arts, Udayana University, Indonesia
History
Usage metrics
Categories
- Programming languages
- Digital heritage
- Data communications
- English language
- Language studies not elsewhere classified
- Historical, comparative and typological linguistics
- Natural language processing
- Linguistic structures (incl. phonology, morphology and syntax)
- Computational linguistics
- Linguistics not elsewhere classified
Keywords
linguistic motion chartsmotion chartsGoogleVisconstructional changediachronic corpus linguisticsdata visualisationsdata visualizationsdata visualisationCorpus of Historical American EnglishCOHAR programming languagereshape2English future constructionscollocational changeProgramming LanguagesDigital HumanitiesData CommunicationsEnglish LanguageLanguage Studies not elsewhere classifiedLanguage in Time and Space (incl. Historical Linguistics, Dialectology)Natural Language ProcessingLinguistic Structures (incl. Grammar, Phonology, Lexicon, Semantics)Computational LinguisticsLinguistics not elsewhere classifiedLinguistics
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC