Kaye-J-Dissertation-nocover.pdf (19.1 MB)
0/0

An Evaluation of the Research Potential of Geo-indexed Internet Archive Data 1996-2010 (MSc Dissertation)

Download (0 kB)
journal contribution
posted on 21.10.2013 by John Kaye

2013 MSc Dissertation for Birkbeck Geographic Information Science

This study uses the Geoindex JISC UK Web Domain Dataset (1996-2010), which is a 61gb text based dataset which contains around 700,000,000 instances of postocdes contained in archive.org’s html for it’s .uk domain collection. This data opens up the possibility of using the archive as a geographic dataset in it’s own right. The study evaluates the use and value of the archive as a dataset to researchers by processing and examining the data at various levels of aggregation and geographic areas. It evaluates data quality, provides summaries of the dataset, analysis examples, some likely research use cases, as well as recommendations for future work around this dataset.

 

Derived datasets are also published under my name on figshare

History

Licence

Exports

Licence

Exports