#DH2016 Sun-Tues Figshare.xlsx (833.36 kB)
Download file

An Archive of #DH2016 Tweets Published from Sunday 10 to Tuesday 12 July 2016 GMT

Download (833.36 kB)
posted on 14.07.2016, 08:07 by Ernesto PriegoErnesto Priego

The Digital Humanities 2016 conference is taking/took place in Kraków, Poland, between Sunday 11 July and Saturday 16 July 2016. #DH2016 is/was the conference official hashtag.

What This Output Is

This is an Excel spreadsheet file containing three sheets containing a total of 3478 Tweets publicly published with the hashtag #DH2016.

The archive starts with a Tweet published on Sunday July 10 2016 00:03:41 +0000 and finishes with a Tweet published on Tuesday July 12 2016 23:55:47 +0000.

The original collection has been organised into conference days; one sheet per day (GMT and Central European Times included). A breakdown of Tweets per day:

Sunday 10 July 2016: 179 Tweets
Monday 11 July 2016: 981 Tweets
Tuesday 12 July 2016: 2318 Tweets
Methodology and Limitations

The Tweets contained in this file were collected by Ernesto Priego using Martin Hawksey's TAGS 6.0.
Only users with at least 1 follower were included in the archive. Retweets have been included (Retweets count as Tweets). The collection spreadsheet was customised to reflect the time zone and geographical location of the conference.

The profile_image_url and entities_str metadata were removed before public sharing in this archive.

Please bear in mind that the conference hashtag has been spammed so some Tweets colllected may be from spam accounts. Some automated refining has been performed to remove Tweets not related to the conference but the data is likely to require further refining and deduplication. 

Both research and experience show that the Twitter search API is not 100% reliable. Large Tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (Gonzalez-Bailon, Sandra, et al. 2012).

Apart from the filters and limitations already declared, it cannot be guaranteed that this file contains each and every Tweet tagged with #dh2016 during the indicated period, and the dataset is shared for archival, comparative and indicative educational research purposes only.

Only content from public accounts is included and was obtained from the Twitter Search API. The shared data is also publicly available to all Twitter users via the Twitter Search API and available to anyone with an Internet connection via the Twitter and Twitter Search web client and mobile apps without the need of a Twitter account.

Each Tweet and its contents were published openly on the Web with the queried hashtag and are responsibility of the original authors.

No private personal information is shared in this dataset. The collection and sharing of this dataset is enabled and allowed by Twitter's Privacy Policy. The sharing of this dataset complies with Twitter's Developer Rules of the Road.

This dataset is shared to archive, document and encourage open educational research into scholarly activity on Twitter.

Other Considerations

Tweets published publicly by scholars during academic conferences are often tagged (labeled) with a hashtag dedicated to the conference in question.

The purpose and function of hashtags is to organise and describe information/outputs under the relevant label in order to enhance the discoverability of the labeled information/outputs (Tweets in this case).

A hashtag is metadata users choose freely to use so their content is associated, directly linked to and categorised with the chosen hashtag.

Though every reason for Tweeters' use of hashtags cannot be generalised nor predicted, it can be argued that scholarly Twitter users form specialised, self-selecting public professional networks that tend to observe scholarly practices and accepted modes of social and professional behaviour.

In general terms it can be argued that scholarly Twitter users willingly and consciously tag their public Tweets with a conference hashtag as a means to network and to promote, report from, reflect on, comment on and generally contribute publicly to the scholarly conversation around conferences. As Twitter users, conference Twitter hashtag contributors have agreed to Twitter's Privacy and data sharing policies.  

Professional associations like the Modern Language Association recognise Tweets as citeable scholarly outputs. Archiving scholarly Tweets is a means to preserve this form of rapid online scholarship that otherwise can very likely become unretrievable as time passes; Twitter's search API has well-known temporal limitations for retrospective historical search and collection.

Beyond individual tweets as scholarly outputs, the collective scholarly activity on Twitter around a conference or academic project or event can provide interesting insights for the contemporary history of scholarly communications. To date, collecting in real time is the only relatively accurate method to archive tweets at a small scale.

Though these datasets have limitations and are not thoroughly systematic, it is hoped they can contribute to developing new insights into the discipline's presence on Twitter over time.

The CC-BY license has been applied to the output in the repository as a curated dataset. Authorial/curatorial/collection work has been performed on the file in order to make it available as part of the scholarly record. The data contained in the deposited file is otherwise freely available elsewhere through different methods and anyone not wishing to attribute the data to the creator of this output is needless to say free to do their own collection and clean their own data.