On Developing Extraction Rules for Mining Informal Scientific References from Altmetric Data Sources

Research impact measurement is through citation count (including more recent measures like H-Index)

• The measures do not provide for calculating impact on media, public discourse or government policies

• We work on developing Altmetrics that are non-traditional metrics aiming at measuring research impact from alternative data sources such as news or social media

• We collected a corpus of around 500 documents reaching to 130 MB. These documents were indexed from the web against the keyword Tamiflu. The corpus sources inludes: news, articles, blogs and government sources

• We search for mentions of scientists, research organizations, industry R&D departments, labs in our heterogeneous corpus

• We manually annotate these mentions and craft Java Annotation Pattern Engine (JAPE) grammar using the text analytics toolkit - General Architecture for Text Engineering (GATE)