How institutions can use ReCiter to provide machine learning-based publications management as an enterprise service

Albert, Paul; Wheeler, Terrie; Dutta, Sarbajit

doi:10.6084/m9.figshare.9897032.v1

Reciter VIVO Conference 2019-FigShare.pdf (11.88 MB)

How institutions can use ReCiter to provide machine learning-based publications management as an enterprise service

presentation

posted on 2019-09-24, 13:10 authored by Paul AlbertPaul Albert, Terrie WheelerTerrie Wheeler, Sarbajit Dutta

Staff at medical institutions are regularly called upon to produce and maintain lists of scholarly publications authored by individuals ranging from NIH-funded principal investigators to people affiliated with other institutions such as alumni and residents. This work tends to be done on an ad hoc basis and is time-consuming, especially when profiled individuals have common names. Often, feedback from the authors themselves is not adequately captured in some central location and repurposed for future requests. ReCiter is a highly accurate, rule-based system for inferring which publications in PubMed a given person has authored. ReCiter includes a Java application, a DynamoDB-hosted database, and a set of RESTful microservices which collectively allow institutions to maintain accurate and up-to-date author publication lists for thousands of people. This software is optimized for disambiguating authorship in PubMed and, optionally, Scopus. ReCiter rapidly and accurately identifies articles, including those at previous affiliations, by a given person. It does this by leveraging institutionally maintained identity data (e.g., departments, relationships, email addresses, year of degree, etc.) With the more complete and efficient searches that result from combining these types of data, individuals at institutions can save time and be more productive. Running ReCiter daily, one can ensure that the desired users are the first to learn when a new publication has appeared in PubMed. ReCiter is freely available and open source under the Apache 2.0 license. https://github.com/wcmc-its/ReCiter For our presentation, we will demonstrate: * How to install ReCiter * How to load ReCiter with identity data * How to run ReCiter * Its API outputs * How ReCiter integrates with a third-party interface for capture feedback, feedback which is fed back into ReCiter to further improve accuracy