Final project for the DST4L: Data Science for Librarians course held at the Harvard-Smithsonian Center for Astrophysics, JG Wolbach Library, Spring 2013.
This final project's mission was to identify items in the Internet Archive, an open access repository of out-of-copyright texts, that matched existing bibliographic stub records in the NASA Astrophysics Data System (ADS). The matches identified could subsequently be used by ADS developers for future ingest of content.