HuMaIN: Human- and Machine-Intelligent Network of Software Elements
Download (824.33 kB) This item is shared privately
poster
modified on 2020-01-31, 15:56 <p>Biodiversity
information extraction (IE) from imaged text in digitized museum specimen
records is a challenging task due to both the large number of labels and the
complexity of the characters and information to be extracted.</p>
<p>The HuMaIN project
investigates software-enabled solutions that support the combination of machine
and human intelligence to accelerate IE from specimen labels.</p>
<p>Among other
contributions, the project proposed the use of self-aware workflows to
orchestrate machines and human tasks (the SELFIE model), Optical Character
Recognition (OCR) ensembles and Natural Language Processing (NLP) methods to
increase confidence in extracted text, named-entity recognition (NER)
techniques for Darwin Core (DC) terms extraction, and a simulator for the study
of these workflows with real-world data. The software has been tested and
applied on large datasets from museums in the USA and Australia.</p>
Funding
SI2-SSE: Human- and Machine-Intelligent Software Elements for Cost-Effective Scientific Data Digitization
Directorate for Computer & Information Science & Engineering
Find out more...
