Automated Identification and Conversion of Chemical Names to Structure Searchable Information.pdf (545.09 kB)
Download file

Automated Identification and Conversion of Chemical Names to Structure Searchable Information

Download (0 kB)
journal contribution
posted on 17.03.2013, 00:24 authored by Antony WilliamsAntony Williams

The communication of chemistry-related information occurs both via print and electronic media and chemical entities can appear as structure depictions or, more commonly, as systematic names (commonly either IUPAC or CAS names), as trade names or of one of a plethora of registry numbers (CAS, EINECS/ EC-number or others). The preferable form of communication for a chemist is via a depiction of the chemical structure with an electronic molecular connection table as its basis. Electronic representations of chemical structures are one of the informatics underpinnings for any organization operating in the domain of chemistry or biology and enable the creation of a structure/substructure searchable database of chemical structures and associated data and knowledge. There is an enormous wealth of information embedded inside both print and electronic documents in the form of chemical names and a means by which to convert those alphanumeric text descriptors into a more rich chemical structure representation has long been the mission of a large group of investigators. The challenges and hurdles to success are quite profound in their nature. We will review the present state of this research and the efforts underway to recover the value of information textually trapped in publications, patents, databases and Internet pages across the multiple domains of chemistry.

History