3 files

Corpus and Software

posted on 20.02.2017, 04:20 by Simon Baker

We are constantly exposed to a large number of chemicals present in food, water, air, dust, soil and consumer products. These chemicals enter our bodies via several routes: ingestion, inhalation and dermal absorption. Many of these chemicals are known or suspected to have toxic effects that can cause disorders and diseases. Chemical risk assessment is the process of evaluating such risks, and includes exposure assessment.

Exposure assessment methods include both indirect methods, such as exposure modelling and exposure calculations based on environmental measurements and questionnaire data, and direct measurements, such as human biomonitoring (HBM) and personal monitoring. HBM is the measurement of exposure biomarkers (chemicals or chemical metabolites) and effect biomarkers (indicators of effects caused by chemical exposure) in human body tissues or fluids, such as blood, hair and urine. To assess the total exposure to a chemical and evaluate the importance of different exposure routes.

We have annotated a corpus of 3686 scientific publication abstracts with a novel classification taxonomy specific to Exposure assessment. The taxonomy is divided into two main branches: Biomonitoring and Exposure routes.

You can also download the pre-processed corpus, which includes extracted Part-of-Speech tags, grammatical relations, Named Entities, Medical Subject Headings, Chemical lists, verb clusters and n-grams

The pre-processed corpus includes all features extracted by our software. This will save you the effort of extracting the features, and you only need to train your models using our provided software (or any other software/tools that you prefer).