This ZIP-compressed file contains 200 source documents (in plain text, on sentence per line) and 200 annotation documents (in brat standoff format). Documents are named using PubMed document IDs, e.g. "15939911.txt" contains text from the document "A young man with palpitations and Ebstein's anomaly of the tricuspid valve" by Marcu and Donohue. Text is from PubMed Central full-text documents but has been edited to include only clinical case report details. All annotations were created manually.
"MACCROBAT2020" is the second release of this dataset, following "MACCROBAT2018". The consistency and format of annotations has been improved in the newest version.