EMC Dutch clinical corpus contains four types of anonymized clinical documents: entries from general practitioners, specialists’ letters, radiology reports, and discharge letters. The identified UMLS terms in the corpus are annotated for negation, temporality, and experiencer properties.

The corpus was used to develop ContextD algorithm.


EMC Dutch Clinical Corpus

The corpus is available for non-commercial research purposes. Please contact This email address is being protected from spambots. You need JavaScript enabled to view it. for further information.


Citing ContextD and the EMC Dutch Clinical Corpus

If you have used ContextD or the corpus in your study, please cite:

Zubair Afzal, Ewoud Pons, Ning Kang, Miriam CJM Sturkenboom, Martijn J Schuemie, Jan A Kors. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinformatics 2014, 15:373 doi:10.1186/s12859-014-0373-3