In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. ContextD is an adaptation of the English ConText algorithm to the Dutch language.
We also created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists’ letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were
annotated for negation, temporality, and experiencer properties.
To try ContextD, click here.
EMC Dutch Clinical Corpus
The Dutch clinical corpus is available through the following link.
Citing ContextD and the EMC Dutch Clinical Corpus
If you have used the ContextD or the corpus in your study, please cite:
Zubair Afzal, Ewoud Pons, Ning Kang, Miriam CJM Sturkenboom, Martijn J Schuemie, Jan A Kors. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinformatics 2014, 15:373 doi:10.1186/s12859-014-0373-3