top of page

Automated NLP Pipeline for Covid-19 Signs & Symptoms from EHR Notes

Automated NLP Pipeline for Covid-19 Signs & Symptoms from EHR Notes


The COVID-19 pandemic swept across the world, rapidly infecting millions of people. An efficient tool that can accurately recognize important clinical concepts of COVID-19 from free text in electronic health records (EHRs) will be significantly valuable to accelerate various applications of COVID-19 research. To this end, the existing clinical NLP tool CLAMP is quickly adapted to COVID-19 information.

An automated tool called COVID-19 SignSym was built, which can extract signs/symptoms from clinical text and the following eight attributes:

  1. body location

  2. severity

  3. temporal expression

  4. subject

  5. condition

  6. uncertainty

  7. negation

  8. course


Extraction information

Melax Tech released a fully automated Natural Language Processing (NLP) pipeline to extract COVID-19-related sign and symptom mentions from electronic health records (EHRs) notes. The pipeline is built on top of the award-winning CLAMP tool and can be quickly customized on the local dataset for the most optimized performance.

The extracted information is also mapped to standard clinical concepts in the common data model of OHDSI OMOP. Evaluations of clinical notes and medical dialogues demonstrate promising results.


The COVID-19 SignSym tool presents an efficient solution for extracting critical sign and symptom information from EHRs. Its comprehensive attributes capture vital details, providing practical value in accelerating COVID-19 research and improving patient care. With its adaptability, accuracy, and ongoing refinement, this tool has the potential to contribute significantly to the advancement of healthcare beyond COVID-19.

This tool is freely accessible to the community as a downloadable package of APIs (

Our fully automated NLP pipeline, COVID-19 SignSym, will provide fundamental support to the secondary use of EHRs, thus accelerating the global research of COVID-19.


bottom of page