The NLP Solution for De-identification of Clinical Notes from EHR Systems on Azure Databricks


Melax Tech is collaborating with a major medical university to de-identify millions of clinical notes from their EHR systems by using our de-identification solution. The de-identification pipeline contains machine learning/deep learning models, context rules, and organization specific resources such as physician name lists to recognize PHI entities from the notes. The solution also provides numerous post-processing options, such as patient level date shifting, synthetic value replacement, and other advanced transforms necessary for a complete de-identification solution. Deployment of the Melax Tech solution is being done on Databricks clusters, and provides for full, parallel clinical notes processing to achieve the performance necessary when working with large corpora of clinical documents.