top of page

Improve Informed Consent and Clinical Trial Protocol Documents with Natural Language Processing


Clinical Trials are crucial to examine the safety and efficacy of new treatment or prevention options. However, a heavy burden exists in patient recruitment and trial data management. Natural language processing (NLP), which aims to extract structured information from unstructured text, has the power to automate and simplify the clinical trials recruitment and management process.

Melax Tech has provided NLP and semantic modeling capabilities for working with textual documents involved in initiating and managing Clinical Trials. We have done extensive work for both industry and the National Cancer Institute in extracting inclusion/exclusion criteria from clinical protocols and in understanding permissions, privacy protections, and data use restrictions from informed consent forms.


Clinical trials for personalized cancer therapy provide tailored treatments based on a patient’s specific characteristics (e.g., genetic status) and have shown great promise for improving outcomes for cancer patients. Hundreds of clinical trials are investigating various drugs and drug combinations that target specific genetic alterations in tumors. Melax Tech developed a CLAMP-based system for extracting genetic alteration information for personalized cancer therapy from trial protocol documents such as those found on This system can be extended to applications involving client need to extract genetic information and inclusion/exclusion criteria from clinical protocol documents. Easy access to this information has the potential to help patients find open trials and for research scientists to understand the design and availability of open trials.

For informed consent documents, we have been one of the leading industry/academic partnerships working on the complex problem of information extraction of permissions, human subject protections, data sharing and use, and other major concepts of interest to the clinical research and regulatory community. In late 2019, Melax Tech was awarded an NIH/NCI SBIR contract targeted at extending our prior work on informed consent to develop information models and NLP tools usable by the community at large. To do this, we leveraged our CLAMP technology, demonstrating the use of NLP using Deep Learning-based (DL) transformer-based models to locate relevant permission attributes from consent material. Using formal semantics, based on our team's previous work on the Informed Consent Ontology (ICO), a part of the OBO Foundry, these permission attributes can be mapped to formal vocabularies being developed in concert with important groups such as the Global Alliance for Genomics and Health. We are extending this work to extract Named Entities and Relationships from informed consent material. We also developed a mechanism to report on the “completeness” of permission information contained in consent material, thus laying a foundation to support what we will term “regulatory decision support” tools aimed at improving comprehension of the material by the study participants.

High-level overview of the semantic information model derived from this work.
Figure 1 shows a high-level overview of the semantic information model derived from this work.

Applications of the work to date can be used to support automatically generating “question answering” (Q&A) systems on a per-trial basis for trial sponsors and study coordinators to recruit participants. A Q&A system facilitates patient discussions on mobile and web recruitment platforms. Additionally, it could support automatic tools on websites for potential trial participants. This is by no means an exhaustive list of potential applications.


Our technology streamlines patient recruitment and data management by extracting critical info from complex documents. We identify genetic alterations for personalized cancer therapy, improve patient engagement, and decipher informed consent forms. This approach accelerates medical advancements and improves patient care. Our innovations have boundless potential for future healthcare applications and benefits. To learn more about the availability of this work, request a demo today!

Selected relevant publications by our team:

bottom of page