We value your privacy. We use cookies to enhance your browsing experience, serve personalized ads or content, and analyze our traffic. By clicking "Accept All", you consent to our use of cookies. Read our Privacy Policy for more information.
back arrow icon

Leveraging Amazon Comprehend Medical AI for medical text interpretation

Tuesday, June 4, 2024
Stefanos Peros
Software engineer

We recently tackled an important use case for one of our healthcare customers: matching patients with clinical trials based on their medical profiles. The abundance of unstructured medical text in both patient profiles and clinical trials posed a big challenge, but is a great use case for Natural Language Processing (NLP). Specifically, NLP enables the efficient parsing and understanding of vast amounts of textual data, and Amazon Comprehend Medical does exactly that: it’s a cloud service that uses machine learning and has been pre-trained to understand and extract health data from medical text. 

How it works

Figure 1: Amazon Comprehend Medical data flow

Figure 1 illustrates the key processing steps of Amazon Comprehend Medical. The process begins when a user inputs unstructured medical text into the service, directly through the console or by calling the API. This text could be anything from clinical notes, electronic health records (EHRs), prescriptions, audio transcripts of doctor-patient interactions, to medical publications. Next is where the bulk of the analysis happens, transforming unstructured text into structured actionable data using machine learning models that are pre-trained on a vast amount of medical text. This data is structured even further by scanning the text to identify words or phrases that represent medical entities, e.g. symptoms, diagnoses, medications, etc. Once entities are identified, Amazon Comprehend Medical analyzes their context to a) extract relevant attributes (e.g. dosage for medications, acuteness for medical conditions among others), and b) link entities to medical ontologies, like the Systematized Nomenclature of Medicine—Clinical Terms (SNOMED-CT) or the International Classification of Diseases (ICD) codes. Amazon Comprehend Medical also detects protected health information (PHI) in the text (e.g. names, geographical identifiers, social security numbers, …), which it categorizes to apply the appropriate measures for protection or redaction, ensuring compliance with health information privacy regulations such as the Health Insurance Portability and Accountability Act (HIPAA). Finally, the API responds with the structured data in the form of a JSON object.

How we used it

In order to match patients to clinical trials, a necessary first step was to identify which clinical trials correspond to the medical conditions of each patient. Since medical conditions are described as unstructured medical text, we needed a way to convert these conditions to a standard format: ICD codes. As such, we ask patients upon registration to select a number of predefined medical conditions, which we mapped to corresponding ICD codes in our database. Next, we fetch available clinical trials from public sources, and use Amazon Comprehend Medical to classify them into ICD codes based on their described medical conditions. This enabled us to quickly eliminate irrelevant clinical trials for each patient, resulting in a smaller subset that we further refine using large language models among other AI techniques. Naturally, to save costs and improve performance, we cached previously retrieved medical condition / ICD-code pairs from Amazon Comprehend Medical into a DynamoDB table.


Through our work with Amazon Comprehend Medical, we were able to address our customer’s needs by a solution that is cutting-edge and efficient and budget-friendly.

We generally always try to keep up with the newest offerings from the major cloud service providers (AWS, GCP, Azure), so we can quickly adapt to new challenges and find fresh ways to make a real difference for our customers. As we move forward, our main focus is to continue embracing these advanced AI services in order to improve service quality, operational efficiency, and to unlock new possibilities for R&D, resulting in a more informed, efficient, and innovative environment for us and our customers.

Let's build!

Are you looking for an entrepreneurial digital partner?
Reach out to hello@panenco.com or schedule a call

Egwin Avau
Founding CEO
Koen Verschooten
Operations manager

Subscribe to our newsletter

Quarterly hand-picked company updates
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.