I attended last week the Medical Large Language Models for Clinical Text Summarization, Information Extraction, and Question Answering from John Snow Labs. I’m sharing my notes here.


LLMs and NLP in general are providing new tools to solve existing problems in healthcare. Here is a list of some new tools that are available today:

  • Question Answering
  • Text Summarization
  • Text Generation
  • Information extraction (e.g. from clinical notes)
  • Relation extraction (e.g. symptoms related to a disease)
  • Entity recognition (like ICD-10 code extraction)
  • Chatbots

Many open source LLMs available today have close performance to the best commercial state-of-the-art models, like GPT-4, GPT-3.5-turbo form OpenAI or Claude from Anthropic. The last tend to be general purpose, powerful, but extremely expensive to train. The open source models tend to lack performance in a broad sense but can be fine-tuned to specific tasks. This field is moving fast, which means that there is much potential for innovation, but it’s also a challenge to keep up with the state-of-the-art.

Here is an overview of tasks that Medical LLMs can perform today:

LLMs in Healthcare

1. Close book Q&A:

Question answering using BioGPT JSL as the base model and a knowledge base.

demo available here: https://demo.johnsnowlabs.com/healthcare/MEDICAL_LLM/

2. Medical text generation

BioGPT JSL can also be used to generate text, like an elaborate response to a more open question, creating synthetic text or even synthetic patients. test Demo available here: https://demo.johnsnowlabs.com/healthcare/MEDICAL_LLM/

(nice implementation with streamlit btw)

3. Clinical text Summarization

One of the most popular uses of LLMs in medical field is clinical summarization. An obvious use case is to summarize a patient’s medical history, since it can get very long and have repeated information. This summarization has its own specificities, like the need to preserve the medical terminology and the need to preserve the order of events.

demo available here: https://demo.johnsnowlabs.com/healthcare/MEDICAL_LLM/

4. Biomedical Research text Summarization

A tool to help dealing with the avalanche of new research papers that are being published. It makes it easier to automate the process of reading and summarizing the papers.

5. Patient Question text Summarization

A way to outline the patient’s question to the doctor, so that the doctor can quickly understand the patient’s concern.

patient question text

No-Code Medical Chatbots

Another interesting concept is no-code implementations that JSL are developing, available at https://www.johnsnowlabs.com/nlp-lab/.

It can be used to create natural language interfaces to query databases, of both clinical records and biomedical research.

query search

This seems very interesting to apply in a Business Intelligence + healthcare context. The user could help accelarate the adoption of BI practices in the field and make it more widespread.

Three key requirements were proposed to comply these chatbots with current regulations (like new legislation that is being currently discussed in EU parliament).

  • No Data Sharing: Clinical data is sensitive. No data is shared through cloud, APIs and 3rd parties. Solutions must be secured under the firewall
  • No BS: no hallucinations or unexplained results. Every answer can be explained and backed by a reference
  • No Test Gaps: test for robustness, fairness, and bias, data leakage or toxicity. Being able to prove these tests to customers and regulators

I’m amazed and overwhelmed by the potential of LLMs in the medical field. The possibilities are limitless.

The vod of the webcast can be found here: https://www.carahsoft.com/learn/event/45747-medical-large-language-models-for-clinical-text-summarization-information-extraction-and-question-answering