diogo carapito

  • medical doctor, studied @NOVA Medical School
  • 3rd year General and Family Medicine resident @ARSLVT, Portugal
  • postgraduate in Information Management and Business Intelligence in Healthcare @NOVA IMS
  • curious about data science and large language models

Summer is over! Time to MLOps

This summer was rough. The beguining was going well. I finished my project for Predictive Methods of Data Mining at NOVA IMS. I’m pretty happy with the results. We ended up used streamlit, and we got a pretty good score on Kaggle with our rather simple neural network. =D I also watched the LLM Bootcamp 2023, which gave me a lot of insight on the current state of LLMs. But my side projects were slowing down....

October 29, 2023

Medical Large Language Models

I attended last week the Medical Large Language Models for Clinical Text Summarization, Information Extraction, and Question Answering from John Snow Labs. I’m sharing my notes here. LLMs LLMs and NLP in general are providing new tools to solve existing problems in healthcare. Here is a list of some new tools that are available today: Question Answering Text Summarization Text Generation Information extraction (e.g. from clinical notes) Relation extraction (e.g. symptoms related to a disease) Entity recognition (like ICD-10 code extraction) Chatbots Many open source LLMs available today have close performance to the best commercial state-of-the-art models, like GPT-4, GPT-3....

June 25, 2023


Last time I posted a blog post, I almost went nuts to make it work. I couldn’t remember how to publish a post. I don’t know what it was, maybe guithub pages, maybe the quarto framework, but definitly my dumbness was a big part of it. I just can’t grasp yet how all this unintuitive git shenanigans work. So I’ve been postponing my new blog post, knowing what awaited me....

June 18, 2023

mgfhub.com is now live!

I’m excited to announce that mgfhub.com is now live! It’s sort of a search tool with data visualization components for KPIs (“indicadores”) that exists in Portuguese Primary Care. I have imagined it for more than a year, and it’s trying to answer questions that I have in my daily life as a Family Medicine Resident when I’m working with KPIs: How many KPIs exist? How do KPIs work? How to quickly find specific KPIs (e....

May 2, 2023
NLP Summit Healthcare 2023

NLP Summit Healthcare 2023

This week I attended the NLP Summit Healthcare 2023, a free virtual event organized by the John Snow Labs. It was a great event with a lot of interesting talks. I’ll share some of my key takeaways. 1. Best practices when developing NLP models Presented in the opening keynote by Dr. David Gondek, Chief Data Scientist at John Snow Labs, he sumarized some best practices that i found interesting as I’m beginning my NLP journey:...

April 7, 2023

webapps and tutorials

This week has been exciting. I have this NLP project cooking inside my head (for some time now) and I’ve been speeding through many youtube tutorials on both backend and frontend structure. For the backend, I’ve been cruising through Pinecone, LangChain and OpenAI API, thanks to tutorials from James Briggs and Data independent. Google colab has been my best friend. Even though the backend is the new exciting stuff, I have a sweet spot for the way it’s presented....

March 11, 2023
Diogo Carapito

hello, world!

I’ve been charging into different directions on my journey to build a bridge between health and tech (NLP and LLMs, I’m looking at you). There is so much potential and I have so many ideas! So, lately I’ve been: grinding through fast.ai Practical Deep Learning for Coders (just finished the 4th lesson this week) exploring website domain name stuff and setting up an website of my previous project (Primary care KPIs exploration tool, mgfhub....

March 5, 2023