Case Study - Named Entity Recognition for Medical Texts
We improved NER performance on Swedish medical texts for a leading university hospital using data augmentation and fine-tuned BERT models, enabling more accurate clinical information extraction.
- Client
- Swedish university hospital
- Service
- Machine Learning
Overview
Swedish patient records hold large amounts of information locked in free text — hard to search, even harder to aggregate. Turning this unstructured information into useful data is a real challenge.
But that's where a project comes in, carried out by two of our current employees, with the goal of efficiently extracting valuable data from Swedish patient records using Named Entity Recognition (NER).
In the project, several BERT models and data augmentation techniques were evaluated, with the potential to significantly improve NER results on Swedish patient records.
The result showed that data augmentation could significantly improve system performance, especially when handling smaller datasets. Interestingly, we were also able to achieve comparable results by augmenting 50% of the training data as we did using the full original dataset without augmentation.
This project shows how the right technology and methods can help us extract valuable information from Swedish patient records, which can contribute to a more enlightened and data-driven healthcare.
Tech
- Machine Learning
- Data Augmentation
- Pytorch
- Python
It was incredibly rewarding to work on this project. Not only did it give us the chance to contribute to better healthcare, but we could also explore and apply new techniques to improve performance within text extraction.

Tech Lead & Co-Founder
Next steps
Want to use ML to structure text data or build better decision support? Contact us and we’ll set up an intro call.