Case Study - Named Entity Recognition for Medical Texts
In one project, our employees managed to improve the performance of Named Entity Recognition for Swedish medical texts.
- Client
- Sahlgrenska Universitetssjukhuset
- Service
- Machine Learning
Overview
In the digital age, data has become an important asset, but in Swedish healthcare there is an enormous amount of invaluable information locked in text form in patient records. Transforming this unstructured information into useful data poses a major challenge.
But that's where a project comes in, carried out by two of our current employees, with the goal of efficiently extracting valuable data from Swedish patient records using Named Entity Recognition (NER).
In the project, several BERT models and data augmentation techniques were evaluated, with the potential to significantly improve NER results on Swedish patient records.
The result showed that the data augmentation could significantly improve the performance of the system, especially when handling smaller data sets. Interestingly, we were also able to achieve comparable results by boosting 50% of the training data as using the entire original data set without boosting.
This project shows how the right technology and methods can help us extract valuable information from Swedish patient records, which can contribute to a more enlightened and data-driven healthcare.
Tech
- Machine Learning
- Data Augmentation
- Pytorch
- Python
It was incredibly rewarding to work on this project with Sahlgrenska. Not only it gave us the chance to contribute to better healthcare, but we could also explore and apply new techniques to improve performance within text extraction.

Tech Lead & Co-Founder
Next steps
Want to use ML to structure text data or build better decision support? Contact us and we’ll set up an intro call.