Case Study - Named Entity Recognition for Medical Texts
In one project, our employees managed to improve the performance of Named Entity Recognition for Swedish medical texts.
- Client
- Sahlgrenska Universitetssjukhuset
- Year
- Service
- Machine Learning
Overview
In that digital age, data has become an important asset, but in our Swedish healthcare system, there is an enormous amount of invaluable information locked in text form, in patient records. Transforming this unstructured information into useful data poses a major challenge.
But that's where a project comes in, carried out by two of our current employees, with the goal of efficiently extracting valuable data from Swedish patient records using Named Entity Recognition (NER).
In the work, several different BERT models and techniques for data augmentation (data augmentation) were investigated, which have the potential to significantly improve the results of NER on Swedish patient records.
The result showed that the data augmentation could significantly improve the performance of the system, especially when handling smaller data sets. Interestingly, we were also able to achieve comparable results by boosting 50% of the training data as using the entire original data set without boosting.
This project shows how the right technology and methods can help us extract valuable information from Swedish patient records, which can contribute to a more enlightened and data-driven healthcare.
What we did
- Machine Learning
- Data Augmentation
- Pytorch
- Python
It was incredibly rewarding to work on this project with Sahlgrenska. Not only it gave us the chance to contribute to better healthcare, but we could also explore and apply new techniques to improve performance within text extraction.

Software Engineer