IMFD Innovation collaborates with the Superintendence of Social Security in project using advanced NLP techniques
January 2025. One of the most interesting challenges in the area of natural language processing (NLP) is free text recognition. This is the challenge faced by the Innovation and Technology Transfer Department of the IMFD in a collaborative work with the Superintendence of Social Security (SUSESO).The main focus of this work was the optimization of claims entry from a citizen's perspective.. The objective of this collaboration is to develop and evaluate the performance of natural language processing (NLP) models for the automatic classification of subjects, sub-subjects and causes of claims of the SUSESO.
"When a person enters a claim at the Social Security Superintendence (SUSESO), there is a range of options to select for a cause. A user could mistakenly enter the wrong option and end up with a negative solution at the end of this process", explains Hernán Sarmiento, IMFD Innovation Engineer and project leader. After this first step, there is a field where the user enters a free text called "story" where he/she indicates why he/she is complaining.
"The question is how we can take advantage of this free-text story to somehow automatically assign to one of these causalities that the user should select. Here we come across stories that have spelling mistakes, typos, and grammar mistakes, so during this project we tried to evaluate experimentally if we could train classification models and see with how much percentage we could accept some of the options of the existing causal factors.", emphasizes the engineer.
This initiative was part of a total of 7 components promoted by SUSESO, called "Natural Language Processing for the Optimization of Claims Entry", which not only seeks to streamline the process, but also to facilitate a faster and more accurate response for those seeking solutions to their problems.
SUSESO is the autonomous State agency in charge of overseeing compliance with social security regulations and guaranteeing respect for the rights of the People, especially workers, pensioners and their families. The organization worked for the first time on solutions with data science and artificial intelligence, in this case with natural language processing (NLP) as part of its efforts to improve service and modernize its processes.
Main results
The project was divided into three key stages: Exploratory data analysisin which claims were reviewed to identify patterns. Linguistic characterization of the storieswhere NLP tools were used. And finally they arrived at the model training and validationwhich involved training artificial intelligence models to automate the classification of claims.
Reaching Stage III, with the model already trained, the test set was reached, in which one thousand stories corresponding to 17 causals were used. Different language models were evaluated in order to know the ability to predict a causal from a story.
The main results indicate that it is possible to build language models that can learn from the accounts of claim entries and that this can improve the ability to correctly classify these claims to a specific causal and that this can improve the ability to correctly classify these admissions or claims to a specific causality.
Hernán Sarmiento points out that "the fact of being able to use existing language models and adapt them to the stories that SUSESO has, can in some way improve the classification by improving over 10 times the fact of selecting a random cause". improve the classification by improving over 10 times the fact of selecting a causal at random".. And he adds that "With this finding, we can say that indeed the stories may contain some kind of very domain-specific language that allows to improve this classification automatically".
Innovation at the service of society
"This is making available something that sometimes seems far away, such as artificial intelligence, in something concrete. Innovation is coming, which is to improve the user experience of our People who come to knock on our door to resolve complaints. We are very pleased with the results of the project, because it can have a citizen impact with a high technological level", said Pamela Gana, SUSESO's Superintendent.
The IMFD team behind this project is led by Hernán Sarmiento, IMFD Innovation Transfer Engineer, Francisca Cona and Camila Henríquez as data scientists, and as consultant Jocelyn Dunstan, an academic from the Department of Computer Science at the UCM Computer Science Department. Department of Computer Science UC in a position shared with the Institute of Computational Mathematical Engineering UCresearcher AC3E and IMFD.
"It was a project that addresses a need and a problem that is common to citizens. Working with SUSESO allows us to address problems that really exist in society through methodologies or scientific experimentation, all with the aim of improving certain processes."Sarmiento emphasizes.
For the IMFD, this project is part of a series of initiatives that seek techniques that allow the application of advanced research in data science developed in Chilean academia to the solution of problems that impact our society, which is aligned with the objectives of the institute and its innovation department.
