Noticias

IMFD and CENIA student team wins first place in NLP competition DIPROMATS 2024

July 2024.- A group of four students, led by Marcelo Mendoza, DCC UC academic and IMFD and Cenia researcher, won first place in DIPROMATS 2024, the shared task on propaganda and narrative detection at the 2024 Iberian Languages Evaluation Forum (IberLEF)

The team, integrated by Miguel Fernández (IMFD and PhD DCC UC), Maximiliano Ojeda (IMFD and PhD DCC UC), Lilly Guevara (CENIA RL5 and USM Engineering), Diego Varela (CENIA RL5 and USM Engineering) obtained the best results in one of the two categories of the task, which consisted of developing systems capable of detecting and characterizing propagandistic content in tweets written by authorities from US, Europe, Russia and China, in English or Spanish.

Propaganda

The deceptive intent of propaganda may be less obvious and more damaging than disinformation. Its content need not be false, and its effects may only be perceptible over time. The abuse of propagandistic content in the information ecosystem produces a manipulation of public opinion, which can be really detrimental to the democratic system.

“Propagandistic content is understood as a message that is premeditatedly designed to influence a certain audience: it is deliberately constructed and can effectively jeopardize the democratic discussion on certain issues, for example, distort what foreigners really do in Chile, can affect minorities, and that slowly erodes democracy,” says Miguel Fernandez, IMFD and PhD student DCC UC. This is why this phenomenon is of particular importance for the studies and was selected by DIPROMATS for this competition.

Miguel Fernández (IMFD y doctorado DCC UC)

“In this challenge we used techniques based on artificial intelligence and natural language processing to detect the use of persuasive language and propaganda techniques in text,” explains Marcelo Mendoza, IMFD researcher.

Marcelo Mendoza, académico del Departamento de Ciencia de la Computación UC, investigador IMFD y Cenia.

“We trained a model based on Transformers technology, which is an architecture that allows machine learning models to understand text and also techniques such as data augmentation,” explains Diego Varela, from CENIA RL5 and USM Engineering.

Diego Varela (CENIA RL5 e Ingeniería USM)

These techniques make it possible to perform analyses that would be impossible to carry out, given the large amount of information that must be handled. “We as humans perhaps have very few resources or not all of us have the resources to be able to identify this type of phenomena, but thanks to these technologies, we can achieve it,” highlights Lily Guevara, from CENIA RL5 and USM Engineering.

Lily Guevara (CENIA RL5 e Ingeniería USM)

Transformers are useful to identify patterns in texts and the way to identify propaganda is to review how the language is behaving within the text: what goes before, what goes after, what words go together: the use of all the language. That is why it is one of the most used techniques nowadays, explains Maximiliano Ojeda, IMFD and DCC UC.

DIPROMATS 2024 is a challenge organized by the Natural Language Processing and Information Retrieval Research Group of the National University of Distance Education of Spain (UNED), in which the research community defined new research challenges and proposed tasks to advance the state of the art in natural language processing (NPL).