SIGMOD/PODS 2024: Chile will host the most important data science meeting in the world

Santiago, Chile. In June 2024, the eyes of all global data science specialists will be on our country. During six days of intensive work, the SIGMOD/PODS 2024 conference will be held in Santiago, where more than 800 scientists from universities, research centers and companies from Chile and abroad will address -both at a theoretical and applied level- the challenges in the management of large volumes of data.

"It is not by chance that Chile was chosen as the venue. For several decades now, the Chilean community has had an important presence at this conference, with at least 10 researchers living in the country who regularly present their scientific results at SIGMOD/PODS. Several of these papers have received important awards at this meeting", says Pablo Barcelódirector of the Institute of Mathematical and Computational Engineering of the Catholic University of Chile (IMC UC), researcher at the Millennium Institute Foundational Research on Data (IMFD) and the National Center for Artificial Intelligence (Cenia), and co-director of the local organizing committee of SIGMOD/PODS 2024.

Nayat Sánchez, director of INRIA Chile and co-director of the SIGMOD/PODS 2024 organizing committee, emphasizes that "the effort to bring this SIGMOD/PODS to Chile has not been minor and the fact that we have achieved it is a recognition of the work of scientists who are considered leaders in the region and the world. That prestige earned over the years has made our country host this conference, a milestone that will allow us to further promote Chile's positioning in the data area".

Barceló adds that "it is from Chile that research has emerged that today makes it possible, for example, to extract information to understand behavior in social networks in just seconds".

SIGMOD/PODS is also attended by experts from technology companies that not only follow closely the advances in the area of data science, but also develop their own applied research, such as Amazon, Apple, Huawei, Microsoft, Google, Alibaba and Oracle, which are also official sponsors of the conference.

SCIENCE MADE IN CHILE IN SIGMOD/PODS 2024

Of the four keynote lectures at the event, two will be given by Chileans. Ricardo Baeza-Yates, full professor at the University of Chile and senior researcher at the Millennium Institute Foundational Research on Data (IMFD), will address the challenges and limitations in the fields of data and machine learning. Marcelo Arenas, full professor at the Catholic University of Chile and associate researcher at the Millennium Institute Foundational Research on Data (IMFD), will talk about his recent work on how to provide explanations for decisions made by artificial intelligence models.

The results of innovative applications and studies carried out by Chilean scientists will also be presented. The Chilean works focus on graph databases, models that store interconnected information, which allows faster and more accurate results when making queries.

One of the presentations will show advances that seek to improve search systems on graph databases, so that they yield not only accurate information, but also richer in depth and nuance. The researchers are Diego Arroyuelo and Juan Reutter (P. Universidad Católica de Chile, IMFD); Benjamín Bustos, Aidan Hogan and Gonzalo Navarro (Universidad de Chile, IMFD), and Adrián Gómez-Brandón (Universidade da Coruña, Spain, IMFD).

MillenniumDB, a new search engine for graph databases that has already proven to be two to 10 times faster than other systems currently in use (such as those of Amazon or Neo4j), will also be presented. The development of this innovation involves 14 academics, researchers and engineers, and was led by Domagoj Vrgoč and Carlos Rojas (P. Universidad Católica de Chile, IMFD).

Aidan Hogan (U. of Chile, IMFD) and Domagoj Vrgoč (P. Catholic University of Chile, IMFD) will be in charge of a tutorial session where they will show how to query large-scale graph databases.

Another innovation to be presented is REmatch, a tool with the ability to extract information from a pattern from text documents. REmatch was developed by Cristian Riveros and Domagoj Vrgoč, together with Vicente Calisto, Gustavo Toro and Nicolás Van Sint Jan, all from P. Universidad Católica de Chile, and Kyle Bossonney (Oxford University).

THE NOBELS OF COMPUTING

In the long history of SIGMOD/PODS, advances have been presented that today are essential for the existence of online commerce, search engines, social networks and artificial intelligence. So relevant are the innovations presented at this conference that, in its five decades, four scientists have received the Turing Award, also known as the "Nobel Prize for Computing".

In the 1970s, in the early years of the conference, efforts were focused on implementing the fundamental work done by Edward F. Codd, the creator of relational databases. These systems are used by all industries and sectors that handle large volumes of information: banking, online shopping systems, health records, retail inventory management, and many, many more. In the world of data science, Codd set a milestone and his research earned him the 1981 Turing Award.

In the 1980s, another researcher -JimGray-furthered Codd's research by addressing the problem of database integrity. His work was key to the use of mechanisms that allowed information to be consulted by multiple users at the same time and is present today in applications such as online banking transaction processing and e-commerce. Gray received the Turing Award in 1998.

The most recent laureate is Michael Stonebraker, who delved into the area of databases, creating new management systems that revolutionized the market, such as Postgres, which allows connecting more complex and diverse information. His innovations are being applied by companies and services such as Instagram, TripAdvisor, Uber and Spotify, among many others. For his innovation in this field, Stonebraker received the Turing Award in 2015.

THE IMPACT OF THE WEB AND SOCIAL NETWORKS

The following decades brought more illustrious names: in the 1990s, the emergence of the World Wide Web generated new fields of research, such as the creation of systems for extracting and exchanging information from large-scale data, where millions of users converged. In this field, the work of Mexican Héctor García-Molina, from Stanford University (USA) stands out, who was the mentor of the project of two doctoral students who revolutionized the Internet: Sergei Brin and Larry Page, creators of the Google search engine.

"The 2000s is the decade where the term Big Data was coined and the challenges involved in its management were identified, such as volume of information, speed of production and diversity of data. This drives work on new methods and systems for fields such as astronomy or DNA sequencing," explains Pablo Barceló.

From 2010 onwards, the irruption of powerful artificial intelligence (AI) algorithms using large data repositories as a basis has led part of the community of scientists gathered around SIGMOD/PODS to focus on the great challenge of bias in AI.

"There is an urgent need to develop research in the use of data and AI in a responsible manner, addressing the risks of manipulating large volumes of information to make decisions. Many of the papers presented at SIGMOD/PODS address these challenges in a theoretical and applied manner," concludes Nayat Sanchez.