Noticias

IMFD Researchers’ Work Awarded at SIGMOD/PODS

June, 2023 – Every year, the Association for Computing Machinery (ACM), founded in 1947 as the first scientific and educational society in the field of computing, organizes the SIGMOD/PODS Conference. Currently, the event is considered one of the most important international forums in the field of data management, where researchers gather to explore new ideas, results, techniques, and experiences. It is in this context that the best papers presented at the conference are also awarded. In the 2023 edition, held from June 18th to 23rd in Seattle (USA), one of these awards goes to a study co-authored by Domagoj Vrgoč, an academic from the Institute of Mathematical and Computational Engineering at Pontifical Catholic University of Chile and a researcher at the Millennium Institute Foundational Research on Data (IMFD), and Renzo Angles, an academic from the Department of Computer Science at the University of Talca and an IMFD researcher.

According to the organizers, the study titled “PG-Schema: Schemas for Property Graphs” was chosen as the “Best Industrial Track Paper” due to its exceptional quality, originality, and contribution to the field of graph databases. The study is the result of collaborative work between researchers from various academic institutions such as the University of Warsaw (Poland), the University of Bayreuth (Germany), the University of Edinburgh (Scotland), and companies like Amazon Web Services, TigerGraph, Neo4J, and RelationalAI, among others.

“SIGMOD/PODS is one of the largest conferences in the world and one of the most prestigious in the field of databases. It brings together nearly 2,000 participants each year. It has a section called SIGMOD, which covers the more practical aspects, and PODS, which focuses on the more theoretical aspects,” says Domagoj Vrgoč, a computer science doctor from the University of Edinburgh. Regarding the paper itself, he emphasizes that the award recognizes the study as “work with significant impact on the industry and collaboration with industry professionals. In fact, the paper has more than 20 authors, representing a large collaboration that took considerable time and addresses a specific problem in the field.”

Domagoj Vrgoč

 

Renzo Angles comments that this article was developed by members of the “Property Graph Schema Working Group” within the Linked Data Benchmark Council (LDBC). “In mid-2019, we began discussing the characteristics of graph-based data models and the absence of a standard way to represent their structure or schema. In this regard, the article proposes a formalism to specify schemas for property graphs, that is, a language that allows for precise description of the types of nodes, edges, and properties existing in a graph-based database, as well as specifying simple and complex constraints on these types and their relationships.”

Renzo Angles

 

The Potential of the Study

In the field of databases, there is a significant branch known as graph databases, where data is conceptually modeled. “Each entity you want to represent, such as a person, a city, or a workplace, becomes a node in your graph. And when you want to connect data, you create edges that indicate the relationships between different entities. This means that the model doesn’t have a fixed structure; when you want to add a new entity, you simply connect it through edges,” explains Vrgoč.

This characteristic implies that a fixed structure is not necessary, as is the case in the more classical area of research encompassing relational databases. “Graph databases do not have a schema, which is understood as a description that tells you ‘everything looks like this,’ and is strongly present in the world of relational databases,” comments the academic. In contrast, he adds, in a graph database, there may be “nodes representing people, but some include only name and country, while others show only a name and age. That’s why it is not necessary for everything to be structured. In contrast, in relational databases, everything must have the same attributes.”

According to Vrgoč, this flexibility gives graph databases many advantages but can also pose a problem “when dealing with very large knowledge graphs where a schema describing the type of data is needed.”

The study helps fill that gap. “The paper is called PG-Schema because it refers to a language that allows defining the schema for a graph database format widely used in the industry called ‘property graphs.’ The work is, in fact, a language that compactly describes what kind of data I have in my database without showing all that data. It is based on a certain syntax, develops a semantics, and facilitates that definition.”

Vrgoč’s contribution to the paper consisted of establishing a grammar for that language: “The work I did with a subgroup of that international team, mainly with Filip Murlak (University of Warsaw, Poland) and Wim Martens (University of Bayreuth, Germany), was to design a base language that allows describing what I have in a node, what I can have in an edge, how they are linked, how my graph looks in general. Then, with the rest of the team, we developed several extensions that ultimately lead to this language.”

“The development of the article took quite some time because there was a discussion between the practical needs expressed by industry members of the group and the theoretical foundations proposed by members from academia. The final result is a schema specification language that allows for representing graph schemas with different types of constraints while respecting important theoretical conditions,” says Renzo Angles, who is confident that the article will have an impact on the development of schema specification languages for graph-based database systems.

The researchers hope that, due to its potential, this language will be incorporated into a new ISO standard for graph query languages. “Our work is a proposal that provides input to the group defining that standard, but it is not yet something established in the industry. For now, it is a research work,” states Domagoj Vrgoč.

All authors of the study are: Renzo Angles (University of Talca); Angela Bonifati (Univ. of Lyon); Stefania Dumbrava (ENSIIE); George Fletcher (Eindhoven University of Technology, the Netherlands); Alastair Green (Mr); Jan Hidders (Birkbeck, University of London)*; Bei Li (Google); Leonid Libkin (University of Edinburgh & RelationalAI); Victor Marsault (UPEM / CNRS); Wim Martens (University of Bayreuth); Filip Murlak (University of Warsaw, Poland); Stefan Plantikow (Neo4j); Ognjen Savkovic (Free University of Bozen-Bolzano); Michael Schmidt (Amazon Web Services); Juan Sequeda (data.world); Sławek Staworko (RelationalAI); Dominik Tomaszuk (University of Bialystok); Hannes Voigt (Neo4j); Domagoj Vrgoč (Pontifical Catholic University of Chile); Mingxi Wu (Tigergraph Inc); Dušan Živković (Integral Data Solutions).”

 

The event takes place from June 18th to 23rd in Seattle, USA
More news
View : All
Annual
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
Biannual
1st semester
2nd semester
Monthly
January
February
March
April
May
June
July
August
September
October
November
December
No news in this category
Show more
Nothing to show