David Darais: Data Privacy by Programming Language Design – Instituto Milenio Fundamentos de los Datos

CHARLA TÉCNICA IMFD: Data Privacy by Programming Language Design

ABSTRACT: Data privacy is a growing concern for every individual, businesses, government, and organization. For example, some companies actively sell private information to third parties without customer consent, provoking increasing privacy concerns. At the same time, personal data can produce positive effects in society: companies perform many useful services for customers based on private data, government entities use personal information for the public good, and medical researchers use patient data to perform important research.

The state of the art in privacy protection for individuals is differential privacy, which enables the statistical analysis of data with a mathematical guarantee of privacy. Successful differential privacy approaches have been developed for aggregate statistics, database queries, and convex machine learning.

In order to achieve differential privacy, random noise is (typically) introduced during the manipulation of data, which results in less accurate results. In some cases, random noise is not enough, and more aggressive techniques must be used such as data clipping. An ongoing challenge in differentially-private algorithm design is to achieve a balance between privacy guarantees and accuracy of results.

We present Duet: a general-purpose programming language for enforcing differential privacy. Duet consists of two mutually embedded programming languages and uses a multi-tiered analysis to automatically provide state-of-the-art bounds on privacy leakage for any program written in the language. In case studies, we show the effectiveness of Duet through the implementation of differentially private convex machine learning algorithms, and an empirical analysis of the accuracy of trained machine learning models vs the non-private model. In future work we aim to achieve differentially private training of non-convex machine learning (e.g., neural networks) with high accuracy—an unsolved challenge in provably-private algorithm design.

BIO: David Darais is an Assistant Professor at the University of Vermont. David’s research focuses on tools for achieving reliable software in critical, security-sensitive, and privacy-sensitive systems. David received his BS, MS, and Ph.D. from the University of Utah, Harvard University and the University of Maryland. http://david.darais.com/

LUGAR: Auditorio Ramón Picarte, DCC U. de Chile, Beauchef 851, Edificio Poniente, Tercer Piso. Santiago.

FECHA/HORA: Miércoles 15 de mayo, 12.00 a 13.00 horas.

CONSULTAS: comunicaciones-imfd@imfd.cl