Winning the NIST Contest: A scalable and general approach to differentially private synthetic data

Main Article Content

Ryan McKenna
Gerome Miklau
https://orcid.org/0000-0003-1369-9239
Daniel Sheldon
https://orcid.org/0000-0002-4257-2432

Abstract

We propose a general approach for differentially private synthetic data generation, that consists of three steps: (1) select a collection of low-dimensional marginals, (2) measure those marginals with a noise addition mechanism, and (3) generate synthetic data that preserves the measured marginals well. Central to this approach is Private-PGM, a post-processing method that is used to estimate a high-dimensional data distribution from noisy measurements of its marginals. We present two mechanisms, NIST-MST and MST, that are instances of this general approach. NIST-MST was the winning mechanism in the 2018 NIST differential privacy synthetic data competition, and MST is a new mechanism that can work in more general settings, while still performing comparably to NIST-MST. We believe our general approach should be of broad interest, and can be adopted in future mechanisms for synthetic data generation.

Article Details

How to Cite
McKenna, Ryan, Gerome Miklau, and Daniel Sheldon. 2021. “Winning the NIST Contest: A Scalable and General Approach to Differentially Private Synthetic Data”. Journal of Privacy and Confidentiality 11 (3). https://doi.org/10.29012/jpc.778.
Section
Privacy Challenges