How Can We Analyze Differentially-Private Synthetic Datasets?

Anne-Sophie Charest

doi:10.29012/jpc.v2i2.589

PDF

Published: Apr 1, 2011

DOI: https://doi.org/10.29012/jpc.v2i2.589

Keywords:

synthetic datasets, differential privacy, beta-binomial synthetizer

Anne-Sophie Charest

Departmen t of Statistics, Carnegie Mellon University

Abstract

Synthetic datasets generated within the multiple imputation framework are now commonly used by statistical agencies to protect the confidentiality of their respondents. More recently, researchers have also proposed techniques to generate synthetic datasets which offer the formal guarantee of differential privacy. While combining rules were derived for the first type of synthetic datasets, little has been said on the analysis of differentially-private synthetic datasets generated with multiple imputations. In this paper, we show that we can not use the usual combining rules to analyze synthetic datasets which have been generated to achieve differential privacy. We consider specifically the case of generating synthetic count data with the beta-binomial synthetizer, and illustrate our discussion with simulation results. We also propose as a simple alternative a Bayesian model which models explicitly the mechanism for synthetic data generation.

How to Cite

Charest, Anne-Sophie. 2011. “How Can We Analyze Differentially-Private Synthetic Datasets?”. Journal of Privacy and Confidentiality 2 (2). https://doi.org/10.29012/jpc.v2i2.589.

Issue

Vol. 2 No. 2 (2011)

Section

Articles

Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.

Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.

Funding data

Army Research Office
Grant numbers DAAD19-02-1-0389;W911NF-09-1-0273
National Science Foundation
Grant numbers BCS0941518

Article Sidebar

Main Article Content

Abstract

Article Details

Funding data

Most read articles by the same author(s)