Journal of Privacy and Confidentiality 2021-12-24T09:27:13-08:00 Lars Vilhuber Open Journal Systems <p>The <em>Journal of Privacy and Confidentiality</em>&nbsp;is an open-access multi-disciplinary journal whose purpose is to facilitate the coalescence of research methodologies and activities in the areas of privacy, confidentiality, and disclosure limitation. The JPC seeks to publish a wide range of research and review papers, not only from academia, but also from government (especially official statistical agencies) and industry, and to serve as a forum for exchange of views, discussion, and news.</p> Synthetic Data Generation with Differential Privacy via Bayesian Networks 2021-02-28T14:19:06-08:00 Ergute Bao Xiaokui Xiao Jun Zhao Dongping Zhang Bolin Ding <p>This paper describes PrivBayes, a differentially private method for generating synthetic datasets that was used in the 2018 Differential Privacy Synthetic Data Challenge organized by NIST.</p> 2021-12-24T00:00:00-08:00 Copyright (c) 2021 Ergute Bao, Xiaokui Xiao, Jun Zhao, Dongping Zhang, Bolin Ding Winning the NIST Contest: A scalable and general approach to differentially private synthetic data 2021-03-09T08:32:41-08:00 Ryan McKenna Gerome Miklau Daniel Sheldon <p>We propose a general approach for differentially private synthetic data generation, that consists of three steps: (1) <strong>select</strong> a collection of low-dimensional marginals, (2) <strong>measure</strong> those marginals with a noise addition mechanism, and (3)&nbsp;<strong>generate</strong> synthetic data that preserves the measured marginals well. Central to this approach is Private-PGM, a post-processing method that is used to estimate a high-dimensional data distribution from noisy measurements of its marginals. We present two mechanisms, NIST-MST and MST, that are instances of this general approach. NIST-MST was the winning mechanism in the 2018 NIST differential privacy synthetic data competition, and MST is a new mechanism that can work in more general settings, while still performing comparably to NIST-MST. We believe our general approach should be of broad interest, and can be adopted in future mechanisms for synthetic data generation.</p> 2021-12-24T00:00:00-08:00 Copyright (c) 2021 Ryan McKenna, Gerome Miklau, Daniel Sheldon LinkedIn's Audience Engagements API 2021-08-17T07:06:12-07:00 Ryan Rogers Subbu Subramaniam Sean Peng David Durfee Seunghyun Lee Santosh Kumar Kancha Shraddha Sahay Parvez Ahammad <p>We present a privacy system that leverages differential privacy to protect LinkedIn members' data while also providing audience engagement insights to enable marketing analytics related applications. We detail the differentially private algorithms and other privacy safeguards used to provide results that can be used with existing real-time data analytics platforms, specifically with the open sourced Pinot system. Our privacy system provides user-level privacy guarantees. As part of our privacy system, we include a budget management service that enforces a strict differential privacy budget on the returned results to the analyst. This budget management service brings together the latest research in differential privacy into a product to maintain utility given a fixed differential privacy budget.</p> 2021-12-24T00:00:00-08:00 Copyright (c) 2021 Ryan Rogers, Subbu Subramaniam, Sean Peng, David Durfee, Seunghyun Lee, Santosh Kumar Kancha, Shraddha Sahay, Parvez Ahammad Differentially Private Set Union 2021-08-25T09:08:47-07:00 Sivakanth Gopi Pankaj Gulhane Janardhan Kulkarni Judy Hanwen Shen Milad Shokouhi Sergey Yekhanin <p>We study the basic operation of set union in the global model of differential privacy. In this problem, we are given a universe $U$ of items, possibly of infinite size, and a database $D$ of users. Each user $i$ contributes a subset $W_i \subseteq U$ of items. We want an ($\epsilon$,$\delta$)-differentially private algorithm which outputs a subset $S \subset \cup_i W_i$ such that the size of $S$ is as large as possible. The problem arises in countless real world applications; it is particularly ubiquitous in natural language processing (NLP) applications as vocabulary extraction. For example, discovering words, sentences, $n$-grams etc., from private text data belonging to users is an instance of the set union problem.Known algorithms for this problem proceed by collecting a subset of items from each user, taking the union of such subsets, and disclosing the items whose noisy counts fall above a certain threshold. Crucially, in the above process, the contribution of each individual user is always independent of the items held by other users, resulting in a wasteful aggregation process, where some item counts happen to be way above the threshold. We deviate from the above paradigm by allowing users to contribute their items in a {\em dependent fashion}, guided by a {\em policy}. In this new setting ensuring privacy is significantly delicate. We prove that any policy which has certain {\em contractive} properties would result in a differentially private algorithm. We design two new algorithms for differentially private set union, one using Laplace noise and other Gaussian noise, which use $\ell_1$-contractive and $\ell_2$-contractive policies respectively and provide concrete examples of such policies. Our experiments show that the new algorithms in combination with our policies significantly outperform previously known mechanisms for the problem.</p> 2021-12-24T00:00:00-08:00 Copyright (c) 2021 Sivakanth Gopi, Pankaj Gulhane, Janardhan Kulkarni, Judy Hanwen Shen, Milad Shokouhi, Sergey Yekhanin