Journal of Privacy and Confidentiality https://journalprivacyconfidentiality.org/index.php/jpc <p>The <em>Journal of Privacy and Confidentiality</em>&nbsp;is an open-access multi-disciplinary journal whose purpose is to facilitate the coalescence of research methodologies and activities in the areas of privacy, confidentiality, and disclosure limitation. The JPC seeks to publish a wide range of research and review papers, not only from academia, but also from government (especially official statistical agencies) and industry, and to serve as a forum for exchange of views, discussion, and news.</p> Cornell University, ILR School en-US Journal of Privacy and Confidentiality 2575-8527 <p>Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)</a>, unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.</p> <p>Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.</p> A Unified Interpretation of the Gaussian Mechanism for Differential Privacy Through the Sensitivity Index https://journalprivacyconfidentiality.org/index.php/jpc/article/view/807 <p>The Gaussian mechanism (GM) represents a universally employed tool for achieving differential privacy (DP), and a large body of work has been devoted to its analysis. We argue that the three prevailing interpretations of the GM, namely epsilon/delta-DP, f-DP and Rényi DP can be expressed by using a single parameter psi, which we term the sensitivity index. Psi uniquely characterises the GM and its properties by encapsulating its two fundamental quantities: the sensitivity of the query and the magnitude of the noise perturbation. With strong links to the ROC curve and the hypothesis-testing interpretation of DP, psi offers the practitioner a powerful method for interpreting, comparing and communicating the privacy guarantees of Gaussian mechanisms.</p> Georgios Kaissis Moritz Knolle Friederike Jungmann Alexander Ziller Dmitrii Usynin Daniel Rueckert Copyright (c) 2022 Georgios Kaissis, Moritz Knolle, Friederike Jungmann, Alexander Ziller, Dmitrii Usynin, Daniel Rueckert http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-29 2022-07-29 12 1 10.29012/jpc.807 A Taxonomy of Attacks on Privacy-Preserving Record Linkage https://journalprivacyconfidentiality.org/index.php/jpc/article/view/764 <p>Record linkage is the process of identifying records that corresponds to the same real-world entities across different databases. Due to the absence of unique entity identifiers, record linkage is often based on quasi-identifying values of entities (individuals) such as their names and addresses. However, regulatory ethical and legal obligations can limit the use of such personal information in the linkage process in order to protect the privacy and confidentiality of entities. Privacy-preserving record linkage (PPRL) aims to develop techniques that enable the linkage of records without revealing any sensitive or confidential information about the entities that are represented by these records. Over the past two decades various PPRL techniques have been proposed to securely link records between different databases by encrypting and/or encoding sensitive values. However, some PPRL techniques, such as popular Bloom filter encoding, have shown to be susceptible to privacy attacks. These attacks exploit the weaknesses of PPRL techniques by trying to reidentify encrypted and/or encoded sensitive values. In this paper we propose a taxonomy for analysing such attacks on PPRL where we categorise attacks across twelve dimensions, including different types of adversaries, different attack types, assumed knowledge of the adversary, the vulnerabilities of encoded and/or encrypted values exploited by an attack, and assessing the success of attacks. Our taxonomy can be used by data custodians to analyse the privacy risks associated with different PPRL techniques in terms of existing as well as potential future attacks on PPRL.</p> Anushka Vidanage Thilina Ranbaduge Peter Christen Rainer Schnell Copyright (c) 2022 Anushka, Thilina Ranbaduge, Peter Christen, Rainer Schnell http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-29 2022-07-29 12 1 10.29012/jpc.764 The Discrete Gaussian for Differential Privacy https://journalprivacyconfidentiality.org/index.php/jpc/article/view/784 <p>A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical errors can entirely destroy privacy. Moreover, when the underlying data is itself discrete (e.g., population counts), adding continuous noise makes the result less interpretable.</p> <p>With these shortcomings in mind, we introduce and analyze the discrete Gaussian in the context of differential privacy. Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise. We also present an simple and efficient algorithm for exact sampling from this distribution. This demonstrates its applicability for privately answering counting queries, or more generally, low-sensitivity integer-valued queries.</p> Clement Canonne Gautam Kamath Thomas Steinke Copyright (c) 2022 Clement Canonne, Gautam Kamath, Thomas Steinke http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-29 2022-07-29 12 1 10.29012/jpc.784 A Latent Class Modeling Approach for Differentially Private Synthetic Data for Contingency Tables https://journalprivacyconfidentiality.org/index.php/jpc/article/view/768 <p>We present an approach to construct differentially private synthetic data for contingency tables. The algorithm achieves privacy by adding noise to selected summary counts, e.g., two-way margins of the contingency table, via the Geometric mechanism. We posit an underlying latent class model for the counts, estimate the parameters of the model based on the noisy counts, and generate synthetic data using the estimated model. This approach allows the agency to create multiple imputations of synthetic data with no additional privacy loss, thereby facilitating estimation of uncertainty in downstream analyses. We illustrate the approach using a subset of the 2016 American Community Survey Public Use Microdata Sets.</p> Michelle Nixon Andres Barrientos Jerome Reiter Aleksandra Slavkovic Copyright (c) 2022 Michelle Nixon, Andres Barrientos, Jerome Reiter, Aleksandra Slavkovic http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-29 2022-07-29 12 1 10.29012/jpc.768 Overlook: Differentially Private Exploratory Visualization for Big Data https://journalprivacyconfidentiality.org/index.php/jpc/article/view/779 <p>Data exploration systems that provide differential privacy must manage a privacy budget that measures the amount of privacy lost across multiple queries. One effective strategy to manage the privacy budget is to compute a one-time private synopsis of the data, to which users can make an unlimited number of queries. However, existing systems using synopses are built for offline use cases, where a set of queries is known ahead of time and the system carefully optimizes a synopsis for it. The synopses that these systems build are costly to compute and may also be costly to store.</p> <p>We introduce Overlook, a system that enables private data exploration at interactive latencies for both data analysts and data curators. The key idea in Overlook is <em>virtual synopsis</em> that can be evaluated \emph{incrementally}, without extra space storage or expensive precomputation. Overlook simply executes queries using an existing engine, such as a SQL DBMS, and adds noise to their results. Because Overlook's synopses do not require costly precomputation or storage, data curators can also use Overlook to explore the impact of privacy parameters interactively. Overlook offers a rich visual query interface based on the open source Hillview system. Overlook achieves accuracy comparable to existing synopsis-based systems, while offering better performance and removing the need for extra storage.</p> Mihai Budiu Pratiksha Thaker Parikshit Gopalan Udi Wieder Matei Zaharia Copyright (c) 2022 Mihai Budiu, Pratiksha Thaker, Parikshit Gopalan, Udi Wieder, Matei Zaharia http://creativecommons.org/licenses/by-nc-nd/4.0 2022-07-29 2022-07-29 12 1 10.29012/jpc.779