A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples

Raj Chetty; John N Friedman

doi:10.29012/jpc.716

PDF Code

Published: Oct 22, 2019

DOI: https://doi.org/10.29012/jpc.716

Keywords:

Differential Privacy, Administrative Data

Raj Chetty

Harvard University and NBER

John N Friedman

Brown University and NBER

Abstract

We develop a simple method to reduce privacy loss when disclosing statistics such as OLS regression estimates based on samples with small numbers of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic's maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.

How to Cite

Chetty, Raj, and John N Friedman. 2019. “A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples”. Journal of Privacy and Confidentiality 9 (2). https://doi.org/10.29012/jpc.716.

Issue

Vol. 9 No. 2 (2019): Differential Privacy, including Special Issue on the Theory and Practice of Differential Privacy 2017

Section

Articles

Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.

Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.

Article Sidebar

Main Article Content

Abstract

Article Details

Funding data