Calibrating Noise to Sensitivity in Private Data Analysis
Main Article Content
Abstract
We continue a line of research initiated in Dinur and Nissim (2003); Dwork and Nissim (2004); and Blum et al. (2005) on privacy-preserving statistical databases.
Consider a trusted server that holds a database of sensitive information. Given a query function $f$ mapping databases to reals, the so-called {\em true answer} is the result of applying $f$ to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user.
Previous work focused on the case of noisy sums, in which $f = \sum_i g(x_i)$, where $x_i$ denotes the $i$th row of the database and $g$ maps database rows to $[0,1]$. We extend the study to general functions $f$, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the {\em sensitivity} of the function $f$. Roughly speaking, this is the amount that any single argument to $f$ can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case.
The first step is a very clean definition of privacy---now known as differential privacy---and measure of its loss. We also provide a set of tools for designing and combining differentially private algorithms, permitting the construction of complex differentially private analytical tools from simple differentially private primitives.
Finally, we obtain separation results showing the increased value of interactive statistical release mechanisms over non-interactive ones.
Article Details
Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.
Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.
Funding data
-
National Science Foundation
Grant numbers IIS-1447700 -
Alfred P. Sloan Foundation
-
Google