Main Article Content
Statistical agencies routinely publish aggregate data in the form of contingency tables. In this paper, we consider the problem of releasing private contingency tables so that the privacy of individual respondents in the table is preserved. We first uncover funda- mental problems with existing cell suppression algorithms that are used for this purpose. We then present a rigorous definition of privacy and a generic algorithmic framework for cell suppression given this definition. Using this framework we build a complete cell suppression solution for the special case of boolean private attributes. We study both theoretically and experimentally the utility of our approach. Along the way, we demonstrate a connection to the query auditing problem in statistical databases and make a foundational contribution to this problem as well. In particular, we analyze an unexamined assumption from the literature regarding the prior knowledge of attackers.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.
Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.