Journal of Privacy and Confidentiality

Foreword for the Collection of Papers from the Workshop on the Analysis of Census Noisy Measurement Files and Differential Privacy

2024-07-16T11:55:17-07:00

The 2022 Workshop on the Analysis of Census Noisy Measurement Files and Differential Privacy brought together research experts from many domains of social sciences, demography, public policy, statistics, and computer science to address key challenges in the use of the differentially private Census noisy measurement files to support social research and policy decisions.

Geographic Spines in the 2020 Census Disclosure Avoidance System

2024-03-24T13:53:01-07:00

The 2020 Census Disclosure Avoidance System (DAS) is a formally private mechanism that first adds independent noise to cross tabulations for a set of pre-specified hierarchical geographic units, which is known as the geographic spine. After post-processing these noisy measurements, DAS outputs a formally private database with fields indicating location in the standard census geographic spine, which is defined by the United States as a whole, states, counties, census tracts, block groups, and census blocks. This paper describes how the geographic spine used internally within DAS to define the initial noisy measurements impacts accuracy of the output database. Specifically, tabulations for geographic areas tend to be most accurate for geographic areas that both 1) can be derived by aggregating together geographic units above the block geographic level of the internal spine, and 2) are closer to the geographic units of the internal spine. After describing the accuracy tradeoffs relevant to the choice of internal DAS geographic spine, we provide the settings used to define the 2020 Census production DAS runs.

Perspective: Better Privacy Theorists for Better Data Stewards

2023-05-22T13:00:21-07:00

The U.S. Census Bureau's use of differential privacy (DP) fundamentally changed how academic DP researchers perform outreach with official statistics stakeholders. In this perspectives piece, I propose ways for us in this community to improve those processes by being more receptive to the practical concerns raised by building DP systems. First, I discuss how academic DP work fundamentally differs from the policy decisions needed to implement DP systems and why this distinction has political consequences. Through examples and discussions from workshops, I show how the DP community largely asked applied stakeholders to communicate on DP's theoretical terms, when such an ask foreclosed important considerations relevant for the Census Bureau's policy problems. Second, I discuss how existing polarization between theoretical and empirical privacy researchers unintentionally seeped into the ways we communicated about DP, pointing to why both perspectives are necessary in different ways for policy conversations. Finally, I conclude by discussing how these issues are not unique to data privacy work, but instead reflect structural problems in translating theoretical science into practice. These ideas are presented in service of a single goal: to ensure DP theory supports substantive, privacy-aware data processing and dissemination in practice for essential data curators.

Incompatibilities Between Current Practices in Statistical Data Analysis and Differential Privacy

2023-08-29T09:29:19-07:00

The authors discuss their experience applying differential privacy with a complex data set with the goal of enabling standard approaches to statistical data analysis. They highlight lessons learned and roadblocks encountered, distilling them into incompatibilities between current practices in statistical data analysis and differential privacy that go beyond issues which can be solved with a noisy measurements file. The authors discuss how overcoming these incompatibilities require compromise and a change in either our approach to statistical data analysis or differential privacy that should be addressed head-on.

Launching the Society for Privacy and Confidentiality Research to Own the Journal of Privacy and Confidentiality

2024-06-23T18:48:16-07:00

We describe the launching of the Society for Privacy and Confidentiality Research (SPCR). SPCR is the new owner of the Journal of Privacy and Confidentiality, with the goal of ensuring a sustainable future for the Journal, and continuing to publish the multidisciplinary, open access journal that JPC has been since its founding in 2008.

Reconstruction Attacks on Aggressive Relaxations of Differential Privacy

2023-11-08T11:02:23-08:00

Differential privacy is a widely accepted formal privacy definition that allows aggregate information about a dataset to be released while controlling privacy leakage for individuals whose records appear in the data. Due to the unavoidable tension between privacy and utility, there have been many works trying to relax the requirements of differential privacy to achieve greater utility.
One class of relaxation, which is gaining support outside the privacy community is embodied by the definitions of individual differential privacy (IDP) and bootstrap differential privacy (BDP). Classical differential privacy defines a set of neighboring database pairs and achieves its privacy guarantees by requiring that each pair of neighbors should be nearly indistinguishable to an attacker. The privacy definitions we study, however, aggressively reduce the set of neighboring pairs that are protected.
To a non-expert, IDP and BDP can seem very appealing as they echo the same types of privacy explanations that are associated with differential privacy, and also experimentally achieve dramatically better utility. However, we show that they allow a significant portion of the dataset to be reconstructed using algorithms that have arbitrarily low privacy loss under their privacy accounting rules.
With the non-expert in mind, we demonstrate these attacks using the preferred mechanisms of these privacy definitions. In particular, we design a set of queries that, when protected by these mechanisms with high noise settings (i.e., with claims of very low privacy loss), yield more precise information about the dataset than if they were not protected at all. The specific attacks here can be defeated and we give examples of countermeasures. However, the defenses are either equivalent to using differential privacy or to ad-hoc methods tailored specifically to the attack (with no guarantee that they protect against other attacks). Thus, the defenses emphasize the deficiencies of these privacy definitions.