The Risk of Linked Census Data to Transgender Youth A Simulation Study
Main Article Content
Abstract
Every ten years, the United States Census Bureau collects data on all people living in the US, including information on age, sex, race, ethnicity, and household relationship.
We conducted a simulation study to investigate the risk of disclosing a change in how an individual's sex was recorded in successive censuses. In a simulated population based on a reconstruction of the 2010 decennial census of Texas, we compared the number of transgender individuals under 18 identified by linking simulated census data from 2010 and 2020 under alternative approaches to disclosure avoidance, including swapping in 2020 (as used in 2010) and TopDown in 2020 (as used for the actual data released from the 2020 enumeration).
Our simulation assumed that in Texas, 0.2\% of the 3,095,857 residents who were under the age of 8 in the 2010 census were transgender and would have a different sex reported in the 2020 census, and 23\% would reside at the same address, which implied that 1,424 trans youth were at risk of having their transgender identity status disclosed by a reconstruction-abetted linkage attack. We found that without any disclosure avoidance in 2010 or 2020, a reconstruction-abetted linkage attack identified 657 transgender youth. With 5\% swapping in 2010 and 2020, it identified 605 individuals, an 8\% decrease. With swapping in 2010 and TopDown in 2020 as configured for the actual data release, it identified 194 individuals, a 68\% decrease from swapping. Our simulation found that the TopDown configuration attains the maximum achievable level of privacy protection against such an attack.
Our results demonstrate the importance of disclosure avoidance for census data and suggest that the TopDown approach used by the Census Bureau is a substantial improvement compared to the previous approach, achieving the maximum level of privacy protection possible against such a linkage attack.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.
Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.
Funding data
References
G. Abbott. Letter to Masters, 2022. URL https://www.documentcloud.org/documents/21272649-abbott-letter-to-masters.
J. Abowd, D. Kifer, B. Moran, R. Ashmead, P. Leclerc, W. Sexton, S. Garfinkel, and A. Machanavajjhala. Census TopDown: Differentially private data, incremental schemas, and consistency with public knowledge. Technical report, U.S. Census Bureau, 2019.
M. A. Barreto. Expert Testimony: NYAG New York v. U.S. Immigration and Customs Enforcement, 2019. URL http://mattbarreto.com/papers/Declaration_of_Matthew_A._Barreto_-_NY.pdf.
U. C. Bureau. 2010 demonstration data for the Demographic and Housing Characteristics file (DHC) (v. 2022-08-25). Technical report, 2022a. URL https://www2.census.gov/programssurveys/decennial/2020/program-management/data-product-planning/2010-demonstration-data-products/02-Demographic_and_Housing_Characteristics/2022-08-25_Summary_File/.
U. C. Bureau. 2010 demonstration data for the Demographic and Housing Characteristics file (DHC) (v. 2022-03-16). Technical report, 2022b. URL https://www2.census.gov/programssurveys/decennial/2020/program-management/data-product-planning/2010-demonstration-data-products/02-Demographic_and_Housing_Characteristics/2022-03-16_Summary_File/.
U. C. Bureau. 2010 demonstration data for the Demographic and Housing Characteristics file (DHC) (v. 2023-04-03). Technical report, 2023. URL https://www2.census.gov/programssurveys/decennial/2020/program-management/data-product-planning/2010-demonstration-data-products/04-Demonstration_Data_Products_Suite/2023-04-03/.
M. Canaday. The straight state. In The Straight State. Princeton University Press, 2009.
S. Garfinkel et al. 2018 end-to-end test disclosure avoidance system design specification. Technical report, U.S. Census Bureau, 2019.
National Center for Chronic Disease Prevention and Health Promotion, Division of Population Health. 2019 BRFSS survey data and documentation. Technical report, 2019. URL https://www.cdc.gov/brfss/annual_data/annual_2019.html.
S. Petti and A. Flaxman. Differential privacy in the 2020 US census: What will it do? Quantifying the accuracy/privacy tradeoff. Gates Open Research, 3, 2019.
T. B. Singer. The profusion of things: the “transgender matrix” and demographic imaginaries in US public health. Transgender Studies Quarterly, 2(1):58–76, 2015.
V. Slothouber. (de) trans visibility: moral panic in mainstream media reports on de/retransition. European Journal of English Studies, 24(1):89–99, 2020.
D. Thompson. Making (mixed-) race: census politics and the emergence of multiracial multiculturalism in the United States, Great Britain and Canada. Ethnic and Racial Studies, 35(8):1409–1426, 2012.
White House. FACT SHEET: Biden-Harris administration advances equality and visibility for transgender Americans, 2022.