Journal of Privacy and Confidentiality https://journalprivacyconfidentiality.org/index.php/jpc <p>The <em>Journal of Privacy and Confidentiality</em>&nbsp;is an open-access multi-disciplinary journal whose purpose is to facilitate the coalescence of research methodologies and activities in the areas of privacy, confidentiality, and disclosure limitation. The JPC seeks to publish a wide range of research and review papers, not only from academia, but also from government (especially official statistical agencies) and industry, and to serve as a forum for exchange of views, discussion, and news.</p> Cornell University, ILR School en-US Journal of Privacy and Confidentiality 2575-8527 <p>Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)</a>, unless choosing a more lenient license (for instance, public domain). Furthermore, the authors of articles published by&nbsp;the journal grant&nbsp;the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.</p> <p>Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.</p> Rembering Stephen Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/685 <p>Stephen Fienberg was instrumental in creating this journal. This special issue of the journal commemorates the intersection between statistics, computer science, privacy, and confidentiality, as he envisioned so many years ago.</p> Aleksandra Slavković Lars Vilhuber ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-28 2018-12-28 8 1 10.29012/jpc.685 The Future of the Journal of Privacy and Confidentiality https://journalprivacyconfidentiality.org/index.php/jpc/article/view/708 <p>The Journal of Privacy and Confidentiality (JPC) is the only journal to actively solicit contributions from the multi-faceted community of researchers and practitioners for whom privacy is a primary intellectual or operational concern, for dissemination across this broad community. This includes computer scientists, statisticians, lawyers, social scientists, policy-makers, health researchers, survey designers, and data-rich corporate players. While not every publication is aimed so broadly, the Journal aims to provide a common forum for all these constituent populations.</p> <p>With the publication of the current issue we re-launch the Journal of Privacy and Confidentiality. We reaffirm our dedication to drawing from multiple disciplines in which privacy and confidentiality are of primary intellectual and operational concern, and to maintaining our status as an open access journal providing a forum for communication across and between these disciplines.</p> Cynthia Dwork ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-28 2018-12-28 8 1 10.29012/jpc.708 Relaunching the Journal of Privacy and Confidentiality https://journalprivacyconfidentiality.org/index.php/jpc/article/view/706 <p>This issue is the first to appear after a longer intermission. We have replatformed the journal, but we continue the original mission of publishing innovative materials from many disciplines in the&nbsp;areas of privacy, confidentiality, and disclosure limitation.&nbsp;</p> Lars Vilhuber ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-23 2018-12-23 8 1 10.29012/jpc.706 The Fienberg Problem: How to Allow Human Interactive Data Analysis in the Age of Differential Privacy https://journalprivacyconfidentiality.org/index.php/jpc/article/view/687 <p>Differential Privacy is a popular technology for privacy-preserving analysis of large datasets. DP is powerful, but it requires that the analyst interact with data only through a special interface; in particular, the analyst does not see raw data, an uncomfortable situation for anyone trained in classical statistical data analysis. In this note we discuss the (overly) simple problem of allowing a&nbsp; <em>trusted</em> analyst to choose an ``"interesting" statistic for popular release (the actual computation of the chosen statistic will be carried out in a differentially private way).</p> Cynthia Dwork Jonathan Ullman ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-21 2018-12-21 8 1 10.29012/jpc.687 Differentially private posterior summaries for linear regression coefficients https://journalprivacyconfidentiality.org/index.php/jpc/article/view/683 <p>In Bayesian regression modeling, often analysts summarize inferences using posterior probabilities and quantiles, such as the posterior probability that a coefficient exceeds zero or the posterior median of that coefficient. However, with potentially unbounded outcomes and explanatory variables, regression inferences based on typical prior distributions can be sensitive to values of individual data points. Thus, releasing posterior summaries of regression coefficients can result in disclosure risks. In this article, we propose some differentially private algorithms for reporting posterior probabilities and posterior quantiles of linear regression coefficients. The algorithms use the general strategy of subsample and aggregate, a technique that requires randomly partitioning the data into disjoint subsets, estimating the regression within each subset, and combining results in ways that satisfy differential privacy.&nbsp; We illustrate the performance of some of the algorithms using repeated sampling studies. The non-private versions also can be used for Bayesian inference with big data in non-private settings.</p> Gilad Amitai Jerome Reiter ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-12 2018-12-12 8 1 10.29012/jpc.683 Public-Use vs. Restricted-Use: An Analysis Using the American Community Survey https://journalprivacyconfidentiality.org/index.php/jpc/article/view/661 <p>Statistical agencies frequently publish microdata that have been altered to protect condentiality.&nbsp;Such data retain utility for many types of broad analyses but can yield biased or insufficiently precise&nbsp;results in others. Research access to de-identied versions of the restricted-use data with little or no&nbsp;alteration is often possible, albeit costly and time-consuming. We investigate the advantages and&nbsp;disadvantages of public-use and restricted-use data from the American Community Survey (ACS)&nbsp;in constructing a wage index. The public-use data used were Public Use Microdata Samples, while&nbsp;the restricted-use data were accessed via a Federal Statistical Research Data Center. We discuss&nbsp;the advantages and disadvantages of each data source and compare estimated CWIs and standard&nbsp;errors at the state and labor market levels. We find the results from the publicly available data&nbsp;are generally good relative to the restricted-use data, with greater similarity for larger areas and&nbsp;less similarity for smaller areas. Standard errors are higher in the public-used data but may still&nbsp;be underestimated.</p> Saki Kinney Alan F Karr ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-24 2018-12-24 8 1 10.29012/jpc.661 A Privacy Preserving Algorithm to Release Sparse High-dimensional Histograms https://journalprivacyconfidentiality.org/index.php/jpc/article/view/657 <p>Differential privacy has emerged as a popular model to provably limit privacy risks associated with a given data release. However releasing high dimensional synthetic data under differential privacy remains a challenging problem. In this paper, we study the problem of releasing synthetic data in the form of a high dimensional histogram under the constraint of differential privacy.<br>We develop an $(\epsilon, \delta)$-differentially private categorical data synthesizer called \emph{Stability Based Hashed Gibbs Sampler} (SBHG). SBHG works by combining a stability based sparse histogram estimation algorithm with Gibbs sampling and feature selection to approximate the empirical joint distribution of a discrete dataset. SBHG offers a competitive alternative to state-of-the art synthetic data generators while preserving the sparsity structure of the original dataset, which leads to improved statistical utility as illustrated on simulated data. Finally, to study the utility of the resulting synthetic data sets generated by SBHG, we also perform logistic regression using the synthetic datasets and compare the classification accuracy with those from using the original dataset.</p> Bai Li Vishesh Karwa Aleksandra Slavković Rebecca Carter Steorts ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-28 2018-12-28 8 1 10.29012/jpc.657 Statistical Approximating Distributions Under Differential Privacy https://journalprivacyconfidentiality.org/index.php/jpc/article/view/666 <p>Statistics computed from data are viewed as random variables. When they are used for tasks like hypothesis testing and confidence intervals, their true finite sample distributions are often replaced by approximating distributions that are easier to work with (for example, the Gaussian, which results from using approximations justified by the Central Limit Theorem). When data are perturbed by differential privacy, the approximating distributions also need to be modified. Prior work provided various competing methods for creating such approximating distributions with little formal justification beyond the fact that they worked well empirically.</p> <p>In this paper, we study the question of how to generate statistical approximating distributions for differentially private statistics, provide finite sample guarantees for the quality of the approximations.</p> Yue Wang Daniel Kifer Jaewoo Lee Vishesh Karwa ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-21 2018-12-21 8 1 10.29012/jpc.666 Statistical Disclosure Limitation: New Directions and Challenges https://journalprivacyconfidentiality.org/index.php/jpc/article/view/684 <p>An overview of traditional types of data dissemination at statistical agencies is&nbsp;provided including definitions of disclosure risks, the quantification of disclosure risk&nbsp;and data utility and common statistical disclosure limitation (SDL) methods. However,&nbsp;with technological advancements and the increasing push by governments for open<br>and accessible data, new forms of data dissemination are currently being explored. We&nbsp;focus on web-based applications such as flexible table builders and remote analysis&nbsp;servers, synthetic data and remote access. Many of these applications introduce new&nbsp;challenges for statistical agencies as they are gradually relinquishing some of their&nbsp;control on what data is released. There is now more recognition of the need for&nbsp;perturbative methods to protect the confidentiality of data subjects. These new forms&nbsp;of data dissemination are changing the landscape of how disclosure risks are&nbsp;conceptualized and the types of SDL methods that need to be applied to protect the<br>data. In particular, inferential disclosure is the main disclosure risk of concern and&nbsp;encompasses the traditional types of disclosure risks based on identity and attribute&nbsp;disclosures. These challenges have led to statisticians exploring the computer science&nbsp;definition of differential privacy and privacy- by-design applications. We explore how differential privacy can be a useful addition to the current SDL framework within&nbsp;statistical agencies.</p> Natalie Shlomo ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-24 2018-12-24 8 1 10.29012/jpc.684 Reminiscences of Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/691 <p>My professional relationship with Steve began in the early 1990s, when I came to NISS as Associate Director and he was a member of the Board of Trustees. We sometimes disagreed, or perhaps more accurately, I failed to grasp his wisdom. Something must have worked, though, because Steve also chaired the committee that selected me to be Director of NISS.</p> <p>Our scientific collaboration arose in late 1990s, when I was PI, and he co-PI, on two grants from NSF's Digital Government initiative. These grants, as did the entire collaboration, stemmed from Steve's fervent belief that deep mathematics can be brought to bear on pressing personal and societal problems. The first had to do with web-based query systems now known as restricted data access systems (RDAS), and specifically with table servers. We were frontiersmen together in formulating and applying risk-utility frontiers, released table frontiers and unreleasable table frontiers.</p> <p>With his usual prescience, Steve knew before data breaches were daily news that privacy and confidentiality are major concerns. We wrote only a few papers together, but we exchanged sometimes wildly complementary ideas for more than twenty years. I still remember a meeting with a number of federal statistical agencies at which what I proposed as a risk measure was exactly what Steve construed as a utility measure.</p> <p>From the science grew a multi-year, multi-continent friendship that drew in Joyce and Senora as well. It mattered not whether the last encounter was three weeks or three years ago. Sadly, only one of the four of us now remains, but in keeping with the advice of Dr. Seuss, instead of crying because what ended, I smile because what happened.</p> Alan F. Karr ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-11-30 2018-11-30 8 1 10.29012/jpc.691 In Honour of Steve and Joyce Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/695 <p>During this period of preparing for the special issue of the Journal of Privacy and Confidentiality in honour of Steve Fienberg, we received news of the tragic events that occurred at the Tree of Life Synagogue in Pittsburgh on October 27th, 2018 and the sudden senseless death of Joyce Fienberg. Whilst Steve was a great support and mentor to me as I embarked on my PhD research at the Hebrew University and the University of Southampton in 2004, he was married to an extraordinary woman who showed endless kindness to me and all of Steve’s students and mentees. I had a wonderful visit to CMU during my sabbatical period in November 2011 spending much quality time with both Steve and Joyce.</p> <p>As my mentor, Steve marked my PhD dissertation in 2007, provided me with advice and support as I embarked on an academic career and provided many recommendation and promotion letters over the years. I can honestly credit Steve with where I am at today in my academic career. Steve was instrumental in bringing differential privacy to the forefront of research in statistical disclosure limitation and provided many opportunities to bring statisticians and computer scientists together for collaborations. Our most recent initiative was the Data Linkage and Anonymisation Programme at the Isaac Newton Institute of Mathematical Sciences at the University of Cambridge from July through December 2016. Steve was to participate in the programme but alas his illness took the better of him during that time. In fact, Steve was to participate in all three programmes that were running at the Institute: Data Linkage and Anonymisation, Theoretical Foundations for Statistical Network Analysis and Probability and Statistics in Forensic Science which only goes to show the breadth and depth of his research activities and achievements. He was sorely missed.</p> <p>I can only hope that these words of devotion and appreciation will provide some comfort to Steve and Joyce’s family. I end with a Hebrew blessing - Zichronom livracha – may their memories be a blessing.</p> Natalie Shlomo ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-21 2018-12-21 8 1 10.29012/jpc.695 Reminiscences of Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/692 <p>Steve Fienberg had an enormous influence on how I think about statistical science and a huge impact on my career. Steve's research is of course legendary; he made fundamental contributions to Bayesian inference, categorical data analysis, disclosure control, official statistics, and record linkage, to name key areas where our research interests overlapped. The statistical insights in this work are brilliant and important -- and directly inspired several of my papers, including my first paper in <em>JASA.</em> The innovations in Steve's work target and respond to real, high impact problems in official statistics with leading edge methodology intended to make it into practice. His example of being an academic statistician who improves the practice of official statisticians on the ground is one that I try to emulate in my official statistics research.</p> <p><br>Although I was not one of Steve's students, I was the beneficiary of his mentorship. At conferences and workshops, and at reciprocating visits to CMU and Duke, Steve generously gave me many hours of his time, answering questions about research, official statistics, advising students, career options and paths, and life in general. I valued his advice dearly and followed it. I also have benefitted from Steve's mentorship in a more indirect way: I have been a post-doc mentor to two of his Ph.D. students. I can say from first hand experience that Steve's students are wonderful, both as scholars and as people. As I would have expected coming from Steve's tutelage.</p> Jerome P. Reiter ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-11-30 2018-11-30 8 1 10.29012/jpc.692 Steve Fienberg and Student Research https://journalprivacyconfidentiality.org/index.php/jpc/article/view/693 <p><span style="font-weight: 400;">I was nervous when giving my first presentation to our US Census research group as a young graduate student: &nbsp;Steve was in the room, and I was not sure if my record linkage research was any good. While I was presenting, Steve seemingly paid no attention to what I was saying. &nbsp;I would glance his direction when presenting each slide, and every time, he was typing something on his iPad and not watching my presentation at all. That's okay, I thought. &nbsp;He's probably really busy, and shouldn't be wasting his time on my talk. </span></p> <p><span style="font-weight: 400;">After the meeting, I got back to my desk and checked my inbox, only to find seven emails from Steve, each with a paper that I should read that related to different parts of my work. &nbsp;His feedback in these emails helped shape the directions of my future research. </span></p> <p><span style="font-weight: 400;">That was Steve. &nbsp;He always made time for sharing constructive feedback on student research, and he could multitask better than anyone I've ever met.</span></p> Samuel L Ventura ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-11-06 2018-11-06 8 1 10.29012/jpc.693 Reminiscences https://journalprivacyconfidentiality.org/index.php/jpc/article/view/702 <div id="Item.MessagePartBody" class="_rp_05"> <div id="Item.MessageUniqueBody" class="_rp_15 ms-font-weight-regular ms-font-color-neutralDark rpHighlightAllClass rpHighlightBodyClass" style="font-family: 'wf_segoe-ui_normal', 'Segoe UI', 'Segoe WP', Tahoma, Arial, sans-serif, serif, 'EmojiFont';"> <div> <div dir="ltr"> <div dir="ltr"> <div id="x_divtagdefaultwrapper" dir="ltr" style="color: black; font-size: 12pt; font-family: Calibri, Helvetica, sans-serif, serif, 'EmojiFont';"> <p style="margin-top: 0; margin-bottom: 0;">I first met Steve during a talk I was giving at Carnegie Mellon in 2003 describing very early thoughts on a cryptography-flavored approach to privacy in public databases.&nbsp; Some of these ideas arose during Adam Smith's internship with me at Microsoft.&nbsp; Steve was critical (``Your utility is going to be in the toilet''), but I think he was intrigued by the cryptographic approach, since after the talk he proposed that we have a workshop (``Your bring your guys and I'll bring mine'').&nbsp; This occurred during the summer of 2005 in the hillside town of Bertinoro, Italy.&nbsp; The workshop almost broke down on the second day: the statisticians thought the cryptographers, with their talk of ``the adversary'' and its arbitrary auxiliary information, were completely paranoid, while the cryptographers were frustrated by the absence of a formal notion of privacy and a measure of its loss in the statistical work.&nbsp; Fortunately, there is little to do in Bertinoro at night, other than to drink grappa in the piazza, and this eased the tension considerably.&nbsp; Later in the workshop Steve proposed to Alan Karr and me that we found a journal and, to paraphrase Gertrude Stein, we have and this is it.</p> </div> </div> </div> </div> </div> </div> <div class="_pe_d _pe_92" tabindex="-1" aria-expanded="false" aria-haspopup="false">&nbsp;</div> Cynthia Dwork ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-02 2018-12-02 8 1 10.29012/jpc.702 Tribute to Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/707 <p>As readers of this Journal know, I paid my tribute to Steve Fienberg in my 2016 Julius Shiskin Lecture:As readers of this Journal know, I paid my tribute to Steve Fienberg in my 2016 Julius Shiskin Lecture:</p> <p>"Finally, I would like to acknowledge the role of Stephen Fienberg of Carnegie Mellon University. I'm sure almost everyone in this auditorium can cite a path-breaking contribution of Steve's that had a major impact on statistics and the federal statistical system. I want to highlight the foresight that he had in gathering researchers from the SDL community and the emerging computer science data-privacy community in Bertinoro, Italy, in 2005. This is where I first met Cynthia Dwork and the team of young cryptographers who were shattering the received wisdom in SDL with methods that Steve recognized as revolutionary. I’m also going to spend much of this lecture on those methods. The last time Steve and I talked about this, at this year's JSM, he confided to me that our big mistake was that ``we did not grow the community fast enough.'' I hope this lecture helps solve that problem too." (JPC, 2017, Vol. 7, No. 3 \url{https://doi.org/10.29012/jpc.v7i3.404})</p> <p>To which I would now add, that I hope this volume in his honor also expands the community of scholars working on these important issues. As compelling as the cryptographers are, privacy-preserving data analysis must have equal participation from domain scientists, technologists, and statisticians. Good science and strong privacy protections do compete for the same scarce resource (the information in confidential databases), but efficient, workable solutions require input from all these specialists.</p> John Abowd ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-18 2018-12-18 8 1 10.29012/jpc.707 Steve's influence on my work https://journalprivacyconfidentiality.org/index.php/jpc/article/view/704 <p>Steve's influence on my work</p> Vishesh Karwa ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-18 2018-12-18 8 1 10.29012/jpc.704 Reminiscenses of Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/705 <p>I was fortunate enough to meet Steve at a variety of privacy workshops over&nbsp;the years. More than anyone I have met, he encouraged early-career researchers&nbsp;and showed interest in their work. Part of the reason was Steve's enthusiasm for&nbsp;statistical disclosure control, which comes across in every conversation. Once,&nbsp;he (half-jokingly) told me that everyone should be limited to lifetime total of&nbsp;20 published pages, so that people focus their energies on deep progress in the&nbsp;problems they care about. Late in his career, Steve also mentioned that he&nbsp;started telling prospective students that he doesn't need to publish anymore.<br>Of course, he continued his research in privacy { not because he needed to, but&nbsp;because he wanted to. One of our shared common interests was in developing&nbsp;techniques for proper statistical analysis of differentially private data, and over-coming the associated computational difficulties. I am grateful for his support&nbsp;and encouragement -- to a first-year faculty, it means a lot when a legend tells&nbsp;you he had read all of your papers, and I will miss his enthusiasm.</p> Dan Kifer ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-18 2018-12-18 8 1 10.29012/jpc.705 Reminiscences about Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/696 <p>Rest in peace, Steve Fienberg.</p> Adam Smith ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-19 2018-12-19 8 1 10.29012/jpc.696 Memories of Steve Fienberg https://journalprivacyconfidentiality.org/index.php/jpc/article/view/709 <p>In 2015, Steve and I traveled together to Ithaca from Pittsburgh for a workshop on differential privacy (DP). This is one of my fondest memories&nbsp;<br>with Steve for many reasons. The trip involved many of my favorite things, which Steve and I both shared --- statistics, hockey, great food, and good conversation. For me, this is one of my most memorable trips with Steve, as I learned so much from our conversations together.&nbsp;</p> Rebecca Carter Steorts ##submission.copyrightStatement## http://creativecommons.org/licenses/by-nc-nd/4.0 2018-12-28 2018-12-28 8 1 10.29012/jpc.709