Beyond Legal Frameworks and Security Controls For Accessing Confidential Survey Data: Engaging Data Users in Data Protection

Main Article Content

Amy M. Pienta
https://orcid.org/0000-0003-1174-6118
Joy Bohyun Jang
Margaret C. Levenstein

Abstract

With a growing demand for data reuse and open data within the scientific ecosystem, protecting the confidentiality and privacy of survey data is increasingly important.  It requires more than legal procedures and technological controls; it requires social and behavioral intervention. In this research note, we delineate the disclosure risks of various types of survey data (i.e., longitudinal data, social network data, sensitive information and biomarkers, and geographic data), the current motivation for data reuse and challenges to data protection. Despite rigorous efforts to protect data, there are still threats to mitigate the protection of confidentiality in microdata. Unintentional data breaches, protocol violations, and the misuse of data are observed even in well-established restricted data access systems, which indicates that the systems all may rely heavily on trust. Creating and maintaining that trust is critical to secure data access. We suggest four ways of building trust; User-Centered Design Practices; Promoting Trust for Protecting Confidential Data; General Training in Research Ethics; Specific Training in Data Security Protocols, with an example of a new project ‘Researcher Passport’ by the Inter-university Consortium for Political and Social Research. Continuous user-focused improvements in restricted data access systems are necessary so that we promote a culture of trust among the research and data user community, train both in the general topic of responsible research and in the specific requirements of these systems, and offer systematic and holistic solutions.

Article Details

How to Cite
Pienta, Amy, Joy Jang, and Margaret Levenstein. 2023. “Beyond Legal Frameworks and Security Controls For Accessing Confidential Survey Data: Engaging Data Users in Data Protection”. Journal of Privacy and Confidentiality 13 (2). https://doi.org/10.29012/jpc.845.
Section
NAHDAP-ICPSR restricted data workshop

References

Bierer, B.E., Crosas, M., and Pierce, H. H. (2017), “Data Authorship as an Incentive to Data Sharing, New England Journal of Medicine, 376, 1684-1687.

Blythe, J., Koppel, R., and Smith, S. W. (2013), “Circumvention of Security: Good Users Do Bad Things,” IEEE Security Privacy, 11, 80–83. http://doi.org/10.1109/MSP.2013.110

Borgman, C. (2016), Big Data: Scholarship in a Networked World, MIT Press.

Damschroder, L. J., Pritts, J. L., Neblo, M. A., Kalarickal, R. J., Creswell, J. W., and Hayward, R. A. (2007), “Patients, Privacy and Trust: Patients’ Willingness to Allow Researchers to Access Their Medical Records,” Social Science & Medicine, 64, 223–235. https://doi.org/10.1016/j.socscimed.2006.08.045

Domingo-Ferrer, J. and Torra, V. (2003), “Disclosure Risk Assessment in Statistical Microdata Protection Via Advanced Record Linkage,” Statistics and Computing, 13, 343–354. https://doi.org/10.1023/A:1025666923033.

Duncan, G. T. and Stokes, S. L. (2004), “Disclosure Risk Vs. Data Utility: The RU Confidentiality Map as Applied to Topcoding,” Chance, 17, 16-20. https://doi.org/10.1080/09332480.2004.10554908

Ember, C. and Hanisch, R. (2013). Sustaining domain repositories for digital data: A white paper. http://hdl.handle.net/2027.42/136145

Feth, D., Maier, A., and Polst, S. (2017, July), “A User-Centered Model for Usable Security and Privacy,” In International Conference on Human Aspects of Information Security, Privacy, and Trust, Springer, Cham, 74-89. https://doi.org/10.1007/978-3-319-58460-7_6

Fiesler, C. and Proferes, N. (2018), “Participant’ Perceptions of Twitter Research Ethics,” Social Media + Society, 4, 1-14. https://doi.org/10.1177/2056305118763366

Gfroerer, J., Penne, M., Pemberton, M., and Folsom, R. (2003), “Substance Abuse Treatment Need Among Older Adults in 2020: The Impact of the Aging Baby-Boom Cohort,” Drug and Alcohol Dependence, 69, 127-135.

Goldenberg, A. J., Maschke, K. J., Joffe, S., Botkin, J. R., Rothwell, E., Murray, T. H., ... and Rivera, S. M. (2015), “IRB Practices and Policies Regarding the Secondary Research Use of Biospecimens,” BMC medical ethics, 16, 1-8.

Grayson, S., Suver, C., Wilbanks, J., and Doerr, M. (2019), “Open Data Sharing in the 21st Century: Sage Bionetworks’ Qualified Research Program and Its Application in mHealth Data Release,” Available at SSRN 3502410.

Hansson, M., Lochmüller, H., Riess, O. et al. (2016), “The risk of re-identification versus the need to identify individuals in rare disease research.” European Journal of Human Genetics 24, 1553–1558.

Harris, K. (2013), “The Add Health Study: Design and Accomplishments,” Chapel Hill: Carolina Population Center, University of North Carolina at Chapel Hill, 1-22 (Available at https://doi.org/10.17615/C6TW87).

Hay, M., Miklau, G., Jensen, D., Weis, P., and Srivastava, S. (2007), “Anonymizing Social Networks,” Computer Science Department Faculty Publication Series, 180. (Available at https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1175&context=cs_faculty_pubs)

Hornstein, D., Nakar, S., Weinberger, S., and Greenbaum, D. (2015), “More Nuanced Informed Consent is Not Necessarily Better Informed Consent,” The American Journal of Bioethics, 15, 51-53.

Karlsson F. and K. Hedstrom. (2014), “End User Development and Information Security Culture,” In: Tryfonas T., Askoxylakis I. (eds) Human Aspects of Information Security, Privacy, and Trust. HAS 2014. Lecture Notes in Computer Science, vol 8533. Springer, Cham. https://doi.org/10.1007/978-3-319-07620-1_22)

Lan, C. W., Chen, Y. H., Grandison, T., Huang, A. F., Chung, J. Y., and Tseng, L. F. (2011, October), “A Privacy Reinforcement Approach Against De-identified Dataset,” In 2011 IEEE 8th International Conference on e-Business Engineering, Beijing, 370-375. http://doi.org/10.1109/ICEBE.2011.25).

Levenstein, M.C., Tyler, A.R.B., and Davidson Bleckman, J. 2018, “The Researcher Passport: Improving Data Access and Confidentiality Protection: ICPSR’s Strategy for a Community-normed System of Digital Identities of Access,” ICPSR White Paper Series, 1. Ann Arbor, MI: University of Michigan Inter-university Consortium for Political and Social Research.

Lohr, S. L., and Raghunathan, T. E. (2017), “Combining Survey Data with Other Data Sources,” Statistical Science, 293-312. http://doi.org/10.1214/16-STS584

Lupia, A. (2020), “Practical and Ethical Reasons for Pursuing a More Open Science,” PS: Political Science & Politics, 1-4. http://doi.org/10.1017/S1049096520000979

Macilotti, M. (2013), “Informed Consent and Research Biobanks: A Challenge in Three Dimensions,” In Comparative Issues in the Governance of Research Biobanks, Berlin, Heidelberg: Springer.

Office of Science and Technology Policy, (2013), “Increasing Access to the Results of Federally Funded Scientific Research.” Executive Office of the White House OSTP Memorandum (Available from https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf).

Oushy, M. H., Palacios, R., Holden, A. E., Ramirez, A. G., Gallion, K. J., and O’Connell, M. A. (2015), “To Share or Not to Share? A Survey of Biomedical Researchers in the US Southwest, an Ethnically Diverse Region,” PloS One, 10, e0138239. https://doi.org/10.1371/journal.pone.0138239

Pienta, A. M., Alter, G. C., and Lyle, J. A. (2010), “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data,” ICPSR Working Paper. http://hdl.handle.net/2027.42/78307

Piwowar, H. A. and Vision, T. J. (2013), “Data Reuse and the Open Data Citation Advantage,” PeerJ, 1, e175. https://doi.org/10.7717/peerj.175

Piwowar, H. A., Day, R. S., and Fridsma, D. B. (2007), “Sharing Detailed Research Data Is Associated with Increased Citation Rate,” PloS One 2, e308. https://doi.org/10.1371/journal.pone.0000308

PSID Main Interview User Manual: Release 2019. (2019), Institute for Social Research, University of Michigan, February, 2019.

Servais M. (2010), “Overview of HRS Public Data Files for Cross-sectional and Longitudinal Analysis,” Ann Arbor, Michigan: Survey Research Center, Institute for Social Research, University of Michigan.

Shlomo N. and Skinner, C. (2010), “Assessing the Protection Provided by Misclassification-Based Disclosure Limitation Methods and Survey Microdata,” The Annals of Applied Statistics. 4 (3), 1291-1310. http://doi.org/10.1214/09-AOAS317

Sinclair, S. and Smith, S. W. (2010), “What’s Wrong with Access Control in the Real World?,” IEEE Security Privacy, 8, 74–77. http://doi.org/10.1109/MSP.2010.139

Tennant, B., Stellefson, M., Dodd, V., Chaney, B., Chaney, D., Paige, S., and Alber, J. (2015), “eHealth Literacy and Web 2.0 Health Information Seeking Behaviors Among Baby Boomers and Older Adults,” Journal of Medical Internet Research, 17, e70. http://doi.org/10.2196/jmir.3992

Tyler, A. (2020), “Facilitating Access to Restricted Data,” International Journal of Digital Curation, 15, 1-16. https://doi.org/10.2218/ijdc.v15i1.602

Wagner, C. S., Wagner, C. S., & Graber. (2018), Collaborative Era in Science, London: Palgrave Macmillan. https://doi.org/10.1007/978-3-319-94986-4

Walport, M. and Brest, P. (2011), “Sharing Research Data to Improve Public Health,” The Lancet, 377, 537–539. https://doi.org/10.1016/S0140-6736(10)62234-9