Beyond Legal Frameworks and Security Controls For Accessing Confidential Survey Data: Engaging Data Users in Data Protection
Main Article Content
Abstract
With a growing demand for data reuse and open data within the scientific ecosystem, protecting the confidentiality and privacy of survey data is increasingly important. It requires more than legal procedures and technological controls; it requires social and behavioral intervention. In this research note, we delineate the disclosure risks of various types of survey data (i.e., longitudinal data, social network data, sensitive information and biomarkers, and geographic data), the current motivation for data reuse and challenges to data protection. Despite rigorous efforts to protect data, there are still threats to mitigate the protection of confidentiality in microdata. Unintentional data breaches, protocol violations, and the misuse of data are observed even in well-established restricted data access systems, which indicates that the systems all may rely heavily on trust. Creating and maintaining that trust is critical to secure data access. We suggest four ways of building trust; User-Centered Design Practices; Promoting Trust for Protecting Confidential Data; General Training in Research Ethics; Specific Training in Data Security Protocols, with an example of a new project ‘Researcher Passport’ by the Inter-university Consortium for Political and Social Research. Continuous user-focused improvements in restricted data access systems are necessary so that we promote a culture of trust among the research and data user community, train both in the general topic of responsible research and in the specific requirements of these systems, and offer systematic and holistic solutions.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.
Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.
Funding data
-
National Institute on Drug Abuse
Grant numbers 75N95019C00017 -
Directorate for Social, Behavioral and Economic Sciences
Grant numbers 1839868 -
Alfred P. Sloan Foundation
References
Bierer, B.E., Crosas, M., and Pierce, H. H. (2017), “Data Authorship as an Incentive to Data Sharing, New England Journal of Medicine, 376, 1684-1687.
Blythe, J., Koppel, R., and Smith, S. W. (2013), “Circumvention of Security: Good Users Do Bad Things,” IEEE Security Privacy, 11, 80–83. http://doi.org/10.1109/MSP.2013.110
Borgman, C. (2016), Big Data: Scholarship in a Networked World, MIT Press.
Damschroder, L. J., Pritts, J. L., Neblo, M. A., Kalarickal, R. J., Creswell, J. W., and Hayward, R. A. (2007), “Patients, Privacy and Trust: Patients’ Willingness to Allow Researchers to Access Their Medical Records,” Social Science & Medicine, 64, 223–235. https://doi.org/10.1016/j.socscimed.2006.08.045
Domingo-Ferrer, J. and Torra, V. (2003), “Disclosure Risk Assessment in Statistical Microdata Protection Via Advanced Record Linkage,” Statistics and Computing, 13, 343–354. https://doi.org/10.1023/A:1025666923033.
Duncan, G. T. and Stokes, S. L. (2004), “Disclosure Risk Vs. Data Utility: The RU Confidentiality Map as Applied to Topcoding,” Chance, 17, 16-20. https://doi.org/10.1080/09332480.2004.10554908
Ember, C. and Hanisch, R. (2013). Sustaining domain repositories for digital data: A white paper. http://hdl.handle.net/2027.42/136145
Feth, D., Maier, A., and Polst, S. (2017, July), “A User-Centered Model for Usable Security and Privacy,” In International Conference on Human Aspects of Information Security, Privacy, and Trust, Springer, Cham, 74-89. https://doi.org/10.1007/978-3-319-58460-7_6
Fiesler, C. and Proferes, N. (2018), “Participant’ Perceptions of Twitter Research Ethics,” Social Media + Society, 4, 1-14. https://doi.org/10.1177/2056305118763366
Gfroerer, J., Penne, M., Pemberton, M., and Folsom, R. (2003), “Substance Abuse Treatment Need Among Older Adults in 2020: The Impact of the Aging Baby-Boom Cohort,” Drug and Alcohol Dependence, 69, 127-135.
Goldenberg, A. J., Maschke, K. J., Joffe, S., Botkin, J. R., Rothwell, E., Murray, T. H., ... and Rivera, S. M. (2015), “IRB Practices and Policies Regarding the Secondary Research Use of Biospecimens,” BMC medical ethics, 16, 1-8.
Grayson, S., Suver, C., Wilbanks, J., and Doerr, M. (2019), “Open Data Sharing in the 21st Century: Sage Bionetworks’ Qualified Research Program and Its Application in mHealth Data Release,” Available at SSRN 3502410.
Hansson, M., Lochmüller, H., Riess, O. et al. (2016), “The risk of re-identification versus the need to identify individuals in rare disease research.” European Journal of Human Genetics 24, 1553–1558.
Harris, K. (2013), “The Add Health Study: Design and Accomplishments,” Chapel Hill: Carolina Population Center, University of North Carolina at Chapel Hill, 1-22 (Available at https://doi.org/10.17615/C6TW87).
Hay, M., Miklau, G., Jensen, D., Weis, P., and Srivastava, S. (2007), “Anonymizing Social Networks,” Computer Science Department Faculty Publication Series, 180. (Available at https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1175&context=cs_faculty_pubs)
Hornstein, D., Nakar, S., Weinberger, S., and Greenbaum, D. (2015), “More Nuanced Informed Consent is Not Necessarily Better Informed Consent,” The American Journal of Bioethics, 15, 51-53.
Karlsson F. and K. Hedstrom. (2014), “End User Development and Information Security Culture,” In: Tryfonas T., Askoxylakis I. (eds) Human Aspects of Information Security, Privacy, and Trust. HAS 2014. Lecture Notes in Computer Science, vol 8533. Springer, Cham. https://doi.org/10.1007/978-3-319-07620-1_22)
Lan, C. W., Chen, Y. H., Grandison, T., Huang, A. F., Chung, J. Y., and Tseng, L. F. (2011, October), “A Privacy Reinforcement Approach Against De-identified Dataset,” In 2011 IEEE 8th International Conference on e-Business Engineering, Beijing, 370-375. http://doi.org/10.1109/ICEBE.2011.25).
Levenstein, M.C., Tyler, A.R.B., and Davidson Bleckman, J. 2018, “The Researcher Passport: Improving Data Access and Confidentiality Protection: ICPSR’s Strategy for a Community-normed System of Digital Identities of Access,” ICPSR White Paper Series, 1. Ann Arbor, MI: University of Michigan Inter-university Consortium for Political and Social Research.
Lohr, S. L., and Raghunathan, T. E. (2017), “Combining Survey Data with Other Data Sources,” Statistical Science, 293-312. http://doi.org/10.1214/16-STS584
Lupia, A. (2020), “Practical and Ethical Reasons for Pursuing a More Open Science,” PS: Political Science & Politics, 1-4. http://doi.org/10.1017/S1049096520000979
Macilotti, M. (2013), “Informed Consent and Research Biobanks: A Challenge in Three Dimensions,” In Comparative Issues in the Governance of Research Biobanks, Berlin, Heidelberg: Springer.
Office of Science and Technology Policy, (2013), “Increasing Access to the Results of Federally Funded Scientific Research.” Executive Office of the White House OSTP Memorandum (Available from https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf).
Oushy, M. H., Palacios, R., Holden, A. E., Ramirez, A. G., Gallion, K. J., and O’Connell, M. A. (2015), “To Share or Not to Share? A Survey of Biomedical Researchers in the US Southwest, an Ethnically Diverse Region,” PloS One, 10, e0138239. https://doi.org/10.1371/journal.pone.0138239
Pienta, A. M., Alter, G. C., and Lyle, J. A. (2010), “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data,” ICPSR Working Paper. http://hdl.handle.net/2027.42/78307
Piwowar, H. A. and Vision, T. J. (2013), “Data Reuse and the Open Data Citation Advantage,” PeerJ, 1, e175. https://doi.org/10.7717/peerj.175
Piwowar, H. A., Day, R. S., and Fridsma, D. B. (2007), “Sharing Detailed Research Data Is Associated with Increased Citation Rate,” PloS One 2, e308. https://doi.org/10.1371/journal.pone.0000308
PSID Main Interview User Manual: Release 2019. (2019), Institute for Social Research, University of Michigan, February, 2019.
Servais M. (2010), “Overview of HRS Public Data Files for Cross-sectional and Longitudinal Analysis,” Ann Arbor, Michigan: Survey Research Center, Institute for Social Research, University of Michigan.
Shlomo N. and Skinner, C. (2010), “Assessing the Protection Provided by Misclassification-Based Disclosure Limitation Methods and Survey Microdata,” The Annals of Applied Statistics. 4 (3), 1291-1310. http://doi.org/10.1214/09-AOAS317
Sinclair, S. and Smith, S. W. (2010), “What’s Wrong with Access Control in the Real World?,” IEEE Security Privacy, 8, 74–77. http://doi.org/10.1109/MSP.2010.139
Tennant, B., Stellefson, M., Dodd, V., Chaney, B., Chaney, D., Paige, S., and Alber, J. (2015), “eHealth Literacy and Web 2.0 Health Information Seeking Behaviors Among Baby Boomers and Older Adults,” Journal of Medical Internet Research, 17, e70. http://doi.org/10.2196/jmir.3992
Tyler, A. (2020), “Facilitating Access to Restricted Data,” International Journal of Digital Curation, 15, 1-16. https://doi.org/10.2218/ijdc.v15i1.602
Wagner, C. S., Wagner, C. S., & Graber. (2018), Collaborative Era in Science, London: Palgrave Macmillan. https://doi.org/10.1007/978-3-319-94986-4
Walport, M. and Brest, P. (2011), “Sharing Research Data to Improve Public Health,” The Lancet, 377, 537–539. https://doi.org/10.1016/S0140-6736(10)62234-9