Article Text

Download PDFPDF
Technical and Policy Approaches to Balancing Patient Privacy and Data Sharing in Clinical and Translational Research
  1. Bradley Malin, PhD*,
  2. David Karp, MD, PhD,
  3. Richard H. Scheuermann, PhD
  1. From the *Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN; †Division of Rheumatology, Department of Internal Medicine, and ‡Division of Biomedical Informatics, Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX.
  1. Received September 28, 2009, and in revised form November 4, 2009.
  2. Accepted for publication November 6, 2009.
  3. Reprints: Bradley Malin, PhD, Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Suite 600, 2525 W End, Nashville, TN 37203. E-mail: b.malin{at}
  4. Supported by the following grants from the US National Institutes of Health: N01AI40076, R01LM009989, U01HG004603, UL1RR023468, and UL1RR024982.


Introduction Clinical researchers need to share data to support scientific validation and information reuse and to comply with a host of regulations and directives from funders. Various organizations are constructing informatics resources in the form of centralized databases to ensure reuse of data derived from sponsored research. The widespread use of such open databases is contingent on the protection of patient privacy.

Methods We review privacy-related problems associated with data sharing for clinical research from technical and policy perspectives. We investigate existing policies for secondary data sharing and privacy requirements in the context of data derived from research and clinical settings. In particular, we focus on policies specified by the US National Institutes of Health and the Health Insurance Portability and Accountability Act and touch on how these policies are related to current and future use of data stored in public database archives. We address aspects of data privacy and identifiability from a technical, although approachable, perspective and summarize how biomedical databanks can be exploited and seemingly anonymous records can be reidentified using various resources without hacking into secure computer systems.

Results We highlight which clinical and translational data features, specified in emerging research models, are potentially vulnerable or exploitable. In the process, we recount a recent privacy-related concern associated with the publication of aggregate statistics from pooled genome-wide association studies that have had a significant impact on the data sharing policies of National Institutes of Health-sponsored databanks.

Conclusion Based on our analysis and observations we provide a list of recommendations that cover various technical, legal, and policy mechanisms that open clinical databases can adopt to strengthen data privacy protection as they move toward wider deployment and adoption.

Key Words
  • clinical research
  • translational research
  • databases
  • privacy

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.