Health Data De-Identification Rules in Need of Update?

November 13, 2008

We’re heading into flu season, though we don’t yet know exactly when, where, or how hard the disease will strike. As the New York Times reported, this year Google may be able to help us predict outbreaks as much as a week to 10 days before the Centers for Disease Control and Prevention can. Google Flu Trends compiles individuals’ searches on flu-related terms from across the U.S. and creates visuals that show their volume and geographic source. As it turns out, those trends are closely correlated with actual outbreaks reported by the medical establishment. Good news for syndromic surveillance, but is it good for privacy? Google Flu Trends assures us that its data “can never be used to identify individual users”. Perhaps.

We would all rest easier if Google would be more transparent about how it assures that identification won’t happen. And such assurances are getting harder to back every day. The Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule includes guidelines on how to “de-identify” health data to protect personal privacy while enabling it to support social goods like improving the quality and safety of health procedures, public health, and medical research. But the Privacy Rule hasn’t kept up with the times, even though the authors’ intention was that it should evolve. For one thing, it doesn’t apply to Google or a host of other companies and organizations that now access and use personal health data. So one problem is that the HIPAA Privacy Rule applies only to a small subset of the ever-evolving list of players who handle health data. Not included are the most influential new entities, including many that manage personal health records (PHRs), and regional health information organizations (RHIOs).

Another problem is that machine (and human) learning has advanced considerably in the last decade, making it easier and cheaper to crack once-formidable codes. Compounding the challenge is a vast proliferation of digital data; the more data points you have about a person, the easier it is to establish his or her identity. The data explosion goes way beyond health data and includes purchases made by credit card, emails, and the automatic tolls for cars on the highway. Any of these data points and countless others could in theory be combined and used to help find out about an individual, including linking him or her to “de-identified” health data. In September CDT’s Health Privacy Project pulled together some of the nation’s best thinkers on data security and policy for a one-day workshop on the de-identification of health data that was more exhaustive than any held since the HIPAA Rule was drafted earlier this decade.

In early 2009 CDT will publish a white paper based on the workshop’s proceedings to tee up the many facets of the de-identification issue for the next administration and Congress, and to suggest possible improvements to existing policies and practices for protecting data. The paper will also be available publicly through this site.

Presenters at the CDT Health Privacy Project’s De-Identification of Health Data Workshop included the following: Mark Kohan and Sofia Plotzker, IMS Health Bill Braithwaite, MD, PhD – Chief Medical Officer of Anakam, Inc. and HIPAA contributing author Justine Carr, MD – Senior Vice President for Quality, Patient Safety, Compliance and Medical Affairs, Caritas Christi Health Care System; Co-Vice Chair, NCVHS Work Group on Uses of Health Data. Stanley W. Crosley, JD – Chief Privacy Officer, Eli Lilly and Company Cynthia Dwork, PhD – Principal Researcher, Microsoft Research Kenneth W. Goodman, PhD – Professor and Director, University of Miami Bioethics Program; Director, Project HealthDesign Ethical, Legal and Social Issues (ELSI) unit Linda Goodwin, RN, PhD – Informatics Program Director, Duke University School of Nursing Shaun Grannis, MD, MS – Medical Informatics Researcher at the Regenstrief Institute, Inc. and Assistant Professor of Family Medicine at Indiana University School of Medicine Mark A. Rothstein, JD – Herbert F. Boehl Chair of Law and Medicine and Director, Institute for Bioethics, Health Policy and Law, University of Louisville School of Medicine Latanya Sweeney, PhD – Associate Professor of Computer Science, Technology and Policy and Director of the Data Privacy Lab, Carnegie Mellon University, Peter Swire, JD – Professor of Law at the Moritz College of Law of the Ohio State University and Senior Fellow at the Center for American Progress Lygeia Ricciardi is a Special Advisor to the Health Privacy Project at CDT