Selectively Redacting Sensitive Places from Location Data to Protect Reproductive Health Privacy

August 25, 2022 / Nick Doty

The Supreme Court’s ruling in Dobbs v. Jackson Women’s Health Organization, and state laws that criminalize abortion or deputize private citizens to sue any person who performs an abortion, have once again emphasized how our use of mobile devices and online services generate a litany of data that can reveal intimate details about our health.

For example, cellphones, fitness trackers, map services, and transit apps create records that could indicate when an individual visited a mental health facility, rehab center, or abortion provider.

To reduce the harm of prosecution or persecution as a result of accessing or providing health care services, one technical mitigation is selective redaction of sensitive places from location data. When providers delete records of people’s visits to health care facilities, they can reduce the likelihood that location data will be used to disclose or prosecute abortion care. This potential mitigation has been proposed, discussed, and, in some cases, already deployed. For example, Google has announced and begun to implement deletions of sensitive places from its Location History timelines.

This type of concrete, technical protection is welcome and necessary.

In designing and implementing this mitigation, we recommend that providers apply comprehensive threat modeling and risk analysis to their particular applications and datasets. Threats assessed should include law enforcement gaining access to data through warrants or subpoenas, as well as the risk that hackers or private actors will buy or sell data and use it for vigilantism, bounty hunting, or tip-offs to law enforcement.

We provide here some recommendations for how to implement the general technique of selectively redacting location data, which can help more effectively protect people’s privacy when visiting potentially sensitive places:

Minimize location data.

Broader data minimization — including not collecting location data, keeping it on-device rather than transferring it, or not retaining location data as long — is simpler, more comprehensible, and more protective against many threats. This practice may not always be feasible for particular uses, but it’s the best place to start.

Use a broad list of sensitive location categories: include more than a single category, and err on the side of including places when uncertain. Lists must be audited and kept up to date, but also remain private.

If only a single category of location is redacted, then a gap in a location history could itself be used as a high-confidence inference that the person visited exactly that kind of sensitive location. While we can’t assume that lists of sensitive locations won’t be known to attackers (including hackers, law enforcement, vigilantes, or bounty hunters), there are also sensitive locations that are not widely known and should not be broadly published.
Remove data not just at the sensitive place itself (lest the gap be revealing), but a larger segment in both space and time.

For example, if a location history is available showing a person traveling up to a point 50 meters away from a health care facility, and then a two hour gap, and then the person traveling from a point 50 meters away from the facility back home, then the gap itself can be used to infer where they were, especially if going to such a health care facility is one of the only reasons such data would be redacted.
Address not just confirmed locations, but also any lookups or views of sensitive places.

Intent or interest in sensitive places will itself be revealing, especially if some location data is redacted. If someone looks up driving directions to a particular clinic, and has a gap in their recorded location history soon after, that could be used as evidence of where they intended to go. Redactions should be applied to search and browsing histories and server logs.
Redact all location data, not just a location’s association with a sensitive place.

When redacting sensitive places, the location data itself (where in space a person was at some time) needs to be removed, not just the fact that a physical location is associated with a health facility or other sensitive place. Location data can still be revealing to someone who knows where a particular health care provider is located. Removing associations with a sensitive place might reduce the ease with which large-scale identification can take place (queries for everyone who visited any rehab facility, for example), but we should not assume that law enforcement or private actors won’t also make searches based on raw location or nearby points of interest.
Removing identifiers has limited utility, especially for longitudinal location data.

As is well-known, location data over time is highly identifying — a person might, for example, go home at night and to their school or place of work during the day. Aggregated data for a large group of people — for instance, that there were 75 people at a given address over a certain time period — is less sensitive for these particular threats.
Involve users, but don’t overburden them with choices.

While it can empower users to be able to easily see, review, and explicitly choose to retain or delete location data, we should not shift all responsibility in this area to the user. People may not be aware of location data that is collected or could be deleted, and may not realize the sensitivity or remember all the relevant sources of such data. User involvement can be a way for people to decide when they specifically want to keep some data (either locally or on remote servers).
Audit and iterate on the redaction process.

Like with any safety or compliance measure, ongoing testing is necessary to make sure redactions are effectively applied. Iteration is important, as the sensitive places and the ways that location data is accessed and used by law enforcement or private parties may change.
Apply removals to all data sets and all data transfers.

Transfers of raw data to private parties and governmental agencies who could misuse it, or be compelled to transfer the data further, are serious risks. Transparency about transfers and law enforcement requests can help evaluate the need for and effectiveness of this kind of mitigation.

Research on the privacy impacts of location data should be considered, as the implications for revelations from location data depend on its precision, uniqueness, and extent, and the characteristics of the place and even of other people.

Even following these recommendations, there are some important limitations that developers who pursue mitigation through selective redaction should keep in mind:

Sensitive location lists are inevitably incomplete, leading to some uncertainty about the protection this mitigation provides.
Sensitivity of a location will vary from person to person, and so selective removal could also inhibit use of helpful features for people who, for example, work in a sensitive place or have need for logs of their visit.
Because location data is unfortunately widely collected and shared, removal of sensitive places from one data store does not guarantee that there isn’t some other source which could be accessible to law enforcement or private actors. Selective removal mitigations must not be communicated in a way to provide a false sense of security. At the same time, redaction should not be rejected just because some other entity may also have similar data.
Location data (including data revealing interest or intent regarding a location) is not the only type of data that can reveal sensitive characteristics about ourselves or our health care choices. This mitigation, among others, can be applied to other types of data, including messaging, online activity (such as web browsing and search history), purchase records, and health-related data.

The tech community should continue to propose, implement, and evaluate potential techniques to protect reproductive health information. Based on threat modeling, research, and practical experience, we can gather recommendations about those common techniques, but this is inevitably an iterative and ongoing process while people seeking and providing reproductive health care are under attack.

Selectively Redacting Sensitive Places from Location Data to Protect Reproductive Health Privacy

Related Reading

CDT VP of Policy Samir Jain Testimony Before U.S. House Energy & Commerce Committee on “Legislative Solutions to Protect Kids Online and Ensure Americans’ Data Privacy Rights”

Deprecating third-party cookies: a small step towards a more private web

Brief – Unintended Consequences: Consumer Privacy Legislation and Schools