Decentralizing the Analysis of Health Data
Numerous state and federal government agencies have established programs that analyze health information to support research and policy objectives. These programs should be undertaken in a way that maximizes the confidentiality and security of patient data and preserves the trust of both health care providers and the public. While a strong policy framework is critical, the technical architecture of information exchange is another important factor. Currently, many government programs using health claims for secondary purposes collect and retain the data in a centralized fashion. The key message of this paper is that decentralized alternatives can achieve most secondary use program goals in a manner that is more protective of privacy and security in the long term.
Whereas centralized databases typically operate by compiling data into one system and managing it from that location, decentralized systems typically leave data housed with the original sources of the data and perform analyses by searching the data held by these entities. A fundamental problem with the centralized architecture used by many programs is that centralization typically necessitates the maintenance and sharing of multiple copies of patient data. This pattern repeats itself each time a new research or policy need requires the creation of another centralized database. Yet continually building and copying huge repositories of medical data is risky, inefficient, and a poor long-term strategy.
Distributed networks can often cost less and take less time to establish than centralized databases because a distributed network minimizes data transfer and leverages existing infrastructure. Using a distributed network can also reduce the risk and severity of data breaches compared to centralized databases. Leaving data sets with the original data sources minimizes the number of copies of sensitive data sets in circulation.
CDT recommends the U.S. Dept. of Health and Human Services (HHS) collaborate with health care plans and providers, researchers, state agencies, and technology vendors to initiate projects to evaluate the effectiveness of a distributed models. CDT also recommends federal and state agencies ensure their regulations leave open the possibility of using systems – subject to approval by the agencies – that do not rely on centralized databases.