Skip to main content

19-12-2011 | Article

Attacks on health databases reveal privacy weaknesses


Free full text

MedWire News: Attacks on medical databases that result in the identification of patients occurs quite frequently, research shows.

The overall success rate of the re-identification attacks on health databases is 34%, a rate study investigators acknowledge as high.

However, they suggest the results should be interpreted cautiously as the confidence intervals around the estimates are quite large, mainly because the attacks were on small databases.

"Therefore, there is considerable uncertainty around these numbers," write Khaled El Eman (University of Ottawa, Canada) and colleagues in the journal PLoS One.

In de-identification, information in the records is reduced in order to decrease the chances of discovering the patient's identity. De-identification in the context of public health data is important for many privacy statutes and regulations.

There is concern, however, that information within the database can be re-identified with relative ease, "casting doubt on the ability to protect personal information from privacy invasions," state El Eman and colleagues.

In their review, the researchers attempted to characterize the attacks on health data and to compare these privacy attacks with breaches in other types of data.

Fourteen studies reported distinct attacks that resulted in the re-identification of patients in the database, including six attacks on health data. Of these attacks, 11 were performed by researchers testing the relative strength of the privacy controls and did not exploit the risks. Four of the six attacks on health databases were performed by researchers.

The overall proportion of attacks on the records that were able to ascertain the identity of individuals was 26%. The overall re-identification success rate on health records was 34%.

"Given such high re-identification rates, it is not surprising that there is a general belief that re-identification is easy," write the researchers.

That said, the results "mask a more nuanced picture that makes it difficult to draw strong conclusions about the ease of re-identification."

They point out that of the 14 studies just two de-identified the data based on current standards.

Overall, the researchers highlight the difficulties in determining the real risk of re-identification, given that unsuccessful attacks might not be published because the results would be perceived as uninteresting. Successful attacks might also not be published because the results would be perceived as embarrassing or cause problems for the data custodians.

"This makes it difficult to draw strong conclusions from the combined effect estimate of the proportion of records re-identified," conclude El Eman and colleagues.

By MedWire Reporters