The Department of Health broke the law when it published datasets containing a billion lines of historical health data relating to around three million Australians in 2016, says Australian Information Commissioner and Privacy Commissioner, Timothy Pilgrim.
(The report was finalised before Pilgrim last week retired from his role at the Office of the Australian Information Commissioner.)
Datasets drawn from the Pharmaceutical Benefits and Medicare Benefits schemes and posted on the government’s open data portal carried too high a risk of re-identifying medical providers the commissioner concluded, following an 18-month investigation into the incident.
The commissioner also found the department’s processes for assessing the risks around the data’s release to be “inadequate”.
“The decision making process the Department of Health followed before releasing this data did not involve a clear and documented approval process, rigorous risk management processes, or a significant degree of cross-government coordination,” Pilgrim's report states.
“This incident holds important lessons for custodians of valuable datasets containing personal information. Determining whether information has been appropriately de-identified requires careful, expert, and likely independent evaluation. Who the information is released to must also be considered,” it adds.
The data, which the commissioner said was made available “in good faith for the public interest” featured information relating to a 10 per cent sample of people who had made a claim for payment of Medicare Benefits since 1984, or for payment of Pharmaceutical Benefits since 2003.
Within a month of the supposedly de-identified data’s publication, Melbourne University researcher Dr Vanessa Teague and colleagues Chris Culnane and Benjamin Rubinstein managed to re-identify some service provider ID numbers. The datasets were swiftly pulled offline by the department.
Further analysis found information in the dataset could be combined with information from other sources to identify individuals.
“While the Department of Health took steps to de-identify the personal information of Medicare service providers, these measures were ultimately not sufficient. The encryption method for provider numbers was flawed, allowing Medicare service providers to potentially be re-identified from the information,” the commissioner’s report states.
The Privacy Act was breached in a number of ways, the commissioner says: The de-identification steps taken by the department were “inadequate relative to the sensitivity of the information and the context of its release” and personal information was disclosed “for a purpose other than that for which it was collected”.
Despite Teague et al demonstrating that individuals in the sample were able to be re-identified – a process made even easier with available commercial datasets – the commissioner concluded that this was not “reasonable to achieve” and so the dataset does not contain personal patient information.
“The affected 10 per cent of Australians should be told,” Teague said in a tweet today.
Pilgrim said the incident provided an opportunity for government to improve its approach to open data since the “de-identification of large and rich datasets for publication to the world at large is extremely difficult”.
Agencies should consider limiting the release of unit level data about individuals to “trusted recipients, rather than to the world at large”, he added.
Following the snafu, in 2016 then Attorney-General George Brandis announced the government would introduce legislation to criminalise re-identification, meaning Teague and colleagues could have faced up to two years in prison for their work.
The bill’s passage through parliament looks uncertain.
The Department of the Prime Minister and Cabinet in December 2016 released a new process government agencies need to follow to release sensitive unit record datasets as open data, a process the commissioner today said would have avoided the Department of Health incident.