Skip to main content

NHS Digital Data Sharing Remote Audit: Clinical Practice Research Datalink

This report records the key findings of a remote data sharing audit of Clinical Practice Research Datalink (CPRD) between April and July 2022.

Audit summary

Purpose

This report records the key findings of a remote data sharing audit of Clinical Practice Research Datalink (CPRD) between April and July 2022 against:

  • the data sharing framework contract (DSFC) CON-325063-H0M5Y-v2.01 
  • the data sharing agreement (DSA) DARS-NIC-15625-T8K6L-v11.2

This DSA covers the provision of the following datasets:  

Dataset Classification of data Dataset period
Emergency Care Data Set (ECDS) Pseudo/Anonymised, Non-sensitive Oct 2017 – 2022/23 Q1
COVID-19 Hospitalisation in England Surveillance System Pseudo/Anonymised, Non-sensitive Latest available
COVID-19 Second Generation Surveillance System Pseudo/Anonymised, Non-sensitive Latest available
Mental Health Minimum Data Set Pseudo/Anonymised, Non-sensitive 2006/07 - 2013/14 (to 08/2014)

Mental Health and Learning Disabilities Data Set

Pseudo/Anonymised, Non-sensitive Aug 2014 – 2016_M9
Hospital Episode Statistics (HES) Admitted Patient Care Pseudo/Anonymised, Non-sensitive 1997/98 – 2022/23 Q1
HES Critical Care Pseudo/Anonymised, Non-sensitive 2008/09 – 2022/23 Q1
HES Outpatients Pseudo/Anonymised, Non-sensitive 2003/04 – 2022/23 Q1
HES Accident and Emergency Pseudo/Anonymised, Non-sensitive 2007/08 – 2019/20 Q3
Diagnostic Imaging Dataset Pseudo/Anonymised, Non-sensitive Latest available
Patient Reported Outcome Measures (Linkable to HES) Pseudo/Anonymised, Non-sensitive Latest available
HES Civil Registration (Deaths) bridge Pseudo/Anonymised, Non-sensitive Latest available
Bridge File: HES to Mental Health Minimum Data Set  Pseudo/Anonymised, Non-sensitive Bridge File
Bridge File: HES to Diagnostic Imaging Dataset Pseudo/Anonymised, Non-sensitive Latest available
MRIS Bespoke Pseudo/Anonymised, Non-sensitive 2019/2020
Civil Registration (Deaths) – Secondary Care Cut Pseudo/Anonymised, Sensitive Latest available
Mental Health Services Data Set Pseudo/Anonymised, Sensitive 2016/17 – 2020/21
CPRD/University Hospital Birmingham linkage file Pseudo/Anonymised, Non-sensitive Latest available

 

The Controller is the Department of Health and Social Care (DHSC) and the Processors are CPRD and NTT DATA UK Limited. Neither the DHSC nor NTT DATA UK were involved in the audit, as neither company are involved in the production of datasets for customers or in the IQVIA audit process.

CPRD is a centre of the Medicines and Healthcare products Regulatory Agency (MHRA), an executive agency of the Department of Health and Social Care. 

The DSA allows CPRD to onwardly share anonymised data with third parties, both in the UK and worldwide, under sub-licence arrangements. CPRD is required to assess the re-identification risk and apply data minimisation techniques as required to ensure re-identification is not possible by means reasonably likely to be used. 

Interviews during the audit were conducted through video conferencing. The audit also included a significant amount of off-line analysis of the datasets conducted by Office of National Statistics (ONS) staff of behalf of NHS Digital.

Whilst this report is based on the criteria expressed in the NHS Digital Data Sharing Remote Audit Guide version 1, there are some deviations given the specific nature of the audit. 


Audit type and scope

Audit type Focussed
Scope areas

The purpose of the audit is to confirm that suitable steps are being taken to reduce the risk of re-identification in those datasets made available to sub-licensees. Such steps will include the minimisation of the data being shared, as well as undertaking audits.

Only datasets controlled (either solely or jointly) by NHS Digital, that form part of CPRD’s standard linkage offerings are within the audit scope. Processes relating to the release of unlinked CPRD primary care data (which are controlled by CPRD) are out of scope.

Restrictions The audit excludes consideration of CPRD’s research data governance process by which applications for data are received, reviewed and approved. 

Analysis Overview

The audit focussed on 2 main elements:

  • an assessment of risk by ONS staff around re-identification of individuals
  • a review of the audit activities undertaken by CPRD with respect to sub-licensees.

In the first element, two representative datasets along with the agreed protocols were examined by ONS using different approaches to simulate how an attacker may use the data to try to find personal information for an individual(s) contained within a dataset. Methods considered included:

  • attacker attempts to identify an individual not known to them by building up a picture using the datasets supplied
  • attacker attempts to identify a known individual with a relatively rare medical condition.

Such attacks also considered the potential impact of data already in the public domain.

The approach CPRD is currently undertaking to audit sublicences was reviewed by the Audit Team. This approach has matured since it was first described in the post audit review of the 2018 data sharing audit.
 

Overall risk statement

Based on evidence presented during the audit and the type of data being shared the following risk has been assigned from the options of Critical - High - Medium - Low

Current risk statement: Low

In deriving this risk, the Audit Team will consider compliance, duty of care, confidentiality and integrity, as appropriate.


Data recipient’s acceptance statement

CPRD has reviewed this report and confirmed that it is accurate. 

Data recipient’s action plan

Since NHS Digital has not identified any nonconformities, observations or outstanding information, there is no requirement for CPRD to produce a corrective action plan. Therefore, no post audit review will be conducted by NHS Digital.


Findings

Risk of re-identification

Analysis was undertaken by means of adapting standard intruder scenarios including the use of published information and spontaneous recognition to the project. From the analysis undertaken, it was concluded that identification is inherently difficult, either by an attacker attempting to identify an individual not known to them by building up a picture using the datasets supplied, or an attacker attempting to identify a known individual with a relatively rare medical condition. The risk of reidentification is further reduced as there is little information in the public domain from which to match against fields in the 2 representative examples and few key identifying variables in these examples.

Audit process

The following table identifies the 6 opportunities for improvement raised as part of the audit. 

Ref Finding Link to area Clause Designation
1 CPRD should make use of video conferencing when reviewing available audit evidence and obtaining further responses to questions. Operational Management   Opportunity for improvement
2 CPRD should request confirmatory evidence in responses to email answers. Statements as to what collaborative evidence was supplied or presented by the auditee should be included in internal audit reports. Operational Management   Opportunity for improvement
3 The Client Audit Acceptance Criteria could be extended to include questions around:
  • how the organisation would undertake the deletion of data (electronically and physically) when required
  • incidents / near misses / breaches that may indirectly or directly impact the supplied data.
CPRD may also wish to consider how it interprets responses when the auditee is using cloud to hold data.
Operational Management   Opportunity for improvement
4 CPRD should consider how it scores auditees for responses when the organisation is unable to provide evidence of outputs at the time of the audit. Operational Management   Opportunity for improvement
5 The Client Audit Procedure should be expanded to define the process by which recommendations and actions are followed up and reported. As part of this activity, CPRD should consider updating its webpage to facilitate the capturing of post audit statements. Operational Management   Opportunity for improvement
6 CPRD should ensure that the number of audits defined in its Client Audit Procedure is achieved. Operational Management   Opportunity for improvement

Use of data

CPRD confirmed that the datasets were only being processed and used for the purposes defined in the DSA and was only being linked with those datasets explicitly allowed in the DSA.

Data location

CPRD confirmed that processing and storage locations, including disaster recovery and backups, of the datasets were limited to the location shown in the following table. These locations conform with the territory of use defined in clause 2c of the DSA.

Organisation Territory of use
CPRD Worldwide

 


Disclaimer

The audit was based upon a sample of the data recipient’s activities, as observed by the Audit Team. The findings detailed in this audit report may not include all possible nonconformities which may exist. In addition, as the audit interviews were conducted through a video conference platform, certain controls that would normally be assessed whilst onsite could not be witnessed.

NHS Digital has prepared this audit report for its own purposes. As a result, NHS Digital does not assume any liability to any person or organisation for any loss or damage suffered or costs incurred by it arising out of, or in connection with, this report, however such loss or damage is caused. NHS Digital does not assume liability for any loss occasioned to any person or organisation acting or refraining from acting as a result of any information contained in this report.

Last edited: 29 June 2023 10:27 am