NHS Digital Data Sharing Remote Audit: Genomics England
This report records the key findings of a remote data sharing audit of Genomics England Limited during March 2022.
Audit summary
Purpose
This report records the key findings of a remote data sharing audit of Genomics England Limited (GE) between 7 and 14 March 2022. It provides an evaluation of how GE conforms to the requirements of both:
- the data sharing framework contract (DSFC) CON-368648-M3S4Z v2.01
- the data sharing agreement (DSA) DARS-NIC-12784-R8W7V-v8.6
This DSA covers the provision of the following datasets:
Dataset | Classification of data | Dataset period |
---|---|---|
Bridge file: Hospital Episode Statistics (HES) to Mental Health Minimum Data Set | Pseudo/Anonymised, Non-sensitive | Historic Data Request |
Bridge file: HES to Diagnostic Imaging Dataset | Pseudo/Anonymised, Non-sensitive | Historic Data Request |
HES Critical Care | Identifiable, Non-sensitive | 2008/09 - 2021/22_M10 |
Diagnostic Imaging Dataset | Identifiable, Non-sensitive | 2008/09 - 2019/20_M13 |
Emergency Care Data Set (ECDS) | Identifiable, Sensitive | 2017/18 - 2020/21_M10 |
Mental Health Minimum Data Set | Identifiable, Sensitive | 2006/07 - 2014/15 |
Mental Health and Learning Disabilities Data Set | Identifiable, Sensitive | 2014/15 - 2015/16 |
Medical Research Information Service (MRIS) - Members and Postings Report | Identifiable, Sensitive | May 2016 - March 2020 |
HES Admitted Patient Care | Identifiable, Sensitive | 1989/90 - 2021/22_M10 |
HES Outpatients | Identifiable, Sensitive | 2003/04 - 2021/22_M10 |
HES Accident and Emergency | Identifiable, Sensitive | 2007/08 - 2019/20_M12 |
MRIS - Flagging Current Status Report | Identifiable, Sensitive | May 2016 - March 2020 |
MRIS - Cohort Event Notification Report | Identifiable, Sensitive | May 2016 - March 2020 |
MRIS - Cause of Death Report | Identifiable, Sensitive | May 2016 - March 2020 |
MRIS - List Cleaning Report | Identifiable, Sensitive | May 2016 - March 2020 |
Patient Reported Outcome Measures (Linkable to HES) | Identifiable, Sensitive | 2009/10 - 2019/20_M13 |
Mental Health Services Data Set | Identifiable, Sensitive | 2016/17 - 2020/21 |
Demographics | Identifiable, Sensitive | Latest Available |
Civil Registration - Deaths | Identifiable, Sensitive | Latest Available |
Cancer Registration Data | Identifiable, Sensitive | Latest Available |
The Controller is GE and the Processors are Amazon Web Services (AWS), UKCloud Limited, Lifebit Biotech Limited (Lifebit) and Microsoft UK (undeclared on DSA). AWS, UKCloud Limited and Microsoft UK do not have access to the data and only provide cloud hosting services.
GE was established by the Department of Health and Social Care to deliver the 100,000 Genomes Project. This project has sequenced 100,000 whole genomes from NHS patients with rare diseases and their families, as well as patients with cancers.
The 100,000 Genomes Project has transformed patient outcomes. Patients have received diagnoses of their conditions for the first time. There has been a better understanding of the effectiveness of certain treatments for certain patients, refining NHS treatments and enabling precision medicine. Following completion of the project, GE is now providing the Genomic Medicine Service (GMS) to the NHS, a key component of which is the National Genomic Informatics System (NGIS).
All participants and patients, about whom GE holds information, have opted in for their data to be used for research and have consented to this data being linked with other health data.
The DSA allows GE to sub-license data which has been de-identified to academic and commercial organisations, subject to the terms, checks and controls carried out by GE in relation to sub-licensing. All commercial research has to be approved on a project-by-project basis by the GE Access Review Committee.
All research analysis on the de-identified datasets is only allowed to be carried out within GE’s Research Environment. Movement of files into and out of the Research Environment is governed via an Airlock Policy.
This report also considers whether GE and Lifebit conform to their own policies, processes and procedures.
The interviews during the audit were conducted through video conferencing.
This is an exception report based on the criteria expressed in the NHS Digital Data Sharing Remote Audit Guide version 1.
Audit type and scope
Audit type | Routine |
---|---|
Scope areas |
Information transfer |
Restrictions |
Access control - limited visibility of physical controls |
Overall risk statement
Based on evidence presented during the audit and the type of data being shared the following risk has been assigned from the options of Critical - High - Medium - Low
Current risk statement: Medium
This risk represents a deviation from the terms and conditions of the contractual documents, signed by both parties. In deriving this risk, the Audit Team will consider compliance, duty of care, confidentiality and integrity, as appropriate.
Data recipient’s acceptance statement
GE confirms this report is accurate regarding compliance with the clauses laid out in the DSA. The audit was not a review of GE’s overall strategy, operational performance or approach to data privacy and data security beyond the scope of the DSA with NHS Digital.
GE takes compliance with the DSA seriously and will work closely with NHS Digital to put actions in place to close the findings and anticipates that many findings and required changes to the DSA to be rapidly resolved / implemented.
For finding 1, GE has participant consent to allow PROMS data to be accessed by academic and commercial entities for health research purposes, approved by its independent Access Review Committee. The audit flagged a discrepancy between the terms of consents and what was permitted under the terms of the DSA, so access to PROMS data for commercial researchers will be removed pending any further DSA updates.
Data recipient’s action plan
GE will establish a corrective action plan to address each finding shown in the findings table below. NHS Digital will validate this plan and the resultant actions at a post audit review with the GE to confirm the findings have been satisfactorily addressed. The post audit review will also consider the outstanding evidence at which point the Audit Team may raise further findings.
Findings
The following table identifies the 6 agreement nonconformities, 5 opportunities for improvement and 6 points for follow-up raised as part of the audit.
Ref | Finding | Link to area | Clause | Designation |
---|---|---|---|---|
1 | Patient Reported Outcome Measures (PROMS) data has been shared with commercial organisations which is prohibited by the DSA. | Use and Benefits | DSA, Annex A, Sections 5b and 6 | Agreement nonconformity |
2 | Data are being stored within secure cloud-based UK data centres whose locations were not declared on the DSA. | Information Transfer | DSA, Annex A, Section 2b |
Agreement nonconformity |
3 | The Audit Team found two users employed by Lifebit, that were selected from a sample, had not completed data protection training in the last 12 months. | Operational Management | DSFC, Schedule 2, Section A, Clause 1.2.2 | Agreement nonconformity |
4 | Dormant accounts are not being managed in line with the requirements of the DSFC. Also, there is no regular review of access to the data via GE user accounts and privileged accounts. | Access Control | DSFC, Schedule 2, Section A, Clause 4.1 | Agreement nonconformity |
5 | There is no comprehensive Information Asset Register (IAR) to cover the data supplied under the DSA. Instead, information is spread across different documents. | Operational Management | DSFC, Schedule 2, Section A, Clause 3.2 | Agreement nonconformity |
6 | The DSA needs to:
|
Use and Benefits | DSA, Annex A, Sections 2c and 5b | Agreement nonconformity |
7 | Publications that are prepared using data provided by NHS Digital should recognise the source of the data as being from NHS Digital, where possible. | Use and Benefits | Opportunity for improvement | |
8 | GE should consider implementing multi-factor authentication for all third-party accounts. | Access Control | Opportunity for improvement | |
9 | GE should perform a risk assessment to ensure any derived risk is acceptable or managed through the availability of user owned datasets, which can be uploaded to a private location on AWS. | Risk Management | Opportunity for improvement | |
10 | GE should include the sub-licensing process in its future internal audit programme to ensure it is fully compliant with the requirements of the DSFC, DSA and also GE’s own policies and procedures. For example, the application process, the approval process, the use of accounts, the Airlock process and any outputs. | Operational Management | Opportunity for improvement | |
11 | GE should update the Data Protection Framework and remove the reference to the De-identification Policy which has been archived. | Operational Management | Opportunity for improvement | |
12 | At the post audit review, the Audit Team will:
|
Operational Management | Follow-up | |
13 | At the post audit review, the Audit Team will look at the implementation by GE to reduce the number of touchpoints of the data. The work has been commissioned by GE for better handling of the data and ultimately the destruction of the data. | Information Transfer | Follow-up | |
14 | At the post audit review, the Audit Team will check that the latest sub-licensing agreements (GeCIP and Data Access Agreement) have been provided to NHS Digital for review. The last time these agreements were supplied to DARS was in 2019. | Operational Management | Follow-up | |
15 | At the post audit review, the Audit Team will review evidence that the latest revision to the Data Protection Impact Assessment (DPIA) has been reviewed and approved. | Operational Management | Follow-up | |
16 | At the post audit review, the Audit Team will check a certificate of destruction (CoD) has been completed by GE to cover the data held at a cloud provider, and the CoD has been approved by NHS Digital. | Data Destruction | Follow-up | |
17 | At the post audit review, the Audit Team will review the most recent validation report and supporting action plan. | Access Control | Follow-up |
Use of data
GE confirmed there was one dataset that was not being processed and used for the purposes defined in the DSA (see finding 1). However, the datasets provided by NHS Digital were only being linked with those datasets explicitly allowed in the DSA for participants who have consented for their data to be used for research purposes.
Data location
GE confirmed that processing and storage locations, including disaster recovery and backups, of the datasets were limited to the location shown in the following table. However, de-identified data was being accessed from locations which did not conform with the territory of use defined in clause 2c of the DSA, see finding 6.
Organisation | Territory of use |
---|---|
UKCloud | England / Wales |
AWS | England / Wales |
Microsoft (Underclared) | England / Wales |
Backup retention
The duration for which data may be retained on backup media is:
Organisation | Media type | Period |
---|---|---|
UKCloud | Disk | The data are currently being deleted (see finding 16) |
AWS | Disk | 7 days |
Microsoft | Disk | 30 days |
Good Practice
During the audit, the Audit Team noted the following area of good practice:
- GE were able to clearly demonstrate the value the data supplied under this DSA has had with benefiting health and social care.
Disclaimer
The audit was based upon a sample of the data recipient’s activities, as observed by the Audit Team. The findings detailed in this audit report may not include all possible nonconformities which may exist. In addition, as the audit interviews were conducted through a video conference platform, certain controls that would normally be assessed whilst onsite could not be witnessed.
NHS Digital has prepared this audit report for its own purposes. As a result, NHS Digital does not assume any liability to any person or organisation for any loss or damage suffered or costs incurred by it arising out of, or in connection with, this report, however such loss or damage is caused. NHS Digital does not assume liability for any loss occasioned to any person or organisation acting or refraining from acting as a result of any information contained in this report.
Last edited: 30 November 2022 5:27 pm