Skip to main content

NHS Digital Data Sharing Remote Audit: Genomics England

This report records the key findings of a remote data sharing audit of Genomics England Limited during March 2022. 

Audit summary

Purpose

This report records the key findings of a remote data sharing audit of Genomics England Limited (GE) between 7 and 14 March 2022. It provides an evaluation of how GE conforms to the requirements of both:

  • the data sharing framework contract (DSFC) CON-368648-M3S4Z v2.01
  • the data sharing agreement (DSA) DARS-NIC-12784-R8W7V-v8.6

This DSA covers the provision of the following datasets:

Dataset Classification of data Dataset period
Bridge file: Hospital Episode Statistics (HES) to Mental Health Minimum Data Set Pseudo/Anonymised, Non-sensitive Historic Data Request
Bridge file: HES to Diagnostic Imaging Dataset Pseudo/Anonymised, Non-sensitive Historic Data Request
HES Critical Care Identifiable, Non-sensitive 2008/09 - 2021/22_M10
Diagnostic Imaging Dataset Identifiable, Non-sensitive 2008/09 - 2019/20_M13
Emergency Care Data Set (ECDS) Identifiable, Sensitive 2017/18 - 2020/21_M10 
Mental Health Minimum Data Set Identifiable, Sensitive 2006/07 - 2014/15
Mental Health and Learning Disabilities Data Set Identifiable, Sensitive 2014/15 - 2015/16
Medical Research Information Service (MRIS) - Members and Postings Report Identifiable, Sensitive May 2016 - March 2020
HES Admitted Patient Care Identifiable, Sensitive 1989/90 - 2021/22_M10
HES Outpatients Identifiable, Sensitive 2003/04 - 2021/22_M10
HES Accident and Emergency Identifiable, Sensitive 2007/08 - 2019/20_M12
MRIS - Flagging Current Status Report Identifiable, Sensitive May 2016 - March 2020
MRIS - Cohort Event Notification Report Identifiable, Sensitive May 2016 - March 2020
MRIS - Cause of Death Report Identifiable, Sensitive May 2016 - March 2020
MRIS - List Cleaning Report Identifiable, Sensitive May 2016 - March 2020
Patient Reported Outcome Measures (Linkable to HES) Identifiable, Sensitive 2009/10 - 2019/20_M13
Mental Health Services Data Set Identifiable, Sensitive 2016/17 - 2020/21
Demographics Identifiable, Sensitive Latest Available
Civil Registration - Deaths Identifiable, Sensitive Latest Available
Cancer Registration Data Identifiable, Sensitive Latest Available

 

The Controller is GE and the Processors are Amazon Web Services (AWS), UKCloud Limited, Lifebit Biotech Limited (Lifebit) and Microsoft UK (undeclared on DSA). AWS, UKCloud Limited and Microsoft UK do not have access to the data and only provide cloud hosting services.

GE was established by the Department of Health and Social Care to deliver the 100,000 Genomes Project. This project has sequenced 100,000 whole genomes from NHS patients with rare diseases and their families, as well as patients with cancers.

The 100,000 Genomes Project has transformed patient outcomes. Patients have received diagnoses of their conditions for the first time. There has been a better understanding of the effectiveness of certain treatments for certain patients, refining NHS treatments and enabling precision medicine. Following completion of the project, GE is now providing the Genomic Medicine Service (GMS) to the NHS, a key component of which is the National Genomic Informatics System (NGIS).

All participants and patients, about whom GE holds information, have opted in for their data to be used for research and have consented to this data being linked with other health data.

The DSA allows GE to sub-license data which has been de-identified to academic and commercial organisations, subject to the terms, checks and controls carried out by GE in relation to sub-licensing. All commercial research has to be approved on a project-by-project basis by the GE Access Review Committee.

All research analysis on the de-identified datasets is only allowed to be carried out within GE’s Research Environment. Movement of files into and out of the Research Environment is governed via an Airlock Policy.

This report also considers whether GE and Lifebit conform to their own policies, processes and procedures.

The interviews during the audit were conducted through video conferencing.

This is an exception report based on the criteria expressed in the NHS Digital Data Sharing Remote Audit Guide version 1.


Audit type and scope

Audit type Routine
Scope areas

Information transfer
Access control
Data use and benefits
Risk management
Operational management and control
Data destruction

Restrictions

Access control - limited visibility of physical controls

Overall risk statement

Based on evidence presented during the audit and the type of data being shared the following risk has been assigned from the options of Critical - High - Medium - Low

Current risk statement: Medium

This risk represents a deviation from the terms and conditions of the contractual documents, signed by both parties. In deriving this risk, the Audit Team will consider compliance, duty of care, confidentiality and integrity, as appropriate.


Data recipient’s acceptance statement

GE confirms this report is accurate regarding compliance with the clauses laid out in the DSA. The audit was not a review of GE’s overall strategy, operational performance or approach to data privacy and data security beyond the scope of the DSA with NHS Digital.

GE takes compliance with the DSA seriously and will work closely with NHS Digital to put actions in place to close the findings and anticipates that many findings and required changes to the DSA to be rapidly resolved / implemented.

For finding 1, GE has participant consent to allow PROMS data to be accessed by academic and commercial entities for health research purposes, approved by its independent Access Review Committee. The audit flagged a discrepancy between the terms of consents and what was permitted under the terms of the DSA, so access to PROMS data for commercial researchers will be removed pending any further DSA updates.

Data recipient’s action plan

GE will establish a corrective action plan to address each finding shown in the findings table below. NHS Digital will validate this plan and the resultant actions at a post audit review with the GE to confirm the findings have been satisfactorily addressed. The post audit review will also consider the outstanding evidence at which point the Audit Team may raise further findings.


Findings

The following table identifies the 6 agreement nonconformities, 5 opportunities for improvement and 6 points for follow-up raised as part of the audit. 

Ref Finding Link to area Clause Designation
1 Patient Reported Outcome Measures (PROMS) data has been shared with commercial organisations which is prohibited by the DSA. Use and Benefits DSA, Annex A, Sections 5b and 6 Agreement nonconformity
2 Data are being stored within secure cloud-based UK data centres whose locations were not declared on the DSA. Information Transfer DSA, Annex A, Section 2b

Agreement nonconformity

3 The Audit Team found two users employed by Lifebit, that were selected from a sample, had not completed data protection training in the last 12 months. Operational Management DSFC, Schedule 2, Section A, Clause 1.2.2 Agreement nonconformity
4 Dormant accounts are not being managed in line with the requirements of the DSFC. Also, there is no regular review of access to the data via GE user accounts and privileged accounts. Access Control DSFC, Schedule 2, Section A, Clause 4.1 Agreement nonconformity
5 There is no comprehensive Information Asset Register (IAR) to cover the data supplied under the DSA. Instead, information is spread across different documents. Operational Management DSFC, Schedule 2, Section A, Clause 3.2 Agreement nonconformity
6 The DSA needs to:
  • document clearly the Airlock review process
  • update the territory of use from England and Wales to Worldwide as sub-licensees are accessing the de-identified data globally and Lifebit is accessing the de-identified data outside England and Wales
  • reflect a joint GE and NHS Digital understanding around data minimisation and the status of the data (for example, personal data).
Use and Benefits DSA, Annex A, Sections 2c and 5b Agreement nonconformity
7 Publications that are prepared using data provided by NHS Digital should recognise the source of the data as being from NHS Digital, where possible. Use and Benefits   Opportunity for improvement
8 GE should consider implementing multi-factor authentication for all third-party accounts.  Access Control    Opportunity for improvement
9 GE should perform a risk assessment to ensure any derived risk is acceptable or managed through the availability of user owned datasets, which can be uploaded to a private location on AWS. Risk Management   Opportunity for improvement
10 GE should include the sub-licensing process in its future internal audit programme to ensure it is fully compliant with the requirements of the DSFC, DSA and also GE’s own policies and procedures. For example, the application process, the approval process, the use of accounts, the Airlock process and any outputs. Operational Management   Opportunity for improvement
11 GE should update the Data Protection Framework and remove the reference to the De-identification Policy which has been archived. Operational Management   Opportunity for improvement
12 At the post audit review, the Audit Team will:
  • confirm that an Information Asset Owner (IAO) and an Information Asset Administrator (IAA) have been formally identified for the data assets supplied under this DSA
  • review the training needs analysis for specialist roles such as IAO, IAA, Data Protection Officer (DPO) and Senior Information Risk Owner (SIRO).
Operational Management   Follow-up
13 At the post audit review, the Audit Team will look at the implementation by GE to reduce the number of touchpoints of the data. The work has been commissioned by GE for better handling of the data and ultimately the destruction of the data. Information Transfer   Follow-up
14 At the post audit review, the Audit Team will check that the latest sub-licensing agreements (GeCIP and Data Access Agreement) have been provided to NHS Digital for review. The last time these agreements were supplied to DARS was in 2019. Operational Management   Follow-up
15 At the post audit review, the Audit Team will review evidence that the latest revision to the Data Protection Impact Assessment (DPIA) has been reviewed and approved. Operational Management   Follow-up
16 At the post audit review, the Audit Team will check a certificate of destruction (CoD) has been completed by GE to cover the data held at a cloud provider, and the CoD has been approved by NHS Digital. Data Destruction   Follow-up
17 At the post audit review, the Audit Team will review the most recent validation report and supporting action plan. Access Control   Follow-up

Use of data

GE confirmed there was one dataset that was not being processed and used for the purposes defined in the DSA (see finding 1). However, the datasets provided by NHS Digital were only being linked with those datasets explicitly allowed in the DSA for participants who have consented for their data to be used for research purposes.

Data location

GE confirmed that processing and storage locations, including disaster recovery and backups, of the datasets were limited to the location shown in the following table. However, de-identified data was being accessed from locations which did not conform with the territory of use defined in clause 2c of the DSA, see finding 6.

Organisation Territory of use
UKCloud England / Wales
AWS England / Wales
Microsoft (Underclared) England / Wales

Backup retention

The duration for which data may be retained on backup media is:

Organisation Media type Period
UKCloud Disk The data are currently being deleted (see finding 16)
AWS Disk 7 days
Microsoft Disk 30 days

Good Practice

During the audit, the Audit Team noted the following area of good practice:

  • GE were able to clearly demonstrate the value the data supplied under this DSA has had with benefiting health and social care.

Disclaimer

The audit was based upon a sample of the data recipient’s activities, as observed by the Audit Team. The findings detailed in this audit report may not include all possible nonconformities which may exist. In addition, as the audit interviews were conducted through a video conference platform, certain controls that would normally be assessed whilst onsite could not be witnessed.

NHS Digital has prepared this audit report for its own purposes. As a result, NHS Digital does not assume any liability to any person or organisation for any loss or damage suffered or costs incurred by it arising out of, or in connection with, this report, however such loss or damage is caused. NHS Digital does not assume liability for any loss occasioned to any person or organisation acting or refraining from acting as a result of any information contained in this report.

Last edited: 30 November 2022 5:27 pm