Skip to main content

Scope

The genomics ecosystem consists of all devices and system components connected to the clinical network involved in the end-to-end genomics diagnostics data flow. It comprises test ordering, genome extraction, plating, transmission, analysis, storage and reporting.

This includes genomics test requests and orders made electronically only.


Overview

DNA, or deoxyribonucleic acid, is the hereditary material in humans and almost all other organisms; and nearly every cell in a person’s body has the same DNA. The information in DNA is stored as a code made up of four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T). Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. The order, or sequence, of these bases determines the information available for building and maintaining an organism, similar to the way in which letters of the alphabet appear in a certain order to form words and sentences.

A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration.

Genomics is the study of the genes in our DNA, their functions and their influence on the growth, development and working of the body – using a variety of techniques to look at the body’s DNA and associated compound. Clinical genomics (previously called clinical genetics) are services where doctors (typically clinical geneticists) and genetic counsellors work with other health professionals to diagnose genetic conditions and assess the risk a patient will inherit or develop a genetic condition.

Genomics laboratory hubs – The national genomic testing service is delivered through a network of seven genomic laboratory hubs (GLHs), each responsible for co-ordinating services for a particular part of the country.


Genomics use cases

The use cases below show the possible genomics end data flow from ordering a test to analysis and reporting:

  1. GP/primary care – Request for genome testing can made electronically from the surgery GP system via ordering systems like ICE, EMIS, Vision or SystmOne, and sent to the trust.
  2. Secondary care (inpatient) – Requests received via patient management systems like Electronic Patient Records (EPR) are fed into ordering tools (ICE, EMIS) where orders generated are forwarded to the appropriate recipient (depending on if the testing is via the WGS or NWGS ecosystems) as described below:
    1. Whole genome ecosystem – Orders are sent electronically to GLH which may involve transcribing before being sent to the test order management system within Genomics England Ltd (GEL) for processing. Samples are sent to GLH for DNA extraction and plating, before being sent to GEL for analysis and sequencing, followed by test outcome interpretation via the interpretation portal. On completion, the results are sent to the requester.
    2. Non-whole genome ecosystem – The test request and ordering process is similar to the whole genome sequencing (WGS) but in this case testing, analysis, sequencing and reporting is all executed in the GLH(s) within the NHS environment. On completion, test results are sent back to the requestor.

Genomics diagnostics components

The genomics ecosystem consists of all connected medical devices involved in the end-to-end genomics testing flow comprising ordering, sample collection and transmission, sample analysis, and reporting to improve the security posture of the clinical network. These components should be segmented on the network.

These components are split between NHS laboratory hubs, GEL and 3rd parties depending on whether the required test is WGS or NWGS. The components can be grouped into:

Application services – These are the standard test/order request systems used to request and order various types of testing for patients, including genomics testing, either in primary care or secondary systems in hospitals. Examples include EPR, ordering tools (example ICE, EPIC) used by patient care teams to request pathology testing for a patient.

GEL – has its test order management system used to manage all aspects of WGS processing.

DNA/RNA extraction devices – These are devices used in DNA extraction such as robotic arms.

DNA purification and analysis devices – Plating is specific to WGS and GLHs provide the plates. Plating is done by plating robots.

Genome sequencing devices – Sequencing is used within WGS and NHS labs. It involves both:

  • library preparation, a chemical process whereby RNA reagents are added to the test and can be performed manually or by automated fluid handlers
  • the sequencing process that determines the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time

Sequencing is performed by a 3rd party provider using next-generation sequencing devices and the outcome are sequencing data files (BAM and VCF) which  are transferred to GEL storage repositories.

Analysis – Once sequencing data is received, a bioinformatic pipeline is used to identify key information in combination with other artefacts and a set of algorithms. The resulting variants are stored in a transient 3rd party data store, Congenica, a web hosted facility (held for max 60 days). Scientists will log into the Congenica portal to access the data for analysis.

TOMS – This is the test ordering management system used by GEL to manage all aspects of the WGS testing from receiving orders to reporting.

Interpretation portal (test outcome analysis) – The interpretation portal is a web portal hosted in GEL that provides access to test data for various authorised personnel.

Clinical interpretation portal (CIP) API – The CIP provides programmatic access to interpreted data for research purposes.

Order and results application – An order and request system (for example ICE) is a software application used to manage requests and orders for various patient related tasks made by various entities such as GP surgeries and NHS trusts. In trusts they are used to manage test requests/orders sent to laboratories, the testing process, and distribution of the test result to the requestor. It links your GP practice directly to test laboratories, electronically.

Laboratory information management system (LIMS) – LIMS is software used in laboratories and hospitals for the effective management of requests/orders, samples and reports. Its core functions are the:

  • reception and log in of a sample and its associated customer data
  • assignment, scheduling, and tracking of the sample and the associated analytical workload
  • processing and quality control associated with the sample and the utilized equipment and inventory
  • storage of data associated with the sample analysis
  • inspection, approval, and compilation of the sample data for reporting and/or further analysis

LIMS could either be genomics specific, or can be shared with pathology services.


Genomics diagnostics component breakdown

Below is a breakdown of various properties of the components identified in the genomics diagnostics pillar:

LIMS

Basic functionalityUsed to generate a request for a genomics test for a patient

Software/hardware - Software

Typical locations - Trusts

Data type - Personal, confidential and sensitive

Storage - Databases

Underlying operating system (OS) - Mostly Windows

Authentication - Local accounts or role-based access control (RBAC)

Active Directory integrated - Yes

Network information - Historically static IP

Communications protocol - HTTPS

TOMS

Basic functionalityUsed to create order forms for pathology test requests

Software/hardware - Order software

Typical locations - GEL

Data type - Personal, confidential and sensitive

Storage - Databases

Underlying operating system (OS) - Mostly Windows

Authentication - Local accounts or role-based access control (RBAC)

Active Directory integrated - Yes

Network information - Historically static IP addresses

Communications protocol - HTTPS

DNA extraction robots

Basic functionality - Extraction of DNA samples

Software/hardware - Hardware

Typical locations - GLH

Data type - Raw data

Storage - Temporary embedded storage

Underlying operating system (OS) - Mostly Windows or embedded

Authentication - Local accounts or role-based access control (RBAC)

Active Directory integrated - No

Network information - Historically static IP addresses

Communications protocol - HL7 or proprietary, for example POCT1-A2

DNA fluid handling robots

Basic functionality - Handles fluids

Software/hardware - Hardware

Typical locations - GLH

Data type - Raw data

Storage - Temporary embedded storage

Underlying operating system (OS) - Mostly Windows or embedded

Authentication - Local accounts or role-based access control (RBAC)

Active Directory integrated - No

Network information - Historically static IP addresses

Communications protocol - HL7

DNA sequencing devices

Basic functionality - Sequencing DNA samples

Software/hardware - Hardware

Typical locations - 3rd party service provider and GLH

Data type - Personal confidential data stored for a temporary period only

Storage - Embedded storage and/or databases

Underlying operating system (OS) - Mostly Windows or embedded

Authentication - Local accounts or role-based access control (RBAC)

Active Directory integrated - Yes

Network information - Historically static IP addresses

Communications protocol - HL7

Analysis tools

Basic functionality - Analyse DNA/RNA samples

Software/hardware - Hardware

Typical locations - Various

Data typePersonal confidential data stored for a temporary period only

StorageEmbedded storage and/or databases

Interpretation portal

Basic functionality - Connects interfaces

Software/hardware - Software

Typical locations - GEL

Storage - Databases

Clinical Variant Ark (CVA) variant store

Basic functionality - Stores de-identified data

Software/hardware - Software

Typical locationsGEL only but other variant stores are stored in various locations

Data typeDe-identified personal confidential data

Storage - Direct-attached storage (DAS), network-attached storage (NAS), storage area network (SAN) or cloud storage

Underlying operating system (OS)Mostly SQL or Oracle database

AuthenticationUnique user account and additional multi-factor authentication (MFA)

Active Directory integrated - Yes

Network information - Historically static IP addresses

Communications protocol - HL7

Sequencing data store

Basic functionality - Stores identifiable data

Software/hardware - Software

Typical locations - GEL and GLH

Data type - Personal confidential and sensitive data

Storage - Databases and archives

Underlying operating system (OS) - Mostly Windows or embedded

Authentication - role-based access control (RBAC)

Active Directory integrated - Yes

Network informationHistorically static IP addresses

Communications protocol - HL7

CIP-API

Basic functionalityProvides access to interpreted data

Software/hardware - Software

Typical locations - GEL

Data type - Personal confidential and sensitive data

Storage - Databases and archives

Underlying operating system (OS) - Mostly Windows or embedded

Authentication - role-based access control (RBAC)

Active Directory integrated - Yes

Network informationHistorically static IP addresses

Communications protocol - HL7


Genomics diagnostic traffic flow

In the NHS, genomics testing is split onto 2 workflows:

  • Whole genome sequencing (WGS) – the WGS workflow includes testing performed by Genomics England Limited (GEL).
  • Non-whole genome sequencing (NWGS) – the NWGS workflow is limited to testing performed within the NHS and the genomics laboratory hubs only, and doesn’t extend to GEL. 

The steps in these workflows are:

  1. Order request – Typically, an order for genome testing can be generated from different sources (for example a GP system or hospital LIMS) and sent to one of many NHS genome laboratory hubs where the request is transcribed and sent to the test order management system (TOMS) for processing.
  2. Sample Preparation and transportation – DNA sample is collected from the patient and undergoes DNA extraction. It is sent to the laboratory for testing (with the appropriate shipping manifest documentation), ensuring that the container and request form are labelled with the patient’s name, date of birth, unit number, date and time of sample, and that adequate clinical information is provided on the form. It is sent to the external lab by post/courier.
  3. Sample library – DNA sample is processed, transferred onto plates and quality checked.
  4. Sequencing – DNA sample undergoes sequencing, resulting in the FASTQ raw sequencing data file (a text-based format for storing a nucleotide sequence and its corresponding quality scores) and 2 subsequent sequencing data files. These are the BAM file which contains an individual’s genome, and the VCF file which contains DNA sequence changes that is generated when compared to a known reference data set.
  5. Analysis – Once sequencing data is received, a pipeline is used to identify key information via a set of algorithms resulting in different variants; the info is then sent to a scientist for analysis.
  6. Reporting – Post analysis, the scientists then produce a report, which is usually reviewed by another scientist, before passing back to the genome testing requestor.

Whole genomics sequencing flow

Whole genomics sequencing flow

Genomics Whole Genome Sequence (WGS) process

Figure 2. A sample topology diagram showing the end-to-end process flow for WGS for the genomics diagnostics pillar.

Image description

Whole genome sequence process flow:

1A. Typically, an order for genome testing whole genome sequence (WGS) or non-whole genome sequence (NGWS), can be generated from different sources. For example, a general practice system or hospital laboratory information management systems (LIMS).

1B. The order request is sent to one of many NHS genome laboratory hubs (GLHs) for processing.

1C. For WGS, the request is transcribed and sent to the GLH test order management system (TOMS) for processing.

2A. Deoxyribonucleic acid (DNA) sample is collected from the patient, undergoes DNA extraction and sample is sent to the external lab for testing by post/courier.

2B. DNA sample is processed, transferred onto plates and quality checked. Plating is specific to WGS and GLHs provide the plates, which is done by plating robots.

2C. Sample is sent to a 3rd party provider, where it undergoes DNA/ribonucleic acid (RNA) sequencing which determines the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. The outcome of sequencing are:

  1. Raw sequencing data file
  2. Subsequent sequencing data files - namely Binary Alignment Map (BAM) and Variant Call Format (VCF) files.

2D. BAM and VCF files are transferred to Genomics England Limited (GEL) sequencing data storage repositories.

2E. Once sequencing data is received, a pipeline is used to identify key information via a set of algorithms resulting in different variants; the info is then sent to a scientist for analysis.

3A. Post analysis, the scientists then produce a report which is usually reviewed by another scientist before passing back to the genome testing requestor (GP office or hospital trust).

3B. Clinical Variant Ark (CVA) Variant Store and knowledge base – This contains de-identified reference data only and acts as a knowledge base for information on known genome variants.

3C. The clinical interpretation portal (CIP) provides programmatic access to interpreted data for research purposes.

3D. Interpretation portal (test outcome analysis) – The interpretation portal is a web portal hosted in GEL that provides access to test data for various authorised personnel.


Asset inventory of genomics diagnostics components in an NHS organisation (sample)

Genomics device category Device name Vendor IP address Underlying operating system (OS) MAC address or manufacturer OUI VLAN Location
Sequencing platform NovaSeq X 3rd party service provider 192.168.10.1 Linux CentOS   10 Not located within a trust
Microarray scanners Micro Array Sequencing 3rd party service provider 192.168.10.2 Linux CentOS   10 3rd party service provider laboratories
In-vitro devices NextSeq 550 3rd party service provider 192.168.10.3 Linux CentOS   10 3rd party service provider laboratories

Table 2: Sample asset inventory list in a GEL laboratory.

OUI - organisationally unique identifier

VLAN - virtual local area network

The information (device names, vendor, IP address for example) used in the above table, is an example created for illustration purposes only and bigger labs will have significantly larger number of devices. Non-WGS includes 20+ different technologies, all with slight variations in their process flows.


Genomics connected medical device (CMD) data flow

Connectivity TOMS Integration portal Sequencing data CVA variant store 3rd party service provider sequencing CIP-API Secondary care
1 TOMS   HTTPS          
2 Interpretation portal HTTPS         HTTPS HTTPS
3 Sequencing data store         HTSGET    
4 CVA variant store     HTSGET HTSGET HTSGET    
5 3rd party service provider sequencing     HTSGET        
6 CIP-API   HTTPS          
7 Analysis             SMTP

Table 3: Sample genomics components communication information.


Sample logical grouping of genomics CMD

Logical group Applicable criteria Assets
Test order and request Requesting entity system GP order and request systems
Order and reporting Test order form for genomics testing Test order management system
Sequencing data Stores BAM or VCF genome sequencing files (identified) GEL sequencing data store
Variant data Stores variant data information CVA variant data store
DNA extraction CMD used for DNA extraction from blood samples DNA plating robots and devices
DNA plating CMD used for DNA plating from blood samples DNA plating robots and devices

Table 4: Sample logical grouping of genomics components.


Last edited: 24 October 2023 5:07 pm