Part of SNOMED CT PaLM Mapping Best Practice

Mapping overview

This section provides an overview of the principles, workflow, and strategies that support the terminology mapping process.

Mapping principles and workflow

These are overarching principles, tasks, and processes derived from the pathology and laboratory medicine terminology mapping requirement that are understood to be applicable in the wider terminology mapping context.

Terminology mapping process

1. Pre-processing source and target terminology

This includes:

data cleansing
computational rules to enhance the terminology - using large language models (LLMs)
parsing terminology strings into component terms

2. Algorithmic terminology mapping

This includes:

deterministic mapping rules
semantic and lexical matching of component terms
using component terminology building blocks
using terminology in supplemental data elements
using referential terminology mappings
configurable term matching algorithms
configurable terminology translation rules

3. Terminology mapping assurance and governance

This includes:

configurable peer review workflow
governance
audit trail
ownership of reviews and mappings
change management
version control

4. Maintenance and implementation

This includes:

data loads
mapping updates/maintenance
architectural implications
implementation implications

The four main steps are underpinned by interface and tooling functionality that together enable a user-centred workflow.

This guide will first cover the pathology and laboratory medicine terminology mapping requirement, followed by an explanation as to what makes a SNOMED CT PaLM reportable. Workflow will then be covered in the following sections:

Mapping strategies

Terminology mapping should be as deterministic as possible. That is to say, the same input will always produce the same output.

Probabilistic methods that integrate randomness and uncertainty into decision-making, such as those employed by large language models (LLMs) can still be useful in handling variation or missing data but for the pathology and laboratory medicine terminology mapping requirement, they are considered second tier.

Consequently, the three main strategies recommended are:

semantic and lexical mapping
using terminology component building blocks
using referential terminology mapping artefacts

A basic overview of these strategies is provided later in section 7 - Algorithmic mapping, together with practical examples of associated techniques that support the pathology and laboratory medicine terminology mapping requirement.

During testing, the PaLM Mapping Project team demonstrated that a combination of strategies optimised the automation of terminology mapping outputs for expert review and assurance.

The PaLM Mapping Project team collaborated with NHS England’s Data Science team to ensure that pre-processing terminology steps and deterministic mapping techniques were most suited to the pathology and laboratory medicine terminology mapping requirement. The Advanced Analysis Components to Support SNOMED PaLM Mapping Project paper produced by the Data Science team outlines the role of advanced analytics components to support mapping and provides considerations relevant to the project. The paper is referenced heavily throughout this document and provides further detail around each mapping strategy.

Using terminology component building blocks

At this point, it is useful to provide a basic explanation of how using terminology component building blocks can support mapping, as this strategy is integral to both the pre-processing of source and target terminology, and to algorithmic terminology mapping.

This strategy involves using the structure of the data and applying translation rules to facilitate mapping.

As described in section 5 - What makes a SNOMED CT PaLM reportable?, SNOMED CT PaLM reportables carry coded attributes relating to a lab test’s property, component, specimen, and technique. These can be viewed as terminology component ‘building blocks’. Equivalent local source data exists that represents each building block.

This example shows the source data relating to a lab’s local ‘serum creatinine’ reportable.

	Hospital unit of measure (UoM)	Hospital local reportable code	Hospital specimen code	Hospital technique code
Hospital data	umol/l	creatinine	serum	n/a
Lab terminology building blocks	property	component	specimen	technique

Note - umol/l defines the property as a 'substance concentration' - see Using configurable terminology translation algorithms

The equivalent SNOMED CT PaLM reportable is shown below:

Creatinine - SNOMED CT PaLM reportable

Image description

Image is taken from NHS England's SNOMED CT browser.

The interface displays a SNOMED concept with its human-readable descriptions on the left and attribute-value pairs on the right.

Human readable description:

Substance concentration of creatinine in serum (observable entity).

SCTID: 1107001000000108

1107001000000108 | Substance concentration of creatinine in serum (observable entity) |

Creatine substance concentration in serum

Substance concentration of creatinine in serum (observable entity)

Creatinine molar concentration in serum

Attribute-value pairs:

Component → Creatinine

Inheres in → Serum

Direct site → Serum specimen

Property → Substance concentration (property)

By applying techniques at both the pre-processing and algorithmic mapping stage that use these data elements, the mapping output is greatly enhanced.

Last edited: 22 May 2025 3:26 pm

Mapping overview

Mapping principles and workflow

Mapping strategies

Using terminology component building blocks

Chapters