Skip to main content

Part of SNOMED CT PaLM Mapping Best Practice

Mapping overview

This section provides an overview of the principles, workflow, and strategies that support the terminology mapping process.


Mapping principles and workflow

These are overarching principles, tasks, and processes derived from the pathology and laboratory medicine terminology mapping requirement that are understood to be applicable in the wider terminology mapping context.

Terminology mapping process

1. Pre-processing source and target terminology

This includes:

  • data cleansing
  • computational rules to enhance the terminology - using large language models (LLMs)
  • parsing terminology strings into component terms

2. Algorithmic terminology mapping

This includes:

  • deterministic mapping rules
  • semantic and lexical matching of component terms
  • using component terminology building blocks
  • using terminology in supplemental data elements
  • using referential terminology mappings
  • configurable term matching algorithms
  • configurable terminology translation rules

 3. Terminology mapping assurance and governance

This includes:

  • configurable peer review workflow
  • governance
  • audit trail
  • ownership of reviews and mappings
  • change management
  • version control

4. Maintenance and implementation

This includes:

  • data loads
  • mapping updates/maintenance
  • architectural implications
  • implementation implications

The four main steps are underpinned by interface and tooling functionality that together enable a user-centred workflow. 

This guide will first cover the pathology and laboratory medicine terminology mapping requirement, followed by an explanation as to what makes a SNOMED CT PaLM reportable. Workflow will then be covered in the following sections:

  1. Pre-processing source and target terminology.
  2. Algorithmic terminology mapping.
  3. Terminology mapping assurance and governance.
  4. Terminology mapping maintenance and implementation.

Mapping strategies

Terminology mapping should be as deterministic as possible. That is to say, the same input will always produce the same output.

Probabilistic methods that integrate randomness and uncertainty into decision-making, such as those employed by large language models (LLMs) can still be useful in handling variation or missing data but for the pathology and laboratory medicine terminology mapping requirement, they are considered second tier.

Consequently, the three main strategies recommended are: 

  • semantic and lexical mapping 
  • using terminology component building blocks 
  • using referential terminology mapping artefacts 

A basic overview of these strategies is provided later in section 7 - Algorithmic mapping, together with practical examples of associated techniques that support the pathology and laboratory medicine terminology mapping requirement. 

During testing, the PaLM Mapping Project team demonstrated that a combination of strategies optimised the automation of terminology mapping outputs for expert review and assurance.

The PaLM Mapping Project team collaborated with NHS England’s Data Science team to ensure that pre-processing terminology steps and deterministic mapping techniques were most suited to the pathology and laboratory medicine terminology mapping requirement. The Advanced Analysis Components to Support SNOMED PaLM Mapping Project paper produced by the Data Science team outlines the role of advanced analytics components to support mapping and provides considerations relevant to the project. The paper is referenced heavily throughout this document and provides further detail around each mapping strategy.


Using terminology component building blocks

At this point, it is useful to provide a basic explanation of how using terminology component building blocks can support mapping, as this strategy is integral to both the pre-processing of source and target terminology, and to algorithmic terminology mapping. 

This strategy involves using the structure of the data and applying translation rules to facilitate mapping.

As described in section 5 - What makes a SNOMED CT PaLM reportable?, SNOMED CT PaLM reportables carry coded attributes relating to a lab test’s property, component, specimen, and technique. These can be viewed as terminology component ‘building blocks’. Equivalent local source data exists that represents each building block. 

This example shows the source data relating to a lab’s local ‘serum creatinine’ reportable.

Hospital unit of measure (UoM) Hospital local reportable code Hospital specimen code Hospital technique code
Hospital data umol/l creatinine serum n/a
Lab terminology building blocks property component specimen technique

Note - umol/l defines the property as a 'substance concentration' - see Using configurable terminology translation algorithms

The equivalent SNOMED CT PaLM reportable is shown below:

Creatinine - SNOMED CT PaLM reportable

Image description

Image is taken from NHS England's SNOMED CT browser.

The interface displays a SNOMED concept with its human-readable descriptions on the left and attribute-value pairs on the right.

Human readable description:

Substance concentration of creatinine in serum (observable entity).

SCTID: 1107001000000108

1107001000000108 | Substance concentration of creatinine in serum (observable entity) |

Creatine substance concentration in serum

Substance concentration of creatinine in serum (observable entity)

Creatinine molar concentration in serum

Attribute-value pairs:

Component → Creatinine

Inheres in → Serum

Direct site → Serum specimen

Property → Substance concentration (property)

By applying techniques at both the pre-processing and algorithmic mapping stage that use these data elements, the mapping output is greatly enhanced.


Last edited: 22 May 2025 3:26 pm